1. 8
  1. 1

    Are we to understand, given the amount of time dedicated to Terry Winograd’s work in the article, that no progress has been made on this since 1972? Although I am not terribly surprised, it makes me a little sad. As an avid gamer and occasional game developer, I have wanted to see progress in conversational interfaces for a long time. I want to use the tech for npc interactions. I even came up with the broad strokes of a framework myself though I have never found the time to go through the details and build a prototype. It is a huge task.

    1. 2

      There have been some others. I remember reading about a system called CYRUS that knew all about [1970s US Secretary of State] Cyrus Vance and could answer complex queries about where he’d been and who he’d met with.

      Some interactive-fiction parsers have edged into this territory, too. Not just supporting more complex grammar, but also forming simple plans for unstated implicit sub-actions, like taking the key out of your pocket and unlocking the door when the user says “open door.”

      1. 2

        I wouldn’t say there’s been no progress since SHRDLU, but most research since then has become much more narrowly focused and arguably a lot less ambitious. But Winograd’s goals were also less ambitious than those of his peers; he was more interested in creating something that worked than something that might someday lead to a perfectly general solution. I used his work as an example because I think it’s a really interesting mix of ambition and pragmatism, in addition to being a clear demonstration of how limited the current generation of slot-and-intent agents are.

        1. 1

          The quest for a perfect general solution is exactly what I perceive to be the main obstacle to all worthwhile research in this and many other fields of machine learning and AI. Even proposing that we should work on conversational interfaces for games is met with scoffs, eye-rolling and comments about ‘AI-Complete problem’. It annoys me no end. I wish you luck in this project and hope such people do not distract you too much from getting it done.

          One optimisation I was planning in my imaginings was to define spheres of knowledge so you can easily switch between different use-cases without having to reinitialise too much. You would then load a list of spheres of knowledge into the language agent for a given task. For example if the user was in the windows settings menu you might add that sphere to the most general one, but when they are using a specific application, load the sphere for that application (if available). This will give you an agent who only knows how to talk about the things that are relevant at the time.

          This might not be so important in a conversational interface assistant but in a game where you might have hundreds or even thousands of npcs, a modular way to define what they can and can’t talk about is absolutely necessary. Npcs often need to say “I have no idea.” and “What the hell is a Jabberwocky?” even if the language engine knows the answer to the query. Nevertheless it could reduce confusion in cases where there similar queries mean different things in different contexts.