• research_dialogue_lab

research_dialogue_lab

The Second Workshop on Action, Perception and Language (APL'2)

Crispin apple © New York Apple Association.

SLTC workshop, November, 13, 2014, Uppsala, Sweden

Workshop programme and proceedings

09:00 - 09:05 Welcome
Session 1
09:05 - 09:35 Francesco-Alessio Ursini and Aijun Huang: Objects and Nouns: Ontologies and Relations
09:35 - 10:05 Simon Dobnik, Robin Cooper and Staffan Larsson: Type Theory with Records: a General Framework for Modelling Spatial Language
10:05 - 10:25 Coffee break
Session 2
10:25 - 10:55 Robert Ross and John Kelleher - Using the Situational Context to Resolve Frame of Reference Ambiguity in Route Description
10:55 - 11:25 Raveesh Meena, Johan Boye, Gabriel Skantze and Joakim Gustafson - Using a Spoken Dialogue System for Crowdsourcing Street-level Geographic Information
11:25 - 11:55 Robert Ross: Looking Back at Daisie: A Retrospective View on Situated Dialogue Systems Development
11:55 - 12:00 Concluding remarks

Situated agents must be able to interact with the physical environment that are located in with their conversational partner. Such an agent receives information both from its conversational partner and the physical world which it must integrate appropriately. Furthermore, since both the world and the language are changeable from one context to another it must be able to adapt to such changes or to learn from new information. Embodied and situated language processing is trying to solve challenges in natural language processing such as word sense disambiguation and interpretation of words in discourse as well as it gives us new insights about human cognition, knowledge, meaning and its representation. Research in vision relies on information represented in natural language, for example in the form ontologies, as this captures how humans partition and reason about the world. On the other hand, gestures and sign language are languages that are expressed and interpreted as visual information.

The Second Workshop on Action, Perception and Language (APL’2) is a continuation of a successful APL workshop held at SLTC 2012 in Lund and is intended to a be a networking and community building event for researchers that are interested in any form of interaction of natural language with the physical world in a computational framework. Example areas include semantic theories of human language, action and perception, situated dialogue, situated language acquisition, grounding of language in action and perception, spatial cognition, generation and interpretation of gestures, generation and interpretation of scene descriptions from images and videos, integrated robotic systems and others. We welcome papers that describe either theoretical and practical solutions as well as work in progress.

Research connecting language and the world is a burgeoning research area to which several international conferences and workshops are devoted. It intends to connect several scientific communities (natural language technology, computer vision, robotics, localisation and navigation). Traditionally, natural language technology has worked separate from the other fields but research in the last 15 years has shown that there exist many synergies between them and that hybrid approaches may provide better solutions to many challenging problems, for example interpretation and generation of spatial language and object recognition. We hope that the APL workshop collocated with the SLTC conference would become a local forum that would lead to new collaborations between computer vision and natural language communities in Sweden.

Image of a Crispin apple courtesy of New York Apple Association, © New York Apple Association.

Workshop organisers

  • Simon Dobnik (University of Gothenburg)
  • Staffan Larsson (University of Gothenburg)
  • Robin Cooper (University of Gothenburg)

Programme committee

Anja Belz (University of Brighton), Johan Boye (KTH), Ellen Breitholtz (University of Gothenburg), Robin Cooper (University of Gothenburg), Nigel Crook (Oxford Brookes University), Kees Van Deemter (University of Aberdeen), Simon Dobnik (University of Gothenburg), Jens Edlund (KTH), Raquel Fernández (University of Amsterdam), Joakim Gustafson (KTH), Pat Healey (Queen Mary, University of London), Anna Hjalmarsson (KTH), Christine Howes (University of Gothenburg), John Kelleher (DIT), Emiel Krahmer (Tilburg University), Torbjörn Lager (University of Gothenburg), Shalom Lappin (King's College, London), Staffan Larsson (University of Gothenburg), Pierre Lison (University of Oslo), Peter Ljunglöf (University of Gothenburg/Chalmers), Joanna Isabelle Olszewska (University of Gloucestershire), Stephen Pulman (University of Oxford), Matthew Purver (Queen Mary, University of London), Robert Ross (DIT), David Schlangen (Bielefeld University), Gabriel Skantze (KTH), Holger Schultheis (University of Bremen), and Mats Wirén (Stockholm University)

Call for papers

We welcome 2 page extended abstracts formatted according to the SLTC templates for LaTeX and Word.

Please submit your abstract as a pdf document with your author details removed through EasyChair here.

The submitted abstracts will be published on the workshop web-page and the authors will be given an opportunity to present their work at the workshop as oral presentations and/or posters (depending on the type and number of submissions).

Following the workshop the contributing authors will be invited to submit full-length (8 page) papers to be published in the CEUR Workshop Proceedings (ISSN 1613-0073) online.

Important dates

  • July, 14: submission opens
  • September, 25 September, 30: extended abstract submission deadline
  • October, 9: notification of acceptance
  • October 13: SLTC early registration deadline
  • October, 30: camera-ready extended abstracts for publication
  • November, 13 09.00 - 12.00: workshop

Contact details

apl@dobnik.net

TTNLS: Programme

Home | Call for papers | Programme committee | Programme | Important dates

All talks will take place in lecture room HC1 at Chalmers Technical University. For details see the attached map below or here. Each talk will be divided into 25 minutes for presentation and 5 minutes for questions. Full papers are available in ACL proceedings.

8:45 - 9:00: Opening remarks
Robin Cooper

9:00 - 10:00: Invited talk: Types and Records for Predication
Aarne Ranta

10:00-10:30: System with Generalized Quantifiers on Dependent Types for Anaphora
Justyna Grudzinska1 and Marek Zawadowski2
1University of Warsaw, Institute of Philosophy, 2University of Warsaw, Institute of Mathematics

10:30 - 11:00: Coffee break

11:00 - 11:30: Monads as a Solution for Generalized Opacity
Gianluca Giorgolo1 and Ash Asudeh2
1University of Oxford, 2University of Oxford & Carleton University

11:30 - 12:00: The Phenogrammar of Coordination
Chris Worth
The Ohio State University

12:00 - 12:30: Natural Language Reasoning Using Proof Technology: Rich Typing and Beyond
Stergios Chatzikyriakidis1 and Zhaohui Luo2
1 2Royal Holloway, University of London

12:30 - 14:00: Lunch

14:00 - 14:30: A Type-Driven Tensor-Based Semantics for CCG
Jean Maillard1, Stephen Clark1, Edward Grefenstette2
1University of Cambridge, 2University of Oxford

14:30 - 15:00: From Natural Language to RDF Graphs with Pregroups
Antonin Delpeuch1 and Anne Preller2
1École Normale Supérieure, 2LIRMM

15:00 - 15:30: Incremental semantic scales by strings
Tim Fernando
Trinity College Dublin

15:30 - 16:00: Coffee break

16:00 - 16:30: A Probabilistic Rich Type Theory for Semantic Interpretation
Robin Cooper1, Simon Dobnik1, Shalom Lappin2, Staffan Larsson1
1University of Gothenburg, 2King's College London

16:30 - 17:00: Probabilistic Type Theory for Incremental Dialogue Processing
Julian Hough1 and Matthew Purver2
1 2Queen Mary University of London

17:00 - 17:30: Abstract Entities: a type theoretic approach
Jonathan Ginzburg1, Robin Cooper2, Tim Fernando3
1Université Paris-Diderot (Paris 7), 2University of Gothenburg, 3Trinity College Dublin

17:30 - 18:30: Concluding discussion

19:00 - 21:00: EACL Reception

Attachments: 

Dialogue-based Search Solution

Masters Thesis proposal (in cooperation with Findwise & Talkamatic)

Background

In an Enterprise environment a search system often crawls and indexes a large number of different data sources – databases, Content Management Systems, external web pages, file shares with different types of documents, etc. Each of the data sources or sub sources may have a primary target group – e.g. Sales, Engineers, Marketing, Doctors, Nurses, etc. all dependent on the type of organization.

The purpose of the (unified) Search system is to serve as a platform (a single entry point) to satisfy the information need for all the different groups in an organization. However, given that search queries are often short (~2.2 words) and ambiguous, and the users have different background, the system employs a number of techniques for filtering of and drill down into the search results. One such technique is facets, e.g. a filtering based on data source, additional keywords, dates, time, etc.

On the other hand, there are at least two types of users (behaviour) - those that know exactly what they look for and how to find it, and use the search in stead of menu clicks; and those  who do not know exactly what they look for, nor where the potential information may be found. We can consider these two groups as the two extremes in much fine-grained scale.

We would like to concentrate on the second group of users, who often engage in some sort of dialogue with the search system. Such users may interact in several ways with the system during a search session – they may rewrite and expand they original query, they may filter it by facets, click on some documents until they finally discover or not what they were looking for.

Dialogue Systems

Spoken dialogue systems are computer systems which use speech as their primary output and input channels. Dialogue systems are primarily used in situations where the visual and tactile channels are not available, for instance while driving, but also to replace human operators for instance in call centers. Recently, spoken dialogue systems have become more widespread with the arrival of Apple’s Siri and Google’s Voice Actions, even outside of the traditional areas of use. As speech and voice has the potential of transmitting large quantities of information very fast compared to traditional GUI interaction, this is a development which is likely to continue.

A spoken dialogue system typically consists of a dialogue manager, an Automatic Speech Recogniser, a Text-to-speech engine, modules for interpretation and generation of utterances and finally some kind of application logic.

Voice search is a term which has emerged the last years. The user speaks a search query, and the system responds by returning a hit list, much like an ordinary Google search. If the hitlist doesn’t contain the desired hit (document, music file, web site etc.) the user needs to do a new voice search with a modified utterance.

The idea of this project is to replace voice search by dialogue-based search, where the user and the system engage in a dialogue over the search results in order to refine the search query. 

Dialogue-based Search – case study

The task of the Masters thesis is to explore the possibilities of using Dialogue-Systems/Dialogue acts in order to satisfy the information needs of certain groups of users in a Search system. The target group consists of several types of users:

  • Users who submit very broad and ambiguous search queries (e.g. “Greece”, “food”, “pm”)
  • Users who do not employ the tools provided by the Search system such as facets (e.g. queries such as “pm pdf”)
  • Users with exploratory queries (e.g. “Abba first album”)

Document format – details

Before documents are being sent for indexing in the Search System, they have been augmented with META-data. The metadata allows us to do a number of things:

  • Advanced queries
  • Filtering
  • Sorting
  • Faceting
  • Ranking

The format of the indexed document could look like:

<doc>
  <field name="id">6H500F0</field>
  <field name="name">Maxtor DiamondMax 11 - hard drive - 500 GB - SATA-300</field>
  <field name="manufacturer">Maxtor Corp.</field>
  <field name="category">electronics</field>
  <field name="category">hard drive</field>
  <field name="features">SATA 3.0Gb/s, NCQ</field>
  <field name="features">8.5ms seek</field>
  <field name="features">16MB cache</field>
  <field name="price">350</field>
  <field name="popularity">6</field>
  <field name="inStock">true</field>
  <field name="manufacturedate_dt">2006-02-13T15:26:37Z</field>
</doc>

<doc>
  <field name=”id”>1</field>
  <field name=”title”>London</field>
  <field name=”body”>London is the capital of UK. London has 7.8 million inhabitants</field>
  <field name=”places”>London</field>
  <field name=”date”>2012-11-30</field>
  <field name=”author”>John Pear</field>
  <field name=”author_email”>john@pear.com</field>
  <field name=”author_phone”>+44 123 456 789</field>
</doc>

Supervision: 

Peter Ljunglöf (CS) together with Findwise AB and Talkamatic AB.

Free Robust Parsing

Background

Open speech recognition

Talkamatic build dialogue systems and are currently using a GF-based grammar tool for parsing and generation. A unified language description is compiled into a speech recognition grammar (for Nuance Vocon ASR, PocketSphinx and others), a parser and a generator.

The problem with this is that the parser can only handle the utterances which the ASR can recognize fom the ASR grammar. The parser is thus not robust, and if an open dictation grammar is used (such as Dragon Dictate used in Apple’s Siri) the parser is mostly useless.

Ontology

Currently TDM (the Talkamatic Dialogue Manager) requires all concepts used in the dialogue to be known in advance. Hence, for a dialogue-controlled music player, all artists, songs, genres etc. need to be known and explicitly declared beforehand.

There are disadvantages with this approach. For example, it requires access to a extensive music database in order to be able to build a dialogue interface for a music player.

Problem description

To simplify the building of dialogue interfaces for this kind of application, it would be useful to have a  more robust parse, which can identify sequences of dialogue moves from arbitrary user input strings.

Ex.

Utterance

Dialogue Moves

”Play Like a Prayer with Madonna”

 

request(play_song),

answer(”Like a Prayer”:song_title)

answer(”Madonna”:artist_name)

”Play Sisters of Mercy”

request(play_song)

answer(”Sisters of Mercy”:song_name)

”Play Sisters of Mercy”

request(play_artist)

answer(”Sisters of Mercy”:artist_name)

”I would like to listen to Jazz”

request(play_genre)

answer(”Jazz”:genre_name)

 

Method

Several different methods can be used: Named Entity Recognizers, regular expressions, databases etc., or combinations of several of these. A strong requirement is that the parser should be built automatically or semiautomatically from a small corpus or database. Computational efficiency is also desirable but less important. The parser must have a Python interface and run on Linux.

Supervision

Peter Ljunglöf, Chalmers Data- och informationsteknik or Staffan Larsson, FLoV, together with Talkamatic AB. Talkamatic is a university research spin-off company based in Göteborg.

Payment

A smallcompensation may be paid by Talkamatic AB when the thesis is completed.

Attachments: 

Information extraction for dialogue interaction

 

The goal of the project is to equip a robotic companion/dialogue manager with topic  modelling and information extraction from corpora, for example Wikipedia articles and topic oriented dialogue corpora, to guide the conversation with a user. Rather than concentrating on a task, a companion engages in free conversation with a user and therefore must supplement traditional rule-based dialogue management with data-driven models. The project thus attempts to examine ways in which text-driven semantic extraction techniques can be integrated with rule-based dialogue management.

Possible directions of this project are:

A. Topic modelling

The system must recognise robustly the topics of user's utterances in order to respond appropriately. This method can be used in addition to a rule-based technique. Given a suitable corpus of topic oriented conversations:

  • what is the most likely topic in the user's dialogue move;
  • ... and given a sequence of topics discussed so far, what is the next most likely topic?

B. Named entity recognition and information extraction for question
generation

The system could take initiative and guide the conversation. It could start with some (Wikipedia) article and identify named entities. If any of the entities match the domain of questions that it can handle, it should generate questions about them.

User: I've been to Paris for holiday.
DM: Paris... I see. Have you been to the Eiffel tower?
...

C. Question answering
 

Supervisors: Simon Dobnik and possible others from the Dialogue Technology Lab

Learning language and perception with a robot

 

The task of the project is to learn a mapping between natural language descriptions on one hand and sensory observations and commands issued to a simple mobile robot (Lego NX) using machine learning. The project would involve building a corpus of descriptions paired with actions - one person is guiding the robot and another person is describing. Multimodal ML models would then be built from this corpus both to predict a description, action or perceptual observation. Finally, the models should be integrated with a simple dialogue manager with which humans can interact and test the success of learning in context.

The system should be implemented in ROS (Robot Operation System) which provides access to sensors and actuators of the robot and allows writing new models in a simplified (well-organised) manner in Python.

Contributions/possible research directions of this thesis:

  • to examine to what extend and for what situations Lego NX robot can be used for learning multimodal semantics;
  • to examine whether bag-of-features approach (features being both linguistic and perceptional/action features) can be used to learn multimodal semantic representations;
  • examine how such models can be integrated with dialogue;
  • examine ML techniques that would actively learn/update the models through the interaction with a user (clarification, correction).


Supervisors: Simon Dobnik and possible others from the Dialogue Technology Lab

Workshop on Language, Action and Perception (APL)

Red Rome apple

SLTC workshop, October 25, 2012, Lund, Sweden

Call for papers

The Workshop on Language, Action and Perception (APL) is intended to a be a networking and community building event for researchers that are interested in any form of interaction of natural language with the physical world in a computational framework. Both theoretical and practical proposals are welcome. Example areas include semantic theories of human language, action and perception, situated dialogue, situated language acquisition, grounding of language in action and perception, spatial cognition, generation and interpretation of scene descriptions from images and videos, integrated robotic systems and others. We would also like to welcome researchers from computer vision and robotic communities who are increasingly using linguistic representations such as ontologies to improve image interpretation, object recognition, localisation and navigation.

Programme committee

Johan Boye (KTH)
Robin Cooper (University of Gothenburg)
Nigel Crook (Oxford Brookes University)
Simon Dobnik (University of Gothenburg)
Raquel Fernandez (University of Amsterdam, The Netherlands)
John Kelleher (Dublin Institute of Technology, Ireland)
Staffan Larsson (University of Gothenburg)
Peter Ljunglöf (Chalmers University of Technology)
Robert Ross (Dublin Institue of Technolog, Ireland)

Invited talks

Johan Boye (KTH) and Gabriel Skantze (KTH)

Submission details

We welcome 2 page extended abstracts formatted according to the SLTC templates for LaTeX and Word.

Please submit your abstract as a pdf document with your author details removed through EasyChair here.

The submitted abstracts will be published on the workshop web-page and the authors will be given an opportunity to present their work at the workshop in a form of brief oral presentations followed by a poster session.

Following the workshop the contributing authors will be invited to submit full-length (8 page) papers to be published in the CEUR Workshop Proceedings (ISSN 1613-0073) online.

Important dates

  • 10 September 2012: abstract submission
  • 17 September 2012: extension of abstract submission deadline
  • 8 October 2012: notification of acceptance
  • 22 October 2012: camera-ready abstracts for publication online

Workshop organisation

If you are coming to the workshop, please don't forget to register for SLTC 2012 here. The registration is sponsored by the GSLT and therefore free for all participants.

The workshop will take place on October 25th 2012 in the room E 1145 of the E building of LTH (Lunds tekniska högskola, a part of Lund university) close to the other two SLTC 2012 workshops. You can find a map here.

The room will have a projector and a wifi connection. Eduroam should allow you to connect to the internet. If you don't have an Eduroam account, let us know in advance to request from the SLTC organisers wifi vouchers.

Workshop programme and proceedings

Workshop organisers

Simon Dobnik, Staffan Larsson, Robin Cooper, Centre for Language Technology and Department of Philosophy, Linguistics, and Theory of Science, Gothenburg University

Contact details

name [dot] surname [at] gu [dot] se or apl2012 [at] easychair [dot] org

Image of a Red Rome apple courtesy of New York Apple Association, © New York Apple Association.

Multilingual FraCaS test suite

Goal

Develop a version of the FraCaS test suite in your native language

Background

The FraCaS test suite was created as part of the FraCaS project back in the nineties. A few years ago Bill McCartney (Stanford) made a machine readable xml version of it and it has been used in connection with textual entailment. This project involves developing the test suite further as a multilingual web accessible resource for computational semantics.

Project description

  1. Learn about the test suite in English, reading the original literature and some recent literature about its current use in computational semantics. Write a description of the work.
  2. Focus on one of the sections of the test suite and learn about the semantic problems which it illustrates and write a description of the semantic issues involved
  3. Translate at least the part of the test suite you focussed on in (2) into your native language and make it machine readable.
  4. Discuss the semantic issues you raised in (2) with respect to your own language and your translations. In particular focus on difficulties in translation or differences between the original English and your translation.
  5. (optional) Implement a parser for (some of) your translations and write documentation of it.
  6. (optional) Extend your parser so that it provides semantic representations which will support the inferences. Document this.
  7. (optional) Run an experiment (perhaps involving a web form) where subjects (native speakers of your language) can express their judgements about the inferences in your translation. Document the results you obtain.

Supervisors

Robin Cooper, Department of Philosophy, Linguistics and Theory of Science. The project will be carried out in connection with Dialogue Technology Lab associated with the Centre for Language Technology.

Maharani: An Open-Source Python Toolkit for ISU Dialogue Management

Based on the previous TrindiKit implementation of the ISU approach to dialogue management (which used a proprietary Prolog), we are now developing Maharani, an open-source Python-based ISU dialogue manager together with Talkamatic AB. The first release is expected in the spring of 2012.

Funding: DTL internal

Researchers: Staffan Larsson, Sebastian Berlin

Reliable Dialogue Annotation for the DICO Corpus

Our purpose is to annotate seven pragmatic categories in the DICO (Villing and Larsson, 2006) corpus of spoken language in an in-vehicle environment, in order to find out more about the distribution of these categories and how they correlate. Some of the annotations have already been made, by one annotator.

To strengthen the results from this work, we are interested in establishing the degree of inter-coder reliability for the annotations. Also, as far as we know, no attempts have been made to annotate enthymemes (Breitholtz and Villing, 2008), a type of defeasible arguments, in spoken dialogue. A corpus of spoken discourse annotated for enthymemes would therefore be a welcome addition to the resources that are currently available.

Researchers: Jessica Villing, Ellen Breitholtz, Staffan Larsson (supervisor)

Funding: CLT internal

X
Loading