• master_thesis_proposal

master_thesis_proposal

Hybrid dialogue systems platform (2016)

Goal

To connect the Talkamatic Dialogue Manager (TDM) to a commercially available dialogue systems infrastructure, such as Microsoft's Cortana, Nuance Mix, Wit.ai, or IBM Watson, and evaluate the resulting hybrid platform in terms of usability for app designers and end users.

Background

Voice interfaces give users the possibility to interact with a device without using their eyes or hands. In recent years, several commercial platforms for dialogue systems have been released by major players in the field. In most of these platforms, the focus has been on high quality speech recognition (ASR) and natural language understanding (NLU), while the dialogue management (DM) and natural language generation (NLG) components are less developed. Fortunately, DM and NLG are exactly the strengths of TDM, and so the integration of TDM with other commercial platforms is an attractive venue to explore.

Problem description

The overall goal is to integrate TDM with an available infrastructure in a way that (1) allow developers to build new applications as easily as possible, without the need for "doing the same thing twice" despite working on a hybrid platform, and (2) allows users of the commercial dialogue systems platform in question to get full access to the advanced dialogue capabilities of TDM, and if possible also the multilingual and multimodal features of TDM.

  • Read up on TDM functionality and APIs
  • Read up on the (non-TDM) commercial dialogue systems platform selected
  • Draw up a plan for integration, with regard to both development and deployment requirements
  • Implement the integration
  • Evaluate the integrated platform with respect to development and deployment

Recommended background knowledge

  • Python
  • XML

Supervisors

Staffan Larsson, FLoV, together with Talkamatic AB. Talkamatic is a university research spin-off company based in Göteborg.

Prosody and emotion: Towards the development of an emotional agent (2016)

Goal

To explore prosody as a communicative channel, that conveys both linguistic, social, and emotional meanings and to provide a classification model of the emotional properties of speech, using multimodal information from the speech signal, e.g., information about the duration, fundamental frequency, formants, and voice quality.

Background

Emotional communicative agents rely on prosodic information for the identification of emotional states. Previous research using such emotional robots has demonstrated robust techniques for identifying affective intent in robot directed speech. For example, by analyzing the prosody of a person’s speech, robots, such as Kismet and Leonardo, can determine whether the robot was scolded, praised, or given an attentional bid.

Most importantly, the robot can discern these affective intents from neutral indifferent speech. Nevertheless, much more work needs to be done to explore the potentials of prosodic information in speech interaction under a computational framework. These models may potentially be included in robots and discourse agents, such as personal assistants.

Problem description

The aims of this work include the following:

  • to study the literature on prosody and emotion.
  • to identify the prosodic categories in speech corpora.
  • to train on corpora developed for this purpose and assess the performance of the classifier on existing prosodic corpora.

Recommended skills

  • Classification/Machine learning
  • Python, R

Supervisor: Charalambos (Haris) Themistocleous

Towards a state of the art platform for Natural Language Inference (2016)

Goal

To propose a methodology for constructing a wide coverage, state of the art NLI platform. To construct a small NLI platform buiding on this methodology that could be extended in the future.

Background

Natural Language Inference (NLI), roughly put, is the task of determining whether an NL hypothesis can be inferred from an NL premise. Inferential ability according to Cooper et al. (1996) is the best way to test the semantic adequacy of NLP systems. In this context and given the importance of NLI to computational semantics, a number of NLI platforms have been proposed by the years, the most important ones being the FraCaS test suite, the Recognizing Textual Entailment (RTE) platforms and the Stanford NLI platform (SNLI). Despite their merits, all three of the platforms seem to concentrate on specific aspects of inference while NLI seems to be a much more complex phenomenon. The project will concentrate on tackling the needs of a wider coverage, both theoretically and implementationally, NLI platform.  

Project description

  • Learn about NLI and the three main platforms for it (FraCaS, RTE and NLI)
  • Describe the merits as well as drawbacks of each platform from both a theoretical and practical perspective. Discuss any aspects of NLI that are not covered in these platforms
  • Propose a methodology for constructing a state of the art NLI platform that will remedy the problems associated with earlier platforms. Justify the choices made.
  • Construct a small NLI platform based on the proposed methodology that is machine readable. Discuss any potential challenges that platforms constructed using this methodology will cause to NLI systems. 
  • (optional) Implement an NLI system and evaluate against a part of your constructed test suite. Provide documentation for it.
  • (optional) Evaluate current state of the art NLI systems against part or the whole constructed NLI platform. Discuss the results, ideas for improvement as well as the prospect of hybrid systems (combining both a machine learning/deep learning component as well as a symbolic (logical) component)

Recommended skills

  • Knowledge of semantics and pragmatics
  • XML
  • Programming skills, preferably Python in case of implemenation
  • Knowledge of current techniques used in Machine Learning, Deep Learning and Logical approaches in case of evaluation

Supervisor: Stergios Chatzikyriakidis

(Towards) a TTR parser: from strings of utterance events to types (2016)

Type Theory with Records (TTR) is a formal semantic framework that allows representing meaning closely related to action and perception. As such, we argued [1], it is ideally suited as a unified knowledge representation system for situated dialogue systems. In understanding language, two kinds of events are involved: events in the world and speech events of utterances. Recognising the former as types allows us to model the sense and reference of words, recognising the latter as types allows us to model syntactic structure of linguistic utterances [2].

The primary goal of the project is to explore parsing open text (which may be fragmented and incomplete, i.e. dialogue) into record-type representations which are represented as feature structures. The task might be accomplished in several different ways, (i) for example exploring shallow information extraction techniques can be used to identify entities and events in the text; (ii) adopting existing semantic parsers (e.g. the C&C tools for CCG or the MALT parser for dependency parses) to rewrite the output into the desired type representations; (iii) implementing new independent semantic parsing techniques that would return types directly. As types will represent discourse rather than isolated sentences, one could/would also explore different discourse referent/pronoun resolution methods and named entity identification.

The next step... "(Towards a) TTR parser: from types to perception" Once having type representations of linguistic events, how can they be linked to what we perceive? See the proposal "Situated Learning Agents" http://clt.gu.se/masterthesisproposal/situated-learning-agents"

[1] http://gup.ub.gu.se/publication/190853

[2] http://gup.ub.gu.se/publication/205229

Recommended skills:

Good Python programming skills both for processing text and of logic formalisms.

Supervisors:

Simon Dobnik, FLoV and other members of Dialogue Technology Lab or Centre for Linguistic Theory and Studies in Probability (CLASP).

Acquisition, correlation, visualization and use of lexico‐syntactic and semantic features, from Swedish transcribed interactions (2016)

The purpose of this project is to: 1. conduct a literature review in the area of feature extraction from dialogue data and spoken language transcriptions 2. implement a (large) set of lexico-syntactic and semantic features from the papers reviewed in the previous step 3. build or use an existing classifier by using the extracted features from the previous step. The application would be able to differentiate between transcribed spoken dialogues from dialogues in other corpora.

Parsing the PROIEL syntactic formalism for historical treebanks

In the context of the project Pragmatic Resources for Old Indo-European Languages (University of Oslo, 2008 - 2012), a formalism and guidelines for the annotation of parts-of-speech, morphology and dependency syntax in several historical Indo-European languages have been developed, such as Ancient Greek, Latin, Old Church Slavonic and Gothic. This formalism and (derived) guidelines have by now been used for 18 different languages / language stages. This includes Old Swedish as part of the project Methods for the automatic Analysis of Text in digital Historic Resources, which currently runs at the Department of Swedish at GU.

Syntactic annotation comes in the form of dependency trees, with additionally the use of so called secondary edges to capture argument sharing and the use of empty tokens to capture ellipsis of verbs and coordinators. As far as we are aware, no one has attempted to statistically parse the PROIEL format. In this project, you will investigate how to parse this format and construct a working set up so that anyone with a PROIEL corpus can come and train a parser on their annotated data.

There is a fair amount of training material available to train parsers on, a selection of which can be used in the MA project. In total, the number of annotated tokens with some kind of PROIEL annotation is well over a million, with the largest languages having over 100k tokens of annotated data.

An important part of the project is to investigate how to tackle the fact that the PROIEL syntactic format is not a classic dependency tree -- are there existing statistical parsers that can handle a format like PROIEL's, can you adjust/extend an existing parser, or could you handle the PROIEL format with a standard parser and some pre- and post-processing? You will also evaluate your final solution's performance on some of the existing annotated material. The historical material also has other challenging features for parsing, such as non-standard orthography, a lack of uniform sentence marking and morphologically rich language, but these issues are not intended to be the focus of the project.

This MA project combines theory (forming an understanding of the challenges of the syntactic format for statistical parsing), literature study (surveying the field for existing/adaptable solutions), implementation, and empirical research (evaluation of the final system(s)).

A working solution of decent quality is a publishable result and will be of practical interest of creators of new PROIEL treebanks, as a parser could be used to support future manual annotation efforts.

Programming skills and an NLP background are a prerequisite, as is some knowledge of statistical methods. Affinity with the linguistic side of processing is a plus as this will for instance allow you to do more insightful error analysis.

The project would be supervised by Gerlof Bouma, Yvonne Adesam or possibly others at Språkbanken.

Ticnet dialogue agent for social media platforms

Goal

The goal of the project is to evaluate and develop further an existing TDM interface to the Ticnet ticket booking service. The app communicates with users through text interaction in social media services, and optionally also using spoken interaction in a smartphone app.

Problem description

Talkamatic have developed a rough prototype for a Ticnet application which allows written (in a terminal window) or spoken (on a smartphone) interaction. Ticnet want a more extensive prototype which communicates with users through text interaction in social media services. The prototype should be deployed by a test group and evaluated using a variety of methods including user surveys.

The role of Talkamatic will be (1) technical support concerning TDM application development and (2) to formulate requirements and give feedback on ideas and prototypes.

The role of Ticnet will be (1) technical support concerning their APIs and services, and (2) to formulate requirements and give feedback on ideas and prototypes.

Recommended skills

Python programming. Familiarity with other lanuages (C++, Java,PHP) is a plus. Familiarity with the concepts of APIs as well as guidelines, tools and processes for software development is also a plus.

Supervisors

  • Staffan Larsson(FLoV, GU)
  • External supervisor from Talkamatic AB.
  • Requirements, feedback, comments from Ticnet.

Ticnet is the leading marketplace in Sweden for events in sports, culture, music and entertainment. Ticnet is since 2004 a wholly owned subsidiary of American Ticketmaster. Ticnet conveys about 12 million tickets/year spread over 25,000 events. Ticnet.se has 1 100 000 unique visitors each month.

Generation of Multilingual Wikipedia articles from raw data

Description

In this project we will collect raw data about countries, cities, languages, etc and we will use this data to generate Wikipedia style articles describing the different entities. The natural language generation will be done by using the GF framework and the generation must be done in at least two different languages to demonstrate that the technology is multilingual. The current proposal is to cover the geographical domain but applications in other domains are possible as well. Suggestions for different domains can be discussed with the supervisor.

Supervisor

Krasimir Angelov

GF Resource Grammar for a yet unimplemented language

Description

See http://www.grammaticalframework.org/lib/doc/status.html for the current status. This is an ambitious project and hard work, requiring good knowledge of GF and the target language. But the result will be a permanent contibution to language resources, and almost certainly publishable. For instance the Japanese grammar implemented within this Masters programme was published in JapTAL.

Supervisor

Contact and possible supervisor: Krasimir Angelov.

Infrastructure for safe in-vehicle speech interaction

Goal

The goal of the project is to develop an existing API for the Talkamatic dialogue system, as well as guidelines, tools and processes for app development. The development would build on ongoing work in a couple of EU FP7 projects underway at Talkamatic, with input from Volvo Trucks.

Problem description

How do we enable app developers of in-vehicle apps to improve the safety of their apps using voice recognition? Talkamatic AB have developed a dialogue system for in-vehicle use. Currently, an API is being developed as part of ongoing EU projects.

Volvo Trucks are interested in the development of APIs, Guidelines, Tools and Processes to enable developers to add safe speech interaction to their apps. The role of Volvo will be to formulate requirements and give feedback on ideas and prototypes.

Recommended skills

Python programming. Familiarity with other lanuages (C++, Java) is a plus. Familiarity with the concepts of APIs as well as guidelines, tools and processes for software development is also a plus.

Supervisors

  • Staffan Larsson(FLoV, GU)
  • External supervisor from Talkamatic AB.
  • Requirements, feedback, comments: contact at Volvo Groups Truck Technology.

Volvo Group Trucks Technology (VGTT) is part of the Volvo Group. The Volvo Group is one of the world’s leading manufacturers of trucks, buses, construction equipment and marine and industrial engines under the leading brands Volvo, Renault Trucks, Mack, UD Trucks, Eicher, SDLG, Terex Trucks, Prevost, Nova Bus, UD Bus, Sunwin Bus and Volvo Penta. Volvo Group Trucks Technology provides Volvo Group Trucks and Business Area's with state-of-the-art research, cutting-edge engineering, product planning and purchasing services, as well as aftermarket product support.

X
Loading