The task of the project is to learn a mapping between natural language descriptions on one hand and sensory observations and commands issued to a simple mobile robot (Lego NX) using machine learning. The project would involve building a corpus of descriptions paired with actions - one person is guiding the robot and another person is describing. Multimodal ML models would then be built from this corpus both to predict a description, action or perceptual observation. Finally, the models should be integrated with a simple dialogue manager with which humans can interact and test the success of learning in context.
The system should be implemented in ROS (Robot Operation System) which provides access to sensors and actuators of the robot and allows writing new models in a simplified (well-organised) manner in Python.
Contributions/possible research directions of this thesis:
Supervisors: Simon Dobnik and possible others from the Dialogue Technology Lab