Deep Learning for Human-Centered vision & Multimodal conversational systems
Event information
Time
-
Venue location
TS126 Linnanmaa campus
External teacher(s): Associate Professor Dr. Dinesh Babu Jayagopi
External teacher(s) organization: IIIT Bangalore
ECTS-credits: 2 ECTS
Grade: Pass or Fail
Assessment: Research report / Essay after the lectures
May 23, 10.15-12.00, Tuesday Morning: Lecture 1 Introduction to Multimodal Conversational Systems
May 23, 13.15-15.00, Tuesday Afternoon: Lecture 2 Deep Learning for scene centered vision
May 24, 10.15-12.00, Wed Morning: Lecture 3 Natural Language Understanding (NLU)
May 24, 13.15-15.00, Wed Afternoon: Lecture 4 Dialog generation
May 25, 10.15-12.00, Thursday Morning: Lecture 5 Gesture generation – Machine Learning based
May 25, 13.15-15.00, Thursday Afternoon: Lecture 6 Gesture generation – Deep Learning based
Learning objectives and contents:
The course gives an introduction to building Multimodal Conversational Systems, that involve both multimodal analysis (of the users) and multimodal synthesis (on virtual agents). First, we recap Deep Learning (DL) for scene centered vision - object recognition and object detection. As regards DL for human centered vision, we introduce the problem of joint location estimation, 3D facial/body mesh estimation and emotion recognition. We then discuss models for natural language understanding and dialog generation - task based and generative. Finally, we discuss both Machine Learning (ML) and DL based models for generating speaking and listening behavior.
Tentative topics:
Deep learning, multimodal conversational systems, NLP, Computer Vision
Amount of contact teaching hours ~12 hours lectures
Contact person(s): Miguel Bordallo López & Praneeth Susarla