Galatea Project

Providing an open-source toolkit for anthropomorphic spoken dialogue agent with which one can develope a life-like animated agent that talks with the user and can be easily customized with the face, voice, and dialog grammar. Refer to our chapter in a book "Life-Like Characters" from Springer-Verlag (right photo; published on 13 Nov 2003).

Another Site Download Outline Features Members Photos Demos Documents Japanese

Keywords / Synonyms

Spoken-Dialog Anthropomorphic Agent, Animated Agent, Life-like Agent, Believable Communication Agent, Virtual Human, Digital Human, Virtual Humanoid, Humanoid Interface, Talking Head, Avatar, Multimodal Human Interface, ...

Free, Open-Source, Linux/Windows, Single/Multi-CPU, ...

Outline
  • "Galatea" is a project for providing an open-source, license-free software toolkit for building anthropomorphic spoken dialogue agents. In other words, using this toolkit, you can build your own unique life-like visual agent that can communicate with you via spoken language. (note: Anthropomorphic agent is being called in various ways: digital human, virtual human, avatar, animated agent, life-like agent, human-like agent, talking head, animated human image, ... etc. with slightly different emphasis to its various aspects.)

  • In 2000-2003, the project was run by 17 prominent researchers (call them "Pygmalions") from 14 universities and institutes in Japan. Their backgrounds were speech recognition, speech synthesis, natural language processing, and human image synthesis. Including students working together in this field, the total number of involved researchders might be 30-40. The project was funded by IPA (Information technology Promotion Agency) for 3 fiscal years of 2000, 2001, and 2002. From 2003, newly established Interactive Speech Technology Consortium (ISTC) is actively working on development and improvement of the toolkit with extended members.

  • This open-source toolkit is being provided to anyone at free of charge. Modification and commercial use are also granted. See the download page for terms and conditions.

  • The project name came from a Greek myth of Pygmalion and Galatea. The king of Crete, Pygmalion, created a woman sculpture and fell in love of her. Aphrodite brought life to her so that they could get married. Refer to related web pages 1, 2, 3, 4. In our project, 17 modern "Pygmalions" are working hard so that the same miracle will happen again.

  • Video demos.

  • Galatea Wiki.

Features
  • Adaptive/Easy-to-Customize in All Aspects:
    To enable unlimited variety of different agents, customizability is of the first priority. With a photo of a person (e.g., you) and some speech data for training from him/her, anyone can create his/her own agent.
    • Face Animation: State-of-the-art texture-mapped wireframe model for photo-real 3-d images produced from a single photo. GUI-based photo fitting tool bundled. (See SIGGRAPH papers by Morishima)
    • Speech Synthesis: HMM-based, speaker-adaptive speech synthesis. First open-source free-of-charge Japanese text-to-speech system.
    • Speech Recognition: Syntax-based Japanese continuous speech recognition system "Julian". Speaker-adaptive.

  • Open Platform:
    We believe dialogue agent will be the most useful application of speech technology in the near future. This toolkit is an open platform as the first step toward the future where anyone can either contribute or make business of it.
    • Open-source, license-free software toolkit for Linux.
    • First free Jpanese text-to-speech system included. Based on JEITA standard for text description (XML-like).
    • VoiceXML-based dialogue control.

  • Modular, Flexible Structure:
    To enable, in the future, other modalities such as visual input, haptic input, and mechanical output, modularity of system contitunts is very important.
    • Agent Manager: Distributed modules, shell-like message passing, lip synchronization.
    • Prototyping Tool: GUI, XIML-based.

Project Members
  • Members in 2000-2003 (funded by IPA):
    Planning and Supervision Shigeki Sagayama, Professor, University of Tokyo
    Satoshi Nakamura, Department Head, ATR Spoken Language Translation Labs.
    System
    Integration
    Tsuneo Nitta, Professor, Toyohashi University of Tochnology
    Hiroshi Shimodaira, Assoc. Prof., JAIST (Japan Advanced Institute of Science and Technology)
    Takuya Nishimoto, Research Associate, University of Tokyo
    Speech
    Recognition
    Katsunobu Itou, Assoc. Prof., Nagoya University
    Atsuhiko Kai, Assoc. Prof., Shizuoka University
    Akinobu Lee, Research Associate, NAIST (Nara Advanced Institute of Science and Technology)
    Speech
    Synthesis
    speech
    synthesis
    Yoichi Yamashita, Professor, Ritsumeikan University
    Takao Kobayashi, Professor, Tokyo Intitute of Technology
    Keiichi Tokuda, Assoc. Prof., Nagoya Institute of Tehnology
    Keikichi Hirose, Professor, University of Tokyo
    Nobuaki Minematsu, Assoc. Prof., University of Tokyo
    text
    analysis
    Yasuharu Den, Assoc. Prof., Chiba Univerity
    Takehito Utsuro, Assoc. Prof., Kyoto University
    Atsushi Yamada, Department Head, ASTEM
    Face Image
    Synthesis
    Shigeo Morishima, Professor, Seikei University

  • Extended members after 2003 in Interactive Speech Technology Consortium:
    See the ISTC page in Japanese.

  • Photos from development camp in August, 2001. -- discussion, debugging, first connection experiment succeeded!

  • Researchers (called "Pygmalions") involved in this project: photos at a meeting in March, 2004. -- discussion and an assembly photo.
Video Demos
Documents and Publications
  • English documentation is under way; some information is found in our publications in English such as the following.

  • Shin-ichi Kawamoto, Hiroshi Shimodaira, Tsuneo Nitta, Takuya Nishimoto, Satoshi Nakamura, Katsunobu Itou, Shigeo Morishima, Tatsuo Yotsukura , Atsuhiko Kai , Akinobu Lee , Yoichi Yamashita , Takao Kobayashi , Keiichi Tokuda , Keikichi Hirose , Nobuaki Minematsu , Atsushi Yamada , Yasuharu Den , Takehito Utsuro , Shigeki Sagayama, ``Open-source software for developing anthropomorphic spoken dialog agent,'' Proc. of PRICAI-02, International Workshop on Lifelike Animated Agents, pp.64-69, Aug 2002. (PDF download)

  • "Life-Like Characters - Tools, Affective Functions, and Applications" from Springer-Verlag (published on 13 Nov 2003; top right photo).

  • Some of other publications (including Japanese papers) are found at http://iipl.jaist.ac.jp/galatea/publications-local.html.

Last update 31 July 2004.