META INTRODUCES SEAMLESSM4T
Meta has introduced SeamlessM4T, the first universal multimodal and multilingual translation model with artificial intelligence that allows users to communicate across language barriers. SeamlessM4T is currently available to researchers and developers under the appropriate license. The metadata of SeamlessAlign, the largest open-source multimodal translation dataset with 270,000 hours of learned speech, has also been published.
Creating a universal language translator is no easy task, and current speech-to-text and text-to-speech systems cover only a small fraction of the world’s languages. Building on the achievements of researchers around the world who have been striving for many years to create a universal translator, SeamlessM4T uses a unified systems approach to increase the efficiency and quality of translations by reducing errors and delays compared to segmented model methods. It leverages learnings from these efforts to provide multilingual and multimodal translation through a unified model built from disparate oral data sources, yielding state-of-the-art results. SeamlessM4T supports:
- Speech recognition in almost 100 languages;
- Speech-to-text conversion for nearly 100 input and output languages;
- Speech-to-text conversion supporting nearly 100 input languages and 36 output languages;
- Text-to-speech translation for nearly 100 languages;
- Text-to-speech, supporting nearly 100 input languages and 35 output languages.
SeamlessM4T marks a new milestone in the development of artificial intelligence technology that bridges language gaps and facilitates connection and communication between people speaking different languages.
Digital skills for all
Digital skills for the workforce
Digital skills for ICT professionals
Digital skills in education
Digital skills for children
Digital skills for public administration