Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
AI Slashdot.org

Meta's 'Massively Multilingual' AI Model Translates Up To 100 Languages, Speech or Text 14

An anonymous reader quotes a report from Ars Technica: On Tuesday, Meta announced SeamlessM4T, a multimodal AI model for speech and text translations. As a neural network that can process both text and audio, it can perform text-to-speech, speech-to-text, speech-to-speech, and text-to-text translations for "up to 100 languages," according to Meta. Its goal is to help people who speak different languages communicate with each other more effectively. Continuing Meta's relatively open approach to AI, Meta is releasing SeamlessM4T under a research license (CC BY-NC 4.0) that allows developers to build on the work. They're also releasing SeamlessAlign, which Meta calls "the biggest open multimodal translation dataset to date, totaling 270,000 hours of mined speech and text alignments." That will likely kick-start the training of future translation AI models from other researchers.

Among the features of SeamlessM4T touted on Meta's promotional blog, the company says that the model can perform speech recognition (you give it audio of speech, and it converts it to text), speech-to-text translation (it translates spoken audio to a different language in text), speech-to-speech translation (you feed it speech audio, and it outputs translated speech audio), text-to-text translation (similar to how Google Translate functions), and text-to-speech translation (feed it text and it will translate and speak it out in another language). Each of the text translation functions supports nearly 100 languages, and the speech output functions support about 36 output languages.
This discussion has been archived. No new comments can be posted.

Meta's 'Massively Multilingual' AI Model Translates Up To 100 Languages, Speech or Text

Comments Filter:
  • Accuracy? (Score:5, Funny)

    by TWX ( 665546 ) on Tuesday August 22, 2023 @05:14PM (#63788950)

    My hovercraft is full of eels.

    • by narcc ( 412956 )

      Believe it or not, I have high expectations here. Transformers are a good fit for translation tasks. It's a real shame that we don't have more quality training data, and that's only going to get more difficult now.

      I haven't read the whitepaper yet, so I hope they have a plan to minimize the amount of their machine translated text that ends up in future datasets.

      • by lsllll ( 830002 )
        I'm not sure if I have high expectations, but rather I have high hopes. Translating my novel into Farsi, if at least so that my mother could read it, would be nice if the translation was accurate enough to convey the writing and the dialogue, but also serene enough to capture the reader's attention. It may be a while still, though.
      • by AmiMoJo ( 196126 )

        I've been using Google Translate since it started and I've found that it has improved dramatically over that time, but still struggles in certain situations.

        It used to translate everything into the first person, which made headlines read a bit weirdly. They seem to have improved that, but it still struggles with lack of context. As an example, it often gets subject's gender confused, even when their name and/or gender is mentioned in the same sentence.

        It also has trouble with slang and informal speech, at l

  • Meta's 'Massively Multilingual' AI Model Translates Up To 100 Languages, Speech or Text

    It only translates things into an incomprehensible dead language that Professor Farnsworth calls, "crazy gibberish" [youtube.com] ...

  • by sconeu ( 64226 ) on Tuesday August 22, 2023 @05:34PM (#63788992) Homepage Journal

    Unfortunately, they trained on historical data sets, so when they translate "Out of sight, out of mind", the result comes out as "invisible idiot".

  • by nuckfuts ( 690967 ) on Tuesday August 22, 2023 @05:49PM (#63789030)
    The speech-to-text feature could be useful for video players when subtitles aren't provided (or are poorly done).
  • Considering how bad their search software is would you trust them to translate 'no' from english to spanish?
  • what's a "minimally multilingual AI model"?

  • I asked for a translation into Navajo and got one into (bad? good?) Hopi, which is a totally unrelated language. I pointed out the mistake, and got one into something resembling Navajo except it was half English and half really badly put together Navajo forms.

Every nonzero finite dimensional inner product space has an orthonormal basis. It makes sense, when you don't think about it.

Working...