ETV Bharat / science-and-technology

First Multilingual Machine Translation model, M2M-100, introduced by Facebook

Facebook introduced the first-ever, open-source multilingual machine translation (MMT) model that can translate between any pair of 100 languages without relying on English data. This model is trained on 2200 language directions, it directly trains on Chinese to French data to better preserve meaning, etc.

machine learning tools by facebook,multilingual machine translation model by facebook
First Multilingual Machine Translation model, M2M-100, introduced by Facebook
author img

By

Published : Oct 20, 2020, 11:07 AM IST

Updated : Feb 16, 2021, 7:31 PM IST

San Francisco: Using novel mining strategies to create translation data, Facebook built the first truly "many-to-many" data set with 7.5 billion sentences for 100 languages.

machine learning tools by facebook,multilingual machine translation model by facebook
First Multilingual Machine Translation model, M2M-100, introduced by Facebook
machine learning tools by facebook,multilingual machine translation model by facebook
First Multilingual Machine Translation model, M2M-100, introduced by Facebook
  • Called "M2M-100," it is trained on a total of 2,200 language directions — or 10 times more than previous best, English-centric multilingual models.
  • "Deploying M2M-100 will improve the quality of translations for billions of people, especially those that speak low-resource languages," Facebook AI said in a statement.
  • When translating, say, Chinese to French, most English-centric multilingual models train on Chinese to English and English to French, because English training data is the most widely available.
  • The new Facebook ML model directly trains on Chinese to French data to better preserve meaning.
  • It outperforms English-centric systems by 10 points on the widely used BLEU metric for evaluating machine translations.

"We're also releasing the model, training, and evaluation set up to help other researchers reproduce and further advance multilingual models," the social network announced.



"We used several scaling techniques to build a universal model with 15 billion parameters, which captures information from related languages and reflects a more diverse script of languages and morphology," the company said.



One challenge in multilingual translation is that a singular model must capture information in many different languages and diverse scripts.

To address this, Facebook saw a clear benefit of scaling the capacity of its model and adding language-specific parameters.

"The combination of dense scaling and language-specific sparse parameters (3.2 billion) enabled us to create an even better model, with 15 billion parameters".

For years, AI researchers have been working toward building a single universal model that can understand all languages across different tasks.

"A single model that supports all languages, dialects, and modalities will help us better serve more people, keep translations up to date, and create new experiences for billions of people equally," Facebook said.

Also Read: Vivo V20 launched, features and specifications

San Francisco: Using novel mining strategies to create translation data, Facebook built the first truly "many-to-many" data set with 7.5 billion sentences for 100 languages.

machine learning tools by facebook,multilingual machine translation model by facebook
First Multilingual Machine Translation model, M2M-100, introduced by Facebook
machine learning tools by facebook,multilingual machine translation model by facebook
First Multilingual Machine Translation model, M2M-100, introduced by Facebook
  • Called "M2M-100," it is trained on a total of 2,200 language directions — or 10 times more than previous best, English-centric multilingual models.
  • "Deploying M2M-100 will improve the quality of translations for billions of people, especially those that speak low-resource languages," Facebook AI said in a statement.
  • When translating, say, Chinese to French, most English-centric multilingual models train on Chinese to English and English to French, because English training data is the most widely available.
  • The new Facebook ML model directly trains on Chinese to French data to better preserve meaning.
  • It outperforms English-centric systems by 10 points on the widely used BLEU metric for evaluating machine translations.

"We're also releasing the model, training, and evaluation set up to help other researchers reproduce and further advance multilingual models," the social network announced.



"We used several scaling techniques to build a universal model with 15 billion parameters, which captures information from related languages and reflects a more diverse script of languages and morphology," the company said.



One challenge in multilingual translation is that a singular model must capture information in many different languages and diverse scripts.

To address this, Facebook saw a clear benefit of scaling the capacity of its model and adding language-specific parameters.

"The combination of dense scaling and language-specific sparse parameters (3.2 billion) enabled us to create an even better model, with 15 billion parameters".

For years, AI researchers have been working toward building a single universal model that can understand all languages across different tasks.

"A single model that supports all languages, dialects, and modalities will help us better serve more people, keep translations up to date, and create new experiences for billions of people equally," Facebook said.

Also Read: Vivo V20 launched, features and specifications

Last Updated : Feb 16, 2021, 7:31 PM IST
ETV Bharat Logo

Copyright © 2024 Ushodaya Enterprises Pvt. Ltd., All Rights Reserved.