脸书首个100种语言互译模型 不依赖英语且开源

   2020-10-22 小E英语0
核心提示:Facebook Develops Machine Translation System for 100 LanguagesFacebook开发首个100种语言互译模型Facebook has developed the first machine learning model

Facebook Develops Machine Translation System for 100 Languages

Facebook开发首个100种语言互译模型

Facebook has developed the first machine learning model that can translate between any two of 100 languages without going into English first.

Facebook开发了首个在100种语言中任意两种语言间可以进行互译,而无需先翻译成英语的机器学习模型。

Facebook says the new multilingual machine translation model was created to help its more than two billion users worldwide. The company is still testing the translation system – which it calls M2M-100 - and hopes to add it to different products in the future.

Facebook表示,创建这一新的多语言机器翻译模型是为了帮助该公司在全球的20亿用户。该公司仍在测试这款名为M2M-100的翻译系统,并希望将来能将其添加到其它产品中。

The social media service says it has made the system open source -- meaning its computer code will be freely available for others to copy or change.

这家社交媒体服务公司称,已经将该系统开源——开源意味着其计算机代码可以免费供其他人复制或修改。

Angela Fan, a research assistant at Facebook, explained the new machine translation model this week on one of the company's websites. She said its development represented a "milestone" in progress after years of "foundational work in machine translation."

Facebook研究助理安吉拉·范本周在该公司的一个网站上解释了这种新的机器翻译模型。她称该模型的开发代表着多年“机器翻译基础工作”进程上的一座“里程碑。”

Fan said the model produces better results than other machine learning systems that depend on English to help in the translation process. The other systems use it as an intermediate step -- like a bridge -- to translate between two non-English languages.

范表示,与其它在翻译过程中依赖英语帮助的机器学习系统相比,该模型的翻译结果会更好。其它系统将英语作为两种非英语语言之间进行翻译的桥梁。

One example would be a translation from Chinese to French. Fan noted that many machine translation models begin by translating from Chinese to English first, and then from English to French. This is done "because English training data is the most widely available," she said. But such a method can lead to mistakes in translation.

其中一个例子是汉语译法语。范指出,许多机器翻译模型都是先将汉语翻译成英语,然后再将英语翻译成法语。她说,这样做的原因是“因为可用的英语训练数据最广泛。”但是这种做法可能会导致翻译错误。

Our model directly trains on Chinese to French data to better preserve meaning, Fan said. Facebook said the system outperformed English-centered systems in a widely used system that uses data to measure the quality of machine translations.

范表示:“我们的模型直接按照中文到法语的数据来运行,以更好地保留语义。”Facebook表示,该模型在一个广泛使用的以数据来衡量机器翻译质量的系统中,要优于以英语为中心的翻译模型。

Facebook says about two-thirds of its users communicate in a language other than English. The company already carries out an average of 20 billion translations every day on Facebook's News Feed. But it faces a huge test with many users publishing massive amounts of content in more than 160 languages.

Facebook表示,该公司约2/3的用户使用英语以外的语言进行交流。该公司已经在Facebook网站的信息流(News Feed)上平均每天进行200亿次翻译。但是该翻译模型面临了巨大的考验,因为许多用户发布大量内容的语言超过160种。

The development team trained, or directed, the new model on a data set of 7.5 billion sentence pairs for 100 languages. In addition, the system was trained on a total of 2,200 language directions. Facebook said this is 10 times the number on the best machine translation models in the past.

该开发团队用100种语言的75亿对句子的数据集训练或指导这款新模型。此外,该系统还共接受了2200种语言方向的训练。Facebook表示,这一数字是过去最佳机器翻译模型的10倍。

One difficulty the team faced was trying to develop an effective machine translation system for language combinations that are not widely used. Facebook calls these "low-resource languages." The data used to create the new model was collected from content available on the internet. But there is limited internet data on low-resource languages.

该团队面临的困难之一是试图为未广泛使用的语言组合开发有效的机器翻译系统。Facebook称这些语言为“低资源语言。”创建新模型所用的数据是从互联网上的内容中收集的。但在低资源语言方面,互联网数据有限。

To deal with this problem, Facebook said it used a method called back-translation. This method can create "synthetic translations" to increase the amount of data used to train on low-resource languages.

为了解决这个问题,Facebook称其利用了一种名为“反向翻译”的办法。这种办法可以创建“合成翻译,”以增加用于训练低资源语言的数据量。

For now, the company says, it plans to continue exploring new language research methods while working to improve the new model. No date has been set for launching the translation system on Facebook.

该公司表示,目前公司计划继续探索新的语言研究方法,同时努力改进这款新模型。该公司尚未确定在Facebook网站上启用该翻译系统的日期。

But Angela Fan said the new system marks an important step for Facebook, especially for the times we live in. "Breaking language barriers through machine language translation is one of the most important ways to bring people together, provide authoritative information on COVID-19, and keep them safe from harmful content," she said.

但是安吉拉·范表示,该新系统标志着Facebook迈出了重要的一步,尤其是对于我们所处的这一时期来说。她说:“通过机器翻译打破语言障碍,是将人们团结在一起、提供新冠肺炎相关权威信息,并使人们远离有害内容的最重要手段之一。”

I'm Bryan Lynn.

布莱恩·琳恩为你报道。

 
反对 0举报 0 评论 0
 

免责声明:本文仅代表作者个人观点,与好速译英语翻译(本网)无关。其原创性以及文中陈述文字和内容未经本站证实,对本文以及其中全部或者部分内容、文字的真实性、完整性、及时性本站不作任何保证或承诺,请读者仅作参考,并请自行核实相关内容。
    本网站有部分内容均转载自其它媒体,转载目的在于传递更多信息,并不代表本网赞同其观点和对其真实性负责,若因作品内容、知识产权、版权和其他问题,请及时提供相关证明等材料并与我们留言联系,本网站将在规定时间内给予删除等相关处理.