Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data analytics
    How Data Analytics Can Help You Construct A Financial Weather Map
    4 Min Read
    financial analytics
    Financial Analytics Shows The Hidden Cost Of Not Switching Systems
    4 Min Read
    warehouse accidents
    Data Analytics and the Future of Warehouse Safety
    10 Min Read
    stock investing and data analytics
    How Data Analytics Supports Smarter Stock Trading Strategies
    4 Min Read
    predictive analytics risk management
    How Predictive Analytics Is Redefining Risk Management Across Industries
    7 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: How Big Data And Machine Translation Combine To Fight COVID-19
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > How Big Data And Machine Translation Combine To Fight COVID-19
Big DataExclusive

How Big Data And Machine Translation Combine To Fight COVID-19

Ryan Kh
Ryan Kh
8 Min Read
big data and machine translation
SHARE

Few if any events in history have brought the importance of big data to popular awareness more than the COVID-19 pandemic. Statistics gathered from around the world are driving public policy and shaping private behavior. Here we’ll focus on the linguistic dimension of this global struggle to communicate essential information both to policymakers, healthcare providers and to the general public. The challenge is how to communicate rapidly changing data across language borders so that essential information is not lost in translation. But there are also more controversial uses of big data that are translated along the way to find users.

Contents
  • Machine Translation using Big Data by the Leading Corporations
  • Social Media Translation and Privacy Challenges in COVID Tracking
  • Translating Privacy Concerns in Connection with Voluntary Data Collected
  • Public uses of Machine Translation and Interpretation on a Massive Scale
  • The Perils and Pitfalls of a Big Data and Machine Translation Project

Machine Translation using Big Data by the Leading Corporations

Given the scale of the problem, translation services are increasingly yielding to the efficiencies and throughput of machine translation. There are simply not enough human translators and interpreters to go around. Happily, thanks to the application of neural network methodologies in the last decade, the quality of machine translation has increased dramatically, dominated by developments in this area of the biggest tech companies, collectively dubbed by the acronym FAMGA: Facebook, Apple, Microsoft, Google, and Amazon. Each of those corporations in their own way has relied on big data to compete on the leading linguistic edge. Instead of crunching numbers, however, they’re crunching words.

Social Media Translation and Privacy Challenges in COVID Tracking

Facebook snagged first place in several categories of the 2019 WMT competition, leveraging large-scale sampled back-translation, a big data technique based on Neural Machine Translation, requiring vast amounts of bilingual training data – sentences for which reference translations are available. Bilingual data is hard to come by, so the Facebook team used back-translation as a workaround. In the end, the team uses roughly 10 billion words of additional data for its task. Facebook has unmatched access to content, using the comments and posts of its 2 billion or so users as training material.

It’s one thing to use posted language for experimental purposes in a language competition. It’s another altogether to exploit member posts on sensitive health matters like the novel coronavirus and the COVID-19 pandemic. As a J. Scott Marcus of the Bruegel Institute has observed, users “volunteer” information in various ways: in their posts to social media, in their use of mobile services and providing location data, in seeking health information. According to Marcus, big data has been used for strategic planning concerning COVID, for tracing potentially infected persons, and for the provision of guidance, advice, and information to infected individuals and the general public.

More Read

big data can help your company choose the right staffing model
Using Data Analytics for Selecting Staff Augmentation & Managed Services
How To Solve The Data Management Challenge Of Self-Driving Cars
Adopting a User Behavior Analytics (UBA) Solution
Is Your Big Data Hot or Not?
Data Analytics Plays Robust Role In Energy Cost Management In 2020

Translating Privacy Concerns in Connection with Voluntary Data Collected

Citizens may not be aware that the provision of “voluntary” data would be used to track them down and potentially quarantine them or expose the tracking of their movements. More than a country – starting with China, then South Korea, Taiwan, Israel, and others have explicitly used some or all of this information. In general, high tech companies have cooperated with national governments in making their data available, although privacy protections such as GPRS in Europe have deterred such uses in the European Union.

Virus tracking initiatives use machine translation to “normalize” communications and make them accessible in a preferred language to public health officials. For example, in Israel, social media communications in Arabic are auto-translated to Hebrew by machine translation techniques for the purpose of finding potential virus carriers.

Public uses of Machine Translation and Interpretation on a Massive Scale

Another example of the massive application of machine translation has been for screening visitors at international airports. In addition to thermal imagine and the now ubiquitous “thermometer pistols”, border officials are using hand-held voice-interpreters to question arriving passengers about their travel histories or medical symptoms.

The same considerations hold true for informing sectors of the public which do not speak the dominant language. Providing up-to-date information about coronavirus is a problem for migrants who do not speak the dominant language of the country in which the resident. In the Netherlands, according to a VOA report, volunteers set up a health desk to assist new immigrants who don’t speak Dutch. In Australia, the government sponsors a massive translation program at the nation’s border. Translating and Interpreting Service (TIS National) is a service provided by the Department of Immigration and Border Protection for non-English speakers who use both human interpreters and machine translation.

The need is massive in US hospitals. The New York Times reported in April 2020 on the vast scale of the difficulties of Hispanic sufferers of COVID-19 in the United States, suffering disproportionately, representing some 34% of casualties from the disease in New York.  To cope with the need, New York hospitals are increasingly turning to video remote interpretation, where health care providers call in to services where an interpreter is available on demand.

Last year, even before the COVID crisis broke, the not-for-profit Translators without Borders (TWB), with support from Cisco, introduced an innovative machine translation initiative called Gamayun aimed at helping individuals who speak marginalized, minority languages. “People who speak marginalized languages lack access to critical and life-saving information,” Grace Tang, who manages the program for TWB. Voice interpretation and text translation based on AI and big data tech will help the program scale up to 10 marginalized languages over 5 years, according to a Cisco spokesman.

The Perils and Pitfalls of a Big Data and Machine Translation Project

Perhaps the most famous, or perhaps notorious, case of a project combining big data and machine translation is Project Baseline, an initiative of Alphabet-backed Verily. U.S. President Donald Trump, in March 2020, raised a ruckus, when he claimed that Google was backing a nationwide initiative to tracking the novel coronavirus using bilingual screening questions. A similar controversy arose with Vital Software’s Covid-19 symptom-checker, translated into 15 languages for the state of Oregon. While the community-based project was launched, the scale remains on the county level in selected states, not the national level. It’s still going through “teething pains.” To its credit, the project takes data privacy concerns seriously, given the massive amounts of sensitive information being collected from individuals.

The bottom line on the use of big data for machine translation and other purposes in the COVID crisis is that it’s being done “on the fly” and under intense pressure – a fact that almost invariably results in cut corners and high expectations not always met. The data is “noisy” and sub-optimal, to quote Facebook’s report on its WMT victory. Let’s hope that efforts to combine big data and machine language methodologies in these difficult days are also successful so that lives are not needlessly lost in translation.

TAGGED:big datacovid-19machine learningmachine translation
Share This Article
Facebook Pinterest LinkedIn
Share
ByRyan Kh
Follow:
Ryan Kh is an experienced blogger, digital content & social marketer. Founder of Catalyst For Business and contributor to search giants like Yahoo Finance, MSN. He is passionate about covering topics like big data, business intelligence, startups & entrepreneurship. Email: ryankh14@icloud.com

Follow us on Facebook

Latest News

protecting patient data
How to Protect Psychotherapy Data in a Digital Practice
Big Data Exclusive Security
data analytics
How Data Analytics Can Help You Construct A Financial Weather Map
Analytics Exclusive Infographic
AI use in payment methods
AI Shows How Payment Delays Disrupt Your Business
Artificial Intelligence Exclusive Infographic
financial analytics
Financial Analytics Shows The Hidden Cost Of Not Switching Systems
Analytics Exclusive Infographic

Stay Connected

1.2KFollowersLike
33.7KFollowersFollow
222FollowersPin

You Might also Like

unsupervised machine learning
Machine Learning

An Important Guide To Unsupervised Machine Learning

11 Min Read
ISPs internet service provider
AnalyticsBig DataExclusive

ISPs Use Holistic Big Data Strategy To Shed Customer Cynicism

6 Min Read
big data for education
Big DataExclusive

How Big Data For Education Sets The Stage For A New Era Of Learning

6 Min Read
amazon use of big data
Big DataExclusiveSoftware

How Amazon Has Shaped the Big Data Landscape

6 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai is improving the safety of cars
From Bolts to Bots: How AI Is Fortifying the Automotive Industry
Artificial Intelligence
ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?