Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
    big data and remote work
    Data Helps Speech-Language Pathologists Deliver Better Results
    6 Min Read
    data driven insights
    How Data-Driven Insights Are Addressing Gaps in Patient Communication and Equity
    8 Min Read
    pexels pavel danilyuk 8112119
    Data Analytics Is Revolutionizing Medical Credentialing
    8 Min Read
    data and seo
    Maximize SEO Success with Powerful Data Analytics Insights
    8 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: How Big Data And Machine Translation Combine To Fight COVID-19
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > How Big Data And Machine Translation Combine To Fight COVID-19
Big DataExclusive

How Big Data And Machine Translation Combine To Fight COVID-19

Ryan Kh
Ryan Kh
8 Min Read
big data and machine translation
SHARE

Few if any events in history have brought the importance of big data to popular awareness more than the COVID-19 pandemic. Statistics gathered from around the world are driving public policy and shaping private behavior. Here we’ll focus on the linguistic dimension of this global struggle to communicate essential information both to policymakers, healthcare providers and to the general public. The challenge is how to communicate rapidly changing data across language borders so that essential information is not lost in translation. But there are also more controversial uses of big data that are translated along the way to find users.

Contents
Machine Translation using Big Data by the Leading CorporationsSocial Media Translation and Privacy Challenges in COVID TrackingTranslating Privacy Concerns in Connection with Voluntary Data CollectedPublic uses of Machine Translation and Interpretation on a Massive ScaleThe Perils and Pitfalls of a Big Data and Machine Translation Project

Machine Translation using Big Data by the Leading Corporations

Given the scale of the problem, translation services are increasingly yielding to the efficiencies and throughput of machine translation. There are simply not enough human translators and interpreters to go around. Happily, thanks to the application of neural network methodologies in the last decade, the quality of machine translation has increased dramatically, dominated by developments in this area of the biggest tech companies, collectively dubbed by the acronym FAMGA: Facebook, Apple, Microsoft, Google, and Amazon. Each of those corporations in their own way has relied on big data to compete on the leading linguistic edge. Instead of crunching numbers, however, they’re crunching words.

Social Media Translation and Privacy Challenges in COVID Tracking

Facebook snagged first place in several categories of the 2019 WMT competition, leveraging large-scale sampled back-translation, a big data technique based on Neural Machine Translation, requiring vast amounts of bilingual training data – sentences for which reference translations are available. Bilingual data is hard to come by, so the Facebook team used back-translation as a workaround. In the end, the team uses roughly 10 billion words of additional data for its task. Facebook has unmatched access to content, using the comments and posts of its 2 billion or so users as training material.

It’s one thing to use posted language for experimental purposes in a language competition. It’s another altogether to exploit member posts on sensitive health matters like the novel coronavirus and the COVID-19 pandemic. As a J. Scott Marcus of the Bruegel Institute has observed, users “volunteer” information in various ways: in their posts to social media, in their use of mobile services and providing location data, in seeking health information. According to Marcus, big data has been used for strategic planning concerning COVID, for tracing potentially infected persons, and for the provision of guidance, advice, and information to infected individuals and the general public.

More Read

data-driven tools for website testing
5 Proven Tips To Use Analytics to Improve Your Website in 2021
#5: Here’s a thought…
12 Big Data One-to-One Marketing Myths [INFOGRAPHIC]
Analytics and the Next Best Activity Strategy
Master Data Management (MDM) – Going Where the Enterprise Data Warehouse has Gone Before

Translating Privacy Concerns in Connection with Voluntary Data Collected

Citizens may not be aware that the provision of “voluntary” data would be used to track them down and potentially quarantine them or expose the tracking of their movements. More than a country – starting with China, then South Korea, Taiwan, Israel, and others have explicitly used some or all of this information. In general, high tech companies have cooperated with national governments in making their data available, although privacy protections such as GPRS in Europe have deterred such uses in the European Union.

Virus tracking initiatives use machine translation to “normalize” communications and make them accessible in a preferred language to public health officials. For example, in Israel, social media communications in Arabic are auto-translated to Hebrew by machine translation techniques for the purpose of finding potential virus carriers.

Public uses of Machine Translation and Interpretation on a Massive Scale

Another example of the massive application of machine translation has been for screening visitors at international airports. In addition to thermal imagine and the now ubiquitous “thermometer pistols”, border officials are using hand-held voice-interpreters to question arriving passengers about their travel histories or medical symptoms.

The same considerations hold true for informing sectors of the public which do not speak the dominant language. Providing up-to-date information about coronavirus is a problem for migrants who do not speak the dominant language of the country in which the resident. In the Netherlands, according to a VOA report, volunteers set up a health desk to assist new immigrants who don’t speak Dutch. In Australia, the government sponsors a massive translation program at the nation’s border. Translating and Interpreting Service (TIS National) is a service provided by the Department of Immigration and Border Protection for non-English speakers who use both human interpreters and machine translation.

The need is massive in US hospitals. The New York Times reported in April 2020 on the vast scale of the difficulties of Hispanic sufferers of COVID-19 in the United States, suffering disproportionately, representing some 34% of casualties from the disease in New York.  To cope with the need, New York hospitals are increasingly turning to video remote interpretation, where health care providers call in to services where an interpreter is available on demand.

Last year, even before the COVID crisis broke, the not-for-profit Translators without Borders (TWB), with support from Cisco, introduced an innovative machine translation initiative called Gamayun aimed at helping individuals who speak marginalized, minority languages. “People who speak marginalized languages lack access to critical and life-saving information,” Grace Tang, who manages the program for TWB. Voice interpretation and text translation based on AI and big data tech will help the program scale up to 10 marginalized languages over 5 years, according to a Cisco spokesman.

The Perils and Pitfalls of a Big Data and Machine Translation Project

Perhaps the most famous, or perhaps notorious, case of a project combining big data and machine translation is Project Baseline, an initiative of Alphabet-backed Verily. U.S. President Donald Trump, in March 2020, raised a ruckus, when he claimed that Google was backing a nationwide initiative to tracking the novel coronavirus using bilingual screening questions. A similar controversy arose with Vital Software’s Covid-19 symptom-checker, translated into 15 languages for the state of Oregon. While the community-based project was launched, the scale remains on the county level in selected states, not the national level. It’s still going through “teething pains.” To its credit, the project takes data privacy concerns seriously, given the massive amounts of sensitive information being collected from individuals.

The bottom line on the use of big data for machine translation and other purposes in the COVID crisis is that it’s being done “on the fly” and under intense pressure – a fact that almost invariably results in cut corners and high expectations not always met. The data is “noisy” and sub-optimal, to quote Facebook’s report on its WMT victory. Let’s hope that efforts to combine big data and machine language methodologies in these difficult days are also successful so that lives are not needlessly lost in translation.

TAGGED:big datacovid-19machine learningmachine translation
Share This Article
Facebook Pinterest LinkedIn
Share
ByRyan Kh
Follow:
Ryan Kh is an experienced blogger, digital content & social marketer. Founder of Catalyst For Business and contributor to search giants like Yahoo Finance, MSN. He is passionate about covering topics like big data, business intelligence, startups & entrepreneurship. Email: ryankh14@icloud.com

Follow us on Facebook

Latest News

image fx (2)
Monitoring Data Without Turning into Big Brother
Big Data Exclusive
image fx (71)
The Power of AI for Personalization in Email
Artificial Intelligence Exclusive Marketing
image fx (67)
Improving LinkedIn Ad Strategies with Data Analytics
Analytics Big Data Exclusive Software
big data and remote work
Data Helps Speech-Language Pathologists Deliver Better Results
Analytics Big Data Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

HR professionals prepare GDPR
Big DataExclusive

Can HR Professionals Use Big Data After the GDPR?

5 Min Read
analyst,women,looking,at,kpi,data,on,computer,screen
Analytics

What to Know Before Recruiting an Analyst to Handle Company Data

6 Min Read
surveys data
Data Mining

5 Data Mining Tips to Leverage the Benefits of Surveys

11 Min Read
analytical problem solving skills
AnalyticsBig DataExclusiveJobs

Here Are The Skills You Need To Work With Big Data

7 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive
ai chatbot
The Art of Conversation: Enhancing Chatbots with Advanced AI Prompts
Chatbots

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?