9 Steps to Get Your Website Localization Started

Localizing your website into different languages is a huge task – but a very rewarding one. Check out how to get your website localization started with these 9 crucial steps.

Global expansion is high on the business agenda from New York City to Beijing. CEOs are rubbing their hands with glee, while CMOs and product managers are tearing out their hair. Why? Because access to international customers may mean greater profits, but it’s also a lot more work. If you’re in the throes of rolling out your website to a global audience, you probably know a little (or a lot) about website localization. And if you’re tasked with the job of making your site more appealing to customers around the world, lucky you! Yours is a job that requires skill, determination, and coordination – resisting ALL temptation to bang your head against the wall as often as possible. Why? Because website localization is a complex, yet rewarding task if done properly. However, the right website localization doesn’t just involve adding a plugin to your site. Sorry if that bursts your bubble! If you want to communicate a valuable message that resonates with your customers, machine translation won’t suffice. Just because users from Japan, Italy, and Venezuela are only a click away, doesn’t mean they understand your English language site … or that it meets their needs with currency, offers, and user experience. People don’t like being spoken to by robots, or as if they were robots – not at home, and not in overseas countries either. So, if you want to capitalize on this goldmine of potential profit, you’ll need to invest in website localization. You’ll need different language and regional versions of your website that walk and talk like a native, in a local and appealing way. With so many different languages and buying preferences, website localization can seem overwhelming. The list of website localization failures and international marketing blunders is long, very long. If you don’t want to join the thousands of companies who have goofed internationally, you’ll need to get it right, but if you’re stopping before you start or buckling under the weight of the awesome task ahead, fear not. Localization doesn’t have to be a headache! There are ways that you can roll out the localization of your website hassle-free. As with many areas of your business, success lies in careful planning, research, and strategy. Check out these 9 steps to get your website localization started.

1. Plan for Website Localization from The Start

Statistics show that more than three billion people use the Internet every day. Most of them are from Asia, America, and Europe. While that doesn’t guarantee that all of them will be candidates for what your business is selling, it highlights the great potential. And the fact that if you’re only selling in one market, you’re missing out on huge potential profits. So, you should always consider the potential of international sales from the start. If you’re just setting out to design your website or updating an existing one, factoring in website localization will be easier for you. Think big and think global. Even if your initial market is local or small, that doesn’t mean it always will be. Having access to the Internet gives you a window on the world, and the world a window on your business. Whether you sell productivity software or clothing for premature babies, global tendencies are merging. So, if you have a successful product at home, it’s just as likely to be an overseas hit as well. However, you’ll need to be able to communicate with your French-speaking customers in French. To your Spanish-speaking customers in Spanish, and so on. Not only communicate either but craft an attractive message in their local language. Use their everyday vocab that they can understand and identify with. If you’re thinking that most of the world speaks English these days, you need to get with the program. While many foreign language speakers understand English or even speak it well, that doesn’t mean they’ll buy from an English language website. Common Sense Advisory statistics don’t lie:

  • 72.1% of consumers spend most or all of their time on websites in their own language
  • 72.4% of consumers said they would be more likely to buy a product with information in their own language

In a similar manner, according to the Globalization and Localization Association (GALA), in China, 95% of Chinese online shoppers prefer using websites in their own language. But only one percent of US-based online retailers offer sites specific to China. Why is that? It could be an epic oversight by US multinationals and service providers. However, it’s more likely that complex cultural issues, protective legislation or lack of local demand are to blame. Considering this background, how do you keep one eye on your international future as you set out to design your site? The simplest way of factoring in localization from the start is by leaving a lot of space in your design. Why? Because all languages are different and they don’t take up the same amount of room. If you’re bound by the constraints of your cluttered design or hard-coded CTAs, you’re going to run into a handful of problems – broken design, collapsing strings, site speed, and a lot of going back and forth. Space will make your website breathe anyway and good designers are already aware of its purpose. Remember that Internet speeds are not the same globally. The fewer images and videos you include, the faster your website will run. You’ll also have fewer images to replace and tailor to your local markets and fewer assets to translate. When it comes to website localization, less is more.

2. Identify Your Target Market

The cost of localization rises with the number of languages you want to work with and countries you want to target. I made a compelling argument for localizing your website. But, you don’t need to embark on a full-scale localization project into over 1,000 languages if your core customer base is in one or two countries. Identify your target market and where you think your product will be more successful. Start with those areas first and remember that new markets may open up in the future. Analyze the countries carefully that are more likely to bring in a greater ROI for your website localization project. Even huge global players, like McDonald’s, had to close down stores in some countries. Or were unable to find sufficient market demand (or comply with legislation). Going global without proper research is cavalier and unadvisable. Companies like McDonald’s can absorb the cost of localization blunders, but can yours? Try to identify your international buyer personas by conducting specific research in each geographic region you want to approach. Using generic data about an entire continent is not good enough. The French, for example, are different from the Spanish. The Spanish don’t share the same culture as the Germans. More importantly, they don’t share the same language. So, you’ll need to analyze and assess the demand. If your product is partiality popular in one region, start out there. Building a one-size-fits-all plan for Europe won’t be of any use to anyone. Your website localization ROI will depend on how well you manage the process in all its stages. So, it’s essential to start out right with every country you approach. Try asking a few simple, but essential questions:

  • Is there interest for your product in a specific market?
  • What is that market’s growth rate?
  • How much competition is there?
  • Can local buyers afford your products?
  • What are their buying preferences?
  • How much will transportation and customer support cost you?
  • How high is the cost of website localization compared with the potential market?

It’s important to reach the highest number of potential customers without spending more than your company can afford. But consider all marketing and financial indicators when deciding what markets to approach. China clearly has huge potential if you look at the number of internet users. But plain numbers may not be relevant if you don’t have a chance of selling your product.

3. Put Your Team Together

Website localization has many stakeholders. In fact, one of the reasons that localization is so dang hard is because there are a lot of people involved. The size of your team may be bound by your budget, but that doesn’t mean the efficiency or the quality has to drop. You may decide to carry out your market research first hand by visiting the countries in question. If you’re lucky enough to have the time and resources for this. Most likely, you’ll work with a local consultant. At the very least, check out Google Analytics. However you choose to approach it, you’ll need people on board who understand what makes your target market customers tick. If you want to stand out from the crowd and boost your sales, don’t cut corners when it comes to translating your content. Even the smallest of translation errors can cause serious damage to your global image. This will be in addition to a stellar team of native translators in your target market countries. First-class translators will be able to translate and localize your message in a way that makes local people identify with it right away. You need to deliver the same amount of wit, wisdom, and charisma that your original message contains. This means you may need to look for translators who can work with a list of keywords and apply techniques of transcreation. They will be less bound by the original text and given more freedom to localize your message. Finally, you’ll also need a reliable team of developers who understand how to work with different software and languages (both programming and human.) They’ll need to understand how translators work and the context needed to translate a full text. Above all, your team will need to be able to work well together and collaborate with each other. Especially considering they will likely be separated over different time zones and geographical regions.

4. Get Your Keyword Research Done

Once you’ve chosen your key markets and languages, you’ll need to get your lists of keywords prepared. This is important, as getting your international SEO right is essential for your SERP. Do you know how you optimize your website for search engines at home? Well, you’re going to have to do that for search engines abroad as well. Not everyone in the world’s go-to search engine is Google. In China, they’re fans of Baidu. In Russia, they mainly use Yandex. Not only do you need to know the terms your customers are searching for, but you’ll need to know the search engines they’re using as well. Just as your content is optimized at home, your translators should make use of key search terms in all languages, on-page, and off-page. That includes metadata, keyword density in your site and external link anchors you use to point to your company. There are often many ways of translating the same text, so it’s essential that you provide your translators with a list of essential keywords. Optimizing your content doesn’t just include markets that speak different languages. Remember that British and Americans speak don’t sound the same. Mexicans and Argentinians don’t use all the same vocabulary. If you’re optimizing your website for “vacations in Orlando” for a UK audience, you’re off to a bad start. Brits go on “holiday,” not “vacation,” and your lack of research will be reflected in your SERP.

5. Don’t Even Think About Localizing Your Website Manually

If you’re thinking about managing this project manually, I suggest you reconsider. There’s a reason that technology exists and that’s because (in most cases) it makes our lives easier. You wouldn’t punch out your thesis on a typewriter, or add up complex equations without a calculator. So, if you’re thinking of using spreadsheets, emails, and Word documents for website localization — stop before you start. Spreadsheets suck for localization and I’m pretty sure I heard someone say that too much manual work can kill you! There’s way too much room for error when localizing your website manually. You have global teams working in isolation. Translators guessing at what’s coming next, and programmers unsure how to break up RTL or vertical languages. It takes forever, you go back and forth and you’ll end up spending way too much on your project. And tearing your hair out at the same time! If you like your hair and want to maximize your ROI, use translation management software. Which takes me to my next point.

6. Use the Right Translation Management Software

Using the right translation management software will take the headache out of your website localization project. Your translation management software should have several key features, including:

  • An API – to make automation possible. Your programmers can then integrate all projects with ease, import locale files and interact with the data. An API is essential for streamlining your workflow.
  • Collaboration functions – to make sure that your project runs smoothly. You need to be able to communicate effectively with all team members. Leave feedback, notes, comments, screenshots and tags, all in one place.
  • Direct website translation – just as it sounds, this will speed up your website localization project. If your translation management software doesn’t allow your translators to type directly onto your website, you’re slowing down your workflow. Remember the point about lengthy emails and piles of spreadsheets? Just think how awesome it would be if all that work could go right onto the site! Your translators and programmers could collaborate over one software!
  • Translation Memory – to speed up your projects and record any frequently used terms. Your translation memory will also save all changes and let you look up older versions with ease. Cut down repetitive manual tasks for all.

Your translation management software will act as a project manager and coordinator at once. When you have all your team members working in harmony, you’re far more likely to get your website localized faster and with lower cost.

7. Internationalize Your Website

Internationalizing your website for other languages will make adjusting for new markets much easier. Your programmers will know all about internationalization and Unicode (if they don’t, you might need to look for new ones!) Applying this is a lot easier when you plan for website localization from the beginning, but the main things to consider are:

  • Making sure your programmers use Unicode (UTF-8). This is the industry standard when it comes to encoding systems. It supports all languages, from Greek to Russian and even Chinese. Unicode provides a unique symbol for each of the hundreds of thousands of characters found in languages.
  • Having your developers ready to work with your translators. They’ll need to divide the source codes and separate the translatable strings to adapt all your data to the new language.
  • Separating the content from the code and storing it. This way, the next time you need to translate into a new language, you won’t have to break the strings again.
  • Enabling codes for local preferences. This includes specific forms of addresses, data and time formats, number formats, or local calendars. Currency and shipping address formats are also important. People are more willing to buy when they don’t have to do the math with currency rates.

8. Test Before Launching

Always plan for localization and linguistic testing of your website. Linguistic testing is to ensure that all words are correctly translated, accurate and understood. Localization testing is about making sure your different language site versions are functioning right. To be more specific… Localization testing is about checking:

  • Encryption algorithms,
  • Hardware compatibility,
  • Names, time, date, weights, measurements, etc.,
  • Upgrades,
  • Entry fields,
  • Hyperlinks,
  • Image appropriateness,
  • Broken strings/design,
  • Form functionality,
  • Shopping Cart,
  • Load time etc.

Linguistic testing includes but is not limited to inspecting:

  • Spelling errors, wrong use of words, punctuation errors,
  • Grammatical mistakes,
  • Presence of cultural taboos,
  • Inappropriate or offensive texts,
  • Misuse of keywords,
  • Readability and appeal of message,
  • Untranslated strings.

Split your testers out over regions. Make sure that you check all site forms functionality and look for broken design. Ensure your site is optimized for speed and, if not, make sure that you address the issue. If you heeded my advice from the first point, your site will be as light and nimble as a South Korean gymnast. But, if it’s laden down with images and videos, your site speed may be slower than you’d like. 40% of online browsers will X out of your site if it takes more than three seconds to load. For those who stick around, they aren’t going to be impressed as they wait for your images to load or your videos to crank into life. Your programmers should have already minified your HTML, removed any unnecessary CSS and optimized your JavaScript. But if your site’s still loading slowly in your target country, think about investing in a Content Delivery Network (CDN). This will use local servers to speed up load time around the world. Ensure that all weights and measurements, dates and currency are correctly localized and that all images are appropriate. Check everything on mobile devices as well and with different browsers. Ensure that you deliver an optimal user experience around the world and that all your hard work with your website localization pays off. Test everything once, twice, three times and more. If you don’t find the issue, you can bet your customer will, and that could result in a lost sale.

9. Keep Calm and Carry on Localizing

If it seems like a lot of hard work and you keep putting it off, remember to keep calm and carry on localizing. Think about the success stories and the 4 billion Internet users worldwide. Remember the potential goldmine your business is sitting on and that any technical issues can be fixed. The sooner you start, the sooner you’ll be ready to launch your products in a new country! So, don’t wait too long to make the right decision. Every great journey begins with a single step. Yours will begin by designing your website with localization in mind. Researching your target market and assembling the right team. With a planned strategy, the right people and translation management software, you’ll pull it off to perfection. So, stop deliberating and get your web localization started today!

This post originally appeared on Phrase.com

RISE OF THE MACHINES THE STATE OF MT – COST, QUALITY, AND COVERAGE

Introduction - 

Machine translation is perhaps the most rapidly evolving space in the language services industry. It is easy to get lost with new innovations being announced each day. This report will help you stay up to speed without having to research for days.

In this Insight Report, we look at top machine translation providers and how they stack up. Specifically we analyze language coverage, pricing, integrations and of course output quality.

Information contained in this report:

  1. TL;DR
  2. Background
  3. How LSPs make money
  4. Market size
  5. Types of MT
  6. Ranking quality output
  7. Language coverage
  8. CAT Tool integration
  9. Pricing
  10. Summary

Note that in this report we look at output quality for the 11 top machine translation providers for 35 language pairs, both from and into English. To get this information Nimdzi has partnered with Inten.to.

“Intento benchmarks Cognitive Services and provides a single API to use all of them. We help to discover and use the best Artificial Intelligence services for every task without spending human effort on integration and switching providers.”

We know you are busy! That’s why each of our reports are formatted so that information can be quickly and easily digested. For those in a hurry, we provide the TL;DR (Too long; didn’t read!) section in the beginning of each report. At the end of each report we also summarize key points with our own Insights.

For those that think the devil is in the details, we have you covered, too! In the body of the report we go into as much detail as possible about each topic discussed. Still not satisfied? Never hesitate to reach out to us directly to request additional information. We’re here to help!

TL;DR

Market size

There is no consensus

Estimates for the size of the machine translation market vary widely. Estimating market size is difficult because it is an industry that is constantly evolving. Rapid growth and new entrants entering the market make for a moving target that is hard to hit.

LSP Opportunities

Translating more

LSPs do not have to see machine translation as a threat. Rather, it can be an opportunity. Machine translation allows LSPs to improve margins internally by increasing translator efficiency, and also also allows for translation of high volume content that previously would have been cost prohibitive, such as e-commerce content and user generated content.

Impending boom

Machine Translation about to explode

Existing players are reporting record growth and new big players continue to flock to the market. Amazon has recently announced a solution to rival Google Translate, and Apple is testing their own technology, expected to release soon.

Quality

Comparing engines by quality

Perhaps the most important aspect of evaluating machine translation engines is evaluating the quality of the output. Output quality varies based on many factors, such as engine, language pair, and source content. This report looks at quality data from 11 different MT engines, covering 35 language pairs. There are some clear winners and losers, though it is important to remember that this evaluation was performed on only one content type and different engines will be more effective with different content types.

Pricing

Cost models for MT providers

Pricing for MT providers can follow different models. The simplest is a simple markup on top of translation rates, which are already pretty well standardized. Other models include billing for support, on-premise servers, and hosted SaaS models. There are also diverse pricing models for post editing services. There is not yet a consensus on whether it makes sense to charge for PE services on a per word, per hour, or as a hybrid with existing TM match pricing models.

Integrations

Playing nice with other tools

Machine translation is only useful if you can use it. This means it somehow needs to be integrated into existing translation workflows and tool-sets. We look at the level of integration of over 35 providers with 10 of the top translation management systems to see who plays nice with who.

Coverage

Which language pairs are covered

Machine translation engines are built for individual languages. Therefore, it is important to make sure to pick an engine that will handle all of thevlanguage pairs you need. Google, Yandex, and Microsoft offer the most language pairs, both in total and unique. Unique language pairs are defined as those which are only offered by one engine. If one of those unique language pairs is a “must have” it is important to know which engine is the one that will offer it.

Evolution

From RbMT to NMT and beyond

Machine translation has come a long ways over the years. Today, we have many different types of machine translation beyond rules-based and statistical. Neural machine translation is a game changer, and when combined into hybrid NMT-SMT engines, can be even more powerful.

Background

Google Translate processed 146 billion words a day in 2016. That is already three times more than all professional translators in the world together can do in a month. Since then, the scale has only been growing.

For those of you that are worried about machine translation taking over, we are sorry to report that the doomsday scenario has already snuck up on us. The machines have already surpassed us.

Contrary to common expectations, the machines have not stolen all of our jobs. Global demand for language services continues to increase every year. The demand for quality linguists is at record highs. Machine translation (at least in its raw form) does not yet match human translation, and human professional translators are still required for projects that require quality translations.

Technology is not replacing translators. Translators using technology are replacing those who don’t.

However, that doesn’t mean professional translation is completely safe from MT or that professional services providers can ignore MT. With post-editing for most types of content, and raw MT for user-generated content, it is becoming urgent for translation companies and global brands to understand and harness the power of machine translation.

How LSPs make money with MT

Based on our interviews, the clear majority of translation companies see machine translation primarily as a threat. They do not yet generate a lot of new revenue by selling machine translation solutions. Instead, the quickly growing volume of post-editing work is cannibalizing income from professional translations. PEMT often warrants lower margins, partly because the service is new and requires adaptation from LSP vendor management, project management and sales. That’s why established LSPs often try to slow down MT adoption into buyer workflows, citing “client miseducation” about MT.

LSP opportunities with MT are connected with using it internally to improve efficiency and margins, and with making available projects that were not feasible before due to schedule and budgetary constraints.

Most common uses for MT:

 

Internal

The low-hanging fruit is to pre-translate numbers, addresses, countries, tags, and similar content with MT. In conjunction with translation memory, MT can produce even better savings. If the LSP introduces MT ahead of their buyer, they can lower translator costs while still charging full price to the customer.

Supplementing TMs

The low-hanging fruit is to pre-translate numbers, addresses, countries, tags, and similar content with MT. In conjunction with translation memory, MT can produce even better savings. If the LSP introduces MT ahead of their buyer, they can lower translator costs while still charging full price to the customer.

UGC

To sell more, marketers in consumer industries increasingly rely on user-generated content such as reviews, social media posts and community guides to their products. An unscripted review from a happy user promotes products better than a shiny official landing page full of marketing praise written by a third-party copywriter. Marketers will look to providers that can offer them an MT solutions to expand the reach of these user interactions.

Translating user generated content

To sell more, marketers in consumer industries increasingly rely on user-generated content such as reviews, social media posts and community guides to their products. An unscripted review from a happy user promotes products better than a shiny official landing page full of marketing praise written by a third-party copywriter. Marketers will look to providers that can offer them an MT solutions to expand the reach of these user interactions.

E-commerce

Online commerce continues to expand faster than brick-and-mortar shops. E-stores have thousands of items in stock, sometimes, hundreds of thousands. When they want to enter a new market, it’s impossible to translate the whole stock in a short period of time and on a feasible budget. MT is the answer.

High-volume product descriptions

Online commerce continues to expand faster than brick-and-mortar shops. E-stores have thousands of items in stock, sometimes, hundreds of thousands. When they want to enter a new market, it’s impossible to translate the whole stock in a short period of time and on a feasible budget. MT is the answer.

How much money is there in MT?

The MT software market is quite small, estimates ranging from USD 130 million to USD 400 million. This is a grain of sand compared to USD 25,000 – 40,000 million in professional translation services.

Industry outsiders typically give MT market a higher valuation, and insiders are more modest. For example, Global Market Insights predicted MT to hit USD 1.5 billion by 2024. Industry insider think tank TAUS made their first estimate in 2015 at USD 250 million, and corrected it down to USD 130 million two years later. They foresee a 6 percent growth rate for the coming years.

The reality is that estimates vary widely and every number we have seen so far is an educated guess. Below are some of the projections that have been published over the last three years.

2015 market sizing estimates

Projected USD 983.3 million by 2022

Estimating that the industry will grow close to USD 1billion by 2022.

Grand View Research

Estimating that the industry will grow close to USD 1billion by 2022.

USD 250 million in 2015

Estimated USD 250 million in 2015

TAUS

Estimated USD 250 million in 2015

2016 market sizing estimates

Estimated USD 400 million in 2016 with a 19 percent growth rate leading to USD 1.5 billion by 2024.

TAUS

Estimated USD 400 million in 2016 with a 19 percent growth rate leading to USD 1.5 billion by 2024.

2017 market sizing estimates

USD 123 million

Estimated USD 123 million for previous year (2016) with a 6.7 percent growth rate

P&S Market Research

Estimated USD 123 million for previous year (2016) with a 6.7 percent growth rate

USD 130 million in 2017

Estimated USD 130 million in 2017, with a 6 percent growth rate prediction

TAUS

Estimated USD 130 million in 2017, with a 6 percent growth rate prediction

The reasons to believe MT market is small are legitimate. With the exception of Google, Microsoft and defense contractors such as Raytheon BBN Technologies, most MT providers are microscopic companies. Important players like Omniscien, Kantan MT, Iconic, and Globalese are in the low millions USD in terms of revenue. We estimate SDL’s machine translation software business to be at around USD 6 million. Larger providers with decades-long history such as Systran and Promt never scaled past USD 10 – 15 million.

By all estimates, the money in pure MT is negligible compared to what it does for language services and how it disrupts the industry. However, it should not be viewed in isolation. MT enables post-editing services which could potentially completely transform the professional translation market.

Will there be a boom in MT revenues?

Yes, there will be.

Even without any new advancements into this market, we are already seeing growth being driven by trends in the industry: post-editing, increasing volumes, new markets, new innovation, and new integrations

MTPE

Translators are gradually becoming post-editors. This trend will continue as MT output quality improves.

Switching to post-editing

Translators are gradually becoming post-editors. This trend will continue as MT output quality improves.

Increased volume

Business and products are increasingly global, and there isn’t enough humans to translate everything manually.

Demand is skyrocketing

Business and products are increasingly global, and there isn’t enough humans to translate everything manually.

New markets

Indian, African

Localizing into more languages

Indian, African

Integration

More and more apps are including machine translation options directly in their app, facilitating easy communication where there used to be strong language barriers.

More in-app MT

More and more apps are including machine translation options directly in their app, facilitating easy communication where there used to be strong language barriers.

Innovation

Leading industries: software, ecommerce, travel, media

Leading industries pave the way

Leading industries: software, ecommerce, travel, media & stock information, basic healthcare

In addition to these pre-existing growth drivers, we are seeing even more growth being driven by technological advancement and new entrants to the market. Google’s introduction of Neural machine translation led to an explosive growth in mainstream business media coverage of MT. Buyers became more aware of MT capabilities and their interest has been piqued. After all, the idea is that universal Star Trek translator is around the corner, and that after 60 years of promises the AI has finally cracked the problem of human language. This news functions as a self-fulfilling prophecy.

In 2016 SDL’s machine translation technology has seen a 72 percent increase. SDL’s annual report does not reveal actual numbers, but taking into account the growth rates, we have calculated that they moved from ₤2.8 million to ₤4.5 million (USD 6 million) a year.

In 2017 Amazon introduced Amazon Translate, a service to rival Google Translate. The announcement arrived at the end of the year, but about two years’ worth of preparation and investment went into this system: in 2015 Amazon acquired MT company Safaba Solutions, and later opened a dedicated “MT office” in Pittsburgh.

The next colossus to enter the MT game will be Apple. As of the time of writing this, they were beta-testing an MT service internally, and hiring more scientists in the language technology field for their Munich office.

IT giants such as Amazon and Apple don’t invest resources into products unless they expect to gain a billion dollars in return. So, while other industry think tanks are predicting 6 – 7 percent annual growth, we believe the potential of MT is much larger.

Types of MT

Whether you view machine translation as a threat or an opportunity, it is important to understand how it has evolved over the years. The evolution is not yet complete, of course. Machine translation improves every day. In order to provide context to our conversation, we will first look at the different types of machine translation that have been developed, classified by technology, specialization and usage scenarios.

Classified by technology

First we will look at the types of MT as classified by the type of technology used. Below is a list in order of development, broken down into three generations of MT innovation.

FIRST GENERATION

RbMT

Rule-based MT

Uses countless algorithms based on language grammar, syntax, and phraseology. Good for repetitive content, such as ecommerce product names, but adding new language combinations takes years.

SECOND GENERATION

SMT

Statistical MT

Pattern-matches millions of reference texts to find translations that are statistically most likely to be suitable. Training new engines is easy provided there is enough reference material.

RbMT/SMT

Hybrid rule based and statistical MT

A combination of statistical with added custom rules on top.

Meta

Meta-language approach

Experimental approach, translates into semantic machine language first, then to another language.

THIRD GENERATION

NMT

Neural Machine Translation

Uses machine learning technology to teach software how to produce the best result. This process consumes a lot of processing power, and that is why it’s often run on graphics units of personal computers. NMT arrived in 2016, and most MT providers are now switching to this technology.

NMT/SMT

Hybrid neural and statistical machine translation

A combination of Neural and Statistical

EBMT

Example based machine translation

Adaptive

Special type: adaptive machine translation

Adaptive MT works in CAT-tools and functions similarly to autosuggest. It offers suggestions to translators, and learns continuously from their selections. Both SMT and NMT systems can be made Adaptive.

Classified by specialization

When classifying machine translation by specialization, it is useful to look at how the engines are trained and used. Different users will have different content types and, therefore, different requirements. Below are some examples of machine translation providers that can be classified as providing generic, custom, and domain-specific solutions.

Type Explanation Example vendors
Generic Doesn’t have specialization, doesn’t follow professional terminology. Equally good for anything. Google Translate, Microsoft Translator
Custom Uses client-specific terminology and fits their needs best, but requires training and maintenance. KantanMT, Omniscien, Iconic, Globalese
Domain-specific Ready-made specialized engine. Prioritizes terminology specific to the selected industry (medical, legal, automotive). Good with the chosen type of texts, bad with others. Promt, Systran, most of custom engines

Classified by usage scenarios

For a more practical view of how you can use machine translation, it is useful to classify different engines by usage scenario. Below are some examples of how to use specialized machine translation providers for different use-cases that you may have.

Scenario Example
Standalone desktop or mobile app Promt mobile translator
Web portal Google Translate
API Almost any MT system
Backend system Intel multilingual forum search
Integrated into a CAT-tool XTM + Crosslang, memoQ + Omniscien
Integrated into browser Chrome + Google Translate
Integrated into another app Facebook post translation, Tripadvisor review translation
Tied in with hardware Google Buds, Waver.ly earphones

Which MT systems offer the best quality

To answer this question Nimdzi Insights has licensed a report from Inten.to comparing translation quality of 11 systems in 35 language combinations. In 2017, Nimdzi, in partnership with Inten.to, ran a series of tests using news as proxy for general business content. DeepL, Google Translate, Yandex, gtcom and Microsoft scored higher that the rest in overall performance. It’s important that a startup company DeepL scores higher than Google Translate – this shows that it’s possible to beat the giant with a quality database, even if it covers only a few languages.

It is very important to note here that the quality was tested with news articles in order to get favorable results for generic engines. Domain engines (ex. automotive) like those used by Promt, Systran, or SAP would score much higher in testing with their respective subject matter area.

The good news though, is that if you have a specific content type that you would like evaluated, this can easily be done. Please contact us to run a specialized test with content of your choice.

Evaluating the quality of MT

MT systems developers test daily to see if new data and algorithms lead to improvements in output quality. Due to the sheer number of tests, human evaluation is usually out of the question, and developers use automated metrics instead.

The most commonly used metric is BLEU. “Bilingual evaluation understudy” shows how closely MT translation corresponds to human. It compares parallel translations and produces a score between 0 (worst) and 1 (best). While BLEU scores are widely used by MT researchers, they can be manipulated, and it takes a specialist to make sense of results. Besides, BLEU favors statistical models over Neural.

BLEU has spawned many derivative models, including METEOR, ROUGE, HyTER, NIST and LEPOR (which is the metric used in the above evaluation). Derivative models attempt to fix some of the drawbacks of BLEU. However, BLEU has the advantage of longevity. Developers using it can compare the performance of their systems all the way back to early stages of MT development, and can clearly estimate the progress, with statements for example, like, “we’ve improved quality by 20 percent over existing systems”.

For actual use it is important to run human evaluation in parallel with machine tests.

Language coverage

MT launched by search engines Google, Yandex and Baidu and Microsoft offer the best language coverage, because through search they have access to the widest selection of parallel texts.

They don’t cover directly all available combinations, and instead translate them via a middle language such as English. This process is called zero-shot translation.

Integrations

Google Translate and Microsoft Translator are integrated in most CAT tools. Other tools offer a varying set of CAT-tool integrations, with Kantan and Omniscien (ex-Asia Online) at the forefront.

Pricing

When we talk about pricing for machine translation, we could look at two different aspects of pricing:

  • Pricing for the machine translation as a service (ie: raw), and;
  • Pricing for post editing of the machine translation

Pricing of raw machine translation

There are several different ways that machine translation providers choose to price machine translation.

APIS

Google, Microsoft, IBM and many others sell MT by million characters.

ON-PREMISE SERVERS

In addition to APIs companies like Promt and Systran offer standalone enterprise translation servers that can be deployed on the client’s hardware and used without per word limit. Server pricing is hidden, and typically starts at USD 15,000 – USD 30,000 per project. Purchasing servers makes sense in the following scenarios:

  • High volume, preferably > 2,000,000 characters a year
  • High security requirements – machine translating confidential information that should not leave the premises

These companies also offer engine training services that are priced per hour.

HOSTED SAAS

Custom machine translation vendors such as KantanMT and Omniscien offer browser suites to train MT engines and connect them to various environments. Customers can log in, upload their translation memory and add it to an existing baseline engine to train custom MT. The custom engine can then function separately, or be plugged into a TMS. Web solutions come on a subscription basis, and their price starts at USD 1,500 – 3,000 a month.

Pricing of post editing

Machine translation post-editing (PEMT) rates are not a standard etched in stone, and many translation companies struggle correctly evaluating the internal cost and offering a respective price to the client. There are three main approaches to pricing PEMT.

PER WORD TRANSLATION COSTS

The simplest and most common way to charge for machine translation post editing would be on a per word basis. Often this is represented as a percentage of the standard human translation rate, such as 20 – 30 percent of the translation rate per word.

Advantages

This pricing strategy is very simple and it is also very familiar. Buyers are used to thinking in terms of per-word costs when it comes to translation. This leads to predictable costs and a smoother negotiation phase.

Disadvantages

Charging in this way creates problems at the production stage. Machine translation output quality varies, and the LSP doesn’t know in advance how difficult their task will be. Often the improvement in productivity does not correspond to the discount given. Linguists trudge through the machine output, get paid less for very tedious assignments, and end up resenting the LSP and declining new PEMT tasks.

PER HOUR

On the agency side per-hour approach requires putting in place time trackers and means of evaluating the post-editors productivity, for example, a post-editing analysis step. Rates should be tied to productivity, and without tools it’s hard to say why one editor is completing ten pages an hour, and the other produces only two: because of MT quality, their talent and experience, or due to the level of attention they are paying to details.

Advantages

Removes MT output quality risk at production stage.

Disadvantages

This makes it harder to quantify costs beforehand. The buyer ends up paying for the effort and not the result.

PER WORD WITH A PER-MATCH DISCOUNT

The LSP can give a fuzzy match discount for the segments where translation closely matches raw MT output.

Advantages

This is the smartest approach so far. The buyer pays for the result (words), and the LSP can adapt to the highs and the lows of the MT output quality.

Disadvantages

It becomes difficult to predict leverage, and implementing this is only possible with specialized software.

Note: difference between translation in a CAT-tool and post-editing

In the past, a translator worked with the translation memory, and a post-editor received raw MT output to edit. Today most translations happen with the support of both technologies integrated into CAT-tools and TMS, blending the difference between editing MT and “translating.” In the future, most translations will be assisted with a smart blend of translation memory and MT.

Post-editing may feature actions needed for engine training. For example, this could be human linguists evaluating MT quality, flagging and classifying MT mistakes in addition to correcting them, feeding the corrected translations back into the engine for training purposes and experimenting with data input and output.

Below are some different models spanning across Classic PEMT, Interactive PEMT, and Adaptive MT. It is very important to know the differences between these, because there are different effort levels (and therefore, different costs) involved with each. When agreeing to a project, it is necessary to clarify up front which post editing model is to be used.

Post-editing type Description Example:
Classic PEMT MT output is already in the “target” segment, whether it is good or not. https://youtu.be/Abijz71Lz8Y?t=13s
Classic light Correct mistakes only, focus on speed
Classic deep Bring the robot translation to the human level
Interactive PEMT Target segment is empty, the linguist can choose to put a MT suggestion in place if it is good, or to translate manually if MT offers gibberish. Indistinguishable from modern translation. https://www.youtube.com/watch?v=_5-SHTFZST4
Adaptive MT MT engine adapts suggestions on the fly based on each input from the translator. The way translation is going. https://www.youtube.com/watch?v=YZ7G3gQgpfI

Summary

Machine translation is improving every day. New developments are constantly rolled out and it is impossible to capture all of the necessary information in a single report. The information we have provided above is a crucial step towards defining your company’s machine translation strategy.

This post originally appeared on Nimdzi