Localization teams are always looking for ways to do more with less. They face daily requests for additional translation into more languages, all under increasingly tighter deadlines, with smaller budgets, and while maintaining top quality.
To meet this demand, machine translation (MT) is being used more and more. But generic machine translation models often fall short when it comes to quality. Specialized or technical terms can be mistranslated, and inconsistencies are common. Generic MT struggles with idiomatic expressions, cultural references, and ambiguity, especially for creative or specialized content. Most importantly, generic online MT tools like Google Translate come with serious confidentiality, privacy, and security risks.
That’s where custom MT engines come in, says Gábor Bessenyei, Senior Product Manager at memoQ. Creating a custom MT engine allows for accelerated human translation to deliver translated content faster, more affordably, and with greater accuracy.
A custom engine works best when the use case is clearly defined, which is why the first step is to understand what you’re trying to achieve with your translated content.
Why Create a Custom MT Engine?
Companies typically consider machine translation for two different scenarios. The first is to provide users with translated content, so they can understand the original. In this case, the machine translation is usually raw, which means it has not been reviewed or edited by a human linguist.
The second scenario is to create a custom MT engine. This engine is trained using your glossaries, translation memories, and industry-specific data. This allows for tailored translations to your content, terminology, and style, which ultimately improves the quality.
“When basic understanding is your goal, customizing your MT isn’t critical,” Gábor says. “But when you need higher quality, for example for a regulated industry or if you have complex technical content, a custom machine translation engine can be a practical solution.”
Most organizations strengthen the quality of their MT output with a human review. This process, known as machine translation post-editing (MTPE), is where a human linguist reviews and edits the translations for cultural appropriateness and nuance.
Some may think it’s time-consuming or expensive to create a custom engine, or that technical expertise is needed. Gábor says globalese by memoQ makes it easy.
NMT and AI-Enhanced Custom MT Engine Options
If you’ve been wondering how to build a custom machine translation engine, it has become much more straightforward than in the past. Organizations used to spend a lot of time, effort, and money creating custom MT engines, and millions of segments of data were needed for training, which made things very complicated. This is no longer the case.
A custom MT engine can now be built without relying on IT or engineers. Because project managers and linguists often know the most about the projects and the translated content, globalese was designed to be straightforward to use.
Two types of custom engines are available. One is neural MT (NMT), a type of machine learning translation model that is trained using your data. The other offers adaptive machine translation with large language models (LLMs). This AI-enhanced option uses custom prompts and keyword lists instead of traditional training. Both support domain-specific content, but setup and maintenance are slightly different for each.
“Building a custom MT engine in globalese is almost fully automated,” Gábor says. “Training can be done by anyone; it doesn’t need to be an engineer or a technical person.”
Getting Started with globalese by memoQ
As you get started, there are a few questions to consider:
- Do you translate different projects for different clients?
- Do you develop different products that use different vocabulary?
- Do you work with different content types that require different writing styles?
If the answer to any of these is yes, then it’s a good idea to create different groups to separate your content, such as corpora, engines, and projects.
How to Build a Custom Machine Translation Engine
The first step is to set up your user accounts and CAT tool server connectors. These will help you extract the TMs, term bases, and previously translated projects and import them into globalese.
When it comes to training data, remember that quality is more important than quantity.
“If there are inconsistencies in the training data, this will show up in the MT output,” Gábor says. “When data comes from a content management system like WordPress, it’s important to ensure that technical content such as URLs is included in the translation memory as tagged content rather than plain text, to preserve quality.”
However, LLMs are making inconsistencies less of a concern. For example, instead of spending time up front cleaning data, a keyword list can be uploaded into the LLM to automatically fine-tune the results.
After the corpora is uploaded, you select the files for the new engine and start the training process. Training settings can be set to define model parameters and LLM features, and progress can be monitored on globalese’s UI.
If you don’t have the required minimum amount of corpora for training, stock corpora can be provided by globalese. However, using your own data is always recommended for the best output.
Fine-Tuning and Retraining Your MT Engine
Regularly retraining your engine will improve its quality. Quick retraining can be done every 1-2 months and usually takes about 10-15 minutes. Over time, this helps improve localization workflows with machine translation by keeping terminology and content current.
Full retraining can be done for larger amounts of data or when there’s a significant update, such as a global terminology change. Training time depends on the size of the data, the length of the segments, and the language. For example, Asian languages take less time because they are character-based. Full retraining normally takes 1-2 days and should be done once or twice a year.
“Traditional training will eventually disappear because of LLMs,” Gábor says. “Instead, users will add references and sample translations in the prompt to generate better results. Eventually, the LLMs will take over, and we will see real-time adaptation instead of re-training.”
Secure Deployment and Data Security
globalese can be deployed in different ways depending on how your team works and where your data needs to be hosted.
For organizations with strict security requirements, an NMT engine can be installed on-premises and run entirely within your own infrastructure, without relying on the cloud.
The LLM-powered engine is available only as a cloud service due to the computing resources required to support large language models in machine translation. It can also be hosted in a private cloud if needed.
All cloud deployments are single-tenant. Data is never shared or reused, and all processing takes place in GDPR-compliant regions, including the EU and Canada.
Real-World Business Impact
Infor provides a good example of how a custom MT engine can help organizations better manage their translations.
As a global provider of enterprise software, Infor needed an MT solution that could help it scale to more languages while maintaining quality. Like many organizations, they didn’t have large volumes of high-quality training data.
After implementing globalese by memoQ for a large number of NMT models covering different product lines and language combinations, the custom engine improved fluency and accuracy, while preserving the tag structure and helping Infor deliver a more consistent experience to their users faster and at a lower cost.
In-domain adaptation let the team work with both core and auxiliary corpora, making it easier to handle smaller or less common projects. Installing the system in-house gave Infor full control over deployment, data security, and the need for complex system-to-system checks.
“As Infor’s preferred machine translation provider, globalese by memoQ has been a reliable and strong partner for almost a decade now,” says John Musters, Technical Translation Support Specialist at Infor. “This allows our department to speed up the translation process of new and recurring — often large — projects while maintaining structural, terminological, and overall linguistic correctness.”
Your MT, Your Competitive Advantage
The benefits of custom machine translation engines can vary, and what’s achievable depends on many factors. Gábor believes custom MT engines could help many organizations save a significant amount of time and cost on their translations.
“With full post-editing to ensure good quality, I estimate that a 30%-50% savings is possible,” Gábor says. “The real accomplishment is that custom MT allows you to publish content that you couldn’t publish before due to budget limitations or languages with a smaller user base.”
A custom MT approach can provide real efficiency and make high-volume quality translation possible. With a platform like globalese, localization teams can translate more of the content that their users need, across more markets and languages. This helps customers get more out of the product and allows companies to provide better support.
Ready to learn more about machine translation customization with globalese by memoQ?
 
memoQ
memoQ is among the world's leading translation management systems. The favorite productivity tool for translation professionals around the globe.
 
               

 
     
     
     
    