In the language industry, translation memory technology has been part of the standard toolset for most language service companies since the 1990s. Translation memory software began to be commercialized in the 1980s and hit the mainstream by the mid-90s. Many language professionals now see translation memory software (also known as TM tools or CAT--computer-aided translation) as passé with the advent of other tools such as machine translation.
Nothing could be further from the truth, though, and here’s why: A translation memory tool…
- provides a standard interface for linguists
- simplifies the interaction with increasingly complex file types
- makes the sharing of work easier and more flexible
- provides equal access to translation memory and terminology databases
- streamlines project workflows
- secures and protects translations
- automates the translation process
- organizes complex projects
- allows for larger translation teams
- creates valuable linguistic resources
Provides standard interfaces
Translation memory tools provide the human translator a familiar and efficient interface for working with content to be translated. The example below shows a translation editor interface. The grid model is now the de facto standard for translation memory tools. In this example, the original document is a simple Microsoft Word file, but most translation memory tools can import myriad file types, including raw software code and other structured content types such as XML.
Translation Memory vs. TMS (Translation Management System)
These are two types of technology that are often discussed together, since their functionality overlaps; however, it is important to understand the difference.
Translation memory tools are technology that enables recording, storage, and recall of translated content in Translation Units.
Translation Management Systems (TMS) are a much broader type of tool that includes, among other things, translation memory functionality. A TMS may also have functions related to project management, terminology management, connectivity to third-party content management systems, and user access and management.
Figure 1 The standard translation grid has revolutionized translator productivity.
Handles complex file types
Standardizing the translator’s experience across file types is still a revolutionary feature of translation memory tools. Few translators possess both the linguistic and technical skills to translate content within the code of a website or mobile app, nor do many have all the software that is used to publish the hundreds of billions of words that are translated globally each year. As you can see from additional examples below, translating raw XML is daunting compared to translating the same content after it has been imported into a translation memory environment.
Figure 2 Translating raw XML is cumbersome.
Figure 3 Translating the same XML content in a translation memory tool makes the task much easier.
Distributes work
To meet market demand for translated content, language service companies sometimes need multiple translators working in simultaneously. Translation memory tools—especially those that are server-based—allow for collaboration of multiple translators in real time. Robust tools enable the slicing and batching of substantial amounts of content.
Hosts critical resources for real-time sharing
In addition to distributing content, the server hosts centralized translation memory databases that translators can access in real-time while translating. This means that a translation created by one translator can be reused by many to help them translate faster and maintain consistency. However, that is not all the server can share. It also hosts terminology databases, so technical terminology can be managed and shared amongst the translation team. This saves the translators from having to do extensive research confirming which terms must be used to provide an accurate and consistent translation.
Streamlines project workflows
Prior to the advent of translation memory tools, the translation process consisted of moving files from one translator to another depending on their role. It was common for a project manager to send files to a translator who would translate them and deliver them back to the project manager, who would then send them to an editor who would edit and revise the translation. The editor would then deliver his completed work back to the project manager, who would send them to a proofreader to finalize the translation prior to delivery. The “TEP” workflow has been standard for publishing translations for centuries.
When working on a server, for example, you can avoid moving files as part of your workflow. Instead, linguists work on translation segments or groups of segments assigned to them depending on the segments’ status. This innovation offers unique benefits, such as overlapping workflows, in which an editor can review a segment once a translator has completed their translation of the segment. Such a “dovetailed” workflow can save as much as 30% in the translation production’s timeline.
Secures and protects translations
Server-based translation not only streamlines the translation workflow, but it also secures the translations. By capturing translations as they are completed in a translation database, the risk of losing that content is mitigated once the translated segment is committed to the translation memory database.
Besides safeguarding the translated content itself, translation memory tools also safeguard the functional structure of the documents being translated. Since files are imported into the translation memory software, the content is parsed into segments, which separates them from the layout of the document. The file’s structure is retained and cannot be changed by the translator. The translator can only affect the formatting of content within a sentence (for example deciding which word should be bold or italic). Beyond formatting, they are responsible for placing any formatting tags (such as markup used in HTML or XML). The translation memory software will also check that these tags are all accounted for and properly recorded. These are critical quality assurance measures that guarantee the translated version of a file will function just like the original.
In addition to capturing content in real time, having all translations in a translation memory tool helps mitigate for other issues like file corruption. Often in post-production (following translation) a file might become corrupted while being moved or during desktop publishing. Should this occur, it is quite easy to recreate the translation by using all the translated segments stored in the translation memory to automatically re-translate the original document with minimal rework. Prior to the use of translation memory, if file corruption occurred, a project manager would have to track down another version of the translation—with all the editor’s changes in place—to continue the process. If the cause of a mishap was a hard drive failure on the part of translator, the work could be lost for good.
Automates the translation process
Aside from standardizing the translation workflow, translation management systems can also automate that workflow. For example, they can automate the pre-translation of documents, initiation of the translation, and passing the process from translation to editing—all without project manager intervention.
Most tools also enable a connection with a machine translation engine by way of an API. Such seamless connectivity allows for the use of both translation memory and machine translation within the translator’s interface to help speed up their work. Equipped with a well-established translation memory and effective machine translation engine, a translator will rarely have to translate any content from scratch, rather they only need to review and revise translation memory or MT engine matches. This can typically double their productivity.
Organizes complex projects
Organizationally, translation memory tools provide a robust environment to manage complex projects that may consist of thousands of files. Using functions that allow filtering and grouping of segments, it is possible to combine content thematically and assign it to specialist translators to help improve translation quality and efficiency.
The translation memory system will also retain the structure of the source files regarding where they are in a complex file structure. Translators have no way of altering the structure of file names while working.
Allows for larger translation teams
The ability to manage and serve up centralized resources like translation memory and terminology databases play a critical role in not only allowing multiple translators to work collaboratively in real time, but to have large teams of them doing their work. By having access to shared resources, translators can maintain better consistency. By properly configuring a server-based project, a project manager can enforce proper terminology usage and stylistic consistency such as number and date formats in the translation. Outside of a translation memory system, this would be a manual effort, which would make it virtually impossible for a large team to collaborate with consistency in mind.
Creates valuable linguistic resources
In the modern context of translation automation, the most valuable contribution of translation memory systems is the creation of high-quality language resources—namely translation memories and terminology databases. Aside from the multiple benefits highlighted above, these bilingual resources are the veritable gold of translation production. Within the context of daily translation work, translation memory provides immediate returns to translation teams working on similar documents, usually for the same customer.
In addition, for organizations looking to improve translation automation, translation memories can provide much longer-term returns as a basis for training machine translation engines. High-quality translation memories are the foundation for training specialized and accurate machine translation engines. Bilingual sources of data that establish the equivalence between the source and target languages enable machine learning and can have meaningful impact on the quality of the generated content with lesser amounts of data.
Terminology databases (termbases) are highly valuable because they help linguists precisely manage critical terminology. Using the correct terminology for the subject matter being translated drastically improves accuracy, readability, and acceptance of the translation by the target audience. With machine translation engines that support the use of run-time glossaries, terminology databases can help improve the accuracy of machine translation engines as well.
Translation memory may seem like old technology given its ubiquity and longevity in the language industry. However, the age of this technology does not signal impending obsolescence. On the contrary, it underscores its critical position in the age of translation automation. Look at how your organization is utilizing translation memory technology and if it is getting full benefit from it for translator productivity, managing your projects, and accelerating your use of automation for creating new translations.
Scott Bass
Principal Consultant, LocFluent Consulting, Inc., and memoQ Brand Ambassador