This article is based on the talk I gave on January 22, 2025, at NOVA University in Lisbon as a part of the memoQ University Summit and the New Voices in Portuguese Translation Studies conference.
AI models now translate, and they translate abundantly. Still, we say, only humans can learn how to translate. Only humans know they are translating. Only they know they are learning. Only they know they want to learn. And only humans know they want to produce the best possible quality.
Six Statements About AI
AI Output Needs to be Reviewed
AI is trained with human translations, and even human translations need editing. Early in my career, there was the case of the “lying translator” who would gloss over concepts they did not understand and cover it up with fluent language – similar to early neural machine translation. When I first tested a custom neural machine translation engine, something about it seemed familiar. Then, I realized we had a “lying translator” here, just that only ten years earlier, it was human.
Shorter deadlines affect the quality of human output, and human errors have certain characteristics. Under time pressure, humans will latch on to the logic and structure of the source language, failing to convey the meaning of the target language at the higher levels (utterance, narrative, context).
AI systems are trained with imperfect human translations. As a result, even without all the other factors, there is no such thing as perfect automatic translation.
AI Systems Degrade Over Time [1], [2], [3]
The performance of information systems, including AI models, degrades over time. The most common form of degradation in modern AI systems is “data drift”. This is due to the passage of time and the data used to train the AI model.
AI models are trained from a body of training data existing at a specific moment in time and tested on a set of prompts and test data existing at that same specific moment in time. Because of this, the training data will eventually become outdated. Typical inputs will change, especially if the system is exposed to the public, resulting in the answers of the AI model becoming less and less adequate.
Another cause of degradation is recursive training.[4],[5] AI-generated content has been proliferating on the internet without being marked as AI-generated. Much of this content will flow back into the training data of foundation models. Without clear marking, it is difficult and potentially costly to detect AI-generated content and exclude it from training.
No Motivation, No Willful Conversation
AI models do not want to produce good quality. It is not the lack of motivation to produce good quality specifically – it is the lack of motivation at large. Language models do not have the ability to want anything.
Today’s AI models are entirely compulsive. Large language models only take a prompt and produce a completion. This is the one action a language model is capable of. Everything else depends on the training data.
The conversational behavior seen in public chat engines is not inherent in the language model. The language model itself cannot converse with you. It doesn't have memory and does not remember a previous question or answer.
When you believe you gave a prompt to ChatGPT, that's not the real prompt. The “Chat” part will first go through your input, extract keywords, run internet searches, and then expand the prompt into a direct promptintended for the language model. When you feed an AI-powered chat website, you are not doing any “prompt engineering”. Prompt engineering can only be learned by direct interaction with the language model.
No Connection to Reality, No Willful Learning
Language models have no connection to reality. More precisely, the only connection to reality is the training data: text, images, videos, etc. We must remember that this is still only data, not direct perception. Any data that goes through a computer system will be turned into a sequence of numbers – then the sequence of numbers will be treated as language. What looks like multi-modal to us, isn't multi-modal to the large language model. Deep down, all the input data is treated uniformly – yes, even in the reasoning modules.
The learning process in an AI model depends entirely on human-supplied and human-interpreted data. You will not encounter a large language model that wants to – or is able to – learn new things that were not included in, or distilled from, existing training data. It cannot learn from things it observes in the real world. That is how human learning works.
Some of the degradation we mentioned earlier happens because a large language model is unable to learn like a human. The large language model will not consciously recognize that its training data is outdated, and it will not be motivated to update the training data.
Because of all of the above, I do not believe that today’s AI models will evolve into artificial general intelligence. I am not saying that AGI will not emerge, but I am saying that it will emerge from something else, and the people who eventually create it will not be today’s AI giants.
Each Translation Situation is Different[6]
The same AI model or even the same automation approach is not suitable for every situation. Requirements for a certain type of translation are different for every organization. Even one organization can have different quality requirements for various types of content. For example, descriptions of pharmaceutical products require radically different quality than a list of steps for how to set up a web server. One can use machine-translated content for the latter, but quality cannot be compromised on technical content about a medical device. Finding the right model, the right prompt, the right automation process is a task for a human engineer – at least for now.
AI Does Not Treat Every Language Equally[7], [8], [9], [10]
Language combinations are not equal. More than half of all available training data is in English. All other languages comprise what remains. But there is more training data for Portuguese than for Hungarian, or for instance, Latvian. It is difficult for the largest AI providers to recognize the needs of so-called under-resourced languages, and there are efforts to create balanced language technology – not only AI – for these languages.
How do you create useful language technology if you don't have enough data? The deep learning methods used in large language models (or in neural machine translation) all use roughly the same technology. There is a difference in quality because the training data are different for each language. This means that an automation approach that might work for a certain language combination, like English into Spanish, may not work for another.
Three Questions About the Human Role
AI systems today are very powerful and display many human-like characteristics. Organizations, and translation buyers often lean towards entirely replacing human translators and other workers in the translation and localization processes. Although this will not happen across the board, the localization profession has already been disrupted. This is inevitable and irreversible, but it also raises questions:
- If translation as a profession ceases to exist – what is the risk of losing human translation?
- If the human actor is not fully replaced, but the “classic” role of a translator is gone, what is the ideal role of humans instead?
- What can today's software system do to support sustainable practices?
What Happens if We Lose Human Translation?
For now, we can safely assume that the only way to maintain the quality of AI translation models is by training from human-made translations. If this is the case, what is the risk of eliminating human translations and ending translation as a human profession?
First, we lose the humans who know how to translate, and, eventually, we will lose the language models. In other words, if we eliminate the human expert, we also lose the AI, and this will occur soon after.
What is the (Ideal) Role of Humans Then?
Even if the tendency is to not fully eliminate the human expert, there might not be enough work for humans to remain as full-time translators. However, humans must be able to remain as translators because that’s the only way that the training of AI models can be supported.
This tells us that a certain percentage of high-value translations should probably always be translated by hand. There is no other way to produce human-generated data.
People need to be taught how to translate, even if the task is to post-edit most of the time. They need to know what a good translation looks like, and what cognitive decisions and processes lead to producing a good translation. Without this knowledge, there is no post-editing.
Software to Keep Up Human Translation – and to Manage Data for AI
To ensure the proper functioning of AI systems, it is necessary to collect and maintain data. This means more than scraping the internet and using whatever we find without regard to quality or ownership. Good data management can also ensure ethical practices.
Training data will be created, collected, and curated by humans. This means you still need good translation editor interfaces. If you depend on humans to keep your data in good shape, they mustn't feel miserable while doing it.
Good data management also means taking good care of the repository of reference documents. You must know what items are included and their quality. You also need to be able to annotate them and use these annotations whenever you are training or fine-tuning any kind of machine learning application.
In other words, you need a translation management system.
Closing Words
Recently, at an airport newsstand, I came across a WIRED yearbook. It contained many predictions of where the world will go in 2025. One prediction was that smaller technology companies and smaller models would be in fashion again[1] (Fair warning, this is journalism, not research).
There are a multitude of reasons why you would not present every problem to a public large language model – privacy, confidentiality, environmental impact, performance, and so on. The idea of minimum automation tells us to use the least costly and least harmful method to solve a problem – and that isn’t always a large language model. The question arises, though, of how to proceed when working with a limited amount of data, particularly when no additional data is available.
There is hope that AI technology will generally develop more favorably for humanity. Often, it’s a matter of power and influence. Right now, the localization profession might not have the influence to make a difference, but if we continue to speak up, this power and the shift in public trust will eventually bring about change.
[1] Whittaker, M.: A new paradigm for tech investment. In: The WIRED World in 2025, p. 23.
[2] Bitrock team (2024): Understanding Data Drift: Causes, Effects, and Solutions. Medium.
[3] Tao Liu (2023): Monitoring Data Drift in Large Language Models. VianOps.
[11] Whittaker, M.: A new paradigm for tech investment. In: The WIRED World in 2025, p. 23.

Balázs Kis
Co-Founder & Co-CEO at memoQ