Challenges of I18n at scale — Part 1

Helder Santos

Jun 10, 2023 • 8 min read

The first part of a two-piece write-up on the internationalisation of large-scale applications in hyper-growing engineering companies.

A brief introduction…

Recently at xgeeks, we had a very exciting challenge, one that taught us many lessons and equipped us with important tools for future endeavours regarding the internationalisation of e-commerce projects on hyper-scaling engineering environments. This article is part of a two-piece write-up that will go over the challenges we’ve faced and the process we ended up implementing. In the second part, we’ll go over the improvements we’ve made in the process and how they improved the speed and quality while reducing the costs of translations.

Before we delve deeper into the topic, it is important to set the stage so that we can better understand the environment and all its constraints, which guided the rationale behind our decisions.

We bootstrapped a new team that was integrated into a larger ecosystem with over 400 engineers, in a company with over 2000 employees distributed across multiple verticals (engineering, logistics, operations, finance, legal, human resources, among others…). The main focus of this new team was to plan and implement internationalisation across all the customer-facing apps and some internal tooling, enabling these to display different propositions in multiple languages and markets. One critical factor to keep in mind is that this engineering branch was growing quickly, and was expected to become even larger in the upcoming months, as the company expanded into new markets and acquired other companies in the process. This meant the solution had to apply to multiple different technologies and be scalable to accommodate future expansion.

Kicking things off

When we got onboarded onto the project, our main concern was gathering as much information as possible to understand the business, the structure of the company, the existing codebase, and how this codebase was maintained and developed.

During this period we made a couple of valuable discoveries, that would be crucial to our decision-making moving forward.

We wanted to devise and implement a single strategy for different codebases, written in different languages and frameworks, these being:

Next.js with TypeScript applications that powered the customer-facing side of the website;
React with TypeScript applications for small pieces of logic that are integrated onto the main apps;
React with TypeScript micro frontends that were used across the main apps and which shared dependencies with them;
Laravel application for operations management and product preparation.

Our goal was to provide a solution that could be performant, simple to use and seamlessly fit the technologies listed above (using a similar file structure both on PHP & JS-based projects).

The decision process

To choose the internationalisation framework and Translation Management System (TMS), it was important to get as much feedback as possible from the engineering teams. To gather this feedback, we created an RFC (Request for Comments) and made it available for everyone to comment on and improve on.

This was a precious tool as it surfaced information that we didn’t have at the time, allowing a team of hundreds of engineers to discuss a single topic without repetition and background noise. Another benefit of using this approach is that all conversations on the topic are centralised and saved for historic purposes, meaning that later on, anyone can look at the document and understand why some decisions were taken and which alternatives were analysed.

For the internationalisation framework, we decided on using different approaches depending on the type of application we were working with. For large-scale React/Next.js projects, we would use react-i18next, which fulfilled all our goals and was flexible enough not to be tied to a specific framework. For smaller codebases like small React applications with a very specific scope, we would use a lightweight in-house implementation to keep them as small and performant as possible while keeping most of the functionality they required in terms of internationalisation (string interpolation for example).

For the TMS, and after weighing many technical and business-related parameters, it was decided that we would be moving forward with Smartling.

Translation workflows at scale

One important topic that has to be addressed when thinking about internationalisation at scale is defining a translation workflow to be used to deliver content to different markets as quickly and efficiently as possible. This kind of workflow will have to meet some important requirements, which can sometimes be challenging depending on the scale of the project and the timelines we have for delivery.

Below I will go through these requirements, explaining how we fulfilled them and some of the challenges they posed.

Translation Memory

An important tool you should be aware of when starting to internationalise a website is being able to keep a unified Translation Memory. This works like a database of translated content that translation tools to automatically suggest previously translated identical text, reducing the time and cost associated with repeated translation efforts.

One of the main benefits of translation memory is consistency. By reusing previously translated content, translation memory helps ensure that terminology, tone, and style are consistent across all translated versions of a website. This consistency not only enhances the user experience but also helps maintain the integrity of the brand.

In addition to consistency, translation memory also helps to streamline the translation process. Translators can easily access previously translated content and can quickly identify where changes or updates are needed. This not only saves time but also reduces the likelihood of human error, further improving the quality and accuracy of the translations.

Finally, translation memory also provides valuable insights into the localisation process. By tracking the use of specific terms and phrases, translation memory can help identify areas where additional attention may be needed, such as updating out-of-date translations or adjusting terminology to better meet the needs of a particular audience.

In summary, this is an essential tool for website internationalisation and something you should ensure especially if you are working with a large expanding website with a strong textual component.

Difference between market and language

Market and language are two distinct but related concepts. Market refers to the group of people or businesses that a product is intended for, while language refers to the means of communication used by that group of people. A market is defined by factors such as geographic location, demographic characteristics, and purchasing habits, while language is determined by cultural and linguistic factors. For example, you might want to sell specific products in the French market, while allowing customers to browse your website in both French and English language while purchasing them.

It is vital that this distinction is made clear to everyone involved in conversations around internationalisation, especially in large technological environments and during rapid expansion phases where you might have slightly different propositions and languages for each market. If this distinction is not made clear from the get-go, it is easy for people to get lost in important conversations around proposition definition and internationalisation in general.

Context is king

Another crucial aspect to be taken into account is ensuring translators have the right context and understanding of the proposition, as this will help ensure that technical terms and phrases related to it are accurately conveyed and that the website’s message and tone resonate with the target audience. Cultural and situational understanding is paramount for effectively communicating with customers in the target language. Without proper context, a translator may misinterpret the meaning of certain terms and phrases, resulting in confusion and miscommunication. For example, an online car marketplace targeting buyers in Germany should be translated differently than one targeting buyers in the United States, as the German market may place more emphasis on fuel efficiency and environmental impact, while the American market may place more emphasis on horsepower and performance. A translator who doesn’t understand the context of the website may not be able to accurately convey these cultural nuances, resulting in a message that falls flat with the target audience.

While this may sound easy, it can prove to be a daunting task when dealing with large-scale e-commerce projects, with multiple distributed teams working on different parts of the sales funnel.

Below are the three main tools we’ve used to deliver a unified and contextualised message to our customers:

Studying the market and adjusting the proposition
This involves researching the target audience, their needs, and preferences, as well as understanding the competitive landscape and legal constraints of the market. By understanding the market, a website can tailor its proposition to align with the cultural, linguistic, and legal nuances of the market. This can help increase its chances of success by effectively connecting with its target audience and standing out from the competition.
Maintain a document to provide context to translators
A great way to ensure translators have context on the proposition they are translating is sharing a document with information about the website’s target audience, its message and tone, and any industry-specific terms or phrases that may be used on the website. This information can help translators understand the context in which the text is written and ensure that the meaning is accurately conveyed in the target language. Providing translators with the necessary context helps to ensure that the website’s message is effectively communicated in the target language.
Create efficient communication channels with the translation agency
Although asynchronous communication can work for most scenarios, having a direct channel to the translation agency can be invaluable. This can make it easier to deal with issues that may arise during the translation process, such as questions about terminology or formatting. It can also be used to align deadlines between the company and the translation agency so that any potential delays are identified and addressed promptly. Overall, this helps ensure that the translation process runs smoothly and that the final product meets the company’s expectations.

Initial approach

One of our main goals in terms of technical implementation was ensuring a smooth integration of the translation process in the development cycle. There are multiple approaches available to meet this goal, I’ll be going through the one we have put into practice, detailing its benefits and shortcomings.

Below is an illustration of how the initial translation workflow fitted into our CI/CD pipelines.

As seen above, when a specific implementation did not require any translations, the deployment cycle was kept unchanged and there was no entropy added to the process. On the other hand, if a feature added new strings that would be shown to a customer, these strings would be placed in a resources file (in our case, we would always add the strings to the English resources, and then translate them into all other languages). The process mainly revolves around the following steps:

A mechanism inside the deployment pipeline checks for changes on the English resource files, if any file was changed, the following actions happen:
Gather the strings that were added/updated;
Send the strings to the Translation Management System;
Block the deployment pipeline to ensure we wait for translations before moving forward;
The translation agency gets notified about these strings, and will translate them directly in the TMS;
After the strings are translated, they are sent back to the code repository and the deployment pipelines are unblocked, enabling teams to deploy their code to test and production environments.

The downsides of the approach

The mechanism above allowed for an effortless integration between tech teams and the translation agency, using the TMS as a bridge to facilitate this, however, it did come with a few shortcomings we needed to tackle, which are listed below:

Development cycle delay
With the above implementation, our deployment pipelines were blocked until a translation was fully complete, which prevented us from testing the implementations in the test environment and consequently, drastically slowed down the pace of development.
Lack of context from development teams
Another pain point early on was the lack of context some developers had on this entire process, which led to the internationalisation team having to do multiple sessions explaining the process in detail to different teams to avoid errors and confusion.
Lack of context from the translation agency
In the early stages of the implementation, there were quite a few issues with mistranslations due to the lack of context and the desync between the engineers and the Translation Agency.

These shortcomings were spotted early on and we started working on ways to tackle each one of them to ensure a smoother process, which we will go over in part two of this article.

The journey continues: improving the process

Thank you for taking the time to read this article, we hope you found the information helpful and informative.

In part 2, we will dive deeper into the improvements we made to the process that helped mitigate the issues outlined above, make sure you read through it, as it will also contain suggestions for alternative approaches to internationalisation that might fit your project better.