The Ultimate Guide To
Multimedia Localization:

Does Multimedia Localization seem like a huge commitment? Well, there are certainly a lot of details involved.  But with the consistent rise in the interest of Multimedia Localization Services for industries such as eLearning and gaming, app development, and so on, we saw it fit to create a comprehensive guide that dissects every step of the Localization process.  What’s more, we have included an overview of a uniform method to achieve cost-effective and quality products from the client perspective as well as an overview of how to decipher voice-over and subtitling costs.

Reading Time: 15 mins

Multimedia Localization


You can find more useful
resources in our
Translation and Localization Blog.

Chapter 1: Multimedia Localization Services

While Multimedia Localization is available as a stand-alone service, it is often part of a bigger project. Not many services are limited to text exclusively, which is why qualitative Language Service Providers (LSP) are specialized in a wide variety of industries.

So, you may have already gathered that Multimedia Localization can get very complex. What challenges does the translation process have to offer? Well, we’ll get to that. But the good news is that the most common linguistic roadblocks can be overcome by committing to a rigorous TEP (Translation, Editing, Proofreading) Process performed by in-country, native professionals.

The use of Translation (CAT) Tools integrated with a glossary and Translation Memory (TM), can automate parts of this process by making use of a master list including technical jargon, brand-specific product names, and keywords, and a reusable terminology base. In order to preserve brand identity, a Style Guide should also be featured, listing the stylistic, structural, and linguistic preferences.

After the translation process is completed and any files that need conversion have been checked by the Localization Experts, a rigorous quality check should be performed to determine, among other aspects, if the content is easy to read, if any conversions have been implemented error-free, and if the final product is linguistically and culturally potent.

Chapter 2: Transcription And Translation

Now, let’s take one step back! Because before we can actually start translating, we need a transcription of the multimedia files in need of translation and localization. 
Transcription is about transforming the spoken language in a written text. The first step in our subtitling, voice-over and dubbing services, but is also offered as a stand-alone service. 

While Transcription Services can be performed manually by professional linguists. Certain project types could benefit from automating at least a part of this process. Let’s dig a bit deeper into this.

Transcription and Automatic Speech Recognition (ASR): Modern Day Partners?

When you hear the term ASR, you might instantly think about the automatically generated subtitles on YouTube. Depending on the audio quality and pronunciation of the narrator, the output can leave something to be desired. 

That being said, automation has been replacing certain parts of a job initially executed by humans exclusively.

But, here is the important part that we like to stress as an LSP dedicated to translation quality: not every part can be replaced by a machine, but with the right tools and workflow, it can be an effective support for the human translator.

A typical manual transcription workflow starts from a division of the source file between multiple transcribers, adding not only in time but also hourly or page-per-rate wages. 

Saving time especially has become a prime mover behind automated revolutions and tools, where a single audio or video file can be used to acquire a transcription within a fraction of that time.

However, we have to understand both the strengths and the limitations when working with ASR Software to speed up the process. You see, ASR is still far from perfect. You won’t receive a flawless product, but rather a time-coded draft ready to be submitted to a professional, human eye for the editing process prior to translation.

A 2018 research paper investigating the place of ASR in the translation workflow, pinpointed through various case studies that time can be saved, and quality still maintained if the linguists are assigned to the source-text editing, translating, and revisions processes. A failure to attribute the human contribution to the wrong processes would compromise on both time and quality.

Let us give you a couple of tips. Because the human editing process will be helped along significantly if the ASR output is optimized. So, if you have any personal influence over the recording process, keep the following in mind:

  • Use clear speech and articulation and avoid multiple speakers taking the floor at once. ASR Software has gotten pretty smart in recognizing the nuances of the human voice, but this can be hindered by poor quality recordings.
  • Limit background noise so the voices stand out.
  • Research top-notch ASR Software that would suit your goals best. Dragon Naturally Speaking, Microsoft Translate, and Webcaptioner, are popular examples.

Chapter 3: Subtitling Services

Subtitling is one of the most sought-after translation services when it comes to Multimedia Localization. 
While subtitling services involve translating the source text into the target language, captioning is the service that offers subtitled content in the same language as the spoken text.

After transcribing the content in its original language, or indeed using the ASR with human QA, the above-outlined translation process is implemented after which the localization engineers apply the conversion. 

The process is completed by performing a final Quality Assurance check to make sure the subtitles are easy to read, perfectly timed, and accurate.

While subtitles are designed to provide the viewer with timed content in their own language, captioning exists to help those hard of hearing to follow a video in the source language. 

Our specialized team of linguists and technicians leverage our innovative tools to provide both closed and open captions that are perfectly synchronized as well as 100% accurate.

Scalable Subtitling Projects: Human Translation

While we would recommend a hybrid form of a linguist’s expertise with automation for any larger projects you may have in mind (see below), scalable products should be submitted to human translation exclusively.

While partial automation can be helpful in the larger projects, it could actually hinder the process for any smaller or mid-sized productions. 

Like we will discuss below, subtitles are rather unique. They are conversational, full of unique references, and often feature incomplete sentences. While properly trained MT-Engines can overcome this, the investment is not worth it unless your project plays on a big scale. In this case, the classic TEP-Method streamlined in a workflow supported by a TMS and CAT-Tool, is ideal.

During the Post-Production, it is essential that the engineers and linguists confirm the timing of the subtitles, register any differences in the tone of the source and target languages to resolve them, and make sure the cultural and linguistic nature of the target language is respected.

Machine Translation: Furthering The Automation Process For Larger Projects?

What if we were to take the level of automation one step further beyond the Transcription, and replace the ‘T’ in the TEP Process with another machine? Sure, it can be done. But not so fast. 

Let’s have a look at Machine Subtitling – and how this could be beneficial for those who are involved in huge subtitling productions and audiovisual translations. 

Automation was designed to save time in an industry that is demanding high-quality delivery in the shortest possible TAT. With TV-Shows, online training courses, and video marketing tools bursting onto the scene eager to keep up with market demands, automation has been a welcome sight. And the professional linguists are here to fill in some of the holes left by machines and add human intuition and a genuine sense of connection.

The Evolution Of Machine Translation Engines

The constant pressure of providing translations at a fast rate might make you wonder about Machine Translation (MT) as well as ASR. And yet, you may instantly ask yourself the question, do subtitles and MT actually work together? 

After all, subtitles aren’t dry, informational content like in most document translations. Instead, they are more conversational and informal.

Quick answer? Yes, MT-Engines can be used. Long answer? It needs to be done correctly. For this reason, MT-Engines used for Machine Subtitling should be:

  • Specifically trained in dialogue so it can grow into a tool that can tackle more complex content and sentence structures over time
  • Provided with an assembled version of the fragmented content (which is what subtitles essentially are) so the context can become a bit more clear
  • Given correct sentence boundaries so the engines clearly know where the sentences begin and where they end
  • Selected based on the language pair as well as the training capabilities

Post-Editing: How Human Intuition Matches Machine Speed

That’s right, even if the translation itself is provided by an engine, the Editing and Proofreading still have to be performed by a human linguist. 

Post-Editing, however, differs from the traditional editing process in that it requires a linguist with specialized knowledge of the engines that are used as well as the common challenges in working with them. Curious about which MT-Engine related challenges Post-Editors are trained to overcome? Here are a few:

  • Each locale, language, and client will have their own specific subtitle Style Guides and expectations when it comes to connecting with their audience, even in a more structural sense. Pauses, speaker changes, the acceptable reading speed of each language, will all need to be taken into account.
  • The informal and altogether more personal nature of subtitles, means that cultural reference and colloquialisms are more freely used. This is why native linguists should be set the task to find appropriate alternatives that will have a similar effect on the target audience.
  • Despite their complex evolution, MT-Engines are still not completely context-aware. Linguists are needed to prevent mistranslations based on misreadings and varying grammatical structures.

Deciphering The Cost Of Your Subtitling Services

How much do Subtitling Services cost? Well, that is a rather complicated question to answer. Subtitling Services combine different levels of expertise from the Linguistic Professionals to the Localization/Audiovisual Engineers and Project Managers.

Next to that, your personal preferences will also impact that final number. Will parts of the process be automated? Are you interested in ongoing or long-term cooperation? How long is the video(s)? And, what are the source languages, and which are your target languages?

Let us give you a comprehensive guide that puts all these questions into perspective and allows you to enter your next translation project fully prepared!

Transcription: 100% Human Or Human & Automated?

In the Transcription phase, the audio is converted into a written document with time-codes (this is called a transcript) that can be translated. While transcription itself is usually charged per minute as calculated on the length of the file, there are some variables to consider.

  • The prices per minute are language-sensitive. English videos, for example, are cheaper than Japanese ones due to the complexity of the language and the availability of qualified linguists.
  • There are Auto Transcription Tools available that can automate this part of the process and bring the costs down. These tools are built on the concept of Automated Speech Recognition (ASR). However, it is worth keeping in mind that not every tool works for every language. 
  • If you are working with the 6 official languages of the UN (Arabic, Chinese, English, French, Russian or Spanish), there might be plenty of tools available to you such as Dragon Naturally Speaking, Microsoft Translate, Webcaptioner or YouTube ASR, etc.. That being said, in the case Arabic, you will face the challenge of it being a Right-to-Left (RTL) Language. Tools are still evolving into meeting this specific feature.

If you decide to take a shortcut and automate the transcription process, note that this approach will only be successful when you have a high-quality audio file with a clear distinction between speakers who don’t talk over each other. 

And even with clear sounding audio, the extracted text would essentially need to pass on to a human eye in a specialized Quality Assurance (QA) review. The QA pricing unit is per-minute, however, and the exact price depends on the language and quality outcome of the ASR.

Translation, Linguistic QA, And Partial Automation

The translation is usually charged per word at the rate confirmed by your LSP/Translation Agency when you received your quote. But, while the rate for the translation itself is pretty straightforward, the linguistic process doesn’t stop here.

Every translation is followed by Editing and Proofreading with a specialized focus on:

  • Translation accuracy
  • Contextual relevance
  • Brand identity
  • Cultural preferences or sensitivities

Due to the complex nature of the QA-Process, the rate can remain per word or, depending on which LSP you decide to choose, hourly or flat-fee.

With the Translation Management Systems (TMS) utilized by most LSPs, parts of the translation process including scaling, preparation, content collection, and internal communications, are now largely automated. 

Not only can a project’s process more easily be tracked through the Translation (CAT) Tool/Subtitling tool, but costs can be cut drastically through the use of a Translation Memory (TM)

A TM makes it possible for previously translated words to be saved for future projects so that only newly translated words will be charged. The integration of glossaries and style guides maintain accuracy, consistency, and help linguists understand the brand’s identity more quickly.
While this can further trim the cost of your translation and ensure a faster turnaround time, it can also nurture the possibility of interesting long-term cooperation. 

Feel free to ask your LSP about the possibilities of any discounts related to high-volume projects and long-term strategies. And even though it is recommended to offer your LSP the opportunity to be involved in the process from the start so that everything can be streamlined, it is possible to request the services separately.

 What About A Translation Without Transcript?

 What if you were to send your LSP a video file without a transcription, and would like to receive a quote for the Subtitling or Video Captioning Services? 

Well, it does get more complicated to provide an accurate number since the word count can be determined. That being said, most LSPs will still be able to give you an estimation without requiring you to provide the transcript.

First of all, as we mentioned before, different languages will hold different rates. Secondly, the file itself will also influence the rate. 

For example, documentaries with fast speech and music will be more demanding than an eLearning Video. And lastly, when it comes to the actual rate, an average can be taken with the knowledge that generally speaking, a minute of transcription will more likely to fall at the average of 70 – 150 words/minute. 

Based on this, an average can be calculated.

The Localization Engineers’ Price Tag

When the linguistic side of the process is completed, it is up to the Audiovisual/Localization Engineers to make sure the technical aspects are seamlessly integrated and functioning as they should. 

To put it simply, it is the Localization Engineer’s job to make sure the video is usable for everyone. They convert the file into the requested by the client (MP4, MOV, WMV, FLV, and so on), confirm the video quality, time codes, and determine the best method the maintain the integrity of the source file and satisfy any other technical requirements such as if the video caption has to be closed or open-caption, etc.

Considering the complex and all-encompassing nature of their job, the Engineers are charged an hourly rate.

Project Management: Ensuring A Seamless Organization

When talking about Subtitling Services, or any service really, the Project Managers (PM) are the oil that makes the gears run smoothly. They focus on which professionals would suit the projects best, design the process, and streamline the workflows that cater to the client’s demands. Long story short? They are here to make sure the team succeeds. This also includes:

  • Briefing linguists on requirements that are shared in the form of Style Guides, Glossaries, and any other important feedback a client may have
  • Managing the deadlines with an eye on allowing for a generous testing stage
  • Resolving any possible disagreements between linguists or reviewers

Since the PM-Team is present in most of the project’s proceedings, you can probably guess that costs are a bit more complicated to define. For this reason, most LSPs tend to add the PM fee as a percentage of the service in general but some LSPs don’t charge for this as a standalone service.

Transcription is about transforming the spoken language in a written text. The first step in our subtitling, voice-over and dubbing services, but is also offered as a stand-alone service. 

While Transcription Services can be performed manually by professional linguists. Certain project types could benefit from automating at least a part of this process. Let’s dig a bit deeper into this.

Transcription and Automatic Speech Recognition (ASR): Modern Day Partners?

When you hear the term ASR, you might instantly think about the automatically generated subtitles on YouTube. Depending on the audio quality and pronunciation of the narrator, the output can leave something to be desired. 

That being said, automation has been replacing certain parts of a job initially executed by humans exclusively.
But, here is the important part that we like to stress as an LSP dedicated to translation quality: not every part can be replaced by a machine, but with the right tools and workflow, it can be an effective support for the human translator.

A typical manual transcription workflow starts from a division of the source file between multiple transcribers, adding not only in time but also hourly or page-per-rate wages. 

Saving time especially has become a prime mover behind automated revolutions and tools, where a single audio or video file can be used to acquire a transcription within a fraction of that time.

However, we have to understand both the strengths and the limitations when working with ASR Software to speed up the process. You see, ASR is still far from perfect. You won’t receive a flawless product, but rather a time-coded draft ready to be submitted to a professional, human eye for the editing process prior to translation.

A 2018 research paper investigating the place of ASR in the translation workflow, pinpointed through various case studies that time can be saved, and quality still maintained if the linguists are assigned to the source-text editing, translating, and revisions processes. A failure to attribute the human contribution to the wrong processes would compromise on both time and quality.

Let us give you a couple of tips. Because the human editing process will be helped along significantly if the ASR output is optimized. So, if you have any personal influence over the recording process, keep the following in mind:

  • Use clear speech and articulation and avoid multiple speakers taking the floor at once. ASR Software has gotten pretty smart in recognizing the nuances of the human voice, but this can be hindered by poor quality recordings.
  • Limit background noise so the voices stand out.
  • Research top-notch ASR Software that would suit your goals best. Dragon Naturally Speaking, Microsoft Translate, and Webcaptioner, are popular examples.

Chapter 4: Voice-Over Translation Services

The way we are consuming information is changing and Multimedia Content isn’t an exception. 

With the rise of multiple platforms and devices, comes an increase in the so-called second screen. While we are reading an article on our laptop, we may very well also be playing an instructional video or TV-Show on our tablet. LSPs and Voice-Over Professionals are trying to accommodate this change in consumer behavior, by upping their multilingual Voice-Over game.

Localization has always been about connecting with an international audience in a very real and personal way. And when it comes to Voice-Over, the challenges are quite unique on the technical as well as the linguistic front.

Voice-Over Translation Challenges

Some translations, especially when dealing with fiction, TV shows, or specific scenes and storylines in video games, are related to dubbing rather than pure voice-overs. Here, the challenge lies in lip-syncing. In this context, the User Experience (UX) is defined by the voice-over matching the actor’s lip movements as closely as possible. This requires a very specialized service provided by a linguist with years of experience, and the accompanied intuition, to fully master lip-syncing in translation.
This brings us to certain Language-specific challenges. For example, in the Arabic language, which is increasingly gaining in online engagement, there is a distinct difference between the spoken language and the written language. 
Additionally, when translating English into Arabic, the linguist must allow for a text expansion of up to 25 percent. 
Of course, you can see where the difficulties lie here. The talent can never sound rushed and yet, they must adhere to the provided time codes and visual integrity of the video. Qualified translators will allow for breathing time and adapt the voice by either condensing or rearranging the content when possible.
A large part of the Voice-Over Translation Challenges are overcome by the above-mention TEP-Process and the use of a CAT- Tool. 
But what’s unique about video voice-overs, is that the sentence flow is time-sensitive. The linguist is challenged with providing a copy that answers to the demands of the video itself and the target language, while still maintaining the original feel and message.

Pro Tip!  Depending on your content and translation goals, you could benefit from Transcreation Services. Transcreation combines translation with creative writing and connects with your target audience in an altogether more emotional and marketing-directed way.

How To Minimize The Challenges With Voice Over Talent In Mind
In order to prevent the most common challenges from disrupting the recording process, the pre-production expanded on above has weeded out many possible issues. 

However, the best linguists will commit to keeping some further points in mind.

  • The original video should be reviewed prior to translation so the context is clear and the message can be maintained
  • The punctuation usage should be natural to the target language so that natural speech can be promoted.
  • The sentence structure of the source text should be respected at all times.
  • The translated content should match the length of the source content as much as possible. This way, the video can remain within its original format and the talent won’t have to alter their speed in an unnatural way.

Post-Production: Voice Over Localization Final Steps And QA

Once the voice-over translation has been recorded, it is passed on to the audio engineers who will apply to making the tracks run smoothly and adapting them to your specific application.

The audio engineer’s expertise includes but it not limited to:

  • Clearing any unfitting background noises including breathing
  • Matching the timing and speed of the source file
  • Unify the talent’s volume and tone
  • Making sure the audio and video are perfectly aligned
  • Noting any differences between the tone of the source and target languages

After the audio engineer has worked his magic, a final QA will be applied before the final product will be delivered back to you. The final QA primarily exists to weed out any possible bugs or mistakes that slipped through the cracks and ensure a flawless delivery, but the entire process from Pre-Production to Post-Production should be so streamlined, that any corrections should be minimal.

Delivering The Right Requirements For Your Multilingual Voice-Over Service

This part is all about you, the client, and the expectation you will be communicating. 

Take note of all the details mentioned here and know that the cost-effectiveness, the turnaround time, and the eventual quality, will be determined by how well you inform your LSP or Voice-Over Vendor of the specific aspects related to your localization project.

  1. Make A Choice Between Professional Studio Recording And Standard-Quality Recording on PC

These days, voice can be recorded with a large variety of tools from smartphones to sophisticated studios. 

Any product designed to speak to a specific audience, can benefit most from professional handling by a translation and localization provider. 

But even here, you have the choice between committing to actual professional studio time and engaging a voice-over talent working from a desktop with professional equipment. 

Have a look at what both of these options represent, and which ones would suit you best!

Professional In Studio

Professional On PC

This option provides the highest quality service also attached to the steepest price tag, featuring varied and specialized equipment.

Most suitable for services including:

  • E-Learning
  • Game Voice-Overs
  • Commercials

The most cost-effective option that features a voice-over talent working from their desktop computer, fitted with professional mic.

Most suitable for service including:

  • Presentation Voice-Overs
  • Conference Speeches
  • IVR
  1. Deliver The Scripts Of The Source Recorded Audio

Scripts or source recorded audio files to determine the multilingual recording lengths

If you have a source recorded file available, you may also share it with your Voice-over vendor along with the specific recording specs. 

This will be very useful if you would like to stay consistent across all the multilingual voice-over components of your product.

  1. Determine The Target Language(s)

Determine well up front which languages you will be translating your content into. Each language will have different requirements and costs based on the availability of translation experts as well as language resources. 

That being said, one aspect will be consistent during the entire process, and that is that the selected translators will need to be in-country, native experts with particular experience in the translation industry with knowledge of the translation tools, and the subject matter dealt with in the source content to achieve maximum understanding and interpretation.

  1. Help Select The Perfect Voice-Over Talent

You have certainly asked yourself by now, what defines a good voice-over talent? 

Of course, this is all related to your source video and the original message you have worked to communicate to your audience.

  • Think about the number of actors used in the source video and communicate how many voiceover talents you will need.
  • Subsequently, communicate the specific age group to your vendor. Add if you are targeting a specific demographic.
  • Should the talent be male or female?
  • Be as specific as possible with the region you want to target. You see, from the perspective of a quality vendor, one specific language actually represents many different ones. 

Your audience will engage with you in a more positive way when the voice-overs are delivered not only in their native language, but also their local dialect, including the use of familiar idioms and phrases. 

For example, the Modern Standard Arabic is a very common form of Arabic and it is commonly understandable across the Arabic speaking countries but when it comes to the spoken form Arabic there are different dialects.

  • Consider the expectation of your audience and brand further and think about the level of formality you want the talent to communicate.
  1. Set The Right Expectations And Request Some Samples

Sharing samples to your LSP is particularly useful for two reasons:
Learning what the outcome quality would look like based on the submitted requirements

  •  Confirm if you have some preferential instructions and if you need to go back to change the requirements we have set in the first place. This will save you time in the end!
  1. Communicate The Technical Specs And Required Equipment

Whether you will be going with Professional Studio Quality or the Standard Quality delivered on a PC, you will need to confirm various other components that you would want included in the process. Some of these are:

  • Which Audio Software to use (ProTools, Nuendo, and so on)
  • Which type of equipment (microphones, compressors, mixing disk, speakers, …)
  • Lip-synching tools (video hardware)
  • Confirm the Plugins such as TDM, RTAS, VST, DX
  • Remote assistance
  • Confirmation that a PC enabling to follow easily the script during the recordings is present in the booth

Understanding The Cost Calculation For Multilingual Voice-Over Service And How To Read The Quote

The requirements mentioned above will have a direct impact on overall the cost of your multilingual voice over project. 

That being said, it’s fair to say they are sufficient to get a preliminary quote, if not a final one for you to review and approve.

An Overview Of How Each Service Is Charged

It is probably clear to you now that Voice-Over Localization is a complex process and by extension, in terms of pricing options there is no one-size-fits-all solution. 

Typically, you will be provided with a quote that involves the Production Hours, Output Hours, use of resources, and so on.

For instance, when it comes to a professional quality-studio recording, the pricing unit could be either based on the production hours or outcome hours/minutes. 

On the other hand, standard recording is counted per hour/minute, and the cost for the translation itself can be included in the overall cost or listed separately.

That being said, we are not going to delve too deeply into this aspect of the Voice-Over pricing. 

After all, the pricing metrics and vendor production costs for this service could be complex depending on the requirements outlined above, and how vendors choose to introduce the costs to their clients.

 

So, what we want to do is to demonstrate the kind of tasks that could be involved in your project and how they would potentially be charged.

Transcription Services

A Transcription is the service that transforms the spoken language into the written script ready for translation. This service will be required if you don’t have the original scripts of your source recorded files available. Transcription is usually counted per minute, and while the outcome may vary, you can expect an estimated average of 70 – 140 words per minute.

Translation Services

The actual Translation Process comprises Translation, Editing, and Proofreading (TEP), aimed at producing the multilingual script for recording. These services are charged at a rate per word.

Script Preparation And Adaptation

The preparation and adaptation of the scripts involve:

  • Adding the time codes and breaking up long sentences to streamline the recording process for the talents if there’s no original recording to match its lengths
  • Matching the translated text with the length of the Source text as much as possible to synch the recording
  • Adding punctuation that is natural to the target language 

Due to the various nature of the work, this service is usually charged per hour

P.S. Note that some Voice-Over studios and LSPs don’t consider this to be a separate phase, and rather see it as part of the translation process, while others would consider it to be part of the Pre-Production tasks. If they were assigned to manage the entire translation phase, no separate costs would be charged for this service.

The Recording Stage

In case you have settled on the pro-studio quality recording, the studio and selected voice over talents along with the technical equipment should be confirmed prior to the recording day. 

Most of the Voice-Over Talents charge for a full day, a half-day minimum, and so does the studio. So, make sure you are taking full advantage of the time you will be charged for!

Post Production: Putting A Price On The Final Touches

The Post-Production phase involves the technical and linguistic QA as performed by the Localization engineer. 

 

This service is usually counted at an hourly base, and the tasks involved in this phase may vary depending on your project’s demands. Tasks include putting the recorded audio takes together, syncing the video with audio, producing the final formats, and so on.

 

Generally, The scale of both the pre and post-production work will hugely vary depending on the type of quality, and whether it is a standard or studio recording.

 

You have to be careful when attending the pre-production arrangements if you decide to go with the Studio option. Otherwise, getting this done incorrectly, or not according to the requirements, might lead up to a redo of the whole production or part of it.

 

This is why your input is so valuable, and all the above-mentioned boxes should be ticked!

Subtitling is one of the most sought-after translation services when it comes to Multimedia Localization. 
While subtitling services involve translating the source text into the target language, captioning is the service that offers subtitled content in the same language as the spoken text.

After transcribing the content in its original language, or indeed using the ASR with human QA, the above-outlined translation process is implemented after which the localization engineers apply the conversion. 

The process is completed by performing a final Quality Assurance check to make sure the subtitles are easy to read, perfectly timed, and accurate.

While subtitles are designed to provide the viewer with timed content in their own language, captioning exists to help those hard of hearing to follow a video in the source language. 

Our specialized team of linguists and technicians leverage our innovative tools to provide both closed and open captions that are perfectly synchronized as well as 100% accurate.

Scalable Subtitling Projects: Human Translation

While we would recommend a hybrid form of a linguist’s expertise with automation for any larger projects you may have in mind (see below), scalable products should be submitted to human translation exclusively.

While partial automation can be helpful in the larger projects, it could actually hinder the process for any smaller or mid-sized productions. 

Like we will discuss below, subtitles are rather unique. They are conversational, full of unique references, and often feature incomplete sentences. While properly trained MT-Engines can overcome this, the investment is not worth it unless your project plays on a big scale. In this case, the classic TEP-Method streamlined in a workflow supported by a TMS and CAT-Tool, is ideal.

During the Post-Production, it is essential that the engineers and linguists confirm the timing of the subtitles, register any differences in the tone of the source and target languages to resolve them, and make sure the cultural and linguistic nature of the target language is respected.

Machine Translation: Furthering The Automation Process For Larger Projects?

What if we were to take the level of automation one step further beyond the Transcription, and replace the ‘T’ in the TEP Process with another machine? Sure, it can be done. But not so fast. 

Let’s have a look at Machine Subtitling – and how this could be beneficial for those who are involved in huge subtitling productions and audiovisual translations. 

Automation was designed to save time in an industry that is demanding high-quality delivery in the shortest possible TAT. With TV-Shows, online training courses, and video marketing tools bursting onto the scene eager to keep up with market demands, automation has been a welcome sight. And the professional linguists are here to fill in some of the holes left by machines and add human intuition and a genuine sense of connection.

The Evolution Of Machine Translation Engines

The constant pressure of providing translations at a fast rate might make you wonder about Machine Translation (MT) as well as ASR. And yet, you may instantly ask yourself the question, do subtitles and MT actually work together? 

After all, subtitles aren’t dry, informational content like in most document translations. Instead, they are more conversational and informal.

Quick answer? Yes, MT-Engines can be used. Long answer? It needs to be done correctly. For this reason, MT-Engines used for Machine Subtitling should be:

  • Specifically trained in dialogue so it can grow into a tool that can tackle more complex content and sentence structures over time
  • Provided with an assembled version of the fragmented content (which is what subtitles essentially are) so the context can become a bit more clear
  • Given correct sentence boundaries so the engines clearly know where the sentences begin and where they end
  • Selected based on the language pair as well as the training capabilities

Post-Editing: How Human Intuition Matches Machine Speed

That’s right, even if the translation itself is provided by an engine, the Editing and Proofreading still have to be performed by a human linguist. 

Post-Editing, however, differs from the traditional editing process in that it requires a linguist with specialized knowledge of the engines that are used as well as the common challenges in working with them. Curious about which MT-Engine related challenges Post-Editors are trained to overcome? Here are a few:

  • Each locale, language, and client will have their own specific subtitle Style Guides and expectations when it comes to connecting with their audience, even in a more structural sense. Pauses, speaker changes, the acceptable reading speed of each language, will all need to be taken into account.
  • The informal and altogether more personal nature of subtitles, means that cultural reference and colloquialisms are more freely used. This is why native linguists should be set the task to find appropriate alternatives that will have a similar effect on the target audience.
  • Despite their complex evolution, MT-Engines are still not completely context-aware. Linguists are needed to prevent mistranslations based on misreadings and varying grammatical structures.

Deciphering The Cost Of Your Subtitling Services

How much do Subtitling Services cost? Well, that is a rather complicated question to answer. Subtitling Services combine different levels of expertise from the Linguistic Professionals to the Localization/Audiovisual Engineers and Project Managers.

Next to that, your personal preferences will also impact that final number. Will parts of the process be automated? Are you interested in ongoing or long-term cooperation? How long is the video(s)? And, what are the source languages, and which are your target languages?

Let us give you a comprehensive guide that puts all these questions into perspective and allows you to enter your next translation project fully prepared!

Transcription: 100% Human Or Human & Automated?

In the Transcription phase, the audio is converted into a written document with time-codes (this is called a transcript) that can be translated. While transcription itself is usually charged per minute as calculated on the length of the file, there are some variables to consider.

  • The prices per minute are language-sensitive. English videos, for example, are cheaper than Japanese ones due to the complexity of the language and the availability of qualified linguists.
  • There are Auto Transcription Tools available that can automate this part of the process and bring the costs down. These tools are built on the concept of Automated Speech Recognition (ASR). However, it is worth keeping in mind that not every tool works for every language. 
  • If you are working with the 6 official languages of the UN (Arabic, Chinese, English, French, Russian or Spanish), there might be plenty of tools available to you such as Dragon Naturally Speaking, Microsoft Translate, Webcaptioner or YouTube ASR, etc.. That being said, in the case Arabic, you will face the challenge of it being a Right-to-Left (RTL) Language. Tools are still evolving into meeting this specific feature.

If you decide to take a shortcut and automate the transcription process, note that this approach will only be successful when you have a high-quality audio file with a clear distinction between speakers who don’t talk over each other. 

And even with clear sounding audio, the extracted text would essentially need to pass on to a human eye in a specialized Quality Assurance (QA) review. The QA pricing unit is per-minute, however, and the exact price depends on the language and quality outcome of the ASR.

Translation, Linguistic QA, And Partial Automation

The translation is usually charged per word at the rate confirmed by your LSP/Translation Agency when you received your quote. But, while the rate for the translation itself is pretty straightforward, the linguistic process doesn’t stop here.

Every translation is followed by Editing and Proofreading with a specialized focus on:

  • Translation accuracy
  • Contextual relevance
  • Brand identity
  • Cultural preferences or sensitivities

Due to the complex nature of the QA-Process, the rate can remain per word or, depending on which LSP you decide to choose, hourly or flat-fee.

With the Translation Management Systems (TMS) utilized by most LSPs, parts of the translation process including scaling, preparation, content collection, and internal communications, are now largely automated. 

Not only can a project’s process more easily be tracked through the Translation (CAT) Tool/Subtitling tool, but costs can be cut drastically through the use of a Translation Memory (TM)

A TM makes it possible for previously translated words to be saved for future projects so that only newly translated words will be charged. The integration of glossaries and style guides maintain accuracy, consistency, and help linguists understand the brand’s identity more quickly.
While this can further trim the cost of your translation and ensure a faster turnaround time, it can also nurture the possibility of interesting long-term cooperation. 

Feel free to ask your LSP about the possibilities of any discounts related to high-volume projects and long-term strategies. And even though it is recommended to offer your LSP the opportunity to be involved in the process from the start so that everything can be streamlined, it is possible to request the services separately.

 What About A Translation Without Transcript?

 What if you were to send your LSP a video file without a transcription, and would like to receive a quote for the Subtitling or Video Captioning Services? 

Well, it does get more complicated to provide an accurate number since the word count can be determined. That being said, most LSPs will still be able to give you an estimation without requiring you to provide the transcript.

First of all, as we mentioned before, different languages will hold different rates. Secondly, the file itself will also influence the rate. 

For example, documentaries with fast speech and music will be more demanding than an eLearning Video. And lastly, when it comes to the actual rate, an average can be taken with the knowledge that generally speaking, a minute of transcription will more likely to fall at the average of 70 – 150 words/minute. 

Based on this, an average can be calculated.

The Localization Engineers’ Price Tag

When the linguistic side of the process is completed, it is up to the Audiovisual/Localization Engineers to make sure the technical aspects are seamlessly integrated and functioning as they should. 

To put it simply, it is the Localization Engineer’s job to make sure the video is usable for everyone. They convert the file into the requested by the client (MP4, MOV, WMV, FLV, and so on), confirm the video quality, time codes, and determine the best method the maintain the integrity of the source file and satisfy any other technical requirements such as if the video caption has to be closed or open-caption, etc.

Considering the complex and all-encompassing nature of their job, the Engineers are charged an hourly rate.

Project Management: Ensuring A Seamless Organization

When talking about Subtitling Services, or any service really, the Project Managers (PM) are the oil that makes the gears run smoothly. They focus on which professionals would suit the projects best, design the process, and streamline the workflows that cater to the client’s demands. Long story short? They are here to make sure the team succeeds. This also includes:

  • Briefing linguists on requirements that are shared in the form of Style Guides, Glossaries, and any other important feedback a client may have
  • Managing the deadlines with an eye on allowing for a generous testing stage
  • Resolving any possible disagreements between linguists or reviewers

Since the PM-Team is present in most of the project’s proceedings, you can probably guess that costs are a bit more complicated to define. For this reason, most LSPs tend to add the PM fee as a percentage of the service in general but some LSPs don’t charge for this as a standalone service.

Transcription is about transforming the spoken language in a written text. The first step in our subtitling, voice-over and dubbing services, but is also offered as a stand-alone service. 

While Transcription Services can be performed manually by professional linguists. Certain project types could benefit from automating at least a part of this process. Let’s dig a bit deeper into this.

Transcription and Automatic Speech Recognition (ASR): Modern Day Partners?

When you hear the term ASR, you might instantly think about the automatically generated subtitles on YouTube. Depending on the audio quality and pronunciation of the narrator, the output can leave something to be desired. 

That being said, automation has been replacing certain parts of a job initially executed by humans exclusively.
But, here is the important part that we like to stress as an LSP dedicated to translation quality: not every part can be replaced by a machine, but with the right tools and workflow, it can be an effective support for the human translator.

A typical manual transcription workflow starts from a division of the source file between multiple transcribers, adding not only in time but also hourly or page-per-rate wages. 

Saving time especially has become a prime mover behind automated revolutions and tools, where a single audio or video file can be used to acquire a transcription within a fraction of that time.

However, we have to understand both the strengths and the limitations when working with ASR Software to speed up the process. You see, ASR is still far from perfect. You won’t receive a flawless product, but rather a time-coded draft ready to be submitted to a professional, human eye for the editing process prior to translation.

A 2018 research paper investigating the place of ASR in the translation workflow, pinpointed through various case studies that time can be saved, and quality still maintained if the linguists are assigned to the source-text editing, translating, and revisions processes. A failure to attribute the human contribution to the wrong processes would compromise on both time and quality.

Let us give you a couple of tips. Because the human editing process will be helped along significantly if the ASR output is optimized. So, if you have any personal influence over the recording process, keep the following in mind:

  • Use clear speech and articulation and avoid multiple speakers taking the floor at once. ASR Software has gotten pretty smart in recognizing the nuances of the human voice, but this can be hindered by poor quality recordings.
  • Limit background noise so the voices stand out.
  • Research top-notch ASR Software that would suit your goals best. Dragon Naturally Speaking, Microsoft Translate, and Webcaptioner, are popular examples.

Chapter 5: Dubbing Services

We haven’t forgotten about dubbing! Dubbing services allow your source video to be prepped for a new audience by bypassing the use of any text altogether and providing a voice actor to record a translated version of the source audio over the video.
While dubbing is not our most requested service, we do provide the required linguists and talent to achieve the best possible result. 

And while, just like with the other services, Dubbing will follow the same process of Transcription and TEP, there are a couple of unique challenges and QA / Post-Production procedures we’d like to focus on briefly.

 

What Makes Dubbing One Of The Most Challenging Services

 

  • When it comes to time-sensitivity, dubbing is more so than subtitling. The linguist faces the challenge of translating not only based on linguistic and cultural accuracy, but preparing a text that can easily be pronounced by the actor. Additionally, the spoken text has to match the picture perfectly.

A dubbing linguist should make the words fit the visuals on the screen.

 

  • The linguist needs to keep synchronization in mind. This is particularly challenging since quality dubbing services should match the ideal content choices, with the lip movements on the screen. Skilled, in-country native with particular experience in dubbing, can find a way to strike a good balance.

Post-Production And QA

 

In the Post-Production phase, Audio engineers will layer the tracks and match them with the visuals. In this, particular attention needs to be given to: 

  • Neutralizing any background noise that may have snuck in
  • Lip-synching is, as we mentioned above, one of the big challenges. The audio engineer will make sure the synching reaches the highest possible quality
  • Resolve any issues in tuning and volume
  • Paying attention to the differences tone of the source and target languages and resolving them

Chapter 6: Conclusion

Multimedia Localization is a highly diverse service that covers several industries and connects different services. 
In order to achieve the best possible result, each service should be submitted to the TEP-Process and a rigorous Post-Production and QA. 

It is key that the LSP of your choice makes use of top-market Translation Tools so that repetitive tasks can be automated, and both time and money can be saved.

We’d like to end our overview by stating that while the LSP of your choice will need to organize itself in order to deliver a quality product, the client also holds great responsibility in providing the LSP with resources and information that communicate both the brand identity, and project intention.

If you have any questions, don’t hesitate to contact us!While dubbing is not our most requested service, we do provide the required linguists and talent to achieve the best possible result. 

And while, just like with the other services, Dubbing will follow the same process of Transcription and TEP, there are a couple of unique challenges and QA / Post-Production procedures we’d like to focus on briefly.

What Makes Dubbing One Of The Most Challenging Services

  • When it comes to time-sensitivity, dubbing is more so than subtitling. The linguist faces the challenge of translating not only based on linguistic and cultural accuracy, but preparing a text that can easily be pronounced by the actor. Additionally, the spoken text has to match the picture perfectly.

A dubbing linguist should make the words fit the visuals on the screen.

  • The linguist needs to keep synchronization in mind. This is particularly challenging since quality dubbing services should match the ideal content choices, with the lip movements on the screen. Skilled, in-country native with particular experience in dubbing, can find a way to strike a good balance.

Post-Production And QA

In the Post-Production phase, Audio engineers will layer the tracks and match them with the visuals. In this, particular attention needs to be given to: 

  • Neutralizing any background noise that may have snuck in
  • Lip-synching is, as we mentioned above, one of the big challenges. The audio engineer will make sure the synching reaches the highest possible quality
  • Resolve any issues in tuning and volume
  • Paying attention to the differences tone of the source and target languages and resolving them