The PDF problem and why it’s costing you much more on translation than it needs to.
We have a customer who has several PDF manuals they need to publish in several languages. Simple right? Not really. The problem is, those manuals are made up of several PDF files – and the client doesn’t have possession of the source files which were used to generate the PDF’s they do have. So when it comes time for translation, not only are they paying for extra desk-top-publishing costs for us to re-layout the manual, but they are not getting any use of Translation Memory. (More on this below.) These two issues will lead to a company spending double or triple what they should be spending on Translation Services.
Why is this client in this situation to begin with?
This client, like many others in the manufacturing sector, buys aspects of their product, from other companies, repackages it and sells it as a different product. Take a car for example; Ford doesn’t actually make the glass for their windows, batteries for their hybrids, or radio’s for their dashboards. They buy these products from other companies and integrate them into their unique design as a Fusion, and sells them. So, in some cases, their service manuals would be simply a series of their OEM’s (Original Equipment Manufacturer) manuals, compiled into one big manual. While this will work for English (or the source language that the manuals are in), when it comes time to translate the manuals into several languages – you’re facing a world of problems to get this done at a low cost.
Making things even worse, (and more expensive to deal with) many of these Manuals could have been produced years ago, and today cannot be located. And not only do the source file not exist, but the PDF files could actually be a picture-scan of the text, instead of text that can be extracted manually. This means, that the text is not scanable, highlightable, or saveable by any means, except by using Optical Character Recognition (OCR) process, which is slow, expensive and still requires someone to proofread and edit the source text before you even start the translation process.
My Advise To You (If you are like this client)
1. Require OEM’s to produce editable manuals: It stands to reason that if you are buying products from OEM’s, you should be able to dictate in your purchase agreement that they need to provide you with editable, electronic source documents which comprise their manuals. If they refuse, perhaps you should look at buying from their competitors, because the significant extra costs you will incur in translation of their manuals.
2. Pass on costs: If they can’t or won’t provide you with editable source manuals, then pass your surcharges in translation to them. This measure may get them to comply in the future.
3. Brute Force Method: If you can’t get source documents no matter what you say or do with your OEM, then there will be these surcharges in translation, regardless of which translation supplier you choose:
A. Optical Character Recognition scanning and editing
B. Re-Layout of Source Document
C. Graphics and Charts with text Redesign
A Word About Translation Memory
If these charges aren’t bad enough, consider how much money is wasted when you’re needing to update a previously translated version? In these common cases, where you are merely updating a translated manual instead of launching a new one, you will not be able to get any translation memory benefits from text that are exact matches, fuzzy matches, or repetitions. You are essentially being charged 100% for each word, because there is no recycling of previously translated assets.
If anyone reading this has any comments or suggestions on dealing with the PDF problem – please feel free to comment below.


Interesting problem!
Perhaps a (not THE) solution was found by Caterpillar in the early 70s. With over 2000 product maintenance handbooks in 72 languages, updated after every major product modification, they decided to standardise on a 650 word English vocabulary. It became mandatory for all worldwide employees to learn this basic vocabulary.
Not good for the translation business!- but an effective solution for them at that time.
More seriously, good tools are avilable for editing/converting *.pdf files to many other formats, the cost of which is not high – neither purchase cost nor time/effort to convert.
John Collins