Tool Box Newsletter Logo

 A computer newsletter for translation professionals


Issue 13-7-224
(the two hundred twenty-fourth edition)  
Contents
1. Translation Computing for the Newbie (Premium Edition)
2. Walking the Shire with a Coffee in Hand
3. Bee Sting Therapy
4. For Your Tool Box
5. New Password for the Tool Box Newsletter Archive
The Last Word on the Tool Box
Euro Matters

I have a few favorite slides that I like to show when giving talks, and one is this image of a (former) Slovakian bill:

Cyril on old bill

It's just too tempting to show this and comment that early translators St. Cyril and St. Methodius, pictured here, are some of the very few translators who actually made it (on) to some money!

Saints Cyril and Methodius, of course, are the brothers who translated the Bible into Old Church Slavonic in the ninth century and also came up with a precursor of the Cyrillic alphabet. (They also came up with some characters that are no longer used today, such as the �ber-cool "little yus": Ѧ -- which I'm tempted to use as an abbreviation for the already-abbreviated TEnT acronym.)

This year, new EU member Slovakia came up with a commemorative euro coin celebrating the two translation heroes: 

Cyril on new coin

Unfortunately, this was nixed by the European Commission because they objected to the Christian cross on the coin. As an alternative, I'd like to propose a new coin for all of Europe:

Jeromobot on Euro coin

I know, I know: it's classy. Not only that, but to have Jeromobot, the patron saint of the modern translator, on an EU coin would make a lot of sense. After all, to quote Umberto Eco and Found in Translation, our love song to translators: "The language of Europe is translation."

1. Translation Computing for the Newbie  (Premium Edition)

I was asked to put together a blog entry for folks who are just starting out as translators and who want to equip themselves with the right technology. (And, yes, I realize that most of you subscribers are veteran translation professionals, but this might help when you mentor someone else.)

First of all, technology does no good if there are no skills to use it with. No, I'm not talking about great programming or software development skills, but instead very fundamental skills that can't be assumed to be present.

  • Typing: I'm an OK typist now, but I'm sure that I lost a few thousand dollars in my early career as a translator because I never had formal training and was very slow at first. Take the time to go through some kind of typing course to increase your productivity. Make sure that you learn to type in your target language on a target language keyboard (and learn how to install different language keyboards on your computer). Also make sure to learn how to use as many keyboard shortcuts as you can so that you have to use the mouse as little as possible.
  • Word processing: You'll need to be confident with basic office software, especially word processing. This does not have to be MS Word, though I would recommend it. You should know how to use advanced search-and-replace features, be familiar with complex formatting and styles, have a good handle on tools like templates and format painting, and know what you should not do in MS Word (such as working in HTML files).
  • Browsing and querying: It's important to know the basic syntax of more advanced search queries and have a good idea of locations where you can find answers (and those don't have to be only dictionaries). I would recommend tools like IntelliWebSearch that enable you to find online content right from your desktop. You also will want to know how to quickly find information on your desktop or cloud-based personal storage.
  • Basic computer maintenance: You don't have to have the skill level of a system administrator, but you should know the basic steps for how to keep your computer in good shape and running more or less seamlessly. You say you can also have your tech guy do this for you? Sure, but the last time I checked, that resulted in lost productivity and income.
  • Code pages: You need to know what Unicode is, how to make a basic code page conversion of text-based documents, and in general understand what code pages are and why they are relevant for translators.
  • Tags: You'll never need to learn the actual function of tags in formats like HTML, XML, or the many other formats that are based on XML, including all the translation memory exchange formats (TMX, TBX, or XLIFF). But you do need to be able to distinguish a tag from other text and learn to respect and not touch it. (A lack of respect for tags is one of the quickest ways to turn your present client into a former client!)

So much for the general skills to adequately use technology. Now to what the technology should be:

  • Operating system: I don't care! I personally use Windows and I'm happy with it because I never have to worry about that very question. (So far I've never encountered any client who wants me to use an application that is available only on a Mac.) The truth is, though, that it's becoming more and more irrelevant. You can virtualize Windows on Mac or Linux computers, work in programs that are supported by various operating systems (such as Java-based programs), and, most importantly, more and more translation jobs are moving into a browser-based system, anyway.
  • Office programs: Same answer as for the operating system: I don't care. Yet, it's just a lot easier to have a copy of MS Office so I don't have to worry about conversion issues with files that clients send me.
  • Translation environment tool (aka CAT tool): The first thing you'll need to do is look at a) what kind of materials you're translating and b) what kind of clients you are or will be working for. The kind of material might determine whether it's important to have a translation memory (it might not be so important if you work with highly creative material), and the client might prescribe a certain tool or at least your ability to work in the format of a certain tool. If you look at the review of CafeTran in this newsletter, you'll see that translation environment tools often support the interim formats of other TEnTs.

To come back to the first criterion -- the kind of materials you're translating -- it doesn't really matter what it is; you will still want to manage your terminology. If you're looking at only doing that, you might want to use tools like Lingo or Xbench (and there are many other tools that manage terminology as well). While these tools don't directly interact with your translation process, it's very easy to access the terminology content that they maintain for you and it's also easy to quickly add more.

If you are working in projects where it would be helpful to access previously translated material (which essentially is the case for any and every technical, legal, medical, or other functional translation) and/or you're working with many different file formats and/or you're working in teams with other translators, you will want to use a full-blown TEnT (which will not only provide the translation memory feature but also terminology maintenance, QA features, file conversion functions, and many other tools). You might eventually end up using (and buying) several tools, but you need to make a decision where to start and which tool brings you the furthest.

Don't start with a "cheap" tool just because it's a beginner's tool. If you use a "cheap" or free tool, use it because that's the tool you really want to use. And forget about the word "cheap" anyway, because what you're really looking for is a tool that has a good return on investment. A $10 tool can be a waste of money, whereas a $1,000 tool can be a steal.

I would classify TEnTs into these categories:

    • There are large tools like Trados or memoQ (or others) that are powerful and might give you access to jobs that can only be done with these tools. (These are the kinds of jobs where the translation materials are located on a remote server that can't be accessed with any other tool.) They might also help you market yourself to companies that look for translators for these jobs.
    • Then there are tools that have a slightly geeky approach like OmegaT or Swordfish. These can be very powerful in the right hands, and they provide access to almost any kind of job (except the ones mentioned above).
    • Finally there are the browser/cloud-based tools like XTM or Memsource that give you a great deal of independence regarding the kind of hardware (even tablets!) and operating systems you use. They also can work with a large number of formats (though you might have to get a little creative when it comes to working at the beach caf� without wifi).

Here's the important thing to remember: you can't really get it wrong. Make sure that the tool has an active and loyal following (most do), and invest in training (either by yourself or through a third party). And don't think that your productivity will skyrocket immediately. In fact, it might never skyrocket, but it will surely increase if you do it right.

You'll find all these points mentioned in much, much greater detail in my Translator's Tool Box, a 400+ page ebook that is the ultimate technical resource for beginning and experienced translation professionals. 

ADVERTISEMENT

Why do people work with memoQ

'The thing I really love of memoQ is that it always offers you (at least) 2 different ways to face the everyday issues, so that you experience "safety" - because you know you are relying on a tool that will not let you down - and "fun" - because it seems a video game with so many options and levels to discover!'

(Renato Renno - The Foreign Friend srl)

Download the full-featured trial version, register for the free online training, and attend our educational webinars. Learn more at www.kilgray.com.

2. Walking the Shire with a Coffee in Hand

This heading refers to Igor Kmitowski, the guy (and in this case, it really is the guy) behind the translation environment tool CafeTran. He loves his hobbits (he internally names his different software versions after Tolkien characters) and he loves his coffee (the product is actually called CafeTran Espresso). Oh, and he also calls himself "semi-autistic" ("over the last two or three years I have become a coder more fluent in the Java language than in other forms of communication").

I can attest to finding some truth in that last statement: neither CafeTran's documentation nor Igor's communication skills in general are particularly good -- it took a bit of patience to coax enough adequate information out of him about his product.

But that does not make the tool itself any less impressive.

I wrote about this tool a little more than five years ago (in edition 108 for all Premium subscribers who have access to the archives). Here are some of the highlights of that review:

  • CafeTran is small, it's cheap, it's easy to learn and maintain, and it's definitely not built for corporate use!
  • It's a tool built for the freelance translator that aims to give as much of a true translation environment as possible.
  • It's written in Java, so it's platform-independent.
  • It supports text, XML, HTML, MS Office XML, and OpenOffice formats.
  • It provides an easy integration of queries to Internet-based resources (sort of like IntelliWebSearch).
  • It has an auto-complete feature of already-typed text (I particularly like this).
  • It offers an extraction feature for frequent terms for glossary-building purposes.
  • All this for a price tag of only 40 euros!

A few of these details have changed (it now costs 80 euros, for instance -- still shockingly inexpensive), though the gist is still the same. The tool itself, though, is much more than it used to be -- so much so, in fact, that I would no longer call it an "easy tool." For the laughable purchase price you get a full-fledged tool that supports many more formats than it used to (it now also supports all MS Office 2007 and higher files, InDesign, FrameMaker, a lot of software development file formats, AutoCAD DXF files (!), and a great number of bilingual formats coming from other TEnTs, including Transit, Trados, Wordfast, memoQ, and D�j� Vu) and offers additional features that are very, very clever.

Let's take a step back, though. Igor told me that "it's a challenge to rival the teams of excellent programmers from SDL, Kilgray or Atril and I made a rule to avoid examining other CATs not to get overwhelmed." I totally get that and it really shows. What we see so often between the other vendors is a mutual catch-up game -- which in many cases works great for us, the consumers. Someone like Igor with a "few hundred customers" and the hope that the tool will "one day become my only source of income" does not really need to worry about the competition but can completely focus on his customer base and their requests. This leads to solutions that are unique to this tool, and it also leads to more and more features.

"More features" is great, of course, but it also can make a tool more and more complex. CafeTran appears to be caught in that Catch-22, but this doesn't mean that it's not workable anymore -- it is just no longer a tool that you can simply sit down with and start using (very much like the "big boys" -- Trados, memoQ, etc.).

When you open the tool you're presented with the Project Manager and a plethora of options that are initially difficult to understand. For instance, you can choose "TMX Memories" as well as "External Databases" for both segments and terms. The difference? The flat text TMX memories require more resources and can at some point be sent to a true database format with fewer resource requirements. And terminology is indeed processed in the same formats as true TM data -- even though it's now also possible to do this as tab-delimited text files. (Terminologists will cringe at the no-concept-based terminology databases.) To me, these look like a case where the tool started out with one feature and then added a parallel feature because of user requests ("need more memory" or "our glossaries are in text format") without questioning whether the original feature was still needed. I would recommend some pruning.

Still, here are some of the features that really make the tool into a productivity hub:

  • Automatic assemble of segments from the various parts found in records from glossaries, translation memories, and automatically extracted subsegments from translation memories
  • Automatic suggestion based on the first few keystrokes and content in glossaries, TMs, and machine translation engines (the pre-configured possibilities here are Google Translate, Bing Translator, and MyMemory)
  • Automatic extraction of proposed terms for every segment you work on as candidates for the glossary
  • Very easy and intuitive way to enter terms into the glossary (highlight source, highlight target, click button or press key)
  • Context matches that differentiate between having the same context on one side of the segment or on both (cleverly named 101% vs. 102% matches)
  • Possibility to use images as reference material
  • Clipboard-based workflow for unsupported applications and environments
  • Smart handling (i.e., deletion) of unnecessary formatting tags in scanned and otherwise badly formatted documents
  • The translation is done internally via XLIFF, and a fully translated XLIFF file is automatically generated at the end of the project (aside from the actually translated file)
  • The interface is completely configurable as far as what kind of panes are displayed in what manner
  • QA checks include all the standard checks also found in other tools
  • For files that originated in Trados, it supports bilingual Word docs, TTX files, and SDLXLIFF and SDLPPX files (and, as mentioned above, various other intermediate files from other TEnTs)

This only scratches the surface. If you've received the impression that this is "just a tool for beginners" because it only costs 80 euros, think again (and accept my apologies for poorly communicating what I tried to say). If you're looking for something new and think this might be a real option, especially because it also runs on Mac (approximately 20% of CafeTran users use it on Mac) or Linux (less than 10%), you really might want to try it out. Plus, Igor has promised as his next great project to spend time on the documentation (right now he's working on enabling the tool to export comments on translation segments).

3. Bee Sting Therapy

Wordbee, the maker of the cloud-based translation environment tool of the same name, has come out with a new service called MT Hive. MT Hive is not a game changer, but there is a self-service element to it that I have rarely seen in our industry.

Essentially, MT Hive is an attempt to customize MT and TMs for translation buyers so they can access the client portal of Wordbee and run documents through the system to be pretranslated by the content of the customer's TMs; the rest gets pretranslated by a machine translation engine. The buyer can then view the "translated" file and decide whether it should be sent to post-editing or whether it should be left in the (presumably crummy) state it's in. Once the file is sent for post-editing, a workflow kicks in where post-editors, editors, and translators are notified and given a link to the project -- and all that without a project manager lifting a finger.

While Google Translate can be chosen as the MT engine, the emphasis is on Microsoft Translator Hub, which allows users to train MT engines in exchange for the data. The integration with the Hub is so deep that after every project new data can automatically be sent for retraining purposes. Other connectors to other MT engines are being developed, including to PangeaMT and tauyou.

When I talked with Wordbee's Stephan B�hmig about it, he strongly emphasized the great customizability of the system that can be used in the way described above or in many other configurations. So far it has been developed primarily with corporate clients with in-house translation project management in mind, but I think that this or a similar system will for instance have a real shot for some translation portal with a vision.

ADVERTISEMENT

Boost Your Productivity with XTM Cloud

See how to work more efficiently and collaborate easily with colleagues using the award winning XTM Cloud. Sign up for one of our regular webinars, view a short movie or try a fully functional version of XTM for free.

Use XTM to manage your projects, translation resources, TM and terminology, then translate and review on line. And even better, join XTM Xchange to find resources or win more translation work. 

4. For Your Tool Box

Here are a couple of tools that really helped me out during the last couple of weeks:

Unifier is a tool that allows you to quickly convert any number of text-based files from one code page to another. It does cost a little bit, but it was worth it for a recent project where I had to convert several hundred HTML files in the Japanese code page Shift-JIS into Unicode. It even rewrites the code page tags in the files.

I've used DownThemAll, my favorite extension for Firefox, a number of times and I really, really like it. It's a tool that allows you to download in one go any number of files that are listed as links on one page. I had to download all the DGT Translation Memory files the other day, and rather than doing it one by one, it was great to take the dog out to the beach while DownThemAll did its work. (Apropos dog: we just found out that our poor little Luna has lupus. Great: the guy with multiple sclerosis now has a dog with lupus. But the sand and surf are great equalizers -- and great therapy!)

To come back to DownThemAll: it's donationware, so they'll ask you to donate (which I happily did).

Lastly, one of my favorite applications is the slightly anarchic Chainsaw to, yes, chop apart files (and later recombine them). Not a good idea for binary files (like MS Word files), but for any kind of text-based file it works great. Why did I have to use it? After downloading all the DGT TM files and extracting the German> English TMX file, it was WAY too large to handle, so I had Chainsaw do what it's good at (it even makes a chainsaw sound!) and make ten files out of one. I did have to repair the header and footer in each of the files afterward, but then I could load them in the tool of my choice.

5. New Password for the Tool Kit Archive

As a subscriber to the Premium version of this newsletter you have access to an archive of Premium newsletters going back to 2007.

You can access the archive right here. This month the user name is toolbox and the password is lupus.

New user names and passwords will be announced in future newsletters.

The Last Word on the Tool Box Newsletter

If you would like to promote this newsletter by placing a link on your website, I will in turn mention your website in a future edition of the Tool Box newsletter. Just paste the code you find here into the HTML code of your webpage, and the little icon that is displayed on that page with a link to my website will be displayed.

If you are subscribed to this newsletter with more than one email address, it would be great if you could unsubscribe redundant addresses through the links Constant Contact offers below.

Should you be  interested in reprinting one of the articles in this newsletter for promotional purposes, please contact me for information about pricing.

� 2013 International Writers' Group