A computer journal for translation professionals

Issue 15-11-255
(the two hundred fifty fifth edition)

Contents

1. Spirited

2. New Compensation Models?

3. "The Best Thing That Can Happen to the Language Industry," Take 2 (Premium Edition)

4. The Tech-Savvy Interpreter: WebRTC: The Technology behind the Rising Tide of Remote Interpretation over the Internet

5. Migrating to Linux

6. New Password for the Tool Box Archive

The Last Word on the Tool Box

Thanks!

I can't think of a better edition of the Tool Box Journal for Thanksgiving. (Yes, I know, the Canadian Thanksgiving is long over, and the rest of the world outside the US doesn't even celebrate it. So for those who don't know -- it's the most beautiful of holidays, a time to do exactly what its name implies: give thanks.) This issue's arrival on Thanksgiving weekend is so appropriate because we'll talk about three new translation technologies that will have an immediate and profoundly positive impact on our professional lives. If that's not reason to give thanks, I don't know what is!

I'm also thankful that the 12^th edition of the Tool Box ebook has been received so well. I've received a lot of encouraging -- make that "jubilant" -- notes in response, and here is one of my favorites:

"... thanks for all the new material. I must say that the world's most useful publication for independent translators just keeps on getting better and better."

Aww. I love you, too. Very much!

If you haven't had a chance to update your previous version of the book or purchase it for the first time, be sure to do so. You can find a good summary as well as a link to the table of contents and index right here. I can't guarantee that you'll never have problems with your computer again with this book by your side, but I can promise that it'll be easier and more productive to use your computer with this resource aiding you.

Plus, the special USD 40 offer if you've never purchased before is still valid through Monday, November 30 (upgraders pay only USD 25 and fellow ATA members USD 30).

ADVERTISEMENT

New eBook: An introduction to translation memory technology

At the core of SDL Trados Studio is the power of translation memory. Using a well maintained translation memory improves translation quality and consistency whilst saving time and money. It's an invaluable asset to any translation business.

Explore translation memory technology and best practice for translators and translation agencies, by reading the free new eBook -- An introduction to translation memory technology.

Read »

1. Spirited

"lilt": noun \ˈlilt\; 1: a spirited and usually cheerful song or tune; 2: a rhythmical swing, flow, or cadence; 3: a springy buoyant movement.

That's Merriam-Webster's definition for "lilt." And Lilt with a capital "L," an exciting new translation software, might mean all of that and more for us as translation professionals.

Let me start with a disclaimer: I've known Spence Green, the creator of Lilt, for a while now. In fact, shortly before he and his team launched their product, I helped him make the tool fit more naturally into the process of the translator. So I'm not unbiased. But I'm going to stand by my enthusiasm for this particular tool for the following reasons: a) I've done the same for a number of other tools, too. b) Who is ever completely unbiased to start with? And c) the kind of technology behind Lilt is exactly what I'm excited about in translation technology's future.

With that said, here we go: Machine translation and post-editing are inextricably connected for many within the world of translation. No matter how good a machine translation output might be, it cannot be trusted for publication-ready quality without a human post-editor evaluating the accuracy and correcting the translation. There are some exceptions, such as the Microsoft knowledgebase, but even that is post-edited, albeit with the P3 (post-published post-editing) process, a form of end-user post-editing that is strongly advocated by Microsoft's Chris Wendt.

Although it might have gone almost unnoticed in the "MT camp," professional translators' real use of machine translation is increasingly integrated into existing processes. True, there are still the "traditional" post-editors who work primarily on raw machine translation, but as any translation vendor who has tried to hire one can tell you, they're hard to find. Why? Well, it's a process that the typical translator wasn't trained for, and it generally doesn't match the expectation that translators bring to their job. Recognizing both this situation and the existence of valuable data, even in publicly available general MT engines, translation environment tool vendors looked at ways to bring that data into the workflow (aside from just displaying full-segment suggestions from machine translation systems that often aren't particularly helpful).

Here are some examples (you can read about all of these in earlier versions of the Tool Box Journal):

A number of tools, including Wordfast Classic and Anywhere, Trados Studio, Déjà Vu, and CafeTran, use auto-suggest features that propose subsegments of machine translated suggestions (which invariably are more helpful than the whole segment). In some cases, such as with Wordfast and Déjà Vu, these even come from a number of different machine translation engines.
Déjà Vu uses machine translation fragments to "repair" fuzzy translation memory matches.
Star Transit uses a process called "TM-validated MT" in which the communication goes the other way: Content in the translation memory is used to evaluate MT suggestions. A similar process is being developed for OmegaT right now as well.
Lift, Kevin Flanagan's PhD project at Swansea University, used MT to identify subsegment matches in translation memories so that even a TM with very little content can produce valid subsegment suggestions. (Kevin now works for SDL and his technology will surely see the light of day in various SDL products.)
In fact, there are too many other creative and productive uses of MT beyond post-editing to list them all here.

Translators and their community have warmly welcomed these developments, but they all have one limitation in common: The underlying MT is static. This means two things in our context: The phrase table (the database that contains the language data) within the machine translation is not automatically and immediately updated with the translator's choices (though SDL is presently working on a process that will account for that). And the automatically generated MT subsegment suggestions come from the initial MT proposal, which does not adjust itself to whatever the translator might have already entered.

Enter Lilt. Lilt uses Phrasal, an open-source statistical machine translation system developed by the Stanford Natural Language Processing Group. Here's what distinguishes the way Lilt employs Phrasal from other SMT solutions:

Every finalized translation unit is directly and immediately entered into the phrase table and considered in further machine translation.
There is no difference between machine translation and translation memory - even imported translation memory exchange (TMX) files are entered "only" into the MT engine's phrase table (where they are treated preferentially).
With every word the translator enters while working on the individual translation unit, a new query is sent to the MT engine to adjust its suggestions to whatever has already been entered.

All this is presented in a browser-based translation environment interface that is shockingly simple -- the user documentation consists of less than three pages and covers essentially everything. The supported file formats include MS Office files, XLIFF and SDLXLIFF, TXML (Wordfast Pro 3's translation format), text, XML (without any possibilities to modify the extraction process), HTML, and InDesign IDML. Files are organized in language-combination-specific projects (presently EN<>ES, and EN>FR and >DE are supported with more in the immediate pipeline) to which existing TMX resources can be assigned. The import process (which can be started by dragging a file into a drag & drop area or through a file selector) takes a bit longer than with most other translation environment tools since all initial machine translation suggestions are already loaded, but once it's imported, the segment-to-segment processing is done rapidly.

For every segment that you open, you will see the initial MT suggestion and possible alternate terms for the currently suggested word. The currently highlighted word can be entered with a keyboard shortcut (Enter or Tab), or the whole suggested segment can be accepted with Shift + Enter as the keyboard shortcut. Unlike other tools, Lilt does not use an auto-complete popup at the cursor's location to suggest matches; instead, it displays them underneath the target field.

If you enter a word or phrase that differs from the machine translation engine's suggestion, the system takes your entry into consideration and sends a new query to the machine translation, which then suggests how to finish the segment on the basis of what has already been entered. This happens for every word you enter -- and blazingly fast, at that.

The corpus on which the machine translation engine was trained (consisting of public sources such as the UN's corpus and Opus) is also and simultaneously available for concordance searches. It opens in a left-side panel with a double-click on the term for which a concordance search is to be executed.

The concordance search lists the complete segments of the corpus (truncated segments are displayed in full by clicking on them) that are preceded by automatically extracted terminology.

When searching for a phrase with more than one word, it's possible to enter the remaining words manually into the search bar and receive auto-complete suggestions for existing entries as you type.

All three data sources (the data behind the MT, the concordance results, and the terminology) actually come from one source, but they are displayed in different ways and for different usage cases. And every time you finalize a segment (or import a TMX file), that data is added to the large data repository and treated the same, albeit preferentially. For instance, if there is a repeated segment within the project or a perfect or fuzzy match from an imported TMX file, it's displayed as a "TM match" -- even though, strictly speaking, there is no classical TM, "just" the phrase table within the machine translation engine.

The data you enter into Lilt, by the way, is kept confidential: nothing is shared, and there is not even an option to share at this point. You'll notice how keenly Spence and his team are dialed into this concern when you read the documentation "leaflet" that addresses the privacy concern before anything else.

Overall, Lilt is a translation environment tool of a different kind. While other tools pride themselves on their wealth of features, Lilt is spartan in its interface, its features, and its name. Granted, there will eventually have to be some additional functionality -- I'm thinking, for instance, of quality assurance, bare-bones project management facilities, and additional languages and file formats. But it's surprising to realize how effectively translation work can be done on the basis of the powerful and interactive Big Data backbone that Lilt runs on.

This is the first technological innovation that I'm thankful for (and I guess you could even argue that there is more than one innovation in there).

Here's another innovation that the team around Lilt just released last week: Lilt offers a totally different way of handling inline codes (aka markup or tags).

Be ready to be amazed.

Ready?

Here it goes: There are no tags. Some formatting is displayed in the source (such as bold, italics, underline, colors, etc.), but most other information that's usually treated as tags (hyperlinks or complex formatting) is completely invisible. Even for the formatting that is displayed in the source, there is (at this point) no way to transfer this into the target field. Instead, Lilt does something that many other tools should have done a long time ago: It uses its language resource to decide on its own what needs to be tagged when the file is exported. If you think about it, it's a no-brainer. Once you have enough reference data in place, you can do an alignment of every translation unit, match source and target words and phrases, and therefore transfer formatting.

Now, this won't always work perfectly (in fact, I frustrated Spence greatly when the first file I tried this with had a separated German verb in the target that should have been highlighted, thus causing the system to fail), but it does for much of the time. The documentation says this about this subject:

"When you export each document, Lilt will automatically insert the tags into the target for you. This procedure is usually very accurate, but some mistakes can be introduced. Check the exported document and correct errors as needed."

Sounds good, but there will be an additional safety mechanism introduced at a later point where you can choose to use tags of some kind if you really want (and -- look into my eyes -- do you really want to do use tags again?).

Have I mentioned pricing yet? It's free. This will change at some point, but once it does, it won't be too much (and if you did decide not to continue using it, you could get all of your data out via the TMX export). In the meantime, I would really encourage you to at least try Lilt. Obviously you have to have the right language combination, but if you do, you should test it to see what this kind of translation experience is like. The technology Lilt is presenting at this point is not going to be the last word on how translator productivity can be increased with the help of machine translation, but it certainly is an important step that's worth looking at.

And here is another thought: One of the reasons why Lilt is so different is because it's built by an industry outsider who does not assume many features as givens and yet is very interested in learning how professional translators work. Both in his work at Stanford and now at Lilt, Spence has been tracking and communicating with many professionals to see how they operate. I first talked to Spence more than a year ago when he was still finishing up his PhD, and at the time he said that "there's a big disconnect between what's been done with translation technology and what can be done." It's a statement we can all agree with, but it can be hard to see what exactly needs to be done when we're in the middle of it and blinded by our preconceived notions. An outsider is exactly what we need then, especially one who is willing to listen.

ADVERTISEMENT

Learn how the new Language Terminal will help you with your translation jobs: https://youtu.be/BN6PJUKyPmU -- The new Language Terminal is coming soon!

2. New Compensation Models?

If you consider what I just wrote in the article about Lilt, it's clear that the old and used-up paradigm of being paid by the word will no longer work for a good part of the world of translation. Why? Because what we first tried in translation memory's infancy in the 1990s -- when we didn't tell our clients that we'd implemented new ways of reusing content and were able to really jack up our profits heavily for some projects -- is not going to work anymore. We're past that kind of clandestine dealing, both in an ethical sense (I should hope) and in a general 21^st-century kind of way where processes are much more transparent.

With TM-based translation it eventually was relatively easy (though painful for some) to share some of the savings with clients (whether LSPs or direct clients). There is no translation environment tool that doesn't allow for a perfect/fuzzy match and repetition analysis, and it was (and is) a matter of negotiation between you and your clients on how to deal with those.

When post-editing of machine translation entered the picture more prominently some five or so years ago, new ways of finding compensation had to be developed. Some used a time-based paradigm, some an assumption that machine translation in general equals the quality of a certain percentage TM match, but probably the most transparent measurement was to calculate the edit distance, i.e., measuring how many edits were made to any one segment, which then could be used in a fuzzy-match-like scheme to come up with a fair compensation.

New technology - particularly the way we use machine translation - has evolved into an activity that I think is virtually impossible to measure. Machine translation is no longer post-edited but is deeply integrated (and will be forever more) into our existing processes, and there might in fact be many different machine translation sources that provide resources for us rather than just one. Will it make us more productive? Well, it had better -- otherwise there's no good reason for us to use it in the first place. Will the added productivity be consistent enough to use as a measuring mechanism? I'm absolutely certain that's not the case.

So what do to?

We discussed one solution at the NTIF conference in Iceland last week (a very nice event -- make a note to attend next year's edition of NTIF in Malmö, no matter whether you do Scandinavian languages or not).

My suggestion in Reykjavik was that we completely move away from pricing by the word, line, or page and learn how to quote by project and/or time, which, after all, is something that virtually everyone in the professional world (outside of translation) does. You can imagine what the immediate response was: "My clients will never go for that!"

Well, maybe not. But we were the ones who taught our clients to expect quotes on the basis of word counts. Now that it's moving into the realm of the impossible, it can be up to us again to teach our clients that we now charge differently. Comparative pricing for any given project will help our clients understand the ultimate benefit to their bottom line.

I can't wait to throw off the shackles of word counts and operate like a professional who can figure out how much to charge for a project, just like my electrician or lawyer does. And ironically, this will happen (I think) because of advancements in technology. Who would have thought?

3. "The Best Thing That Can Happen to the Language Industry," Take 2 (?) (Premium Edition)

This is what István Legyel of Kilgray said about Language Terminal awhile back. Back then I bemoaned his hyperbole -- which really turned out to be exactly that, hyperbole. Maybe it was one of those remarkable Hungarian-doing-the-American-mindset moments...

Anyway, after about a full year of rework, Language Terminal will be relaunched on December 7, and not only is Kilgray's estimation much more realistic this time around, but the product does look better and promising (and up ahead is still the third ground-breaking technology in connection with this).

First of all, it has a new look. It's very app-like and "HTML5-ish," with a dashboard when you open it that lists all the different things you can do on that site. (The link is www.languageterminal.com, but be sure to wait until December 7 if you want to actually see what I'm writing about.)

You can enter or track a job, share a TM or termbase, share or browse other memoQ-specific resources, and convert InDesign files.

The features that Kilgray particularly wants to highlight in this new incarnation of Language Terminal are the project management facilities that especially relate to the individual translator. Once you create a profile, you're able to enter a client- and service-specific price list that you can use to quote on jobs (presently only with one non-alterable template). This will appear within memoQ in a new "Financing" pane in which you can link memoQ word counts of future, present, and past projects for reporting purposes and to extract data for invoicing (unfortunately only as raw data, which you then will have to enter into your own invoice system).

It's a system that has promise but is not yet quite where it wants to be (and I think the good folks from Kilgray would agree with this). Naturally there are plans to take this further. I think it would be good if they would partner with an existing vendor for project management and accounting tools for translators, but we'll see how that plays out.

On the mid-term schedule for release are unified logins for Plunet- and XTRF-based workflow systems that language service providers use (so vendors don't have to manage lots of different login data and can collect all work-related data right within Language Terminal) and a "place to meet" for customers and providers. I'm not sure what else to call it, since Gabór, whom I talked to about this, had a very bitter taste in his mouth when saying that this really is not a "marketplace." Of course, "marketplace" has such a bad connotation because it's associated with a race-to-the-bottom when it comes to pricing and commodity-like trading of translation. All of these are connotations that Kilgray understandably wants to avoid. So how about a "market lounge"?

Let's quickly touch on the InDesign feature before we talk about the feature that I think is groundbreaking. I've written about it before and I'm sad that it's not used more than it is (Gabór said it's virtually not used by users of translation environment tools other than memoQ) because it provides really great access to otherwise impenetrable InDesign files (unless you have InDesign installed).

Here is what I wrote about it in the Tool Box ebook:

"Kilgray's Language Terminal has changed the way translators can work with InDesign files. One of the various features of Language Terminal is the ability to upload InDesign IDML files of any version to a server, which converts these files to a memoQ-specific version of XLIFF. This version of XLIFF, with the extension MQXLZ, can be directly processed in memoQ or, with a little workaround, in any other tool that supports XLIFF.

"For this workaround, you will need to realize that the MQXLZ format is a zipped (compressed) format that contains an XLIFF file (with the extension MQXLF) and a "skeleton" file (which contains all the external data, such as images). To retrieve the XLIFF file, change the extension of the MQXLZ to ZIP, right-click on the file and select Open with> Windows (File) Explorer (Don't use a compression utility because that might cause problems in the back conversion to InDesign.) "Once you see the MQXLIFF file, copy it to an external location and rename it to XLF or XLIFF. Now you can process it in any other tool. Once you're finished with the translation, replace the extension of the XLIFF file with MQXLIFF, open the ZIP file again with Windows/File Explorer, and replace the old MQXLIFF file with the newly translated one. Once that is done, close the ZIP file, rename its extension to MQXLZ, and upload it to the Language Terminal again to have it converted back to an InDesign INDD file. Once the Terminal is done with the conversion, you can download a ZIP file that contains the INDD file alongside a PDF with a preview of the translated file."

If you are given an original InDesign INDD file to translate and don't want to bother with InDesign on your computer, here's how to do it, no matter what kind of translation environment tool you're using. Of course, if you're using memoQ, it's easier since you can do all this from within memoQ.

One thing missing from the new Language Terminal will be memoQ cloud server. Not that this will be gone altogether -- but it will be available on and through memoQ.com instead. Still available through Language Terminal will be TMs and termbases that you can share for "free" (four TMs in two reverse language pairs and two termbases in five language pairs with up to three users).

("Free" is in quotation marks because there actually is a form of payment or -- as Kilgray puts it -- "We offer this service free of charge, in exchange for mining the content of shared resources for linguistic patterns to enhance our offering. Kilgray will not directly use any content you upload, and we will not share it with any third party." It's very important to keep this in mind depending on what kind of clients you work with.)

The new component -- ta-da! -- is that it's now possible to share the TMs (and at some point the termbases) in real-time with users of Trados Studio. Parallel with the relaunch of Language Server, Kilgray will also publish a Trados Studio plugin that will allow Trados users who have been given access to a shared TM to connect that within Trados and to read and/or read and write (according to what the original setting is) to that database.

Wow.

Using the same plugin, the same is possible with a memoQ server TM.

And you are like: What's he so excited about?

See, we've been talking about exchange standards forever, and for the last few years it's felt like one big charade. Yes, TMX, TBX, XLIFF, and TIPP/Linport more or less work to exchange their respective kinds of data, but the reality is that many of us no longer work with data on our desktops that can be exchanged right there; instead, we're connected to servers from which we receive our TM and termbase data in the tool of the client's choice, and there is no way to get to that data except with the original tool. No possibility of interoperability.

This is the first time that two competing commercial tools can talk to each other through server-based processes. It's a live exchange of data, and I think it's a real milestone for all of us. I can only hope that others will follow suit now, especially since Kilgray is also showing how this can be done profitably: For every user who uses the Trados plugin to connect to a memoQ server TM, a WebTrans license is required, and this is typically provided by the server's owner through a floating license (of course, this is not the case for the shared TMs through Language Terminal).

Happy?

I am.

ADVERTISEMENT

Go Beyond the Headlines with "Interpreting the News"

Interpreting is finding its way into world headlines with increasing frequency. Stay abreast of developments affecting your profession and get the analysis and commentary that only InterpretAmerica provides on the Interpreting the News blog.

Subscribe today. It's free and only takes a moment. You'll receive occasional news and commentary about interpreting right in your inbox.

www.interpretamerica.com

4. The Tech-Savvy Interpreter: WebRTC: The Technology behind the Rising Tide of Remote Interpretation over the Internet (Column by Barry Slaughter Olsen)

For generations, scientists and innovators have lamented being limited by the technology available during their lifetime. For centuries, translators were subject to limits inherent to quill, ink, and parchment -- the technology of their time. Then along came the typewriter, followed by the word processor, translation memories, the Internet. Translation continues to evolve with the technologies of the times.

So it goes with interpreting as well. As communication technologies advance, interpreting is being made available in ways that it never was before. There are many technologies used to deliver interpreting services from a distance today. This month I want to focus on one that, while still in its infancy, is having a profound effect on the way interpreting is being provided: WebRTC.

WebRTC stands for "Web Real Time Communication." Here's the "techy" explanation. WebRTC supports "browser-to-browser applications for voice calling, video chat, and P2P [peer to peer] file sharing without the need of either internal or external plugins." (See Wikipedia entry here.)

That sounds suspiciously similar to Skype, and other web conferencing services like GoToMeeting or WebEx. So, what's the big deal? The short answer is that it is all about the "plugins," as in, we won't need them anymore.

The Internet has made the web browser (e.g., Chrome, Safari, Firefox) the computer's most used and recognizable tool. A browser is our gateway to all that cyberspace has to offer. As users began to demand more of their web browsers in the early 2000s, like watching streaming video, playing videogames, and communicating with friends, traditional html code just couldn't keep up. So along came plugins -- those pesky add-ons that have to be downloaded, often crash, and need to be updated regularly. What is more, some of them can slow your computer to a crawl and expose you to vulnerabilities that can be exploited by hackers. To make matters worse, many plugins are not compatible across different web browsers or operating systems.

What this has meant for the user experience is more clicks to connect and slower connection times to web meetings or having to update and reboot a computer unexpectedly before being able to connect to a web conference or webinar. All frustrating when you have people waiting for you to connect. With the advent of mobile devices like smart phones and tablets, things became even more complex.

How does WebRTC change all this? Simple. It does away with the need for plugins. As the programmers like to say, the ability to start a phone or video call is "baked in" to your web browser. No plugin or software download needed. You will not need to open Skype to make an Internet-based phone call, or WebEx's or Adobe's "add on" to participate in a webinar. This new capability is simple to use and is currently the backbone of several remote platforms designed to provide interpreting over the Internet.

As I said at the beginning, WebRTC is still in its infancy. It is not supported by all browsers yet. Currently Google Chrome, Mozilla Firefox, and Opera support it. Microsoft Explorer and Safari do not. However, Microsoft's new web browser Edge, which is slated to replace Explorer, will. Google is tracking some 750 companies currently that are using WebRTC today for browser-based applications.

In practical terms, what this means for interpreters is that their computer with broadband connectivity can now easily be converted into an interpreting studio, which is why startups on three different continents are working hard to gain a foothold in the remote interpreting market.

Ultimately, technologies like WebRTC make cross-border communication easier than ever from just about any Internet-enabled device. Interpreters ready to adapt to this new online environment stand to benefit by expanding their service offering to a new segment of the interpreting market. Communication technology continues to advance and interpreting is adapting along with it.

Do you have a question about a specific technology? Or would you like to learn more about a specific interpreting platform, interpreter console or supporting technology? Send us an email at inquiry@interpretamerica.com.

ADVERTISEMENT

The Words You Want. Anywhere, Anytime

Let WordFinder open a new world of opportunities -- get access to millions of words and translations from the best dictionaries, on your computer, via a web browser, on your smartphone or tablet. Stuffed with lots of smart features. WordFinder has what you need as a translator in your everyday work -- anywhere, anytime!

Read more at www.wordfinder.com.

5. Migrating to Linux

Andrew Rennison (you can find him on Twitter right here), one of our colleagues and readers of the Tool Box Journal, has painstakingly recorded his migration from Windows to Linux. Here is his story:

Why did I switch in the first place? It was a combination of many things. I've always been happy with Windows 7, and reinstalling the OS if something went wrong was never really a problem as you could legally download the ISO from Microsoft and install the necessary drivers via a Windows update. I tried Windows 8 but really couldn't get along with switching between the tiles and desktop (Classic Shell's Start menu fixed that, but I didn't really see any other benefits).

And then came Windows 10. A friend of mine upgraded from Windows 7, decided he didn't like it (I don't know why), so he tried to roll back. It didn't work so he had to start from scratch, but the only version of Windows he had was in English and he only speaks German. As Microsoft is pushing Windows 10 hard, they've taken down the legal ISO links to Windows 7 so my friend is no longer able to get a German version.

This, combined with the recurring Microsoft prompt to upgrade to Windows 10, made me give some serious thought to going open source. I know that this doesn't always go smoothly as Linux doesn't always support your hardware, so I decided to err on the side of caution by turning a Windows 7 laptop into a dual-boot system with Linux Mint. I chose Mint because it receives good reviews and is close to Windows in terms of usability as it also has a Start menu.

After playing around with that setup for a while, I realized that switching over wouldn't be a major problem as I still have access to my network-attached storage, while PDF files are no problem and media files can be run with VLC, Clementine and the like.

I basically saw two potential pitfalls: LibreOffice and TEnTs. Don't get me wrong, I like LibreOffice (it reminds me of good old Office 2003), but if a client sends me a Word file that contains any formatting that's slightly out of the ordinary, LibreOffice generally won't play ball and Microsoft Office needs to come to the rescue. While scouring some forums I read about PlayOnLinux, a tool that allows you to run certain Windows software. It basically installs the right version of Wine (Windows emulator) for you as a virtual drive. I decided to give it a go with Office 2007 and it works! (Well, Word, Excel, and PowerPoint work, Outlook and Access don't, but I don't need those two anyway.)

So far, so good. Now to TEnTs. In my 10 years as a translator I've used Wordfast Classic, SDL Trados Studio, and memoQ. I also tried Across, which is free for freelancers, but wasn't too keen on the fact that you're more or less stuck in their environment. Trados is extremely versatile and generally very reliable, but I found it taxing just to set up a project and actually start translating. I'm not a power user, though, and I'm sure practice would surely make perfect in this regard. My preferred tool to date was definitely memoQ as I found it quick and easy to get going and could also move projects between computers (I work on more than one machine -- more on that later).

I then looked into what I could use on Linux. My online reading presented me with three main options: OmegaT, Swordfish, and Wordfast Pro.

None of my preferred TEnTs supports Linux natively, which is a real shame. I honestly think that providers are missing out here. I realize that there aren't all that many translators running Linux, hence the amount of work involved in maintaining several OS versions is deemed too high. I don't own any Apple products myself, but I understand their popularity due to the fact that they just work (with each other). But TEnT providers simply don't see any need to budge from their Windows pedestal, instead forcing Mac users into dual-boot or virtual systems.

Given my success with Linux Mint, I decided to get a small laptop and install Ubuntu on it to see how I liked using that. This is somewhat of a change from Windows if you go with the "Unity" interface as I did, but it didn't take me very long to find my way around, install updates, and the like.

Next I decided to try out the TEnT I mentioned above. I started off with OmegaT. This is actually available in the Ubuntu Software Center, so installation was a breeze.

After launching OmegaT, I fed it a short Word file as a source document and imported a TMX translation memory and TXT glossary. This all went off without a hitch, but then I saw the editing pane. On the one hand, I'm perhaps spoiled by the WYSIWYG offerings of memoQ and Trados; on the other hand, I work for an online company where we use a content management system without WYSIWYG functionality, meaning I need to understand and be able to use HTML tags (at least basic ones such as bold, ordered list, and the like), so I think I'm used to handling tags to some extent.

Unfortunately, OmegaT was a minefield of tags. Granted, the source text did include words in bold and italics, but there were tags in the middle of words and dotted around words. It basically made it nigh on impossible to read the sentence at hand. I realize that people like David Turner have come up with great little tools such as CodeZapper to alleviate such problems, but it's another step in the process and one that's unnecessary if I were to use a different TEnT (I originally translated the file in memoQ, that's how I know the source file isn't at fault).

Despite being free, I couldn't warm to OmegaT. Next on my list was Swordfish. This runs in Java, which is why it also runs on Linux. I managed to install Swordfish without a hitch, but running the program proved to be a little more difficult. In Windows you click on an .exe file; with Swordfish you click on the .sh file. What I didn't know is that you have to tell Linux to make such files executable. This turned out to be a simple setting in the Ubuntu preferences, but it took me about an hour to find that out as it kept wanting to open the file in a text reader whenever I clicked on it. Once I'd got past that hurdle, I was able to launch the trial version and take the program for a spin. Swordfish had no problems with the source file that proved irksome in OmegaT. There were some tags in there, of course, but nowhere near as many and certainly not in the middle of words.

The next problem I encountered was with glossaries. Creating and importing a database designated as a glossary was absolutely fine, but to run them you need something called WebKitGTK+. After googling the term and visiting the website hosting the code, it looked like I'd need to compile this. At this point I started to think that this wasn't perhaps such a good idea after all. Again, working for a tech company put me in the luxurious position of being able to ask a colleague who runs Linux servers at home and develops Android apps, so he knows a thing or two about Linux. He had a look and said that I may indeed need to build WebKitGTK+, but it may just be as simple as a one-line command in the terminal to get the necessary code installed from the repository. We gave the command a try and it worked like a charm! After that it was plain sailing and I was able to run glossaries and even import the entire dict.cc vocabulary list so I can perform a dictionary search without having to leave Swordfish. To be honest I didn't even proceed to demo Wordfast Pro as I found that Swordfish suits me just fine and has the added bonus that I can take my license with me, deactivating it on one computer and then activating it on another.

Speaking of moving from one computer to another, I run several computers -- I have an Intel NUC hooked up to a 29-inch monitor as my main rig (this still has Windows 7 on it) and a couple of laptops that now run Ubuntu so I can take my work with me if necessary. If you're working on a small laptop screen, it can be a little tiresome switching between programs. Ubuntu gives you the option of several workspaces so you can have your TEnT maximized in one workspace, then with a simple key combination you can switch over to another workspace with your browser or some other program or file open. I believe Windows 10 now offers this functionality as well, but I thought it was worth mentioning. In a previous newsletter Jost mentioned that it's now also possible to install Windows in different languages. That's a welcome change as this was once the preserve of the Ultimate versions of Windows, but I also noticed that Linux also offers language switching (you just need to log out and back in again for the change to take effect).

While talking to my work colleague about the situation with TEnT providers in terms of compatibility and pricing, he pointed out that the developer software world used to be in a similar position. The big difference is that the developers got tired of paying large sums for the software they use to, well, build software, so they basically looked at the insides of the software and started building their own versions. According to my colleague, almost all of the software he now uses is either free of charge or costs a fraction of what it used to, simple because developers are in a position to reproduce the software, thus influencing the market.

As translators we don't generally enjoy that luxury and simply want an operating system and TEnT that helps us get the job done reliably and on time. I'm by no means a techie myself, but I'm prepared to invest some of my free time to try and get things to work as they should or as I want them to.

I suppose the point of this whole message is to encourage fellow translators not to shy away from other opportunities. Linux definitely isn't for everyone. While getting Ubuntu up and running, I ended up doing quite a lot in the terminal. This situation seemed like a combination of Windows 7 and Windows 95: the Unity interface seems fairly modern (although there are a lot of Linux users out there who hate it), while the terminal reminds me of my gaming days as a teenager where I had to run DOS commands to get the CD-ROM to work (I never got beyond that, though). The good thing about choosing a common distribution like Ubuntu or Mint is that there are countless forums where you can post any problems you're having and people who know a lot more than you can try and help out. Often you'll already find the answer to your problem as someone else has already encountered the same thing.

Anyway, I hope reading this wasn't a waste of your time.

Not sure about you, but it certainly wasn't for me! Thanks so much, Andrew.

6. New Password for the Tool Box Archive

As a subscriber to the Premium version of this journal you have access to an archive of Premium journals going back to 2007.

You can access the archive right here. This month the user name is toolbox and the password is frostylawns.

New user names and passwords will be announced in future journals.

The Last Word on the Tool Box Journal

If you would like to promote this journal by placing a link on your website, I will in turn mention your website in a future edition of the Tool Box Journal. Just paste the code you find here into the HTML code of your webpage, and the little icon that is displayed on that page with a link to my website will be displayed.

If you are subscribed to this journal with more than one email address, it would be great if you could unsubscribe redundant addresses through the links Constant Contact offers below.

Should you be interested in reprinting one of the articles in this journal for promotional purposes, please contact me for information about pricing.