A computer journal for translation professionals

Issue 14-6-237
(the two hundred thirty seventh edition)

Contents

1. The Translator Must Always Be the Boss! (Premium edition)

2. Excelling Beyond Measure

3. Cross with Across

4. New Password for the Tool Box Archive

The Last Word on the Tool Box

Mobility

So, yes, I finally started using a smart phone a few months ago -- my wife told me that she'd had enough of all the fremdschämen she went through when she imagined me trotting around the globe, speaking about technology, and carrying a pre-Graham-Bell cell phone -- and I now use it very happily when I'm traveling. It's lovely (and frightening) to be in touch with clients and partners through email and social media from, well, anywhere. (If you read German, I would recommend Siggi Armbruster's clever article on the dual nature of availability in the translation world.) And clearly that's a big part of what our job is about: communication. However, the core part of our job -- translation -- is still a long way from being used productively in a mobile environment (I consider hybrid solutions like some keyboarded tablets no more mobile than notebook computers).

It struck me the other day that until we don't see "Sent from a mobile device. Please excuse any typos" at the bottom of emails, we won't be using our mobile devices in a truly productive fashion.

ADVERTISEMENT

The Words You Want. Anywhere, Anytime

Let the new WordFinder open a new world of opportunities -- get access to millions of words and translations from the best dictionaries, on your computer, via a web browser, on your smartphone or tablet. Stuffed with lots of smart features. The new WordFinder has what you need as a translator in your everyday work - anywhere, anytime!

Read more at www.wordfinder.com

1. The Translator Must Always Be the Boss! (Premium edition)

National Transportation Safety Board chairman Christopher Hart said this about the investigation into the 2013 crash of an Asiana Airlines plane in San Francisco: "... we have learned that pilots must understand and command automation, and not become over-reliant on it. The pilot must always be the boss." When I heard this on the radio this week, a big smile appeared on my face. What a perfect illustration of what I would like to share about translators and machine translation!

Recently I was invited to represent translators as a speaker at the European Association of Machine Translation conference, and I'm happy to report that it was a very rewarding event. One of the points in my presentation was a call to put an end to the assumption that post-editing is the only and most productive way to use machine translation.

You see, I think that we've looked at machine translation from only one angle -- that of post-editing -- and judged its effectiveness and desirability from that single perspective. And it's no wonder that the majority of professional translators haven't embraced post-editing of machine translation (PEMT). In the PEMT process, the proposed machine translation is the driving force; it is the agent (or -- to use the NTSB chairman's words -- "the boss"), and it objectifies the translator or post-editor who is reduced to a purely reactive role.

One reason I have been pushing the term "translation environment tool" (TEnT) is that it places the translator in the very center of a process supported by the tools she needs to successfully work on and finish her translation. There really is no place for post-editing in that kind of environment. Why? Because post-editing merely reduces the translator to another one of the tools.

So what other ways could there be to use machine translation? I outlined a few of them as predictions for 2012, and now -- two years later -- they seem to be coming to fruition.

One that I've mentioned a number of times before is using machine translation to correct fuzzy TM matches with MT proposals. This is a very simple process that can prove to be very powerful. The idea is that if your TEnT can recognize which part of a translation unit is causing the match to be fuzzy (i.e., the "offending" part), and it can identify which part in the target unit corresponds to that, the tool can just as well go out to an associated machine translation engine, get a proposal for the term or subsegment, and replace the "offending" target part.

Following is an example where the fuzzy match ("Die Augen der Katze sind braun") was turned into a perfect match by automatically identifying the "offending" part ("braun") and automatically replacing it with the correct term that it pulled from a machine translation engine. (Since in this case there were several MT engines connected to the project, there was actually more than one proposal; this becomes apparent by "gelb" being underlined.)

Chances are you will still have to do some clean-up after that (or you might not be able to use the MT insertion at all), but you can see that this has the potential for a great deal of productivity improvement. In fact, if implemented well, it might even be possible to lower the fuzzy threshold (which most of us probably have at around 85% at the moment) to benefit more from TM content. Tools that presently support this include Déjà Vu X2/3 (that's the tool I used for the example above) and Star Transit NxT, but the makers of Wordfast Classic/Pro, XTM, and Wordbee are also working on implementing this feature. (I think if someone were able and willing to implement the same in a Trados Studio app, it would quickly become one of the favorite downloads.)

That's one example of how machine translation can be used more productively and in a more translator-centric way.

Another translator-centric way of using machine translation that I've finally understood as a potentially highly productive way of using MT is the AutoComplete (or AutoWrite or AutoSuggest, depending on the tool's nomenclature) feature that Wordfast Classic, Déjà Vu X2/3, and Trados Studio are already offering (in the case of the latter, through some apps from SDL's OpenExchange app store).

This is how the process works: Rather than presenting machine translation proposals that need to be post-edited, it's the translator's keystrokes that prompt suggestions from machine translation engines -- very much along the lines of the translator as the agent. If, as in the case of the example below, there is more than one machine translation engine associated with the project, all of them are interactively polled for matches with the keystrokes that have already been entered, and the suggestions are displayed as an AutoComplete tool tip that you can select by pressing the Enter key or with a mouse click.

This is particularly interesting because it shows more than just the complete matches for the whole segment as you translate; it also shows the subsegments, which continue to adjust themselves to what you type. Now, for those among you who read German, you will recognize that none of the suggestions is perfect as a completion for what had already been entered, but the third suggestion comes very close to a possibility, with only the need to change the ending of the verb.

I recently finished a large project where I tested this way of working with MT. I really liked that it saved a lot of time while also avoiding the unfortunate influence of preoccupying the translator's mind with suggestions (a process that most of you are familiar both from TM and MT). In fact, I found myself more often than not just choosing a few words and phrases here and there from MT suggestions rather than lengthy segments. Still, I know that I was able to work significantly much faster.

The example above is done with Google Translate, Microsoft Bing Translator, and Systran, but at the conference I realized that there is so much more that can be done with such a system.

Here is a case in point: While the European Patent Office (EPO) has chosen to collaborate with Google to build up a version of Google Translate for patent translation, WIPO, the World Intellectual Property Organization, strongly rejected that concept and built its own machine translation engine based on the open-source Moses platform. So far there are only a handful of languages available (English into and out of French, Japanese, Chinese, and German), but it has some features that are really interesting. Especially when you go to its "interactive" edition right here, you can see a couple of the more advanced features.

In the lower half of the window (which appears once you send a text for translation), you can see that when you select any part of the source segment, a translation for that subsegment is entered into the target pane. Of particular interest, the MT engine not only gives you that one match but also a list of other possible matches in the dropdown box right under Proposals. That's how machine translation works -- for any segment or subsegment, a very large number of proposals are generated, but typically only the one with the highest (internal) confidence score is actually used and exposed to the user. The WIPO developers have managed to show us other matches as well.

It would be interesting for translators who want to use a translator-centric MT process to have all the possibilities -- or at least some of them -- displayed as AutoSuggestions.

Google Translate in its latest incarnation does something similar. While giving the translation for the whole segment, you can click on individual subsegments to see alternate translations. To some degree, Xingzeng Liu, the developer of the free Trados Studio app Google Translate AutoSuggest, uses this mechanism. After you install the app, it looks for your first keystrokes and suggests the translation of the complete segment from Google Translate if it matches the characters you have entered. If your keystrokes don't match the Google Translate proposal or you choose not to use it, it'll suggest subsegments according to the subsegmentation that Google Translate has already performed. Unfortunately, it doesn't also suggest the alternate translations, which would be very nice.

(If you do use Google Translate as a translation aid and are a Trados user, you should definitely think about using Xingzeng's tool because he has found a way to retrieve the Google Translate matches without forcing you to pay for the API access -- enjoy this while it lasts.)

While we're talking about Trados AutoSuggest apps for machine translation access: CodingBreeze's MT AutoSuggest app is interesting as well. This works with any kind of machine translation engine you might have associated with your project and gives you AutoSuggest proposals for subsegments. It bases the delimitation of those subsegments on things like segment-internal punctuation (including commas, semicolons, etc.) and a certain number of words. It does not yet propose suggestions from more than one machine translation engine at a time, but I talked to the developers and they promised me that they would work on that next.

(Incidentally, if you would like to use the Trados app for the Microsoft Bing Translator and are using Trados Studio 2014, you can find a workaround to install the 2009/2011 version right here.)

So to come back to the purpose of this article: these two translator-centric ways of using MT have already been implemented in some commercial tools, and others will follow soon (laboratory tools like CASMACAT or MateCat are also experimenting with more advanced MT features, but they're not really in the production phase). There is plenty that can be done better, and there are plenty other more creative uses of MT, but what's important in all of this is that we need to stop talking about post-editing as if this were the only way to approach MT. I am willing to bet that with a tool that has a well-implemented AutoSuggest feature for MT or several well-trained MT(s), professional translators can produce high-quality output in a more productive manner than by post-editing raw MT output from one engine. And they can do that while respecting and utilizing their human translation skills and environment.

ADVERTISEMENT

Post-Editing Machine Translation Certification

New course provided by SDL, for free

SDL has developed a course with associated certification for all translators who wish to extend their skillset and learn about Post-Editing Machine Translation.

"SDL has a strong heritage with Machine Translation spanning almost 15 years. With the SDL Post Editing Certification, we are really excited to be sharing this expertise and experience with the wider industry in the hope of removing some of the mystique and fear around PEMT. Freelancers are increasingly being asked to accept Post-Editing jobs - but are wary of taking the plunge. This easily-digestible program is designed to give folk the skills and confidence to get started, without any financial outlay."

Melissa Kane, Director of Intelligent Machine Translation Solutions, SDL

2. Excelling Beyond Measure

Yes, I've mentioned ASAP Utilities before, but the point is this: Can you who have used this super-helpful tool for all these years after you first read about it in the Tool Box Journal imagine the joy it brings to someone who has never heard of it and will now discover it?

The guy behind ASAP Utilities is pretty good about talking up his tool himself, so I will him let do that job on his website. But if you, like most translators, still have to deal with Excel-based data quite a bit and sometimes wish that you had more flexibility to manipulate the data in almost any way you see fit, you should make use of the more than 300 individual tools that ASAP Utilities offers right within Excel.

Some of my favorite functions include the ability to count characters in individual cells, helpful formatting and selection functions, and filtering options that I did not even know existed.

There is a free version you can use, but after using it for many years I have actually decided to pay up (and I felt really good about myself afterward).

Please note that the tool will not work if you are using a 64-bit version of Excel.

Another Excel tool that I have not used myself but for which I have the very reliable recommendation of Renato Reinau (who as a dual citizen of Argentina and Switzerland must feel pretty good about himself in relation to all things football right now...).

Synkronizer is a tool that allows you to compare and merge two Excel files in very advanced ways, something that could be particularly useful when dealing with different versions of glossaries. You can find more information right here.

3. Cross with Across

Across is a translation environment tool that many love to hate -- more on that later -- but there are actually some -- even translators -- who really like it. While numbers are not always the best indicators, they still have some validity, and here are some that were given to me a couple of months ago by the folks from Across. The free version for freelancers and students had a total of 26,979 users with about 900 new monthly registrations. Those numbers may not really be that meaningful because any and every translator who ever worked with a client who used Across had to download and register it -- and many several times -- but here is a number that is more meaningful: worldwide there are 950 Across server installations, 65% of which are translation buyers and 35% LSPs.

Now Across is just about to release version 6, and I had a chance to talk with Across's Christian Weih about it. In general I think it would be fair to say that the new version will be focused primarily on the server users with a number of business-friendly features. This is not surprising. Not only are those the two groups that actually pay for the product, but Across has also spent a fair amount of effort talking to these two user groups with events like the LSP Day they just convened.

Christian did promise, though, that in the next version there will be significantly more emphasis on the translation aspect of the tool, which will include invitations to groups of translators who will help shape some of the translation interface that many feel is not particularly ergonomic and user-friendly. One of the features that will be introduced at that point will be the ability to work directly in the translation grid rather than in a separate area like some other tools also offered in the 1990s.

Back to the new version at hand. The new features include a new engine with a much smaller footprint on your computer -- some of you might yawn when you read that, but those who have used Across, especially some of the very resource-heavy earlier versions, will be delighted.

The "What do you want to do" that is displayed when you open Across (which I personally always found kind of silly) is now replaced with a "Dashboard". This Dashboard is sort of reminiscent of the Windows 8 tile view (the Across tiles are called "dashlets") but potentially more useful. Folks who work in Across for most of the day will be able to place customizable links for all kinds of Across and non-Across resources and applications. For those who use Across in a more limited fashion, it will probably not be too relevant.

For project managers, the "Cockpit" (hey, I'm trying to be kind and not make fun of all these names) will probably be the most relevant new feature. Project management was rather rigid (Christian's own words) in Across, with very few ways to customize; now the Cockpit view, which is the central control console with all the projects listed for the project manager, can be filtered and adjusted to any criteria.

For the business manager, the "Data Cube" should be the most relevant feature. The Data Cube is a business intelligence module that allows you to run a large number of preconfigured and customizable reports in Excel, based on data that is automatically extracted from Across. Really the only translation environment tool that I'm aware of that is able to produce a similar range of reports with such ease is Wordbee. I was quite impressed with that feature, and I also liked the fact that it's done within Excel, which not only is the most familiar environments for "business types" but also makes it very easy to share the reports.

Another element that I find interesting is the web-based review mode that includes a change history, a nice feature for customer edits without having to install the desktop client.

As a side note, I thought it was remarkable that for the last few years, development efforts for Across have taken place first in the web client (which typically is harder to program) and are then ported to the desktop client to assure as much feature parity as possible.

For some groups within the translation chain, the new version 6 should be a welcome new version. For translators it probably does not make that much of a difference -- and hopefully this will be different in the next version.

Before I end, let me make a few comments about a topic that has caused a measure of anger among some in the world of translation.

There is a certain kind of political correctness when it comes to technology in general, and translation technology in particular. The "correctest" political correctness is to say that "the tool that we develop supports open exchange standards because we believe in an open world where users can choose whatever they like." Virtually every tool vendor makes this claim and does more or less support the typical standards, including TMX, TBX, and XLIFF, even though no tool vendor truly wants users to choose whatever tool -- that would be business folly.

Across has never supported exchange standards and has recently found that this is actually a marketing message that might just work for the business segment they are aiming at. There is no other company with the gall to say publicly, "We don't believe in open project exchange formats" -- as Christian Weih did earlier this week (again). Some (including some of you) have really, really strong sentiments about that and are not shy to express those, but whenever I hear him say that, I can't help but smile. We are not a monolithic industry. We are like a quilt made up of many patches with very, very different interests. If Across customers either ask for a closed system that prevents data being taken out or external data being brought in, or if Across is able to convince them that's in their best interest, let them be. It's not great for translators -- I know that, I am one myself and I have (rarely) worked with Across -- but it might just be the way to get to those clients who will either like it or at some point realize that this is not the most current of concepts and then change technology.

For now, I can't but admire Across's honesty and marketing savvy -- and if other developers are really honest, unless they are developing open source software, I imagine there is an element of respect they have for this "my way or the highway" kind of attitude. Even though they would never admit to it.

ADVERTISEMENT

Save 80% of the clicks when setting up translation projects, and localize images & embedded objects with memoQ 2014

Adding significant functionality to the previous version, memoQ 2014 offers a wealth of productivity boosters for freelance translators, language service providers and enterprise customers alike.

Working with memoQ 2014's new project template and workflow automation feature, both freelance translators and project managers can save 80% of the clicks previously needed for project setup and management.

Attend Kilgray's free webinars to learn more about all the most recent developments and give a try to memoQ 2014 by clicking here.

4. New Password for the Tool Box Archive

As a subscriber to the Premium version of this journal you have access to an archive of Premium journals going back to 2007.

You can access the archive right here. This month the user name is toolbox and the password is lamems.

New user names and passwords will be announced in future journals.

The Last Word on the Tool Box Journal

If you would like to promote this journal by placing a link on your website, I will in turn mention your website in a future edition of the Tool Box Journal. Just paste the code you find here into the HTML code of your webpage, and the little icon that is displayed on that page with a link to my website will be displayed.

If you are subscribed to this journal with more than one email address, it would be great if you could unsubscribe redundant addresses through the links Constant Contact offers below.

Should you be interested in reprinting one of the articles in this journal for promotional purposes, please contact me for information about pricing.