Preloading Machine Translation
Thread poster: Ian Kahn
Ian Kahn
Ian Kahn
United Kingdom
Local time: 22:09
German to English
Dec 4, 2019

Hey everybody,

Let's say I'm working on a project with no access to the internet.

Is there any way for OmegaT to "pre-load" machine translations (I use the Google Translate API) that I can then see while I'm translating each line?

Would be pretty helpful and seems pretty simple to do.


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 23:09
Member (2006)
English to Afrikaans
+ ...
@Ian Dec 4, 2019

Ian Kahn wrote:
Is there any way for OmegaT to "pre-load" machine translations that I can then see while I'm translating each line?


It's simple enough, but it will take some work.

1. Extract all segments from OmegaT.
2. Translate those segments using Google Translate.
3. Align the extracted source text with the machine translated text to create a TMX memory.
4. Use that TMX memory in OmegaT in the /tm/ subfolder somewhere.

Ways to extract all segments from OmegaT:
- using a script (usually bundled with OmegaT) that writes source and/or target to a file.
- by using Ctrl+F, then selecting "regular expressions" and setting the number of results to 100 000, and then searching for ".".
- if it's all one file, by simply selecting all text in OmegaT and copy/pasting it to a plain text file.

To translate all segments using Google Translate, you can use the AutoIt script mentioned here:
https://www.proz.com/forum/cat_tools_technical_help/308360.html

You're going to have to google for how to align two files. There is e.g. LF Aligner.

I suggest you put the TMX in a subfolder called tm/penalty-10/ so that OmegaT penalizes fuzzy matches from the machine translated TMX file by 10%.

Finally, the DGT fork of OmegaT has some interesting features regarding machine translation, so check it out (it contains pretty much the same features as the official OmegaT, and there is no bad blood between the developers):
http://185.13.37.79/?q=node/31


 
Ian Kahn
Ian Kahn
United Kingdom
Local time: 22:09
German to English
TOPIC STARTER
Thanks Dec 4, 2019

Samuel Murray wrote:

It's simple enough, but it will take some work.



Thanks so much Samuel! That's really helpful!


 
tcordonniery
tcordonniery
France
Local time: 23:09
Call pre-translation with DGT-OmegaT Dec 21, 2019

Samuel Murray wrote:
Finally, the DGT fork of OmegaT has some interesting features regarding machine translation, so check it out (it contains pretty much the same features as the official OmegaT, and there is no bad blood between the developers):
http://185.13.37.79/?q=node/31


Thanks Samuel for making reference to DGT-OmegaT.
Indeed, I have added some features which could help for this problem.

In command line, I extended the feature create pseudo translate (in standard OmegaT it only enables empty or "translation like the source"):
java -jar OmegaT.jar --console-pseudotranslatetmx --pseudotranslatetmx=/where/to/put/file.tmx --pseudotranslatetype=Google2
(techically, Google2 will be replaced by org.omegat.core.machinetranslators.net.Google2Translate, which is the class to be called)

This will generate a TMX which you can put in the mt/ folder, and then work offline. Translations will appear in the MT pane but referred as "Local", not as "Google".

Other possibility is, from the UI, to use the menu "Edit => Search & Pre-translate".


 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


Preloading Machine Translation






Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »