SMARTCAT tons of TAGS translating from French to Italian
ناشر الموضوع: Simona Arminio
Simona Arminio
Simona Arminio
إيطاليا
أنجليزي إلى إيطالي
+ ...
May 10, 2022

Hello everyone,
I would need your help because I started using SMARTCAT but I have a problem when I upload a file in French.
All the letters where there is an accent (and in French it's almost every word), Smartcat is inserting one of the yellow tags. To be able to confirm the segments I should add all those tags also to the right column: a nightmare!
Probably it's me that I'm not good because I'm a newbee, but I would be so grateful if you could help me.
Thanks a lot!


 
Stepan Konev
Stepan Konev  Identity Verified
الاتحاد الروسي
Local time: 00:11
أنجليزي إلى روسي
TransTools May 10, 2022

What you describe is called a tag soup. It happens when you ocr a bitmap image file with a tool that can't process the file properly. You have to clean your file before you import it into Smartcat. You can use TransTools (a better choice), or codezapper, or similar software.

Simona Arminio
Jorge Payan
Kevin Fulton
 
Simona Arminio
Simona Arminio
إيطاليا
أنجليزي إلى إيطالي
+ ...
بادئ الموضوع
Thanks for suggesting TransTools May 11, 2022

Stepan Konev wrote:

What you describe is called a tag soup. It happens when you ocr a bitmap image file with a tool that can't process the file properly. You have to clean your file before you import it into Smartcat. You can use TransTools (a better choice), or codezapper, or similar software.


Thanks a lot for your suggestion Stepan.
I have a Mac and apparently there is no Mac version for TransTools


 
esperantisto
esperantisto  Identity Verified
Local time: 00:11
عضو (2006)
أنجليزي إلى روسي
+ ...
مترجم الموقع
File format? May 13, 2022

What is the file format that you upload?

Simona Arminio
 
Simona Arminio
Simona Arminio
إيطاليا
أنجليزي إلى إيطالي
+ ...
بادئ الموضوع
SMARTCAT tons of TAGS translating from French to Italian May 13, 2022

esperantisto wrote:

What is the file format that you upload?


It's a Word file (saved for Mac).
I also tried to clean / remove all the accents from the source file in French but it is a nightmare as well and does not make any sense.


 
Stepan Konev
Stepan Konev  Identity Verified
الاتحاد الروسي
Local time: 00:11
أنجليزي إلى روسي
You can try this: May 13, 2022

Simona Ar wrote:
It's a Word file
Press Ctrl+A to select all, then press Ctrl+D in MS Word to open the font settings window. Go to the 'Advanced' tab, set 'Scale' to 100%, 'Spacing' to Normal and 'Position' to Normal too. This will remove most tags.
This issue happens because some OCR software recognizes each character individually and may add different formatting to different characters. When formatting changes within a segment, your CAT tool adds a tag. That is why you see so many tags.2022-05-13_170721

[Edited at 2022-05-13 14:13 GMT]


Simona Arminio
 
Simona Arminio
Simona Arminio
إيطاليا
أنجليزي إلى إيطالي
+ ...
بادئ الموضوع
SMARTCAT tons of TAGS translating from French to Italian May 13, 2022

Stepan Konev wrote:

Simona Ar wrote:
It's a Word file
Press Ctrl+A to select all, then press Ctrl+D in MS Word to open the font settings window. Go to the 'Advanced' tab, set 'Scale' to 100%, 'Spacing' to Normal and 'Position' to Normal too. This will remove most tags.
This issue happens because some OCR software recognizes each character individually and may add different formatting to different characters. When formatting changes within a segment, your CAT tool adds a tag. That is why you see so many tags.2022-05-13_170721

[Edited at 2022-05-13 14:13 GMT]


Thanks a lot for your suggestion Stepan!
Trying to apply what you're telling me on my Mac but can't find...
Cannot find the setting window where I can set: SCALE, SPACING and POSITION


 
esperantisto
esperantisto  Identity Verified
Local time: 00:11
عضو (2006)
أنجليزي إلى روسي
+ ...
مترجم الموقع
Some advice May 14, 2022

I. More or less simple:

1. Open your file in Word.
2. Open the formatting dialog by Ctrl+D, do what Stepan advises plus set a uniform font face for the entire text.
3. Select the entire text, set the language to None and hit F7 for spellcheck (almost immediately it will report completion).
4. Remove multiple spaces, non-breaking spaces, soft hyphens.

Also, try to save to RTF and save back to Word.

II. A bit more convoluted wa
... See more
I. More or less simple:

1. Open your file in Word.
2. Open the formatting dialog by Ctrl+D, do what Stepan advises plus set a uniform font face for the entire text.
3. Select the entire text, set the language to None and hit F7 for spellcheck (almost immediately it will report completion).
4. Remove multiple spaces, non-breaking spaces, soft hyphens.

Also, try to save to RTF and save back to Word.

II. A bit more convoluted way: try the following macro made by Marc Prior to level the character formatting of a paragraph (i. e. to set formatting for all characters of a paragraph equal to the last character):


Sub levelformat()
'
' levelformat macro
'
'
Selection.MoveUp Unit:=wdParagraph, Count:=1
Selection.MoveDown Unit:=wdParagraph, Count:=1, Extend:=wdExtend
Selection.MoveLeft Unit:=wdWord, Count:=1, Extend:=wdExtend
Selection.Copy
Selection.PasteSpecial DataType:=wdPasteText
End Sub


…and then also do steps 3 and 4 of the simple procedure.
Collapse


Simona Arminio
 
esperantisto
esperantisto  Identity Verified
Local time: 00:11
عضو (2006)
أنجليزي إلى روسي
+ ...
مترجم الموقع
And the simplest way is… May 14, 2022

1. Open the file in LibreOffice.
2. Ctrl + A to select the entire text.
3. Ctrl + M to reset formatting to the paragraph style.

If nothing of the above helps, share a sample file for further advise.


 
Hans Lenting
Hans Lenting
هولندا
عضو (2006)
ألماني إلى هولندي
CafeTran Espresso on Mac May 14, 2022

Simona Ar wrote:

I have a Mac and apparently there is no Mac version for TransTools


My suggestion:

Download CafeTran Espresso from www.cafetran.com.

Install it.

Drag the Ms Word document onto the Dashboard.

In the Project Configuration dialogue box choose the Ms Word OCR file format option.

Open the project.

Do nothing else.

Export the project: you'll get a clean version of the Ms Word document that you can process in Smartcat.

Import:

1

Export:

2

[Edited at 2022-05-14 08:41 GMT]


Simona Arminio
 
Simona Arminio
Simona Arminio
إيطاليا
أنجليزي إلى إيطالي
+ ...
بادئ الموضوع
SMARTCAT tons of TAGS translating from French to Italian May 15, 2022

esperantisto wrote:

I. More or less simple:




Thanks a lot for all your suggestions!
Cannot find all the commands you're mentioning on my Mac
Looks like the "Cafetran" solution is working...


 
Simona Arminio
Simona Arminio
إيطاليا
أنجليزي إلى إيطالي
+ ...
بادئ الموضوع
SMARTCAT tons of TAGS translating from French to Italian May 15, 2022

Hans Lenting wrote:

Simona Ar wrote:

I have a Mac and apparently there is no Mac version for TransTools


My suggestion:

Download CafeTran Espresso from www.cafetran.com.

[Edited at 2022-05-14 08:41 GMT]


Thanks a lot Hans!
I did what you suggested and it worked!
Honestly at this point I can also use Cafetran (that I didn't know) for the translation instead of Smartcat.
I'm trying to understand how much is the subscription for Cafetran. Is there a free version available?
Can I ask you which one would you suggest between SMARTCAT and CAFETRAN?
Thanks again


 
Hans Lenting
Hans Lenting
هولندا
عضو (2006)
ألماني إلى هولندي
CafeTran May 15, 2022

Simona Ar wrote:

Can I ask you which one would you suggest between SMARTCAT and CAFETRAN?


CafeTran

https://www.cafetran.com/get-cafetran/

Or via Proz Plus package:

Screen Shot 2022-05-16 at 09.12.57

That is 172,92 Euro.

[Edited at 2022-05-16 07:14 GMT]


Simona Arminio
 
Simona Arminio
Simona Arminio
إيطاليا
أنجليزي إلى إيطالي
+ ...
بادئ الموضوع
SMARTCAT tons of TAGS translating from French to Italian May 16, 2022

Hans Lenting wrote:

Simona Ar wrote:

Can I ask you which one would you suggest between SMARTCAT and CAFETRAN?


CafeTran

https://www.cafetran.com/get-cafetran/

Or via Proz Plus package:

Screen Shot 2022-05-16 at 09.12.57

That is 172,92 Euro.

[Edited at 2022-05-16 07:14 GMT]


Great, thanks Hans!


 


لم يتم تعيين مشرف خاص بهذا المنتدى
للإبلاغ عن انتهاكات لقواعد الموقع أو الحصول على مساعدة، يرجى الاتصال بـ العاملين في الموقع »


SMARTCAT tons of TAGS translating from French to Italian






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »