Electronics-Related.com
Forums

Translation services/strategies/costs

Started by Don Y December 22, 2021
Jeroen Belleman wrote:
> On 2021-12-22 11:33, Don Y wrote: >> On 12/22/2021 2:50 AM, Jeroen Belleman wrote: >>> On 2021-12-22 10:17, Martin Brown wrote: >>>> [...]  In some ways their knowledge of more complex English >>>> grammar was better than many native English speakers today. Correct >>>> usage of "I would be obliged if you could" vs "would" - there is a >>>> hidden insult in the first phrase which has now sort of been lost in >>>> common English usage. >>>   [...] >>> >>> Would you elaborate on that a little? The nuance escapes me. >>> Is it that 'could' throws doubt on the other's ability? >> >> Yes.  It's the "Can" vs. "May" issue. >> >> There are innumerable other "technical screwups" that have crept into >> the language that folks either are oblivious to or unconcerned with >> fixing.  (My personal pet peeve is the use of "what" in place of "that": >> "He's the guy what sold me that lemon of a car!") > > Indeed. I still think it's supposed to be "He's the guy /who/ ...". > That may be British rather than American English. Languages evolve. > > Jeroen Belleman >
"The guy what sold me..." would mark the speaker as illiterate anywhere in the northern US AFAIK, except in parts of the Bronx or Staten Island, northeastern New Jersey, and maybe outer Brooklyn. Cheers Phil Hobbs -- Dr Philip C D Hobbs Principal Consultant ElectroOptical Innovations LLC / Hobbs ElectroOptics Optics, Electro-optics, Photonics, Analog Electronics Briarcliff Manor NY 10510 http://electrooptical.net http://hobbs-eo.com
On Wed, 22 Dec 2021 11:34:57 -0500, Phil Hobbs
<pcdhSpamMeSenseless@electrooptical.net> wrote:

>Jeroen Belleman wrote: >> On 2021-12-22 11:33, Don Y wrote: >>> On 12/22/2021 2:50 AM, Jeroen Belleman wrote: >>>> On 2021-12-22 10:17, Martin Brown wrote: >>>>> [...]&#4294967295; In some ways their knowledge of more complex English >>>>> grammar was better than many native English speakers today. Correct >>>>> usage of "I would be obliged if you could" vs "would" - there is a >>>>> hidden insult in the first phrase which has now sort of been lost in >>>>> common English usage. >>>> &#4294967295; [...] >>>> >>>> Would you elaborate on that a little? The nuance escapes me. >>>> Is it that 'could' throws doubt on the other's ability? >>> >>> Yes.&#4294967295; It's the "Can" vs. "May" issue. >>> >>> There are innumerable other "technical screwups" that have crept into >>> the language that folks either are oblivious to or unconcerned with >>> fixing.&#4294967295; (My personal pet peeve is the use of "what" in place of "that": >>> "He's the guy what sold me that lemon of a car!") >> >> Indeed. I still think it's supposed to be "He's the guy /who/ ...". >> That may be British rather than American English. Languages evolve. >> >> Jeroen Belleman >> > >"The guy what sold me..." would mark the speaker as illiterate anywhere >in the northern US AFAIK, except in parts of the Bronx or Staten Island, >northeastern New Jersey, and maybe outer Brooklyn. > >Cheers > >Phil Hobbs
"Done sold me" is the past tense. It's comprehensible. Just a regional variant, likely not correlated to literacy. -- I yam what I yam - Popeye
On a sunny day (Wed, 22 Dec 2021 09:30:43 -0700) it happened Don Y
<blockedofcourse@foo.invalid> wrote in <spvjrv$tj1$1@dont-email.me>:

>It would be a tough call to determine if American English had evolved more >OR LESS than the original British. I've read that American English is, in >many ways, truer to its British roots than modern British English. > >Pronunciations also evolve, over time. As well as speech patterns. > >E.g., I was taught "the" should be pronounced as "thee" when preceding >a word beginning with a vowel sound: "Thee English", "Thee other guy" >but with a schwa ahead of a consonant: "The next one", "the Frenchman". >This seems to no longer be the norm. > >[You're interested in these sorts of things when you design a >speech synthesizer; the different "wh" sounds, etc.]
A pretty decent text to speech is google translate. This script, called gst2_en on my system, has a female talk in english: #!/bin/bash say() { local IFS=+;/usr/bin/mplayer -ao alsa -really-quiet -noconsolecontrols "http://translate.google.com/translate_tts?ie=UTF-8&client=tw-ob&q=$*&tl=en"; } say $* You call it like this (with your text as example): gst2_en ">E.g., I was taught "the" should be pronounced as "thee" when preceding" In the script the &tl=en can be changed for the language you want, so &tl=nl for Dutch and &tl=de for German. If you want the output to go to a mp3 file then use mplayer -dumpstream in that script. I find the quality better than other things I have tried. All Linux of course
jlarkin@highlandsniptechnology.com wrote:
> On Wed, 22 Dec 2021 11:34:57 -0500, Phil Hobbs > <pcdhSpamMeSenseless@electrooptical.net> wrote: > >> Jeroen Belleman wrote: >>> On 2021-12-22 11:33, Don Y wrote: >>>> On 12/22/2021 2:50 AM, Jeroen Belleman wrote: >>>>> On 2021-12-22 10:17, Martin Brown wrote: >>>>>> [...]&nbsp; In some ways their knowledge of more complex English >>>>>> grammar was better than many native English speakers today. Correct >>>>>> usage of "I would be obliged if you could" vs "would" - there is a >>>>>> hidden insult in the first phrase which has now sort of been lost in >>>>>> common English usage. >>>>> &nbsp; [...] >>>>> >>>>> Would you elaborate on that a little? The nuance escapes me. >>>>> Is it that 'could' throws doubt on the other's ability? >>>> >>>> Yes.&nbsp; It's the "Can" vs. "May" issue. >>>> >>>> There are innumerable other "technical screwups" that have crept into >>>> the language that folks either are oblivious to or unconcerned with >>>> fixing.&nbsp; (My personal pet peeve is the use of "what" in place of "that": >>>> "He's the guy what sold me that lemon of a car!") >>> >>> Indeed. I still think it's supposed to be "He's the guy /who/ ...". >>> That may be British rather than American English. Languages evolve. >>> >>> Jeroen Belleman >>> >> >> "The guy what sold me..." would mark the speaker as illiterate anywhere >> in the northern US AFAIK, except in parts of the Bronx or Staten Island, >> northeastern New Jersey, and maybe outer Brooklyn. >> >> Cheers >> >> Phil Hobbs > > "Done sold me" is the past tense. > > It's comprehensible. Just a regional variant, likely not correlated to > literacy.
"Done sold me" is more of a Southernism, no? You don't hear that round here, anyway. Cheers Phil Hobbs -- Dr Philip C D Hobbs Principal Consultant ElectroOptical Innovations LLC / Hobbs ElectroOptics Optics, Electro-optics, Photonics, Analog Electronics Briarcliff Manor NY 10510 http://electrooptical.net http://hobbs-eo.com
On Wed, 22 Dec 2021 01:02:44 -0700, Don Y
<blockedofcourse@foo.invalid> wrote:

>I'm looking for folks who've first hand experience having >documents translated into foreign languages. Said documents >to include diagrams (think: callouts, legends), #included >text, etc. > >I've a fair bit of experience with I18N/L10N for software >but the extent of the effort, there, is usually fairly limited. >And, there's less of a need for a cohesive approach as the >interactions are "punctuated" (no pun intended). > >Recommendations for firms to do this? (no, finding multilingual >"friends" to do same is far too unprofessional -- though they may >have value in proofing the results) I suspect there is some >value in having a single firm handle all of the translations >(in the hope that they will create a consistent SET of >translations, even if different individuals are involved for >each) > >Relative effort? (i.e., closer to reading speed or writing speed?)
Writing speed. Fluency in the technical domain, plus native fluency in the target language, are both necessary.
>Time frame? (is this effort-bound or business-bound) > >Cost? (and, "unit of measure"?)
Slow and expensive.
>Finally, how to check the translation for accuracy and "feel" >(i.e., ensuring it is true to the original intent)?
Always need a proof reader and a tech editor in the target language; need not be capable of translation.
>With translations in hand, do you (thereafter) maintain >individual documents? Or, merge them into a conditional >document?
Same as for the original document, but in versions. With luck, the drawings are in common.
>Horror stories of attempts gone horribly wrong (i.e., what to >avoid)?
Only on the receiving end so far. Here, I do have a war story from the 1970s: I was considering how to interface a typewriter-like printer made by ABB or the like, and so was studying the English-language interface manual, with text and sequence diagrams and the like. It read like perfectly good English, but was incomprehensible. So, I ignored the text and studied the figures. Whereupon it became clear how that interface worked, and a bit later what was wrong with the text. Unlike English, all Swedish pronouns are gendered, which gender is grammatical and has little to do with actual gender. Which means that in a Swedish sentence one can carry about twice as many pronoun reference without ambiguity than in English. Well, you guessed it -- what had happened is that Swedish pronouns were all directly 2:1 mapped to the corresponding English pronoun, without recasting the sentences to remove the now massive ambiguities. And that interface turned out to be too complicated for what it was, so I gave up on that printer. Another translation issue came up maybe 5 years ago, when we were using a FPGA board from a Danish vendor. Their user manual was maybe 10 pages long, incomprehensible in many places, and a factor ten too short to adequately describe the board and how to implement your stuff on it. It had been written by harried Danish engineers and perhaps a tech editor in English, their second language. My advice to the President of the Danish firm was to have his engineers write the first draft in Danish, and hire a tech editor whose native language is English to make the translation and perform the cleanup. The tech writer was allowed to question the engineers until the editor understood, so the editor in effect stood in for the English-speaking customer audience. This was done. I did a full tech-edit scan of the result, and it read very well, and was perfectly clear. Only needed to fix one usage problem. It still was not large enough to fully describe that product, but still this was great progress. Joe Gwinn
On 12/22/2021 5:09 AM, Martin Brown wrote:
>>>> With translations in hand, do you (thereafter) maintain >>>> individual documents? Or, merge them into a conditional >>>> document?
----^^^^^^^^
>>> The way we did it was have a separate set of documents and resource files >>> for the text components in each language and a language code. I doubt if >>> that part has changed. They get bigger with time. Then you can add new >>> phrases to the end as and when needed. >>> >>> That way additional languages can be added easily if someone else is willing >>> to do the work. >> >> I don't understand. Are you building a translation *dictionary* that you >> apply to create the document(s)? > > Not a dictionary so much as a set of key phrases that will appear in dialogue > boxes in the software - each having a unique numeric token. > > It is also a part of the way the computer assisted translation engines work for > commonly used constructs.
I was referring to how you prepare multilingual *documents*, not executables. E.g., you will often encounter things like "instruction manuals" that have a section (pages) in English, followed by one in Spanish, followed by... AS IF they were separate documents strung together. Alternatively, one can create a single document with the different embedded within and *conditionally* expose one (or another). In which case, the instruction manual would be generated by: LANGUAGE=English Print document LANGUAGE=Spanish Print document LANGUAGE=... My point is intended to address how you expose the translations to the writer, charged with maintaining ALL of them (though likely with a service bureau's assistance) as *each* is revised. Separate documents seems like it would run the risk of certain languages lagging in their currency.
>>>> Horror stories of attempts gone horribly wrong (i.e., what to >>>> avoid)? >>> >>> Translations by some willing amateur who is not a native speaker of the >>> destination language and which you are unable to check for veracity. Think >>> most Chinglish instruction manuals for cheap hitech gear. >> >> Yes, exactly. But, that would be easy to avoid. >> >> The bigger fear is hiring a service bureau and later discovering they >> were little better than mechanical translators (again, because "you" >> likely can't review the result, directly). > > I think you have to ask around for recommendations in the territory or where > you want to have the work done. Some of our translations were done by the > national distributors (Korean for instance) and seemed to go OK. The big jumps > were doing the first non-English one that included accents and top bit set > characters and then the Japanese one with full DBCS. It got a bit easier after > that. > > There is a lot more support for internationalisation DBCS these days.
Yes, I'm not worried about the "tools" but, rather, the "expertise". I can create a document with 30 different "languages" (UTF-16) but that doesn't mean *I* can tell you if what the document *says* is correct, consistent, etc. And, the shorter the text snippet, the worse the potential "size change" as it undergoes translation (i.e., a *book* will tend to see a smaller change in overall length than a single word) So, callouts in figures are risky: Imagine an illustrated volume annotated (in the obvious way) with: length width height In Armenian: &#1381;&#1408;&#1391;&#1377;&#1408;&#1400;&#1410;&#1385;&#1397;&#1400;&#1410;&#1398; &#1388;&#1377;&#1397;&#1398;&#1400;&#1410;&#1385;&#1397;&#1400;&#1410;&#1398;&#1384; &#1378;&#1377;&#1408;&#1393;&#1408;&#1400;&#1410;&#1385;&#1397;&#1400;&#1410;&#1398; A naive author/illustrator would plan for the english text in placing the legend/callout and be frustrated when the translator provided these lengthier translations.
On 22/12/2021 20:21, Don Y wrote:

> A naive author/illustrator would plan for the english text in > placing the legend/callout and be frustrated when the translator > provided these lengthier translations.
Fixed field lengths always cause trouble. Your best bet is rescale the text to fit the space and pray that it remains legible. In hitech industries you do have a sporting chance that the skilled technical people will be able to read some English (and may understand a lot more spoken English than they let on during negotiations). My wife's name contains phonemes that are all but impossible in Japanese and her transliterated name overflowed the bank card field allowed. Mine survived a little bit better as Ma-chin Buroun (roughly). The only lonely consonant is "n" and some Romaji consonants are not available. Most real Japanese names will fit in at most 5 characters (many just 4). My company was taken over whilst we were there by a big UK company that had a similar problem with its name being impossible to represent in Japanese. They rebranded us (long established two easy phonemes company) with their new snazzy Western name which was impossibly dumb in a culture that values long established brands and relationships! They went bust 5 years later. (I was well out of it by then) -- Regards, Martin Brown
On 12/22/2021 1:57 PM, Martin Brown wrote:
> On 22/12/2021 20:21, Don Y wrote: > >> A naive author/illustrator would plan for the english text in >> placing the legend/callout and be frustrated when the translator >> provided these lengthier translations. > > Fixed field lengths always cause trouble. Your best bet is rescale the text to > fit the space and pray that it remains legible.
But in software, you are aware of a "field length" and can consciously "pad" the length available in anticipation of "longer content". By contrast, when you are illustrating something, you don't tend to think "gee, I'd better locate this callout with a fair bit of empty space surrounding IN CASE some other translation happens to require *more* space. You'd not want to have to *redraw* an illustration just to accommodate a "wider" callout.
> In hitech industries you do have a sporting chance that the skilled technical > people will be able to read some English (and may understand a lot more spoken > English than they let on during negotiations).
Things that are highly technical are usually not a problem; the target audience can comprehend the "important words" (and skip over the fluff). A datasheet being the simplest case of extracting content without having a clue as to what the text proclaims! The bigger problem is things that are more "reasoned" or presenting arguments/rationales/side-effects/etc. These aren't usually as terse and easy to parse to their intent.
> My wife's name contains phonemes that are all but impossible in Japanese and > her transliterated name overflowed the bank card field allowed.
Kalahari? :> You don't have to veer far from the european languages to find gotchas in translations, odd (mis?)spellings, etc. "Preservative" will raise eyebrows in french culture ("preservatif") Polish tends to omit the V, X and Q graphemes. So, my friend Eva's name is Ewa (I earned her friendship by being able to *spell* it when I first met her; "Eva? As in E W A?"). And, of course, all vowels (or so it seems :> ) Ditto the german F vs. V
> Mine survived a little bit better as Ma-chin Buroun (roughly). The only lonely > consonant is "n" and some Romaji consonants are not available. > > Most real Japanese names will fit in at most 5 characters (many just 4). > > My company was taken over whilst we were there by a big UK company that had a > similar problem with its name being impossible to represent in Japanese. They > rebranded us (long established two easy phonemes company) with their new snazzy > Western name which was impossibly dumb in a culture that values long > established brands and relationships! > They went bust 5 years later. (I was well out of it by then)
On 12/22/2021 10:15 AM, Jan Panteltje wrote:
> On a sunny day (Wed, 22 Dec 2021 09:30:43 -0700) it happened Don Y > <blockedofcourse@foo.invalid> wrote in <spvjrv$tj1$1@dont-email.me>: > >> It would be a tough call to determine if American English had evolved more >> OR LESS than the original British. I've read that American English is, in >> many ways, truer to its British roots than modern British English. >> >> Pronunciations also evolve, over time. As well as speech patterns. >> >> E.g., I was taught "the" should be pronounced as "thee" when preceding >> a word beginning with a vowel sound: "Thee English", "Thee other guy" >> but with a schwa ahead of a consonant: "The next one", "the Frenchman". >> This seems to no longer be the norm. >> >> [You're interested in these sorts of things when you design a >> speech synthesizer; the different "wh" sounds, etc.] > > A pretty decent text to speech is google translate. > > This script, called gst2_en on my system, has a female talk in english: > > #!/bin/bash > say() { local IFS=+;/usr/bin/mplayer -ao alsa -really-quiet -noconsolecontrols "http://translate.google.com/translate_tts?ie=UTF-8&client=tw-ob&q=$*&tl=en"; } > say $* > > > > You call it like this (with your text as example): > gst2_en ">E.g., I was taught "the" should be pronounced as "thee" when preceding" > > In the script the &tl=en can be changed for the language you want, so &tl=nl for Dutch and &tl=de for German. > > If you want the output to go to a mp3 file then use mplayer -dumpstream in that script. > > I find the quality better than other things I have tried. > > All Linux of course
There are lots of synthesizers out there -- FOSS as well as commercial. But, those that run on a PC tend to be bloated implementations -- large dictionaries, unit databases, etc. And, require a fair bit of CPU to deliver speech in real-time. If you're trying to run in a small footprint consuming very little "energy" (think tiny battery), there really isn't much choice -- esp if you want to be able to tweek the voice to suit the listeners' preferences (with unconstrained vocabulary) And, all suffer from requiring some level of smarts at the application level. Feed it "Blue orange dog cat run" or "Mr Mxyzptlk" or even something as bland as "abcdefghijklmnopqrstuvwxyz" and they yield results that are unfathomable -- without *looking* at the source text to try to suss-out what they are *trying* to say.
On 12/22/2021 1:20 PM, Joe Gwinn wrote:
> On Wed, 22 Dec 2021 01:02:44 -0700, Don Y > <blockedofcourse@foo.invalid> wrote: > >> I'm looking for folks who've first hand experience having >> documents translated into foreign languages. Said documents >> to include diagrams (think: callouts, legends), #included >> text, etc. >> >> I've a fair bit of experience with I18N/L10N for software >> but the extent of the effort, there, is usually fairly limited. >> And, there's less of a need for a cohesive approach as the >> interactions are "punctuated" (no pun intended). >> >> Recommendations for firms to do this? (no, finding multilingual >> "friends" to do same is far too unprofessional -- though they may >> have value in proofing the results) I suspect there is some >> value in having a single firm handle all of the translations >> (in the hope that they will create a consistent SET of >> translations, even if different individuals are involved for >> each) >> >> Relative effort? (i.e., closer to reading speed or writing speed?) > > Writing speed. Fluency in the technical domain, plus native fluency in > the target language, are both necessary.
So, you are assuming there is no learning curve for the material? Or, that the original author is conveniently available (and communicative with translator) to resolve those issues as they manifest?
>> Time frame? (is this effort-bound or business-bound) >> >> Cost? (and, "unit of measure"?) > > Slow and expensive.
But what is the unit of measure? Page? Job? How does it scale? (e.g., if you bundle two 50 page documents together, do you see a better price than if kept separate? Or, vs. a 100pp document?)
>> Finally, how to check the translation for accuracy and "feel" >> (i.e., ensuring it is true to the original intent)? > > Always need a proof reader and a tech editor in the target language; > need not be capable of translation.
So, you have to ensure both the translator and the proofreader comprehend the material (and presentation).
>> With translations in hand, do you (thereafter) maintain >> individual documents? Or, merge them into a conditional >> document? > Same as for the original document, but in versions. With luck, the > drawings are in common.
So, you're suggesting *different* documents (for each translation)?
>> Horror stories of attempts gone horribly wrong (i.e., what to >> avoid)?
> Well, you guessed it -- what had happened is that Swedish pronouns > were all directly 2:1 mapped to the corresponding English pronoun, > without recasting the sentences to remove the now massive ambiguities.
So, this is a failure on the part of the translator(s). And, likely, an "amateurish" one
> My advice to the President of the Danish firm was to have his > engineers write the first draft in Danish, and hire a tech editor > whose native language is English to make the translation and perform > the cleanup. The tech writer was allowed to question the engineers > until the editor understood, so the editor in effect stood in for the > English-speaking customer audience. This was done. I did a full > tech-edit scan of the result, and it read very well, and was perfectly > clear. Only needed to fix one usage problem. It still was not large > enough to fully describe that product, but still this was great > progress.
I had an experience with a Japanese firm where the Japanese (vendor) would simply (apparently!) update their existing documentation to reflect my needs. This didn't instill confidence -- are they really changing the product to meet those tighter specs? Or, just *claiming* to?