sci.electronics.design | Translation services/strategies/costs| page 5

Reply by Martin Brown ●December 24, 20212021-12-24

On 23/12/2021 10:35, Don Y wrote:
> On 12/23/2021 2:39 AM, Martin Brown wrote:

>> Alexa can't manage for example Tyne & Wear&nbsp; (tine and weir), dialect
>> Chop Gate (chop yat) and Cholmondeley (Chumlee) catch out most 
>> non-native English speakers in fact most non-locals. For that reason 
>> the latter was a location for sensitive military intelligence during 
>> WWII.
> 
> Worcester (WUSS-ter), Billerica (bill-RICK-a), Berlin (BURR-lin, not 
> burr-LIN),
> etc.&nbsp; Or, words that folks often mispronounce (almond, salmon).
> 
> I can identify folks who are from my home *town* (not "state"!) by their
> speech habits -- highly localized.

I can recognise a fair number of British accents but I have all but lost 
mine from time spent away from my home town at university and overseas.

> A neighbor claimed her firstname to be "Lara" -- though she spelled it
> L-A-U-R-A ("Isn't that Laura??").
> 
> [BTW, I'm still waiting for a pointer to the code you want compiled...]

If you are willing to give it a go I'll email you a copy (about 150k 
main file plus a couple of tiny header file stubs to satisfy includes).

I don't seem to have your email contact details. My own peculiar looking 
reply-to address is valid provided that you do not alter it in any way.

First time around just throw it at the Intel compiler and send me the 
error messages (or if by some happenstance it compiles and links OK the 
output of running it with no parameters - also about 100-200k).

If you have any nice fast series 10 or 11 Intel CPU's I'd be interested 
in the output from running an MSC 2019 compiled executable on them too. 
I'm looking for SSE architectural differences affecting out of order and 
speculative execution (and how they have changed with time).

I thought you were tied up until the year end.
I know how pressured year end shipment deadlines can be. Good luck!

Have a super Christmas!

-- 
Regards,
Martin Brown

Reply by Jan Panteltje ●December 24, 20212021-12-24

PS most android smartphones have text to speech and support bluetooth earpieces.
Then you could write an application and maybe even use the phone's camera to guide the blind.
If you have a clue about image recognition.

Carrying a raspi and a lipo battery around is not that hard either, 10 hours should go
and fits in ones pocket.

Show us something you did apart from babble.
Else you are - in essence  - just waiting my time.

It is what it is.

Reply by Don Y ●December 24, 20212021-12-24

On 12/24/2021 4:32 AM, Jan Panteltje wrote:
> On a sunny day (Fri, 24 Dec 2021 03:36:53 -0700) it happened Don Y
> <blockedofcourse@foo.invalid> wrote in <sq47sd$qn2$1@dont-email.me>:
> 
>> On 12/24/2021 1:55 AM, Jan Panteltje wrote:
>>> On a sunny day (Thu, 23 Dec 2021 12:57:23 -0700) it happened Don Y
>>> <blockedofcourse@foo.invalid> wrote in <sq2kbf$s5s$1@dont-email.me>:
>>>
>>>> On 12/23/2021 9:53 AM, Jan Panteltje wrote:
>>>>> On a sunny day (Thu, 23 Dec 2021 08:43:24 -0700) it happened Don Y
>>>>> <blockedofcourse@foo.invalid> wrote in <sq25f9$br9$1@dont-email.me>:
>>>>>
>>>>>> On 12/23/2021 6:16 AM, Jan Panteltje wrote:
>>>>>>> On a sunny day (Wed, 22 Dec 2021 17:21:06 -0700) it happened Don Y
>>>>>>> <blockedofcourse@foo.invalid> wrote in <sq0fdu$gjc$1@dont-email.me>:
>>>>>>>
>>>>>>>> On 12/22/2021 10:15 AM, Jan Panteltje wrote:
>>>>>>>>> On a sunny day (Wed, 22 Dec 2021 09:30:43 -0700) it happened Don Y
>>>>>>>>> <blockedofcourse@foo.invalid> wrote in <spvjrv$tj1$1@dont-email.me>:
>>>>>>>>>
>>>>>>>>>> It would be a tough call to determine if American English had evolved more
>>>>>>>>>> OR LESS than the original British.  I've read that American English is, in
>>>>>>>>>> many ways, truer to its British roots than modern British English.
>>>>>>>>>>
>>>>>>>>>> Pronunciations also evolve, over time.  As well as speech patterns.
>>>>>>>>>>
>>>>>>>>>> E.g., I was taught "the" should be pronounced as "thee" when preceding
>>>>>>>>>> a word beginning with a vowel sound:  "Thee English", "Thee other guy"
>>>>>>>>>> but with a schwa ahead of a consonant:  "The next one", "the Frenchman".
>>>>>>>>>> This seems to no longer be the norm.
>>>>>>>>>>
>>>>>>>>>> [You're interested in these sorts of things when you design a
>>>>>>>>>> speech synthesizer; the different "wh" sounds, etc.]
>>>>>>>>>
>>>>>>>>> A pretty decent text to speech is google translate.
>>>>>>>>>
>>>>>>>>> This script, called gst2_en on my system, has a female talk in english:
>>>>>>>>>
>>>>>>>>> #!/bin/bash
>>>>>>>>> say() { local IFS=+;/usr/bin/mplayer -ao alsa -really-quiet -noconsolecontrols
>>>>>>>>> "http://translate.google.com/translate_tts?ie=UTF-8&client=tw-ob&q=$*&tl=en"; }
>>>>>>>>> say $*
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> You call it like this (with your text as example):
>>>>>>>>>       gst2_en ">E.g., I was taught "the" should be pronounced as "thee" when preceding"
>>>>>>>>>
>>>>>>>>> In the script the  &tl=en   can be changed for the language you want, so &tl=nl for Dutch and &tl=de for German.
>>>>>>>>>
>>>>>>>>> If you want the output to go to a mp3 file then use  mplayer -dumpstream in that script.
>>>>>>>>>
>>>>>>>>> I find the quality better than other things I have tried.
>>>>>>>>>
>>>>>>>>> All Linux of course
>>>>>>>>
>>>>>>>> There are lots of synthesizers out there -- FOSS as well as commercial.
>>>>>>>> But, those that run on a PC tend to be bloated implementations -- large
>>>>>>>> dictionaries, unit databases, etc.  And, require a fair bit of CPU
>>>>>>>> to deliver speech in real-time.  If you're trying to run in a small
>>>>>>>> footprint consuming very little "energy" (think tiny battery), there
>>>>>>>> really isn't much choice -- esp if you want to be able to tweek the voice
>>>>>>>> to suit the listeners' preferences (with unconstrained vocabulary)
>>>>>>>
>>>>>>> Sure
>>>>>>> But the advantge of this script is that it uses NO resources on the PC / raspi or whatever
>>>>>>
>>>>>> Of course it uses resources!  You need a network stack, the memory to
>>>>>> handle the packets delivered across that connection, the memory to support
>>>>>> the shell, the filesystem from which to load the script and other binaries,
>>>>>> the kernel, etc.
>>>>>
>>>>> Sure
>>>>>
>>>>>> You just assume they cost nothing because they are already present
>>>>>> in your implementation.  Take a *bare* rPi and see how much you have to
>>>>>> add to it to make it speak.  *That* is the resource requirement.
>>>>>
>>>>> Not sure what you mean by a 'bare rPi', but even my old raspi one has all that.
>>>>
>>>> Strip all of the code off of it so you are starting with *hardware*.
>>>> Then, add back what you need to make it speak.
>>>
>>> OK, let me give you some example in this, and why the choice between apples and oranges .
>>> Let's say we have nothing but a PIC 18F14k22 (because I have those).
>>>
>>> To do the internet thing you need a TCP stack and add a Microchip ethernet ENC28J60 chip
>>> Microchip _has_ a TCP stack, but then what fun is it, I wrote one back in the 5 1/4 inch floppy days but
>>> those files got lost, but here is project with an UDP stack I wrote in PIC asm
>>>    http://panteltje.com/panteltje/pic/ethernet_color_pic/
>>>     controls room lighting from anywhere, been working fine 24/7 since 2013
>>> You you will need:
>>> 1 PIC18F14K22
>>> 1 ENC28J60
>>>
>>> Now let's see if we can do audio out with that
>>> Sure I have done audio with same PIC:
>>>    http://panteltje.com/panteltje/pic/audio_pic/
>>>     that used PWM somehow, but here I would use a R2R DAC on 8 PIC output pins perhaps.
>>> The B I G question is now "With This chip can I decode the mp3 stream from google translate?"
>>> Perhaps, would have to look a the source of mpg123 (open source C mp2/mp3 decoder) to see if I can do it all with 256 bytes
>>> RAM
>>> maybe the buffer size is too small, maybe need a bigger PIC or some external memory is needed.
>>> Never wrote a mp3 decoder so question mark here.
>>> Config system via RS232 in EEPROM (as in projects above) for IP, gateway, MAC etc etc.
>>> So 2 chips (or 3 if memory), some caps, power regulator, wall wart, and now you need to make a board if it is for production.
>>> And you need to write the asm.
>>> And test an debug it
>>> Estimated time: some days.
>>> Cost per hour of qualified person?
>>> A Raspberry Pi that has all connectors plus some, costs 47$ (went up a while ago because of chip shortages I have read)
>>> and a small SDcard.
>>> Runs Linux, is easy to program in minutes, and proven reliable, no board layouts needed, ready in an hour or so.
>>> And can fall back on whatever speech synth. you installed on it if no internet connection for any reason.
>>> The advantage of using google for speech is that THEY will do there best to make the audio as good as possible
>>> and support several languages
>>>
>>> So, in short, the Raspberry way is faster, cheaper, better, proven reliable, available everywhere.
>>> Now show us what YOU did.
>>
>> Put it *in* a bluetooth earpiece and have it run off the battery that's
>> in that earpiece.  Make sure that earpiece is paired with a BT host that
>> ultimately has internet access -- to get to your google service.  And,
>> maintain this connectivity while I walk, drive, ride a bicycle or
>> any other activity -- above or below ground.
>>
>> You're solving the wrong problem with a sledgehammer.
> 
> You should have spcified that right away.

I stated:
    [You're interested in these sorts of things when you design a
    speech synthesizer; the different "wh" sounds, etc.]
I didn't realize I had to explain my entire application in order
to defend that statement.

You also need to know braille if you want to transcribe emitted text
into braille "on-the-fly".

And, how to Daltonize visual presentations if you want to rely
on vision.

And, how to render graphic images if you want to present information
graphically.

Do I have to defend each of these statements, as well?

> So again, PIC, bluetooth chip, asm nothing new.
> Some people here can even design it all in one chip.
> But we are talking text to speech no (or did you change requirenent again)?
> WTF would you get the text from?

A set of applications.  As well as audio from music sources,
annunciators, etc.

> Much simpler to use a normal bluetooth earpiece and a Raspberry Pi talking to it from a fixed place..

That sort of thinking says it's much easier to use a land-line phone
in a fixed place (than to bother with all these silly cell phones)

And, why bother with tablets and laptops when you can sit bolt-upright
in front of your desktop PC?!

>   https://pimylifeup.com/raspberry-pi-bluetooth/
> For other platforms / system bluetooth USB adaptors plenty, I have some for the PC, also bluetooth earpieces.
> Like I said,the raspi can fall back on any otehr sinth. if no internet connections

Where is that other synthesizer?  What resources does *it* use?

> AGAIN where is your text coming from?
> You did not show any design or code

My synthesizers are *big*.  There's a lot of code to implement the text
normalization, prosody assignment, waveform synthesis, unit databases,
exception dictionaries, letter-to-sound rules, etc.

I've implemented a formant-based synthesizer modeled after an
enhanced Klatt synthesizer, a diphone synthesizer (but with only
one voice, presently, as sampling speech is tedious and requires
a fair bit of time from the voice model), and an LPC coder.  I've
implemented the NRL LTS rules, Hunnicutt's as well as McIlroy's.
(all of these technologies and documentation are available,
publicly -- but you may have to do some digging)

[The regression tests for the rule sets are each ~50pp.  Rhyme
tests another dozen or so.  etc.]

I've implemented fixed point and floating point versions of each
(as necessary) to address the possibilities of having limited target
capabilities.  So, I can piece together ~20 different synthesizers
with different resource requirements (and capabilities/limitations)
by mixing and matching these modules.

Unfortunately, there are no concrete criteria that I can use to
decide which (combination of algorithms) is "optimal".  Judging
the quality of speech output isn't something that lends itself to
a simple metric.  And, different approaches have different
tradeoffs (e.g., it's considerably harder to alter a diphone
voice than one created via LPC or formant synthesis)

[If algorithm 1 pronounces A, B and C well but chokes on D and E,
is that better than algorithm 2 pronouncing C, D and E well and
choking on A and B?  Does it matter if any of them can pronounce
"platypus" correctly?  Will questions ever be uttered?  If not,
then why deal with the associated rising intonation?  Will input
text always be grammatically correct?  Or, will I have to potentially
deal with "nonsense"?]

I didn't see any of google's code presented in YOUR implementation...

> Just babbling?

I don't think you've thought through the problem space.

Close your eyes -- permanently -- as if you were blind, suffered from
macular degeneration, diabetic retinopathy, etc.  (We'll assume you're
still mentally competent and able to suss-out the details of high tech
kit so we can ignore that potential complication...)

Your sole means of getting information from this gizmo is via audio
(i.e., speech and annunciators).

Now, take that brand new gizmo that you've been given and make it
talk to you.  How do you connect to the internet?  How do you
get status messages from it telling you that it *can't* connect?
Are you sure it is powered on?  How do you adjust the volume?
Voice used?  Speaking rate?  How do you even KNOW how to do these
things -- as you can't READ a manual!

(Gee, it sure would be nice if the device could read it *to* you...
but, you'd need to be able to synthesize speech BEFORE you'd
*configured* the device in order to learn HOW to *configure* it!
OTOH, if it could speak "natively", then it can prompt you to
determine if you need assistance:  "Would you like me to read
the manual to you?")

When your box stops talking, how do you know if it is because
it lost the connection to google?  Or, a bad battery?  Or, just high
latency or dropped packets?  Is there a way to coax it to utter
something to prove it is still operational (but, that would
have to be an annunciator and not spoken word)

How do you ask *it* how much battery run-time is left?  Does
it have to send a message to google so google can sort out
how to speak that information?

Similarly, when charging, how do you ask it how long until
fully recharged?  Maybe a series of coded beeps and bops
and an outrageously good memory to sort out which is which?

OTOH, if you can *always* speak -- even in the absence of
google -- then you can convey all of this information just
like any other application-generated information -- in
spoken utterances:

"Battery at 46% charge.  Estimate 3 hours until depleted."
"Battery at 52% charge.  Estimate charging complete in 2:10."
"Nominal battery life exceeded.  Replace soon"
"Volume level 12 (of 20)"
"High frequency boost +3 (of 5)"
"Low frequency cut -2 (of 5)"
"Voice selected is Tommy"
"Speech rate set to 200 words per minute"
"Received (radio) signal strength low"
"Name resolution error:  google.com"
"Unable to acquire DHCP lease."
"Using IP 192.168.1.105"
"Available networks are MYHOUSE23, VERIZON19 & YourPlace"
"Selected server busy; use foo.bar as alternate"
"For assistance, contact Dr. Jones at x5-2323"
"Access denied"

etc.  But, being able to *always* speak means it has to be
able to do so "cheaply" as you don't know how long it will
have to rely on its own capabilities to communicate with
the user!

Reply by Don Y ●December 24, 20212021-12-24

On 12/24/2021 5:26 AM, Jan Panteltje wrote:
> 
> PS most android smartphones have text to speech and support bluetooth earpieces.
> Then you could write an application and maybe even use the phone's camera to guide the blind.
> If you have a clue about image recognition.

Go get that android phone.  CLOSE YOUR EYES.  Now, find the application
and download it.  Figure out how to pair your BT earpiece to the
phone (still with your eyes closed).  You suffer from the typical engineering
Dunning-Kruger effect regarding misjudging your own assessment of problem
domains.  (surely you have considered the bootstrap issues I presented
in previous post -- right?)

I guess you'll write another version that runs under iOS, too?

And, ensure that the application takes over the phone completely
so the (blind) user isn't stuck wondering why your app isn't
talking to him, presently.  Or, how to get BACK to it after
completing a phone call.

And, when there's a new android version released, keep chasing it?

Or, if google stops offering their (free!) service -- assuming, of
course, that you don't mind google knowing everything that is
SAID to you.

Spend a day blind-folded.  Or, in a wheelchair.  Or, with ears plugged.
Then, explain how "simple" it is to address these shortcomings.  And,
how patient you will be with the fools who think they understand the
problem space and toss out half-baked solutions.  EVERYTHING is easy
if you don't actually have to *do* it!

Obviously simple enough that there must be TONS of solutions out there,
right?  (where are they all hiding?)

> Carrying a raspi and a lipo battery around is not that hard either, 10 hours should go
> and fits in ones pocket.

How wonderful of you to decide what users should be willing to
carry -- and how!  This, in *addition* to the android phone?
What if their only *input* modality is a sip-n-puff?  Or,
a braille keypad?  Or, an ASL glove?  Surely the phone is designed
to accommodate all of these... I just haven't found them in
any stores!

> Show us something you did apart from babble.

Yeah, that's the Larkin trick.  Then, when confronted with concrete
evidence, skulk away in ignorance.

> Else you are - in essence  - just waiting my time.

Show me you've even thought about the application -- as your comments
clearly indicate you're just tilting at windmills.  Wasting *my* time.

Have you ever done anything with something bigger than a PIC?  In
anything other than ASM?  Ever designed a system other than a
prebuilt box (e.g., rPi)?  Surely you've written kernel drivers?
Network stacks?  OSs?

Show us.  Something beyond trivial shell scripts.

> It is what it is.

yup.  My sentiments exactly!

Bye!

Reply by David Brown ●December 24, 20212021-12-24

On 24/12/2021 11:15, Jeroen Belleman wrote:
> On 2021-12-23 15:37, David Brown wrote:
>> On 23/12/2021 12:23, Jeroen Belleman wrote:
>>> On 2021-12-23 10:39, Martin Brown wrote:
>>> [...]
>>>> Cholmondeley (Chumlee) catch out most
>>>> non-native English speakers in fact most non-locals. [...]
>>>
>>> English is well known for its complete disconnect between
>>> pronunciation and spelling, but this is ridiculous.
>>>
>>
>> It is not a "complete disconnect" - not by a long way.&nbsp; Despite some of
>> the common oddities of spelling in English, and some particularly
>> unusual cases, there are far worse languages.&nbsp; Look at verb endings in
>> French - many different spellings have different meanings, but are
>> pronounced the same.&nbsp; Mongolian and Gaelic have a very much bigger
>> separation between the phonetic values of the written spellings and the
>> actual pronunciation.&nbsp; [...]
> 
> French spelling is pretty regular, in the sense that spelling
> usually unambiguously specifies the pronunciation. The reverse
> is far from true though. 

That is a good way to put it.

> I should know, I live there.
> 

From my school days, I found French was not bad for reading and writing
(despite it seeming like all verbs are irregular), but I could never get
the hang of understanding even simple spoken French.

I am fortunate that the language of my second home - Norwegian - is
quite regular in both directions (although of course regional dialects
vary).

Reply by Jan Panteltje ●December 24, 20212021-12-24

On a sunny day (Fri, 24 Dec 2021 06:04:01 -0700) it happened Don Y
<blockedofcourse@foo.invalid> wrote in <sq4gga$gs4$2@dont-email.me>:

>Have you ever done anything with something bigger than a PIC?  In
>anything other than ASM?  Ever designed a system other than a
>prebuilt box (e.g., rPi)?  Surely you've written kernel drivers?
>Network stacks?  OSs?

>Show us.  Something beyond trivial shell scripts.

My site is full of it, open source too.


>> It is what it is.
>
>yup.  My sentiments exactly!

You are an asshole, posting here babble only, have nothing to show for, and cannot read it seems.

>Bye!

Nothing missed

Post to a babble group, not to a design group.
You have nothing.

Reply by Jan Panteltje ●December 24, 20212021-12-24

On a sunny day (Fri, 24 Dec 2021 15:55:55 +0100) it happened David Brown
<david.brown@hesbynett.no> wrote in <sq4n1s$phk$1@dont-email.me>:

>On 24/12/2021 11:15, Jeroen Belleman wrote:
>> On 2021-12-23 15:37, David Brown wrote:
>>> On 23/12/2021 12:23, Jeroen Belleman wrote:
>>>> On 2021-12-23 10:39, Martin Brown wrote:
>>>> [...]
>>>>> Cholmondeley (Chumlee) catch out most
>>>>> non-native English speakers in fact most non-locals. [...]
>>>>
>>>> English is well known for its complete disconnect between
>>>> pronunciation and spelling, but this is ridiculous.
>>>>
>>>
>>> It is not a "complete disconnect" - not by a long way.&nbsp; Despite some of
>>> the common oddities of spelling in English, and some particularly
>>> unusual cases, there are far worse languages.&nbsp; Look at verb endings in
>>> French - many different spellings have different meanings, but are
>>> pronounced the same.&nbsp; Mongolian and Gaelic have a very much bigger
>>> separation between the phonetic values of the written spellings and the
>>> actual pronunciation.&nbsp; [...]
>> 
>> French spelling is pretty regular, in the sense that spelling
>> usually unambiguously specifies the pronunciation. The reverse
>> is far from true though. 
>
>That is a good way to put it.
>
>> I should know, I live there.
>> 
>
>From my school days, I found French was not bad for reading and writing
>(despite it seeming like all verbs are irregular), but I could never get
>the hang of understanding even simple spoken French.
>
>I am fortunate that the language of my second home - Norwegian - is
>quite regular in both directions (although of course regional dialects
>vary).

Here in the Netherlands they started teaching French in kindergarten.
Maybe that is why I have few problems with the language when in France.
They did not start with German and English until highschool.

Reply by Arie de Muijnck ●December 24, 20212021-12-24

On 2021-12-24 16:24, Jan Panteltje wrote:
> Here in the Netherlands they started teaching French in kindergarten.
> Maybe that is why I have few problems with the language when in France.
> They did not start with German and English until highschool.

Oh, these memories... French started when I was 11, English at 12, 
German at 13. Had exams in all 4 languages.

	Papa fume une pipe.
	Maman coupe le pain.
	Le soldat sur la mur.
		etc...

But it still really helps on holydays in France  :-}

Arie

Reply by Jan Panteltje ●December 24, 20212021-12-24

On a sunny day (Fri, 24 Dec 2021 18:14:09 +0100) it happened Arie de Muijnck
<noreply@ademu.com> wrote in <61c5ffe1$0$9511$e4fe514c@usenet.xs4all.nl>:

>On 2021-12-24 16:24, Jan Panteltje wrote:
>> Here in the Netherlands they started teaching French in kindergarten.
>> Maybe that is why I have few problems with the language when in France.
>> They did not start with German and English until highschool.
>
>Oh, these memories... French started when I was 11, English at 12, 
>German at 13. Had exams in all 4 languages.
>
>        Papa fume une pipe.
>        Maman coupe le pain.
>        Le soldat sur la mur.
>                etc...
>
>But it still really helps on holydays in France  :-}
>
>Arie
 
 https://www.youtube.com/watch?v=IJvI0WNihyM

Reply by Jan Frank ●December 24, 20212021-12-24

Arie de Muijnck <noreply@ademu.com> wrote:

> On 2021-12-24 16:24, Jan Panteltje wrote:
>> Here in the Netherlands they started teaching French in kindergarten.
>> Maybe that is why I have few problems with the language when in France.
>> They did not start with German and English until highschool.
> 
> Oh, these memories... French started when I was 11, English at 12, 
> German at 13. Had exams in all 4 languages.
> 
>      Papa fume une pipe.
>      Maman coupe le pain.
>      Le soldat sur la mur.
>           etc...
> 
> But it still really helps on holydays in France  :-}
> 
> Arie

I am Canadian and married my wife while I was with NATO in Metz, France. 
Our first child was a boy, and by the time he was 4, he could speak fluent 
French and English. He knew which was which, and never got them confused. 

It never ceased to amaze me how quickly children pick up languages. I 
suppose it is an evolutionary necessity to be able to say they are hungry 
(they are always hungry), and to learn the other things essential to life. 
It is a beautiful thing to watch.

Previous 3 456 7 Next

Translation services/strategies/costs

Sign in

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Quick Links

About Electronics-Related.com

Social Networks

The Related Media Group