Electronics-Related.com
Forums

Translation services/strategies/costs

Started by Don Y December 22, 2021
On a sunny day (Thu, 23 Dec 2021 08:43:24 -0700) it happened Don Y
<blockedofcourse@foo.invalid> wrote in <sq25f9$br9$1@dont-email.me>:

>On 12/23/2021 6:16 AM, Jan Panteltje wrote: >> On a sunny day (Wed, 22 Dec 2021 17:21:06 -0700) it happened Don Y >> <blockedofcourse@foo.invalid> wrote in <sq0fdu$gjc$1@dont-email.me>: >> >>> On 12/22/2021 10:15 AM, Jan Panteltje wrote: >>>> On a sunny day (Wed, 22 Dec 2021 09:30:43 -0700) it happened Don Y >>>> <blockedofcourse@foo.invalid> wrote in <spvjrv$tj1$1@dont-email.me>: >>>> >>>>> It would be a tough call to determine if American English had evolved more >>>>> OR LESS than the original British. I've read that American English is, in >>>>> many ways, truer to its British roots than modern British English. >>>>> >>>>> Pronunciations also evolve, over time. As well as speech patterns. >>>>> >>>>> E.g., I was taught "the" should be pronounced as "thee" when preceding >>>>> a word beginning with a vowel sound: "Thee English", "Thee other guy" >>>>> but with a schwa ahead of a consonant: "The next one", "the Frenchman". >>>>> This seems to no longer be the norm. >>>>> >>>>> [You're interested in these sorts of things when you design a >>>>> speech synthesizer; the different "wh" sounds, etc.] >>>> >>>> A pretty decent text to speech is google translate. >>>> >>>> This script, called gst2_en on my system, has a female talk in english: >>>> >>>> #!/bin/bash >>>> say() { local IFS=+;/usr/bin/mplayer -ao alsa -really-quiet -noconsolecontrols >>>> "http://translate.google.com/translate_tts?ie=UTF-8&client=tw-ob&q=$*&tl=en"; } >>>> say $* >>>> >>>> >>>> >>>> You call it like this (with your text as example): >>>> gst2_en ">E.g., I was taught "the" should be pronounced as "thee" when preceding" >>>> >>>> In the script the &tl=en can be changed for the language you want, so &tl=nl for Dutch and &tl=de for German. >>>> >>>> If you want the output to go to a mp3 file then use mplayer -dumpstream in that script. >>>> >>>> I find the quality better than other things I have tried. >>>> >>>> All Linux of course >>> >>> There are lots of synthesizers out there -- FOSS as well as commercial. >>> But, those that run on a PC tend to be bloated implementations -- large >>> dictionaries, unit databases, etc. And, require a fair bit of CPU >>> to deliver speech in real-time. If you're trying to run in a small >>> footprint consuming very little "energy" (think tiny battery), there >>> really isn't much choice -- esp if you want to be able to tweek the voice >>> to suit the listeners' preferences (with unconstrained vocabulary) >> >> Sure >> But the advantge of this script is that it uses NO resources on the PC / raspi or whatever > >Of course it uses resources! You need a network stack, the memory to >handle the packets delivered across that connection, the memory to support >the shell, the filesystem from which to load the script and other binaries, >the kernel, etc.
Sure
>You just assume they cost nothing because they are already present >in your implementation. Take a *bare* rPi and see how much you have to >add to it to make it speak. *That* is the resource requirement.
Not sure what you mean by a 'bare rPi', but even my old raspi one has all that. All extra it needs in an audio amp and speaker. The rest you can do via ssh, I was just making an animated christmas tree for my laser video projector on it via ssh, that old pi1 has analog video out that then goes to the analog video in of the i-connect picop laser projector, works, Of course it does all that while running my xgpspc server naviagation program and a GPS processing and.... panteltje20: ~ # ssh -Y 192.168.178.73 root@192.168.178.73's password: Linux raspi73 3.6.11+ #371 PREEMPT Thu Feb 7 16:31:35 GMT 2013 armv6l Last login: Thu Dec 23 17:20:08 2021 from panteltje10 unaroot@raspi73:~# uname -a Linux raspi73 3.6.11+ #371 PREEMPT Thu Feb 7 16:31:35 GMT 2013 armv6l GNU/Linux root@raspi73:~# just keeps working and working and working 24/7 root@raspi73:~# cat /dev/ttyAMA0 $GPRMC,164612.00,V,,,,,,,231221,,,N*7A $GPVTG,,,,,,,,,N*30 $GPGGA,164612.00,,,,,0,00,99.99,,,,,,*60 $GPGSA,A,1,,,,,,,,,,,,,99.99,99.99,99.99*30 $GPGSV,3,1,12,02,22,113,,03,04,001,,06,24,066,,11,17,106,*7E $GPGSV,3,2,12,12,81,067,,19,16,041,,22,04,339,,24,41,140,*71 $GPGSV,3,3,12,25,59,259,,29,20,195,,31,10,302,,32,41,281,*76 $GPGL....... root@raspi73:~#TOP Top - 17:49:22 up 66 days, 5:21, 11 users, load average: 1.66, 1.69, 1.86 Tasks: 88 total, 2 running, 86 sleeping, 0 stopped, 0 zombie %Cpu(s): 61.6 us, 37.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.6 si, 0.0 st KiB Mem: 448776 total, 283472 used, 165304 free, 40276 buffers KiB Swap: 102396 total, 0 used, 102396 free, 47700 cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2625 root 20 0 128m 40m 1568 R 74.3 9.3 67539:59 xgpspc 2420 root 20 0 5904 1728 880 S 3.1 0.4 2744:00 rxvt 2423 root 20 0 5904 1732 880 S 3.1 0.4 2744:07 rxvt Nice stuff, raspberries, IF you know how to use those. Uptime 66 days, last record was 256 days or so, moved house, needed to rewire some power here in the new place a few times, now it is on UPS. Don't complain, write code.
>[No, I don't care if it can also serve up web pages or log errors >to remote hosts or handle TELNET connections... I just want it to >*speak*! You'll get no "credit" for supporting those other things.] > >> but it does need a net connection, but mp3s are small. >> [B >> Here is an other one using google translate: >> >> #/bin/bash >> echo "english text document to audio or to mp3" >> echo "Usage: gst6_en filename.txt [1]" >> echo "if second argument present output to mp3 file, one mp3 file per line, else to audio" >> input=$1 >> lines=1 >> while IFS= read -r line >> do >> echo "line $lines" >> if [ "$2" == "" ] >> then >> /usr/bin/mplayer -ao alsa -really-quiet -noconsolecontrols >> "http://translate.google.com/translate_tts?ie=UTF-8&client=tw-ob&q=$line&tl=en"; >> else >> wget -O $1_$lines.mp3 "http://translate.google.com/translate_tts?ie=UTF-8&client=tw-ob&q=$line&tl=en" >> fi >> let lines=lines+1 >> done < $1 >> >> So this will speak a whole english text file line by line or, if you call it with an extra argument, >> make numbered mp3 files from a text file, one per line. >> You can then play the numbered mp3 files in [any] sequence with a similar script, >> and even edit and add comments by adding extra lines or deleting lines. >> >> Was just a quick hack.... >> >> OTOH I have 'festival' speech synthesizer on the PC for 20 years or so, not that bad either. > >Festival is a prime example of that bloat.
I thought I was rather small... :-)
In article <sq1srd$147f$1@gioia.aioe.org>, pNaOnStPeAlMtje@yahoo.com 
says...
> >>> E.g., I was taught "the" should be pronounced as "thee" when preceding > >>> a word beginning with a vowel sound: "Thee English", "Thee other guy" > >>> but with a schwa ahead of a consonant: "The next one", "the Frenchman". > >>> This seems to no longer be the norm.
I'm not aware of any such rule or even pattern. On the other hand if emphasising that there is a particular other guy that you are referring to, the "thee" emphasises his singularity...
On 12/23/2021 12:12 PM, Mike Coon wrote:
> In article <sq1srd$147f$1@gioia.aioe.org>, pNaOnStPeAlMtje@yahoo.com > says... >>>>> E.g., I was taught "the" should be pronounced as "thee" when preceding >>>>> a word beginning with a vowel sound: "Thee English", "Thee other guy" >>>>> but with a schwa ahead of a consonant: "The next one", "the Frenchman". >>>>> This seems to no longer be the norm. > > I'm not aware of any such rule or even pattern. On the other hand if > emphasising that there is a particular other guy that you are referring > to, the "thee" emphasises his singularity...
<https://www.merriam-webster.com/words-at-play/how-do-you-pronounce-the-let-us-count-the-ways>
Jeroen Belleman wrote:
> On 2021-12-23 10:39, Martin Brown wrote: > [...] >> Cholmondeley (Chumlee) catch out most >> non-native English speakers in fact most non-locals. [...] > > English is well known for its complete disconnect between > pronunciation and spelling, but this is ridiculous. > > Jeroen Belleman
English family names, and place names in England can be confusing. It's not a language issue, it's mostly a legacy of the Norman Conquest. For instance Pontefract (castle) = Pumfrey Featherstonehaugh (family) = Fanshaw Over here it's mostly Americanized pronunciations by American families descended from immigrants e.g. Dubois = De Boyce Daubert = Dowburt Not very different from "Parris" or "The Hague" or "Pekin" or "Moscow" (the Idaho one is pronounced "moscoe", but neither sounds like the Russian pronunciation. The French call London "Londres". The current fashion for aping native pronunciations of place names that have been well known for ages is pretty silly actually. Cheers Phil Hobbs -- Dr Philip C D Hobbs Principal Consultant ElectroOptical Innovations LLC / Hobbs ElectroOptics Optics, Electro-optics, Photonics, Analog Electronics Briarcliff Manor NY 10510 http://electrooptical.net http://hobbs-eo.com
On 12/23/2021 9:53 AM, Jan Panteltje wrote:
> On a sunny day (Thu, 23 Dec 2021 08:43:24 -0700) it happened Don Y > <blockedofcourse@foo.invalid> wrote in <sq25f9$br9$1@dont-email.me>: > >> On 12/23/2021 6:16 AM, Jan Panteltje wrote: >>> On a sunny day (Wed, 22 Dec 2021 17:21:06 -0700) it happened Don Y >>> <blockedofcourse@foo.invalid> wrote in <sq0fdu$gjc$1@dont-email.me>: >>> >>>> On 12/22/2021 10:15 AM, Jan Panteltje wrote: >>>>> On a sunny day (Wed, 22 Dec 2021 09:30:43 -0700) it happened Don Y >>>>> <blockedofcourse@foo.invalid> wrote in <spvjrv$tj1$1@dont-email.me>: >>>>> >>>>>> It would be a tough call to determine if American English had evolved more >>>>>> OR LESS than the original British. I've read that American English is, in >>>>>> many ways, truer to its British roots than modern British English. >>>>>> >>>>>> Pronunciations also evolve, over time. As well as speech patterns. >>>>>> >>>>>> E.g., I was taught "the" should be pronounced as "thee" when preceding >>>>>> a word beginning with a vowel sound: "Thee English", "Thee other guy" >>>>>> but with a schwa ahead of a consonant: "The next one", "the Frenchman". >>>>>> This seems to no longer be the norm. >>>>>> >>>>>> [You're interested in these sorts of things when you design a >>>>>> speech synthesizer; the different "wh" sounds, etc.] >>>>> >>>>> A pretty decent text to speech is google translate. >>>>> >>>>> This script, called gst2_en on my system, has a female talk in english: >>>>> >>>>> #!/bin/bash >>>>> say() { local IFS=+;/usr/bin/mplayer -ao alsa -really-quiet -noconsolecontrols >>>>> "http://translate.google.com/translate_tts?ie=UTF-8&client=tw-ob&q=$*&tl=en"; } >>>>> say $* >>>>> >>>>> >>>>> >>>>> You call it like this (with your text as example): >>>>> gst2_en ">E.g., I was taught "the" should be pronounced as "thee" when preceding" >>>>> >>>>> In the script the &tl=en can be changed for the language you want, so &tl=nl for Dutch and &tl=de for German. >>>>> >>>>> If you want the output to go to a mp3 file then use mplayer -dumpstream in that script. >>>>> >>>>> I find the quality better than other things I have tried. >>>>> >>>>> All Linux of course >>>> >>>> There are lots of synthesizers out there -- FOSS as well as commercial. >>>> But, those that run on a PC tend to be bloated implementations -- large >>>> dictionaries, unit databases, etc. And, require a fair bit of CPU >>>> to deliver speech in real-time. If you're trying to run in a small >>>> footprint consuming very little "energy" (think tiny battery), there >>>> really isn't much choice -- esp if you want to be able to tweek the voice >>>> to suit the listeners' preferences (with unconstrained vocabulary) >>> >>> Sure >>> But the advantge of this script is that it uses NO resources on the PC / raspi or whatever >> >> Of course it uses resources! You need a network stack, the memory to >> handle the packets delivered across that connection, the memory to support >> the shell, the filesystem from which to load the script and other binaries, >> the kernel, etc. > > Sure > >> You just assume they cost nothing because they are already present >> in your implementation. Take a *bare* rPi and see how much you have to >> add to it to make it speak. *That* is the resource requirement. > > Not sure what you mean by a 'bare rPi', but even my old raspi one has all that.
Strip all of the code off of it so you are starting with *hardware*. Then, add back what you need to make it speak. Otherwise, you're comparing apples to oranges. I can type "Hello, World!" on a sheet of paper. Then, put it on the scanner glass of my Reading Machine and hear it speak those words. Does that mean there is *no* resource usage associated with those utterances? :>
> All extra it needs in an audio amp and speaker. > The rest you can do via ssh, I was just making an animated christmas tree for my laser video projector > on it via ssh, that old pi1 has analog video out that then goes to the analog video in of the i-connect picop > laser projector, works, > Of course it does all that while running my xgpspc server naviagation program and a GPS processing and.... > panteltje20: ~ # ssh -Y 192.168.178.73 > root@192.168.178.73's password: > Linux raspi73 3.6.11+ #371 PREEMPT Thu Feb 7 16:31:35 GMT 2013 armv6l > Last login: Thu Dec 23 17:20:08 2021 from panteltje10 > unaroot@raspi73:~# uname -a > Linux raspi73 3.6.11+ #371 PREEMPT Thu Feb 7 16:31:35 GMT 2013 armv6l GNU/Linux > root@raspi73:~# > > just keeps working and working and working 24/7 > root@raspi73:~# cat /dev/ttyAMA0 > $GPRMC,164612.00,V,,,,,,,231221,,,N*7A > $GPVTG,,,,,,,,,N*30 > $GPGGA,164612.00,,,,,0,00,99.99,,,,,,*60 > $GPGSA,A,1,,,,,,,,,,,,,99.99,99.99,99.99*30 > $GPGSV,3,1,12,02,22,113,,03,04,001,,06,24,066,,11,17,106,*7E > $GPGSV,3,2,12,12,81,067,,19,16,041,,22,04,339,,24,41,140,*71 > $GPGSV,3,3,12,25,59,259,,29,20,195,,31,10,302,,32,41,281,*76 > $GPGL....... > > root@raspi73:~#TOP > Top - 17:49:22 up 66 days, 5:21, 11 users, load average: 1.66, 1.69, 1.86 > Tasks: 88 total, 2 running, 86 sleeping, 0 stopped, 0 zombie > %Cpu(s): 61.6 us, 37.7 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.6 si, 0.0 st > KiB Mem: 448776 total, 283472 used, 165304 free, 40276 buffers > KiB Swap: 102396 total, 0 used, 102396 free, 47700 cached > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 2625 root 20 0 128m 40m 1568 R 74.3 9.3 67539:59 xgpspc > 2420 root 20 0 5904 1728 880 S 3.1 0.4 2744:00 rxvt > 2423 root 20 0 5904 1732 880 S 3.1 0.4 2744:07 rxvt > > Nice stuff, raspberries, IF you know how to use those. > Uptime 66 days, last record was 256 days or so, moved house, > needed to rewire some power here in the new place a few times, now it is on UPS. > > Don't complain, write code.
On a sunny day (Thu, 23 Dec 2021 12:57:23 -0700) it happened Don Y
<blockedofcourse@foo.invalid> wrote in <sq2kbf$s5s$1@dont-email.me>:

>On 12/23/2021 9:53 AM, Jan Panteltje wrote: >> On a sunny day (Thu, 23 Dec 2021 08:43:24 -0700) it happened Don Y >> <blockedofcourse@foo.invalid> wrote in <sq25f9$br9$1@dont-email.me>: >> >>> On 12/23/2021 6:16 AM, Jan Panteltje wrote: >>>> On a sunny day (Wed, 22 Dec 2021 17:21:06 -0700) it happened Don Y >>>> <blockedofcourse@foo.invalid> wrote in <sq0fdu$gjc$1@dont-email.me>: >>>> >>>>> On 12/22/2021 10:15 AM, Jan Panteltje wrote: >>>>>> On a sunny day (Wed, 22 Dec 2021 09:30:43 -0700) it happened Don Y >>>>>> <blockedofcourse@foo.invalid> wrote in <spvjrv$tj1$1@dont-email.me>: >>>>>> >>>>>>> It would be a tough call to determine if American English had evolved more >>>>>>> OR LESS than the original British. I've read that American English is, in >>>>>>> many ways, truer to its British roots than modern British English. >>>>>>> >>>>>>> Pronunciations also evolve, over time. As well as speech patterns. >>>>>>> >>>>>>> E.g., I was taught "the" should be pronounced as "thee" when preceding >>>>>>> a word beginning with a vowel sound: "Thee English", "Thee other guy" >>>>>>> but with a schwa ahead of a consonant: "The next one", "the Frenchman". >>>>>>> This seems to no longer be the norm. >>>>>>> >>>>>>> [You're interested in these sorts of things when you design a >>>>>>> speech synthesizer; the different "wh" sounds, etc.] >>>>>> >>>>>> A pretty decent text to speech is google translate. >>>>>> >>>>>> This script, called gst2_en on my system, has a female talk in english: >>>>>> >>>>>> #!/bin/bash >>>>>> say() { local IFS=+;/usr/bin/mplayer -ao alsa -really-quiet -noconsolecontrols >>>>>> "http://translate.google.com/translate_tts?ie=UTF-8&client=tw-ob&q=$*&tl=en"; } >>>>>> say $* >>>>>> >>>>>> >>>>>> >>>>>> You call it like this (with your text as example): >>>>>> gst2_en ">E.g., I was taught "the" should be pronounced as "thee" when preceding" >>>>>> >>>>>> In the script the &tl=en can be changed for the language you want, so &tl=nl for Dutch and &tl=de for German. >>>>>> >>>>>> If you want the output to go to a mp3 file then use mplayer -dumpstream in that script. >>>>>> >>>>>> I find the quality better than other things I have tried. >>>>>> >>>>>> All Linux of course >>>>> >>>>> There are lots of synthesizers out there -- FOSS as well as commercial. >>>>> But, those that run on a PC tend to be bloated implementations -- large >>>>> dictionaries, unit databases, etc. And, require a fair bit of CPU >>>>> to deliver speech in real-time. If you're trying to run in a small >>>>> footprint consuming very little "energy" (think tiny battery), there >>>>> really isn't much choice -- esp if you want to be able to tweek the voice >>>>> to suit the listeners' preferences (with unconstrained vocabulary) >>>> >>>> Sure >>>> But the advantge of this script is that it uses NO resources on the PC / raspi or whatever >>> >>> Of course it uses resources! You need a network stack, the memory to >>> handle the packets delivered across that connection, the memory to support >>> the shell, the filesystem from which to load the script and other binaries, >>> the kernel, etc. >> >> Sure >> >>> You just assume they cost nothing because they are already present >>> in your implementation. Take a *bare* rPi and see how much you have to >>> add to it to make it speak. *That* is the resource requirement. >> >> Not sure what you mean by a 'bare rPi', but even my old raspi one has all that. > >Strip all of the code off of it so you are starting with *hardware*. >Then, add back what you need to make it speak.
OK, let me give you some example in this, and why the choice between apples and oranges . Let's say we have nothing but a PIC 18F14k22 (because I have those). To do the internet thing you need a TCP stack and add a Microchip ethernet ENC28J60 chip Microchip _has_ a TCP stack, but then what fun is it, I wrote one back in the 5 1/4 inch floppy days but those files got lost, but here is project with an UDP stack I wrote in PIC asm http://panteltje.com/panteltje/pic/ethernet_color_pic/ controls room lighting from anywhere, been working fine 24/7 since 2013 You you will need: 1 PIC18F14K22 1 ENC28J60 Now let's see if we can do audio out with that Sure I have done audio with same PIC: http://panteltje.com/panteltje/pic/audio_pic/ that used PWM somehow, but here I would use a R2R DAC on 8 PIC output pins perhaps. The B I G question is now "With This chip can I decode the mp3 stream from google translate?" Perhaps, would have to look a the source of mpg123 (open source C mp2/mp3 decoder) to see if I can do it all with 256 bytes RAM maybe the buffer size is too small, maybe need a bigger PIC or some external memory is needed. Never wrote a mp3 decoder so question mark here. Config system via RS232 in EEPROM (as in projects above) for IP, gateway, MAC etc etc. So 2 chips (or 3 if memory), some caps, power regulator, wall wart, and now you need to make a board if it is for production. And you need to write the asm. And test an debug it Estimated time: some days. Cost per hour of qualified person? A Raspberry Pi that has all connectors plus some, costs 47$ (went up a while ago because of chip shortages I have read) and a small SDcard. Runs Linux, is easy to program in minutes, and proven reliable, no board layouts needed, ready in an hour or so. And can fall back on whatever speech synth. you installed on it if no internet connection for any reason. The advantage of using google for speech is that THEY will do there best to make the audio as good as possible and support several languages So, in short, the Raspberry way is faster, cheaper, better, proven reliable, available everywhere. Now show us what YOU did.
On 2021-12-23 15:37, David Brown wrote:
> On 23/12/2021 12:23, Jeroen Belleman wrote: >> On 2021-12-23 10:39, Martin Brown wrote: >> [...] >>> Cholmondeley (Chumlee) catch out most >>> non-native English speakers in fact most non-locals. [...] >> >> English is well known for its complete disconnect between >> pronunciation and spelling, but this is ridiculous. >> > > It is not a "complete disconnect" - not by a long way. Despite some of > the common oddities of spelling in English, and some particularly > unusual cases, there are far worse languages. Look at verb endings in > French - many different spellings have different meanings, but are > pronounced the same. Mongolian and Gaelic have a very much bigger > separation between the phonetic values of the written spellings and the > actual pronunciation. [...]
French spelling is pretty regular, in the sense that spelling usually unambiguously specifies the pronunciation. The reverse is far from true though. I should know, I live there. Jeroen Belleman
On 12/24/2021 1:55 AM, Jan Panteltje wrote:
> On a sunny day (Thu, 23 Dec 2021 12:57:23 -0700) it happened Don Y > <blockedofcourse@foo.invalid> wrote in <sq2kbf$s5s$1@dont-email.me>: > >> On 12/23/2021 9:53 AM, Jan Panteltje wrote: >>> On a sunny day (Thu, 23 Dec 2021 08:43:24 -0700) it happened Don Y >>> <blockedofcourse@foo.invalid> wrote in <sq25f9$br9$1@dont-email.me>: >>> >>>> On 12/23/2021 6:16 AM, Jan Panteltje wrote: >>>>> On a sunny day (Wed, 22 Dec 2021 17:21:06 -0700) it happened Don Y >>>>> <blockedofcourse@foo.invalid> wrote in <sq0fdu$gjc$1@dont-email.me>: >>>>> >>>>>> On 12/22/2021 10:15 AM, Jan Panteltje wrote: >>>>>>> On a sunny day (Wed, 22 Dec 2021 09:30:43 -0700) it happened Don Y >>>>>>> <blockedofcourse@foo.invalid> wrote in <spvjrv$tj1$1@dont-email.me>: >>>>>>> >>>>>>>> It would be a tough call to determine if American English had evolved more >>>>>>>> OR LESS than the original British. I've read that American English is, in >>>>>>>> many ways, truer to its British roots than modern British English. >>>>>>>> >>>>>>>> Pronunciations also evolve, over time. As well as speech patterns. >>>>>>>> >>>>>>>> E.g., I was taught "the" should be pronounced as "thee" when preceding >>>>>>>> a word beginning with a vowel sound: "Thee English", "Thee other guy" >>>>>>>> but with a schwa ahead of a consonant: "The next one", "the Frenchman". >>>>>>>> This seems to no longer be the norm. >>>>>>>> >>>>>>>> [You're interested in these sorts of things when you design a >>>>>>>> speech synthesizer; the different "wh" sounds, etc.] >>>>>>> >>>>>>> A pretty decent text to speech is google translate. >>>>>>> >>>>>>> This script, called gst2_en on my system, has a female talk in english: >>>>>>> >>>>>>> #!/bin/bash >>>>>>> say() { local IFS=+;/usr/bin/mplayer -ao alsa -really-quiet -noconsolecontrols >>>>>>> "http://translate.google.com/translate_tts?ie=UTF-8&client=tw-ob&q=$*&tl=en"; } >>>>>>> say $* >>>>>>> >>>>>>> >>>>>>> >>>>>>> You call it like this (with your text as example): >>>>>>> gst2_en ">E.g., I was taught "the" should be pronounced as "thee" when preceding" >>>>>>> >>>>>>> In the script the &tl=en can be changed for the language you want, so &tl=nl for Dutch and &tl=de for German. >>>>>>> >>>>>>> If you want the output to go to a mp3 file then use mplayer -dumpstream in that script. >>>>>>> >>>>>>> I find the quality better than other things I have tried. >>>>>>> >>>>>>> All Linux of course >>>>>> >>>>>> There are lots of synthesizers out there -- FOSS as well as commercial. >>>>>> But, those that run on a PC tend to be bloated implementations -- large >>>>>> dictionaries, unit databases, etc. And, require a fair bit of CPU >>>>>> to deliver speech in real-time. If you're trying to run in a small >>>>>> footprint consuming very little "energy" (think tiny battery), there >>>>>> really isn't much choice -- esp if you want to be able to tweek the voice >>>>>> to suit the listeners' preferences (with unconstrained vocabulary) >>>>> >>>>> Sure >>>>> But the advantge of this script is that it uses NO resources on the PC / raspi or whatever >>>> >>>> Of course it uses resources! You need a network stack, the memory to >>>> handle the packets delivered across that connection, the memory to support >>>> the shell, the filesystem from which to load the script and other binaries, >>>> the kernel, etc. >>> >>> Sure >>> >>>> You just assume they cost nothing because they are already present >>>> in your implementation. Take a *bare* rPi and see how much you have to >>>> add to it to make it speak. *That* is the resource requirement. >>> >>> Not sure what you mean by a 'bare rPi', but even my old raspi one has all that. >> >> Strip all of the code off of it so you are starting with *hardware*. >> Then, add back what you need to make it speak. > > OK, let me give you some example in this, and why the choice between apples and oranges . > Let's say we have nothing but a PIC 18F14k22 (because I have those). > > To do the internet thing you need a TCP stack and add a Microchip ethernet ENC28J60 chip > Microchip _has_ a TCP stack, but then what fun is it, I wrote one back in the 5 1/4 inch floppy days but > those files got lost, but here is project with an UDP stack I wrote in PIC asm > http://panteltje.com/panteltje/pic/ethernet_color_pic/ > controls room lighting from anywhere, been working fine 24/7 since 2013 > You you will need: > 1 PIC18F14K22 > 1 ENC28J60 > > Now let's see if we can do audio out with that > Sure I have done audio with same PIC: > http://panteltje.com/panteltje/pic/audio_pic/ > that used PWM somehow, but here I would use a R2R DAC on 8 PIC output pins perhaps. > The B I G question is now "With This chip can I decode the mp3 stream from google translate?" > Perhaps, would have to look a the source of mpg123 (open source C mp2/mp3 decoder) to see if I can do it all with 256 bytes RAM > maybe the buffer size is too small, maybe need a bigger PIC or some external memory is needed. > Never wrote a mp3 decoder so question mark here. > Config system via RS232 in EEPROM (as in projects above) for IP, gateway, MAC etc etc. > So 2 chips (or 3 if memory), some caps, power regulator, wall wart, and now you need to make a board if it is for production. > And you need to write the asm. > And test an debug it > Estimated time: some days. > Cost per hour of qualified person? > A Raspberry Pi that has all connectors plus some, costs 47$ (went up a while ago because of chip shortages I have read) > and a small SDcard. > Runs Linux, is easy to program in minutes, and proven reliable, no board layouts needed, ready in an hour or so. > And can fall back on whatever speech synth. you installed on it if no internet connection for any reason. > The advantage of using google for speech is that THEY will do there best to make the audio as good as possible > and support several languages > > So, in short, the Raspberry way is faster, cheaper, better, proven reliable, available everywhere. > Now show us what YOU did.
Put it *in* a bluetooth earpiece and have it run off the battery that's in that earpiece. Make sure that earpiece is paired with a BT host that ultimately has internet access -- to get to your google service. And, maintain this connectivity while I walk, drive, ride a bicycle or any other activity -- above or below ground. You're solving the wrong problem with a sledgehammer.
On 12/24/2021 3:36 AM, Don Y wrote:
> On 12/24/2021 1:55 AM, Jan Panteltje wrote: >> On a sunny day (Thu, 23 Dec 2021 12:57:23 -0700) it happened Don Y >> <blockedofcourse@foo.invalid> wrote in <sq2kbf$s5s$1@dont-email.me>: >> >>> On 12/23/2021 9:53 AM, Jan Panteltje wrote: >>>> On a sunny day (Thu, 23 Dec 2021 08:43:24 -0700) it happened Don Y >>>> <blockedofcourse@foo.invalid> wrote in <sq25f9$br9$1@dont-email.me>: >>>> >>>>> On 12/23/2021 6:16 AM, Jan Panteltje wrote: >>>>>> On a sunny day (Wed, 22 Dec 2021 17:21:06 -0700) it happened Don Y >>>>>> <blockedofcourse@foo.invalid> wrote in <sq0fdu$gjc$1@dont-email.me>: >>>>>> >>>>>>> On 12/22/2021 10:15 AM, Jan Panteltje wrote: >>>>>>>> On a sunny day (Wed, 22 Dec 2021 09:30:43 -0700) it happened Don Y >>>>>>>> <blockedofcourse@foo.invalid> wrote in <spvjrv$tj1$1@dont-email.me>: >>>>>>>> >>>>>>>>> It would be a tough call to determine if American English had evolved >>>>>>>>> more >>>>>>>>> OR LESS than the original British. I've read that American English >>>>>>>>> is, in >>>>>>>>> many ways, truer to its British roots than modern British English. >>>>>>>>> >>>>>>>>> Pronunciations also evolve, over time. As well as speech patterns. >>>>>>>>> >>>>>>>>> E.g., I was taught "the" should be pronounced as "thee" when preceding >>>>>>>>> a word beginning with a vowel sound: "Thee English", "Thee other guy" >>>>>>>>> but with a schwa ahead of a consonant: "The next one", "the Frenchman". >>>>>>>>> This seems to no longer be the norm. >>>>>>>>> >>>>>>>>> [You're interested in these sorts of things when you design a >>>>>>>>> speech synthesizer; the different "wh" sounds, etc.] >>>>>>>> >>>>>>>> A pretty decent text to speech is google translate. >>>>>>>> >>>>>>>> This script, called gst2_en on my system, has a female talk in english: >>>>>>>> >>>>>>>> #!/bin/bash >>>>>>>> say() { local IFS=+;/usr/bin/mplayer -ao alsa -really-quiet >>>>>>>> -noconsolecontrols >>>>>>>> "http://translate.google.com/translate_tts?ie=UTF-8&client=tw-ob&q=$*&tl=en"; >>>>>>>> } >>>>>>>> say $* >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> You call it like this (with your text as example): >>>>>>>> gst2_en ">E.g., I was taught "the" should be pronounced as "thee" >>>>>>>> when preceding" >>>>>>>> >>>>>>>> In the script the &tl=en can be changed for the language you want, >>>>>>>> so &tl=nl for Dutch and &tl=de for German. >>>>>>>> >>>>>>>> If you want the output to go to a mp3 file then use mplayer >>>>>>>> -dumpstream in that script. >>>>>>>> >>>>>>>> I find the quality better than other things I have tried. >>>>>>>> >>>>>>>> All Linux of course >>>>>>> >>>>>>> There are lots of synthesizers out there -- FOSS as well as commercial. >>>>>>> But, those that run on a PC tend to be bloated implementations -- large >>>>>>> dictionaries, unit databases, etc. And, require a fair bit of CPU >>>>>>> to deliver speech in real-time. If you're trying to run in a small >>>>>>> footprint consuming very little "energy" (think tiny battery), there >>>>>>> really isn't much choice -- esp if you want to be able to tweek the voice >>>>>>> to suit the listeners' preferences (with unconstrained vocabulary) >>>>>> >>>>>> Sure >>>>>> But the advantge of this script is that it uses NO resources on the PC / >>>>>> raspi or whatever >>>>> >>>>> Of course it uses resources! You need a network stack, the memory to >>>>> handle the packets delivered across that connection, the memory to support >>>>> the shell, the filesystem from which to load the script and other binaries, >>>>> the kernel, etc. >>>> >>>> Sure >>>> >>>>> You just assume they cost nothing because they are already present >>>>> in your implementation. Take a *bare* rPi and see how much you have to >>>>> add to it to make it speak. *That* is the resource requirement. >>>> >>>> Not sure what you mean by a 'bare rPi', but even my old raspi one has all >>>> that. >>> >>> Strip all of the code off of it so you are starting with *hardware*. >>> Then, add back what you need to make it speak. >> >> OK, let me give you some example in this, and why the choice between apples >> and oranges . >> Let's say we have nothing but a PIC 18F14k22 (because I have those). >> >> To do the internet thing you need a TCP stack and add a Microchip ethernet >> ENC28J60 chip >> Microchip _has_ a TCP stack, but then what fun is it, I wrote one back in the >> 5 1/4 inch floppy days but >> those files got lost, but here is project with an UDP stack I wrote in PIC asm >> http://panteltje.com/panteltje/pic/ethernet_color_pic/ >> controls room lighting from anywhere, been working fine 24/7 since 2013 >> You you will need: >> 1 PIC18F14K22 >> 1 ENC28J60 >> >> Now let's see if we can do audio out with that >> Sure I have done audio with same PIC: >> http://panteltje.com/panteltje/pic/audio_pic/ >> that used PWM somehow, but here I would use a R2R DAC on 8 PIC output pins >> perhaps. >> The B I G question is now "With This chip can I decode the mp3 stream from >> google translate?" >> Perhaps, would have to look a the source of mpg123 (open source C mp2/mp3 >> decoder) to see if I can do it all with 256 bytes RAM >> maybe the buffer size is too small, maybe need a bigger PIC or some external >> memory is needed. >> Never wrote a mp3 decoder so question mark here. >> Config system via RS232 in EEPROM (as in projects above) for IP, gateway, MAC >> etc etc. >> So 2 chips (or 3 if memory), some caps, power regulator, wall wart, and now >> you need to make a board if it is for production. >> And you need to write the asm. >> And test an debug it >> Estimated time: some days. >> Cost per hour of qualified person? >> A Raspberry Pi that has all connectors plus some, costs 47$ (went up a while >> ago because of chip shortages I have read) >> and a small SDcard. >> Runs Linux, is easy to program in minutes, and proven reliable, no board >> layouts needed, ready in an hour or so. >> And can fall back on whatever speech synth. you installed on it if no >> internet connection for any reason. >> The advantage of using google for speech is that THEY will do there best to >> make the audio as good as possible >> and support several languages >> >> So, in short, the Raspberry way is faster, cheaper, better, proven reliable, >> available everywhere. >> Now show us what YOU did. > > Put it *in* a bluetooth earpiece and have it run off the battery that's > in that earpiece. Make sure that earpiece is paired with a BT host that > ultimately has internet access -- to get to your google service. And, > maintain this connectivity while I walk, drive, ride a bicycle or > any other activity -- above or below ground. > > You're solving the wrong problem with a sledgehammer.
Ask yourself how your device is going to TELL the user (who lacks eyesight) that "I'm sorry, I can't contact google.com, at the moment". Or, how you're going to tell your device (or google) to use a voice that is richer in low frequency components (larger head size). Or, perhaps a smaller head size that is more friendly to a young child user. Or, ask it how much battery time is remaining (remember, you have to be able to do all of these things while NOT in contact with google). Or, tell it to speak more rapidly (without altering the pitch of the speech). Or, slowly. Or, spell that last word because you couldn't quite sort out what it was saying. Or, tell it to try contacting a different BT host if it can't establish a connection with the nominal BT host. Or, if that host can't get out to the internet. Or... You haven't thought out the problem space to see why your reliance on an external service is flawed.
On a sunny day (Fri, 24 Dec 2021 03:36:53 -0700) it happened Don Y
<blockedofcourse@foo.invalid> wrote in <sq47sd$qn2$1@dont-email.me>:

>On 12/24/2021 1:55 AM, Jan Panteltje wrote: >> On a sunny day (Thu, 23 Dec 2021 12:57:23 -0700) it happened Don Y >> <blockedofcourse@foo.invalid> wrote in <sq2kbf$s5s$1@dont-email.me>: >> >>> On 12/23/2021 9:53 AM, Jan Panteltje wrote: >>>> On a sunny day (Thu, 23 Dec 2021 08:43:24 -0700) it happened Don Y >>>> <blockedofcourse@foo.invalid> wrote in <sq25f9$br9$1@dont-email.me>: >>>> >>>>> On 12/23/2021 6:16 AM, Jan Panteltje wrote: >>>>>> On a sunny day (Wed, 22 Dec 2021 17:21:06 -0700) it happened Don Y >>>>>> <blockedofcourse@foo.invalid> wrote in <sq0fdu$gjc$1@dont-email.me>: >>>>>> >>>>>>> On 12/22/2021 10:15 AM, Jan Panteltje wrote: >>>>>>>> On a sunny day (Wed, 22 Dec 2021 09:30:43 -0700) it happened Don Y >>>>>>>> <blockedofcourse@foo.invalid> wrote in <spvjrv$tj1$1@dont-email.me>: >>>>>>>> >>>>>>>>> It would be a tough call to determine if American English had evolved more >>>>>>>>> OR LESS than the original British. I've read that American English is, in >>>>>>>>> many ways, truer to its British roots than modern British English. >>>>>>>>> >>>>>>>>> Pronunciations also evolve, over time. As well as speech patterns. >>>>>>>>> >>>>>>>>> E.g., I was taught "the" should be pronounced as "thee" when preceding >>>>>>>>> a word beginning with a vowel sound: "Thee English", "Thee other guy" >>>>>>>>> but with a schwa ahead of a consonant: "The next one", "the Frenchman". >>>>>>>>> This seems to no longer be the norm. >>>>>>>>> >>>>>>>>> [You're interested in these sorts of things when you design a >>>>>>>>> speech synthesizer; the different "wh" sounds, etc.] >>>>>>>> >>>>>>>> A pretty decent text to speech is google translate. >>>>>>>> >>>>>>>> This script, called gst2_en on my system, has a female talk in english: >>>>>>>> >>>>>>>> #!/bin/bash >>>>>>>> say() { local IFS=+;/usr/bin/mplayer -ao alsa -really-quiet -noconsolecontrols >>>>>>>> "http://translate.google.com/translate_tts?ie=UTF-8&client=tw-ob&q=$*&tl=en"; } >>>>>>>> say $* >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> You call it like this (with your text as example): >>>>>>>> gst2_en ">E.g., I was taught "the" should be pronounced as "thee" when preceding" >>>>>>>> >>>>>>>> In the script the &tl=en can be changed for the language you want, so &tl=nl for Dutch and &tl=de for German. >>>>>>>> >>>>>>>> If you want the output to go to a mp3 file then use mplayer -dumpstream in that script. >>>>>>>> >>>>>>>> I find the quality better than other things I have tried. >>>>>>>> >>>>>>>> All Linux of course >>>>>>> >>>>>>> There are lots of synthesizers out there -- FOSS as well as commercial. >>>>>>> But, those that run on a PC tend to be bloated implementations -- large >>>>>>> dictionaries, unit databases, etc. And, require a fair bit of CPU >>>>>>> to deliver speech in real-time. If you're trying to run in a small >>>>>>> footprint consuming very little "energy" (think tiny battery), there >>>>>>> really isn't much choice -- esp if you want to be able to tweek the voice >>>>>>> to suit the listeners' preferences (with unconstrained vocabulary) >>>>>> >>>>>> Sure >>>>>> But the advantge of this script is that it uses NO resources on the PC / raspi or whatever >>>>> >>>>> Of course it uses resources! You need a network stack, the memory to >>>>> handle the packets delivered across that connection, the memory to support >>>>> the shell, the filesystem from which to load the script and other binaries, >>>>> the kernel, etc. >>>> >>>> Sure >>>> >>>>> You just assume they cost nothing because they are already present >>>>> in your implementation. Take a *bare* rPi and see how much you have to >>>>> add to it to make it speak. *That* is the resource requirement. >>>> >>>> Not sure what you mean by a 'bare rPi', but even my old raspi one has all that. >>> >>> Strip all of the code off of it so you are starting with *hardware*. >>> Then, add back what you need to make it speak. >> >> OK, let me give you some example in this, and why the choice between apples and oranges . >> Let's say we have nothing but a PIC 18F14k22 (because I have those). >> >> To do the internet thing you need a TCP stack and add a Microchip ethernet ENC28J60 chip >> Microchip _has_ a TCP stack, but then what fun is it, I wrote one back in the 5 1/4 inch floppy days but >> those files got lost, but here is project with an UDP stack I wrote in PIC asm >> http://panteltje.com/panteltje/pic/ethernet_color_pic/ >> controls room lighting from anywhere, been working fine 24/7 since 2013 >> You you will need: >> 1 PIC18F14K22 >> 1 ENC28J60 >> >> Now let's see if we can do audio out with that >> Sure I have done audio with same PIC: >> http://panteltje.com/panteltje/pic/audio_pic/ >> that used PWM somehow, but here I would use a R2R DAC on 8 PIC output pins perhaps. >> The B I G question is now "With This chip can I decode the mp3 stream from google translate?" >> Perhaps, would have to look a the source of mpg123 (open source C mp2/mp3 decoder) to see if I can do it all with 256 bytes >> RAM >> maybe the buffer size is too small, maybe need a bigger PIC or some external memory is needed. >> Never wrote a mp3 decoder so question mark here. >> Config system via RS232 in EEPROM (as in projects above) for IP, gateway, MAC etc etc. >> So 2 chips (or 3 if memory), some caps, power regulator, wall wart, and now you need to make a board if it is for production. >> And you need to write the asm. >> And test an debug it >> Estimated time: some days. >> Cost per hour of qualified person? >> A Raspberry Pi that has all connectors plus some, costs 47$ (went up a while ago because of chip shortages I have read) >> and a small SDcard. >> Runs Linux, is easy to program in minutes, and proven reliable, no board layouts needed, ready in an hour or so. >> And can fall back on whatever speech synth. you installed on it if no internet connection for any reason. >> The advantage of using google for speech is that THEY will do there best to make the audio as good as possible >> and support several languages >> >> So, in short, the Raspberry way is faster, cheaper, better, proven reliable, available everywhere. >> Now show us what YOU did. > >Put it *in* a bluetooth earpiece and have it run off the battery that's >in that earpiece. Make sure that earpiece is paired with a BT host that >ultimately has internet access -- to get to your google service. And, >maintain this connectivity while I walk, drive, ride a bicycle or >any other activity -- above or below ground. > >You're solving the wrong problem with a sledgehammer.
You should have spcified that right away. So again, PIC, bluetooth chip, asm nothing new. Some people here can even design it all in one chip. But we are talking text to speech no (or did you change requirenent again)? WTF would you get the text from? Much simpler to use a normal bluetooth earpiece and a Raspberry Pi talking to it from a fixed place.. https://pimylifeup.com/raspberry-pi-bluetooth/ For other platforms / system bluetooth USB adaptors plenty, I have some for the PC, also bluetooth earpieces. Like I said,the raspi can fall back on any otehr sinth. if no internet connections AGAIN where is your text coming from? You did not show any design or code Just babbling?