Electronics-Related.com
Forums

Nice MCU for small jobs

Started by Phil Hobbs September 3, 2018
On 09/05/2018 11:05 AM, David Brown wrote:
> On 05/09/18 15:35, Joseph Gwinn wrote: >> On Sep 5, 2018, 698839253X6D445TD@nospam.org wrote >> (in article <pmnoi8$12kf$5@gioia.aioe.org>): >> >>> >>> Sure, linked list is not the only thing I use in C. >>> An other thing, also used in xgpspc, is the worldwide list of ships, >>> basically ALL ships in the world that have radio are in that list: >>> -rw-r--r-- 1 root root 31637292 Oct 3 2016 listv.tx >>> 31,637,292 >>> >>> 31 MB that looks like this: >>> 203013200 OEX2023 APSYRTE AUT PL MTB W25884 2,30 7 V SRI: VHFDSC >>> >>> When a radio message is received from a ship it contains the MMSI (the first >>> number on the previous line), >>> and xgpspc has to look it up in the list to get the ship-name, what it is, >>> etc... >>> In that case we first sort the list by number ONCE, and then do a successive >>> approximation search >>> and have the answer in a few ms for every message. >>> No plush plush needed, >>> >>> Maybe a hundred lines of code.. >> >> This successive approximation approach is well known in computer science, >> where it is called &ldquo;binary search&rdquo; in English. As you note, it is very >> fast. >> >> .<https://en.wikipedia.org/wiki/Binary_search_algorithm> >> >> The only algorithm that is faster is hashing, which is also discussed in the >> above wiki article. The lookup time for hashing is constant, regardless of >> the size of the dataset. >> > > There are many search and sorting algorithms, some of which are faster > than binary search - for some value of "faster", and in some > circumstances. Hashing is one example, but of course its speed depends > on the collision rate, which also depends on the space available. And > lookup speed may be fast in theory (it's just one array index) but slow > in practice (the array is spread over memory that is not in caches) - an > algorithm that is theoretically slower might be faster in practice if it > is more cache-friendly.
IIRC that's part of why the linked-list insertion poops out compared to vector resize-and-insert for large data sets, the naive linked-list is not particularly cache-friendly for large data sets
> Then there are parallel algorithms... > > > Binary search is optimal from a theoretical viewpoint in a simple > system. It's usually also very good in practice, but it does not always > work out that way. > >
On 09/05/2018 11:00 AM, 698839253X6D445TD@nospam.org wrote:
> Joseph Gwinn wrote >> On Sep 5, 2018, 698839253X6D445TD@nospam.org wrote >> This successive approximation approach is well known in computer science, >> where it is called &ldquo;binary search&rdquo; in English. As you note, it is very >> fast. >> >> .<https://en.wikipedia.org/wiki/Binary_search_algorithm> >> >> The only algorithm that is faster is hashing, which is also discussed in the >> above wiki article. The lookup time for hashing is constant, regardless of >> the size of the dataset. > > I am not sure hashing is faster, what gives you that idea? > I have used it, was long time ago, trying to remember... > You may well be right, but why? >
Binary search time-complexity is O(1) best-case and O(log n) worst-case, a hash table is O(1) best-case and O(n) worst-case. A naive binary search implementation can certainly beat a pathological hash table implementation in the average; one would hope most hash tables aren't pathological, though, and binary search is only O(1) in the average for pathological use-cases of binary search.
On 09/05/2018 11:00 AM, 698839253X6D445TD@nospam.org wrote:
> Joseph Gwinn wrote >> On Sep 5, 2018, 698839253X6D445TD@nospam.org wrote >> This successive approximation approach is well known in computer science, >> where it is called &ldquo;binary search&rdquo; in English. As you note, it is very >> fast. >> >> .<https://en.wikipedia.org/wiki/Binary_search_algorithm> >> >> The only algorithm that is faster is hashing, which is also discussed in the >> above wiki article. The lookup time for hashing is constant, regardless of >> the size of the dataset. > > I am not sure hashing is faster, what gives you that idea? > I have used it, was long time ago, trying to remember... > You may well be right, but why? >
The reason a hash table is faster is more-or-less because you already did the work of the search and stored it, encoded in the hash table. You trade off space usage to save time. "Searching" for something is easy if you already know where it is.
On 09/05/2018 09:35 AM, Joseph Gwinn wrote:
> On Sep 5, 2018, 698839253X6D445TD@nospam.org wrote > (in article <pmnoi8$12kf$5@gioia.aioe.org>): > >> >> Sure, linked list is not the only thing I use in C. >> An other thing, also used in xgpspc, is the worldwide list of ships, >> basically ALL ships in the world that have radio are in that list: >> -rw-r--r-- 1 root root 31637292 Oct 3 2016 listv.tx >> 31,637,292 >> >> 31 MB that looks like this: >> 203013200 OEX2023 APSYRTE AUT PL MTB W25884 2,30 7 V SRI: VHFDSC >> >> When a radio message is received from a ship it contains the MMSI (the first >> number on the previous line), >> and xgpspc has to look it up in the list to get the ship-name, what it is, >> etc... >> In that case we first sort the list by number ONCE, and then do a successive >> approximation search >> and have the answer in a few ms for every message. >> No plush plush needed, >> >> Maybe a hundred lines of code.. > > This successive approximation approach is well known in computer science, > where it is called &ldquo;binary search&rdquo; in English. As you note, it is very > fast. > > .<https://en.wikipedia.org/wiki/Binary_search_algorithm> > > The only algorithm that is faster is hashing, which is also discussed in the > above wiki article. The lookup time for hashing is constant, regardless of > the size of the dataset. > > Joe Gwinn >
When I go to the library to find a novel I want here's what I do. I mentally split the library floor space in half and check every book in the first half one by one. This only takes approximately twelve days. If it's not in the first half I....fuuuuuk!!!! :( :(
On 05/09/18 14:35, Joseph Gwinn wrote:
> On Sep 5, 2018, 698839253X6D445TD@nospam.org wrote > (in article <pmnoi8$12kf$5@gioia.aioe.org>): > >> >> Sure, linked list is not the only thing I use in C. >> An other thing, also used in xgpspc, is the worldwide list of ships, >> basically ALL ships in the world that have radio are in that list: >> -rw-r--r-- 1 root root 31637292 Oct 3 2016 listv.tx >> 31,637,292 >> >> 31 MB that looks like this: >> 203013200 OEX2023 APSYRTE AUT PL MTB W25884 2,30 7 V SRI: VHFDSC >> >> When a radio message is received from a ship it contains the MMSI (the first >> number on the previous line), >> and xgpspc has to look it up in the list to get the ship-name, what it is, >> etc... >> In that case we first sort the list by number ONCE, and then do a successive >> approximation search >> and have the answer in a few ms for every message. >> No plush plush needed, >> >> Maybe a hundred lines of code.. > > This successive approximation approach is well known in computer science, > where it is called &ldquo;binary search&rdquo; in English. As you note, it is very > fast. > > .<https://en.wikipedia.org/wiki/Binary_search_algorithm> > > The only algorithm that is faster is hashing, which is also discussed in the > above wiki article. The lookup time for hashing is constant, regardless of > the size of the dataset.
One algorithm that seems to combine the best behaviours and avoid the worst case behaviours is the skiplist algorithm.
bitrex wrote
>Binary search time-complexity is O(1) best-case and O(log n) worst-case, >a hash table is O(1) best-case and O(n) worst-case. > >A naive binary search implementation can certainly beat a pathological >hash table implementation in the average; one would hope most hash >tables aren't pathological, though, and binary search is only O(1) in >the average for pathological use-cases of binary search.
OK, yes, exactly. Just did look up where I used hashing, I did actually use hashing for message IDs in my newsreader, to stop cross posted repeats from being listed again and again. and also in my videotext / ceefax program for storing pages. Both applications are not really that time critical and the data sets are small.
On 09/05/2018 01:31 PM, 698839253X6D445TD@nospam.org wrote:
> bitrex wrote >> Binary search time-complexity is O(1) best-case and O(log n) worst-case, >> a hash table is O(1) best-case and O(n) worst-case. >> >> A naive binary search implementation can certainly beat a pathological >> hash table implementation in the average; one would hope most hash >> tables aren't pathological, though, and binary search is only O(1) in >> the average for pathological use-cases of binary search. > > OK, yes, exactly. > > Just did look up where I used hashing, > I did actually use hashing for message IDs in my newsreader, > to stop cross posted repeats from being listed again and again. > and also in my videotext / ceefax program for storing pages. > Both applications are not really that time critical and the data sets are small. >
Big O is really only useful for analysis comparisons for very large data sets, where for example your data set is >> CPU cache size on desktop architectures. what's actually better in practice for small datasets where the set-up/tear-down overhead of the algorithm and execution time of things like branches becomes on the order of the time you spend actually carrying out the full search. For "small" data sets sometimes just linear search beats everything else hands-down. What the specific uses cases it will are and what the precise definition of "small" is for a particular architecture endlessly debated; x86 is so complex it's often anyone's guess, you have to profile to see. as usual the answer to "what's the best way to..." is "it depends"
On 09/05/2018 02:13 PM, bitrex wrote:
> On 09/05/2018 01:31 PM, 698839253X6D445TD@nospam.org wrote: >> bitrex wrote >>> Binary search time-complexity is O(1) best-case and O(log n) worst-case, >>> a hash table is O(1) best-case and O(n) worst-case. >>> >>> A naive binary search implementation can certainly beat a pathological >>> hash table implementation in the average; one would hope most hash >>> tables aren't pathological, though, and binary search is only O(1) in >>> the average for pathological use-cases of binary search. >> >> OK, yes, exactly. >> >> Just did look up where I used hashing, >> I did actually use hashing for message IDs in my newsreader, >> to stop cross posted repeats from being listed again and again. >> and also in my videotext / ceefax program for storing pages. >> Both applications are not really that time critical and the data sets >> are small. >> > > Big O is really only useful for analysis comparisons for very large data > sets, where for example your data set is >> CPU cache size on desktop > architectures. what's actually better in practice for small datasets > where the set-up/tear-down overhead of the algorithm and execution time > of things like branches becomes on the order of the time you spend > actually carrying out the full search. > > For "small" data sets sometimes just linear search beats everything else > hands-down. What the specific uses cases it will are and what the > precise definition of "small" is for a particular architecture endlessly > debated; x86 is so complex it's often anyone's guess, you have to > profile to see. > > as usual the answer to "what's the best way to..." is "it depends"
For example if your arch of choice has truly exceptional branch prediction binary search can be faster even in cases where an on-paper analysis says it shouldn't be, just cuz a binary search can be written where it's a lot easier for a the branch predictor to suss out the taken branch than a naive linear search or hash table lookup.
bitrex wrote:

> For the stuff they do, e,g, back-ends for Web sites receiving hundreds > of thousands or millions of hits a day, being up to speed on the latest > hip stuff in data structures is really important.
I have been doing that for the financial sector for 15 years, even crazier volumes. Cutting edge technology, partially invented by myself. My background in lock-free algorithms and low-level programming was the enabler here. We were also pioneering the GPU-based read-mostly databases -- throughput was stellar, but latency killed the entire idea. It should have been done on FPGA, but the management failed to recognize this opportunity. Many brilliant people were involved, perhaps the most advanced stuff I've ever been doing in my professional career, by several orders of magnitude. Best regards, Piotr
tirsdag den 4. september 2018 kl. 03.35.04 UTC+2 skrev Lasse Langwadt Christensen:
> tirsdag den 4. september 2018 kl. 03.31.47 UTC+2 skrev k...@notreal.com: > > On Mon, 3 Sep 2018 13:25:05 -0400, Phil Hobbs > > <pcdhSpamMeSenseless@electrooptical.net> wrote: > > > > >LPC804: 15 MHz Cortex M0+, 32kB flash, 4k RAM, 12-bit 2 us ADC with 1 > > >LSB DNL, 10-bit DAC, the usual serial stuff, 20-TSSOP: 67 cents in > > >reels, dev board $27. > > > > <https://www.digikey.com/product-detail/en/stmicroelectronics/STM32F030F4P6TR/497-17333-2-ND/4755941> > > > > $0.55 > > that only has half the flash
I like the STM32F051 32kB devices, comes in at below 32 cents in volume (ST slogan: 32 bits for 32 cents) 4 times faster than the LPC802, double the IO, much faster ADC, better DAC. Should I continue? Cheers Klaus