On Sun, 17 Oct 2021 00:01:26 +0100, Tom Gardner
<spamjunk@blueyonder.co.uk> wrote:

>On 16/10/21 23:28, Joe Gwinn wrote:
>> On Fri, 15 Oct 2021 22:54:52 +0100, Tom Gardner
>> <spamjunk@blueyonder.co.uk> wrote:
>> 
>>> On 15/10/21 20:36, Joe Gwinn wrote:
>>>>> But if all you are used to doing is clagging together
>>>>> components in a framework, most people don't understand
>>>>> the simple underlying principles - and limitations. And
>>>>> that's when subtle problems occur.
>>>>
>>>> Right.  And it's intrinsic.  When teams go from say ten people max to
>>>> a hundred min, regression to the mean is inevitable.
>>>>
>>>> War story:  Many years ago, before C et al, we had a simple GUI to
>>>> code.  Because the hardware was very small and slow, I forbade the use
>>>> of Fortran FORMAT statements.  We didn't have the memory for the
>>>> FORMAT library, or the time to run it.  FORMAT is in fact a runtime
>>>> interpreter.
>>>>
>>>> Big fight.  Turned out that the programmer in question did not know
>>>> how to generate numbers in ascii format given integer values. Argument
>>>> ended when I gave the programmer a xerox copy of the relevant few
>>>> pages from Knuth's "The Art of Computer Programming".
>>>>
>>>> .<https://en.wikipedia.org/wiki/The_Art_of_Computer_Programming>
>>>
>>> My first vacation job was to implement floating point
>>> arithmetic on a 6800 SBC, in a macro expansion language.
>>>
>>> I figured it all out, but had to be pointed to the CORDIC
>>> algorithm. Fortunately I had done a decent amount of maths
>>> by then, so it was trivial to understand. Still like those
>>> algorithms!
>> 
>> We always ran scaled binary, for speed, back in that day.
>
>Speed wasn't an issue, and it was only a demo-of-concept, so
>overall it was unimportant.
>
>
>> CORDIC is a cute algorithm, and still widely used in signal processing
>> firmware.
>
>I'm not surprised :) Sometimes elegance does win!

I suspect that a big attraction is that it can be done in a pipeline.


>>> I wonder what "your" programmer would have done!
>> 
>> Seppuku is unlikely.  Actually, the usual response was a bespoke table
>> lookup algorithm, likely with linear interpolation, with no more
>> precision than absolutely needed.
>> 
>> My objection was not that he didn't know, but that he tried to bluster
>> his way through.
>
>One of my techniques in many situations is to ask increasingly
>unrealistic questiond. The sooner I get a response like "no" or
>"I don't know" the better - because then I have more trust in a
>"yes" or "I know" answer.
>
>Bullshitters/bozos fail that test; good.

I use that test too.  A parallel form is to ask a person pitching XXX
what XXX is *not* good for.  

Broad experts will have a very clear idea where their technology is
essential (opens new vistas), very useful, reasonable, and most
importantly, should not be used.  True believers will come up blank on
the last parts.  But nothing is good for everything.

Joe Gwinn

On 16/10/21 23:28, Joe Gwinn wrote:
> On Fri, 15 Oct 2021 22:54:52 +0100, Tom Gardner
> <spamjunk@blueyonder.co.uk> wrote:
> 
>> On 15/10/21 20:36, Joe Gwinn wrote:
>>>> But if all you are used to doing is clagging together
>>>> components in a framework, most people don't understand
>>>> the simple underlying principles - and limitations. And
>>>> that's when subtle problems occur.
>>>
>>> Right.  And it's intrinsic.  When teams go from say ten people max to
>>> a hundred min, regression to the mean is inevitable.
>>>
>>> War story:  Many years ago, before C et al, we had a simple GUI to
>>> code.  Because the hardware was very small and slow, I forbade the use
>>> of Fortran FORMAT statements.  We didn't have the memory for the
>>> FORMAT library, or the time to run it.  FORMAT is in fact a runtime
>>> interpreter.
>>>
>>> Big fight.  Turned out that the programmer in question did not know
>>> how to generate numbers in ascii format given integer values. Argument
>>> ended when I gave the programmer a xerox copy of the relevant few
>>> pages from Knuth's "The Art of Computer Programming".
>>>
>>> .<https://en.wikipedia.org/wiki/The_Art_of_Computer_Programming>
>>
>> My first vacation job was to implement floating point
>> arithmetic on a 6800 SBC, in a macro expansion language.
>>
>> I figured it all out, but had to be pointed to the CORDIC
>> algorithm. Fortunately I had done a decent amount of maths
>> by then, so it was trivial to understand. Still like those
>> algorithms!
> 
> We always ran scaled binary, for speed, back in that day.

Speed wasn't an issue, and it was only a demo-of-concept, so
overall it was unimportant.


> CORDIC is a cute algorithm, and still widely used in signal processing
> firmware.

I'm not surprised :) Sometimes elegance does win!


>> I wonder what "your" programmer would have done!
> 
> Seppuku is unlikely.  Actually, the usual response was a bespoke table
> lookup algorithm, likely with linear interpolation, with no more
> precision than absolutely needed.
> 
> My objection was not that he didn't know, but that he tried to bluster
> his way through.

One of my techniques in many situations is to ask increasingly
unrealistic questiond. The sooner I get a response like "no" or
"I don't know" the better - because then I have more trust in a
"yes" or "I know" answer.

Bullshitters/bozos fail that test; good.

On Fri, 15 Oct 2021 22:54:52 +0100, Tom Gardner
<spamjunk@blueyonder.co.uk> wrote:

>On 15/10/21 20:36, Joe Gwinn wrote:
>>> But if all you are used to doing is clagging together
>>> components in a framework, most people don't understand
>>> the simple underlying principles - and limitations. And
>>> that's when subtle problems occur.
>>
>> Right.  And it's intrinsic.  When teams go from say ten people max to
>> a hundred min, regression to the mean is inevitable.
>> 
>> War story:  Many years ago, before C et al, we had a simple GUI to
>> code.  Because the hardware was very small and slow, I forbade the use
>> of Fortran FORMAT statements.  We didn't have the memory for the
>> FORMAT library, or the time to run it.  FORMAT is in fact a runtime
>> interpreter.
>> 
>> Big fight.  Turned out that the programmer in question did not know
>> how to generate numbers in ascii format given integer values. Argument
>> ended when I gave the programmer a xerox copy of the relevant few
>> pages from Knuth's "The Art of Computer Programming".
>> 
>> .<https://en.wikipedia.org/wiki/The_Art_of_Computer_Programming>
>
>My first vacation job was to implement floating point
>arithmetic on a 6800 SBC, in a macro expansion language.
>
>I figured it all out, but had to be pointed to the CORDIC
>algorithm. Fortunately I had done a decent amount of maths
>by then, so it was trivial to understand. Still like those
>algorithms!

We always ran scaled binary, for speed, back in that day.

CORDIC is a cute algorithm, and still widely used in signal processing
firmware.


>I wonder what "your" programmer would have done!

Seppuku is unlikely.  Actually, the usual response was a bespoke table
lookup algorithm, likely with linear interpolation, with no more
precision than absolutely needed.  

My objection was not that he didn't know, but that he tried to bluster
his way through.

Joe Gwinn

On Fri, 15 Oct 2021 15:00:00 -0700 (PDT), Tony Hans
<tonyhands109@gmail.com> wrote:

>Face to face deals but most be a client we have dealt before
>On the menu,
> we have variety of bud strains/ extracts,vapes, carts,coke,addies,xanax bars,vicodin,80mg Oxy,Valium, Tramadol,Ivermectin,Morphine,Modafinil ,back tar,,blues,Cough syrup, Codeine, Lsd and Mdma.
>
>We can meet face to face as well as delivery to your doorstep as pizza boy
>Or Postal service if you are not in my area.
>
>We also do international shipment to Canada, Australia,Greece,Denmark, Italy, UK and many 
>
>other countries in the European Union
>
>We accept payment by bitcoin only and communication by Wickr only
>
>Place your orders
>Wickr: Vendorvuggs1

Does each delivery come with handcuffs?

-- 

If a man will begin with certainties, he shall end with doubts, 
but if he will be content to begin with doubts he shall end in certainties.
Francis Bacon

Face to face deals but most be a client we have dealt before
On the menu,
 we have variety of bud strains/ extracts,vapes, carts,coke,addies,xanax bars,vicodin,80mg Oxy,Valium, Tramadol,Ivermectin,Morphine,Modafinil ,back tar,,blues,Cough syrup, Codeine, Lsd and Mdma.

We can meet face to face as well as delivery to your doorstep as pizza boy
Or Postal service if you are not in my area.

We also do international shipment to Canada, Australia,Greece,Denmark, Italy, UK and many 

other countries in the European Union

We accept payment by bitcoin only and communication by Wickr only

Place your orders
Wickr: Vendorvuggs1

On 15/10/21 20:36, Joe Gwinn wrote:
>> But if all you are used to doing is clagging together
>> components in a framework, most people don't understand
>> the simple underlying principles - and limitations. And
>> that's when subtle problems occur.
> Right.  And it's intrinsic.  When teams go from say ten people max to
> a hundred min, regression to the mean is inevitable.
> 
> War story:  Many years ago, before C et al, we had a simple GUI to
> code.  Because the hardware was very small and slow, I forbade the use
> of Fortran FORMAT statements.  We didn't have the memory for the
> FORMAT library, or the time to run it.  FORMAT is in fact a runtime
> interpreter.
> 
> Big fight.  Turned out that the programmer in question did not know
> how to generate numbers in ascii format given integer values. Argument
> ended when I gave the programmer a xerox copy of the relevant few
> pages from Knuth's "The Art of Computer Programming".
> 
> .<https://en.wikipedia.org/wiki/The_Art_of_Computer_Programming>

My first vacation job was to implement floating point
arithmetic on a 6800 SBC, in a macro expansion language.

I figured it all out, but had to be pointed to the CORDIC
algorithm. Fortunately I had done a decent amount of maths
by then, so it was trivial to understand. Still like those
algorithms!

I wonder what "your" programmer would have done!

On Fri, 15 Oct 2021 00:56:47 +0100, Tom Gardner
<spamjunk@blueyonder.co.uk> wrote:

>On 15/10/21 00:09, Joe Gwinn wrote:
>> On Thu, 14 Oct 2021 17:10:37 +0100, Tom Gardner
>> <spamjunk@blueyonder.co.uk> wrote:
>> 
>>> On 14/10/21 16:26, Joe Gwinn wrote:
>>>> On Wed, 13 Oct 2021 00:57:53 +0100, Tom Gardner
>>>> <spamjunk@blueyonder.co.uk> wrote:
>>>>
>>>>> On 13/10/21 00:14, Joe Gwinn wrote:
>
>>>>> Another was debugging the 68000 SBC and its RTOS, where we
>>>>> had purchased both and both were buggy. Oh the (unproductive)
>>>>> fun we had!
>>>>
>>>> We had lots of problems with 68000 SBCs that didn't work as well, but
>>>> eventually figured out who _not_ to buy from.  Which was most of the
>>>> then vendors, who mostly vanished over time.  One assumes that word
>>>> got around.
>>>
>>> Interesting. From memory (1988!) we had three, and different
>>> ones had a different 1/4 of their memory non-functional.
>> 
>> I also recall lots of problems with correct implementation of such
>> instructions as Test-and-Set, which require the bus and bus interfaces
>> to cooperate when used in a multiprocessor setup.
>> 
>> Also, backplanes that could not handle conflicting atomic operations
>> from different SBCs on the same bus.  The symptom was that the
>> backplane locked up, requiring a power cycle to recover control.
>> 
>> I wrote a short test program for that, called "TASbasher".  One ran an
>> instance in each SBC, with mutually prime numbers of NOPs in the
>> loops, so the two instances could not get comfortable in an
>> alternating cycle.  Vulnerable backplanes would lock up in a second or
>> two.  And the hardware folk ran out of fix-your-buggy-software
>> excuses.
>
>I've never understood the "its /your/ problem" attitude,
>partly because I've usually been involved in all parts
>of the problem and solution - and don't recognise
>artificial boundaries.

All those tribes are under pressure to get to done NOW!  The
boundaries are not artificial, they are tribal.


>>> In the Java world, Doug Lea transliterated useful design patterns
>>> found in the real time community into Java classes. They were
>>> eventually included in the standard Java libraries.
>>>
>>> Most of my architectures seem to be what is sometimes called
>>> the half-async-half-sync pattern:
>>>   - create event (from a task or hardware interrupt)
>>>   - put event in queue, and return pronto
>>>   - loop, sucking event from queue, processing it to completion,
>>>     often creating an event and yielding
>> 
>> An oldie but goodie.  We would timestamp event records at creation,
>> and then process them while respecting the responsiveness needed for
>> whatever kind of event it was, in order if that was needed.
>
>Yup :)
>
>Being able to do that efficiently is useful. In my case
>I created PDFs on the fly, and could later post process
>them into CDFs with useful info such as the 95th percentile.
>That level of detail blew people's minds, but it seemed
>obvious to me.

I often go far deeper, depending.  Sometimes the measurements are to
parts per million, to ensure that the skirts are well enough known.


>It is very helpful to be able to show another
>developer/team/company that you have been able to
>narrow the system's unresponsiveness to their
>component. That's the pre-requisite for improvements.

Yep.


>>> A variation on that is a telecom system where there are
>>> many calls each with their own distinct event flow, where
>>> each call can have multiple outstanding events from different
>>> sources that must be processed in the order of reception.
>>>
>>> In that case
>>>   - for each remote event source, a task sucks on the incoming
>>>     events
>>>   - each event queue for the relevant call, and the relevant
>>>     call is queued in a global "work to be done" queue
>>>   - set of worker threads (~1 per core) take then next call in
>>>     the global queue, take the first event in that call's queue,
>>>     and process it to completion
>> 
>> This sounds like what I described just above.
>
>I triumphantly (re)invented FSMs at school, when programming
>a 39-bit machine in assembler.
>
>I triumphantly (re)invented microprogramming a couple of
>years later when thinking about how I would implement a
>computer.
>
>Such reinvention is to be expected when you start from
>scratch thinking about how you would solve a problem.
>Hopefully the reinvented wheels aren't too elliptical.
>
>But if all you are used to doing is clagging together
>components in a framework, most people don't understand
>the simple underlying principles - and limitations. And
>that's when subtle problems occur.

Right.  And it's intrinsic.  When teams go from say ten people max to
a hundred min, regression to the mean is inevitable.

War story:  Many years ago, before C et al, we had a simple GUI to
code.  Because the hardware was very small and slow, I forbade the use
of Fortran FORMAT statements.  We didn't have the memory for the
FORMAT library, or the time to run it.  FORMAT is in fact a runtime
interpreter.

Big fight.  Turned out that the programmer in question did not know
how to generate numbers in ascii format given integer values. Argument
ended when I gave the programmer a xerox copy of the relevant few
pages from Knuth's "The Art of Computer Programming".

.<https://en.wikipedia.org/wiki/The_Art_of_Computer_Programming>


>>>> One also tested for priority inversion failures.  While speed is also
>>>> measured, this and the circular path are problems no matter the speed
>>>> of the problem or RTOS.
>>>
>>> I manage to structure my systems into three priority levels:
>>>    - hardware interrupt
>>>    - panic and commit seppuku
>>>    - everything else
>>>
>>> My brain is too feeble to cope with anything else.
>> 
>> The old hard-frame periodic RTOSes worked that way, but it doesn't
>> scale all that well.  But nobody had the panic and die option - there
>> was always some form of elasticity, sometimes called a "rubber clock",
>> because customers would not tolerate such fragility.
>
>I've seen pictures of clocks on railway platforms showing
>the time as 14:65, for similar reasons :)

Or, simply broken.


>> Present-day ERT systems have many priority levels, basically for
>> better overall responsiveness, taking into account that while they are
>> all ERT, some are more urgent than others.
>
>I've never worked on such problems, I'm pleased to say :)

But the one with bigger toys wins!


>> A classic example is a weather radar, where a 3D volume scan takes
>> about five minutes to collect all the data.  The problem is that the
>> intensity and kind of weather passing through coverage varies
>> randomly, and there can be too much data to handle.
>> 
>> The data reduction algorithms form a data-driven data-flow machine.
>> The objective is to not lose raw radar data even if the data
>> processing falls behind, especially in the heaviest of weather just
>> short of blowing the radar tower away.
>> 
>> So the highest priority is given to the interface to the radar
>> hardware, and the components of the data-flow machine have a gradient
>> of priorities, such that if one level falls behind, the earlier layer
>> will carry on, and the overall data reduction will eventually catch
>> up. But of course, in heavy enough weather, this becomes impossible.
>> 
>> So, in parallel, the number of free memory blocks is tracked, and if
>> it falls below some threshold, the later processing steps are simply
>> omitted, so some outputs are not produced at all in extremis.  Given
>> that the raw data was collected, anything can be generated after the
>> storm passes, up to when the radar tower went over the moon.
>
>All seems sane.

Yes.  There are many priority architectures, depending on the
objectives of the overall system controlled by the ERT code within.

Joe Gwinn

On 15/10/21 00:09, Joe Gwinn wrote:
> On Thu, 14 Oct 2021 17:10:37 +0100, Tom Gardner
> <spamjunk@blueyonder.co.uk> wrote:
> 
>> On 14/10/21 16:26, Joe Gwinn wrote:
>>> On Wed, 13 Oct 2021 00:57:53 +0100, Tom Gardner
>>> <spamjunk@blueyonder.co.uk> wrote:
>>>
>>>> On 13/10/21 00:14, Joe Gwinn wrote:

>>>> Another was debugging the 68000 SBC and its RTOS, where we
>>>> had purchased both and both were buggy. Oh the (unproductive)
>>>> fun we had!
>>>
>>> We had lots of problems with 68000 SBCs that didn't work as well, but
>>> eventually figured out who _not_ to buy from.  Which was most of the
>>> then vendors, who mostly vanished over time.  One assumes that word
>>> got around.
>>
>> Interesting. From memory (1988!) we had three, and different
>> ones had a different 1/4 of their memory non-functional.
> 
> I also recall lots of problems with correct implementation of such
> instructions as Test-and-Set, which require the bus and bus interfaces
> to cooperate when used in a multiprocessor setup.
> 
> Also, backplanes that could not handle conflicting atomic operations
> from different SBCs on the same bus.  The symptom was that the
> backplane locked up, requiring a power cycle to recover control.
> 
> I wrote a short test program for that, called "TASbasher".  One ran an
> instance in each SBC, with mutually prime numbers of NOPs in the
> loops, so the two instances could not get comfortable in an
> alternating cycle.  Vulnerable backplanes would lock up in a second or
> two.  And the hardware folk ran out of fix-your-buggy-software
> excuses.

I've never understood the "its /your/ problem" attitude,
partly because I've usually been involved in all parts
of the problem and solution - and don't recognise
artificial boundaries.



>> In the Java world, Doug Lea transliterated useful design patterns
>> found in the real time community into Java classes. They were
>> eventually included in the standard Java libraries.
>>
>> Most of my architectures seem to be what is sometimes called
>> the half-async-half-sync pattern:
>>   - create event (from a task or hardware interrupt)
>>   - put event in queue, and return pronto
>>   - loop, sucking event from queue, processing it to completion,
>>     often creating an event and yielding
> 
> An oldie but goodie.  We would timestamp event records at creation,
> and then process them while respecting the responsiveness needed for
> whatever kind of event it was, in order if that was needed.

Yup :)

Being able to do that efficiently is useful. In my case
I created PDFs on the fly, and could later post process
them into CDFs with useful info such as the 95th percentile.
That level of detail blew people's minds, but it seemed
obvious to me.

It is very helpful to be able to show another
developer/team/company that you have been able to
narrow the system's unresponsiveness to their
component. That's the pre-requisite for improvements.



>> A variation on that is a telecom system where there are
>> many calls each with their own distinct event flow, where
>> each call can have multiple outstanding events from different
>> sources that must be processed in the order of reception.
>>
>> In that case
>>   - for each remote event source, a task sucks on the incoming
>>     events
>>   - each event queue for the relevant call, and the relevant
>>     call is queued in a global "work to be done" queue
>>   - set of worker threads (~1 per core) take then next call in
>>     the global queue, take the first event in that call's queue,
>>     and process it to completion
> 
> This sounds like what I described just above.

I triumphantly (re)invented FSMs at school, when programming
a 39-bit machine in assembler.

I triumphantly (re)invented microprogramming a couple of
years later when thinking about how I would implement a
computer.

Such reinvention is to be expected when you start from
scratch thinking about how you would solve a problem.
Hopefully the reinvented wheels aren't too elliptical.

But if all you are used to doing is clagging together
components in a framework, most people don't understand
the simple underlying principles - and limitations. And
that's when subtle problems occur.


>>> One also tested for priority inversion failures.  While speed is also
>>> measured, this and the circular path are problems no matter the speed
>>> of the problem or RTOS.
>>
>> I manage to structure my systems into three priority levels:
>>    - hardware interrupt
>>    - panic and commit seppuku
>>    - everything else
>>
>> My brain is too feeble to cope with anything else.
> 
> The old hard-frame periodic RTOSes worked that way, but it doesn't
> scale all that well.  But nobody had the panic and die option - there
> was always some form of elasticity, sometimes called a "rubber clock",
> because customers would not tolerate such fragility.

I've seen pictures of clocks on railway platforms showing
the time as 14:65, for similar reasons :)


> Present-day ERT systems have many priority levels, basically for
> better overall responsiveness, taking into account that while they are
> all ERT, some are more urgent than others.

I've never worked on such problems, I'm pleased to say :)



> A classic example is a weather radar, where a 3D volume scan takes
> about five minutes to collect all the data.  The problem is that the
> intensity and kind of weather passing through coverage varies
> randomly, and there can be too much data to handle.
> 
> The data reduction algorithms form a data-driven data-flow machine.
> The objective is to not lose raw radar data even if the data
> processing falls behind, especially in the heaviest of weather just
> short of blowing the radar tower away.
> 
> So the highest priority is given to the interface to the radar
> hardware, and the components of the data-flow machine have a gradient
> of priorities, such that if one level falls behind, the earlier layer
> will carry on, and the overall data reduction will eventually catch
> up. But of course, in heavy enough weather, this becomes impossible.
> 
> So, in parallel, the number of free memory blocks is tracked, and if
> it falls below some threshold, the later processing steps are simply
> omitted, so some outputs are not produced at all in extremis.  Given
> that the raw data was collected, anything can be generated after the
> storm passes, up to when the radar tower went over the moon.

All seems sane.

On Thu, 14 Oct 2021 17:10:37 +0100, Tom Gardner
<spamjunk@blueyonder.co.uk> wrote:

>On 14/10/21 16:26, Joe Gwinn wrote:
>> On Wed, 13 Oct 2021 00:57:53 +0100, Tom Gardner
>> <spamjunk@blueyonder.co.uk> wrote:
>> 
>>> On 13/10/21 00:14, Joe Gwinn wrote:
>>>>
>>>>>>> That is all/partially/  hidden by the operation of the L1/L2/L3
>>>>>>> caches in processors, but all the interacting C language
>>>>>>> features (e.g. const, volatile, etc) have to be got right.
>>>>>>> If incorrect, then subtle rare unreproduceable errors will
>>>>>>> occur.
>>>>>> Also true.  It's the programmers' job to understand all this.  Not
>>>>>> that many understand the hardware that deeply, but enough do.
>>>>> Most of them understand (and I use that word loosely) only
>>>>> enough to allow them to copy-and-paste "solutions" from
>>>>> stackexchange. Once it compiles and passes their inadequate
>>>>> unit tests, it works - by definition.
>>>>
>>>> That is certainly true.  There is a wide dynamic range of programmer
>>>> skill.  Back in the day, I was a rarity, being bilingual (hardware and
>>>> software), and it allowed me to solve some pretty wild bugs fairly
>>>> easily, because I had access to a level of  information not commonly
>>>> available to pure software folks.
>>>
>>> I was too, everything from low noise analogue, conventional
>>> digital, "micros", RT software, cellphone modelling and measuring,
>>> and even some CRUD database stuff.
>>>
>>> When I'm feeling mischievous, usually in a pub, I'll tell
>>> people that I don't know where the boundary between hardware
>>> and software actually is.
>>>
>>> They are usually aghast at first. After mentioning microcode,
>>> the way modern ISAs are decomposed into RISC-like micro-ops
>>> inside the processor, FPGAs, emulation, etc, the reactions
>>> are one of two kinds
>>>   - slightly aggressive denial, usually accompanied by looks
>>>     of bewilderment and incomprehension
>>>   - amusement, and delight at the philosophical questions
>>>
>>> Guess which people I trust (technically) more!
>> 
>> Heh.  Did they buy you a beer?
>> 
>> 
>>>> War story.  Something like ten years ago, the C++ tribe was unable to
>>>> figure our why the radar software would go casters-up on startup. This
>>>> was likely a million lines of code at least.  When it fell over, no
>>>> error messages or other information was printed.   This problem
>>>> endured for months.
>>>
>>> Java, and other modern languages, are usually much
>>> better in that respect - aggressive use of exceptions
>>> and full stack traces really help.
>> 
>> In theory, so does C++.  For all the good it did.
>> 
>> And C++ stack traces can be pretty hard to follow.  But in my above
>> example, there was no stack trace to pore over.  That's where the
>> kernel debugger came in.  The kernel knows who is waiting on what, and
>> where in the application code the request was made.
>
>The only time this is likely to happen in a Java application
>is if the JVM is broken. I have seen config statements to
>the effect of "don't HotSpot optimise ByteArrays if using
>JRE 1.4.16". No idea how they noticed and isolated that
>as the bug!

The first three teams didn't figure it out?

>>> Another was debugging the 68000 SBC and its RTOS, where we
>>> had purchased both and both were buggy. Oh the (unproductive)
>>> fun we had!
>> 
>> We had lots of problems with 68000 SBCs that didn't work as well, but
>> eventually figured out who _not_ to buy from.  Which was most of the
>> then vendors, who mostly vanished over time.  One assumes that word
>> got around.
>
>Interesting. From memory (1988!) we had three, and different
>ones had a different 1/4 of their memory non-functional.

I also recall lots of problems with correct implementation of such
instructions as Test-and-Set, which require the bus and bus interfaces
to cooperate when used in a multiprocessor setup.

Also, backplanes that could not handle conflicting atomic operations
from different SBCs on the same bus.  The symptom was that the
backplane locked up, requiring a power cycle to recover control.

I wrote a short test program for that, called "TASbasher".  One ran an
instance in each SBC, with mutually prime numbers of NOPs in the
loops, so the two instances could not get comfortable in an
alternating cycle.  Vulnerable backplanes would lock up in a second or
two.  And the hardware folk ran out of fix-your-buggy-software
excuses.

>> As for RTOSes, we used MTOS, which did work.  
>
>It /might/ have been MTOS (but 1988 etc). I captured bus
>transactions to determine that when one RTOS call was made
>with a parameter, that did not reappear when the relevant
>task awoke. That was really "fun", given that instructions
>seen on the bus were not necessarily executed - the only
>way to tell was to execute the instructions on paper, and
>discount irrelevant prefetches.
>
>The RTOS vendor traced the problem to some assembly
>code in the port to the specific SBC, and fixed it
>speedily.

MTOS had the usual teething problems, but I don't recall that one. But
then it had survived some malicious benchmark tests.

>
>> Again, there were many
>> RTOSes that didn't work.  In many cases, the problem was bugs, and
>> also by design.  A classic design problem was if the inter-task
>> messaging facility could not handle a circular path, where A-> B -> C
>> -> A, and so on.
>> 
>> This meant that the RTOS could handle only synchronous activities,
>> which was crippling in ERT, because the order of arrival of events is
>> necessarily random, and all orders will happen, and a system built on
>> a synchronous RTOS would immediately lock up.  This lead to a very
>> simple but deadly RTOS benchmarking architecture.  This architecture
>> also works on middleware.
>
>It is why "higher level" design patterns are so useful,
>including in embedded systems. The traditional mutex/semaphore
>is necessary, but not sufficient.

Well, at the time design patterns were well in the future.  All they
did was describe and name bits of the ERT lore, which isn't a bad
thing, but was hardly a revelation either.

>In the Java world, Doug Lea transliterated useful design patterns
>found in the real time community into Java classes. They were
>eventually included in the standard Java libraries.
>
>Most of my architectures seem to be what is sometimes called
>the half-async-half-sync pattern:
>  - create event (from a task or hardware interrupt)
>  - put event in queue, and return pronto
>  - loop, sucking event from queue, processing it to completion,
>    often creating an event and yielding

An oldie but goodie.  We would timestamp event records at creation,
and then process them while respecting the responsiveness needed for
whatever kind of event it was, in order if that was needed.

>A variation on that is a telecom system where there are
>many calls each with their own distinct event flow, where
>each call can have multiple outstanding events from different
>sources that must be processed in the order of reception.
>
>In that case
>  - for each remote event source, a task sucks on the incoming
>    events
>  - each event queue for the relevant call, and the relevant
>    call is queued in a global "work to be done" queue
>  - set of worker threads (~1 per core) take then next call in
>    the global queue, take the first event in that call's queue,
>    and process it to completion

This sounds like what I described just above.

>> One also tested for priority inversion failures.  While speed is also
>> measured, this and the circular path are problems no matter the speed
>> of the problem or RTOS.
>
>I manage to structure my systems into three priority levels:
>   - hardware interrupt
>   - panic and commit seppuku
>   - everything else
>
>My brain is too feeble to cope with anything else.

The old hard-frame periodic RTOSes worked that way, but it doesn't
scale all that well.  But nobody had the panic and die option - there
was always some form of elasticity, sometimes called a "rubber clock",
because customers would not tolerate such fragility.

Present-day ERT systems have many priority levels, basically for
better overall responsiveness, taking into account that while they are
all ERT, some are more urgent than others.

A classic example is a weather radar, where a 3D volume scan takes
about five minutes to collect all the data.  The problem is that the
intensity and kind of weather passing through coverage varies
randomly, and there can be too much data to handle.  

The data reduction algorithms form a data-driven data-flow machine.
The objective is to not lose raw radar data even if the data
processing falls behind, especially in the heaviest of weather just
short of blowing the radar tower away.

So the highest priority is given to the interface to the radar
hardware, and the components of the data-flow machine have a gradient
of priorities, such that if one level falls behind, the earlier layer
will carry on, and the overall data reduction will eventually catch
up. But of course, in heavy enough weather, this becomes impossible. 

So, in parallel, the number of free memory blocks is tracked, and if
it falls below some threshold, the later processing steps are simply
omitted, so some outputs are not produced at all in extremis.  Given
that the raw data was collected, anything can be generated after the
storm passes, up to when the radar tower went over the moon.

Joe Gwinn

On 14/10/21 16:26, Joe Gwinn wrote:
> On Wed, 13 Oct 2021 00:57:53 +0100, Tom Gardner
> <spamjunk@blueyonder.co.uk> wrote:
> 
>> On 13/10/21 00:14, Joe Gwinn wrote:
>>>
>>>>>> That is all/partially/  hidden by the operation of the L1/L2/L3
>>>>>> caches in processors, but all the interacting C language
>>>>>> features (e.g. const, volatile, etc) have to be got right.
>>>>>> If incorrect, then subtle rare unreproduceable errors will
>>>>>> occur.
>>>>> Also true.  It's the programmers' job to understand all this.  Not
>>>>> that many understand the hardware that deeply, but enough do.
>>>> Most of them understand (and I use that word loosely) only
>>>> enough to allow them to copy-and-paste "solutions" from
>>>> stackexchange. Once it compiles and passes their inadequate
>>>> unit tests, it works - by definition.
>>>
>>> That is certainly true.  There is a wide dynamic range of programmer
>>> skill.  Back in the day, I was a rarity, being bilingual (hardware and
>>> software), and it allowed me to solve some pretty wild bugs fairly
>>> easily, because I had access to a level of  information not commonly
>>> available to pure software folks.
>>
>> I was too, everything from low noise analogue, conventional
>> digital, "micros", RT software, cellphone modelling and measuring,
>> and even some CRUD database stuff.
>>
>> When I'm feeling mischievous, usually in a pub, I'll tell
>> people that I don't know where the boundary between hardware
>> and software actually is.
>>
>> They are usually aghast at first. After mentioning microcode,
>> the way modern ISAs are decomposed into RISC-like micro-ops
>> inside the processor, FPGAs, emulation, etc, the reactions
>> are one of two kinds
>>   - slightly aggressive denial, usually accompanied by looks
>>     of bewilderment and incomprehension
>>   - amusement, and delight at the philosophical questions
>>
>> Guess which people I trust (technically) more!
> 
> Heh.  Did they buy you a beer?
> 
> 
>>> War story.  Something like ten years ago, the C++ tribe was unable to
>>> figure our why the radar software would go casters-up on startup. This
>>> was likely a million lines of code at least.  When it fell over, no
>>> error messages or other information was printed.   This problem
>>> endured for months.
>>
>> Java, and other modern languages, are usually much
>> better in that respect - aggressive use of exceptions
>> and full stack traces really help.
> 
> In theory, so does C++.  For all the good it did.
> 
> And C++ stack traces can be pretty hard to follow.  But in my above
> example, there was no stack trace to pore over.  That's where the
> kernel debugger came in.  The kernel know who is waiting on what, and
> where in the application code the request was made.

The only time this is likely to happen in a Java application
is if the JVM is broken. I have seen config statements to
the effect of "don't HotSpot optimise ByteArrays if using
JRE 1.4.16". No idea how they noticed and isolated that
as the bug!


>> Another was debugging the 68000 SBC and its RTOS, where we
>> had purchased both and both were buggy. Oh the (unproductive)
>> fun we had!
> 
> We had lots of problems with 68000 SBCs that didn't work as well, but
> eventually figured out who _not_ to buy from.  Which was most of the
> then vendors, who mostly vanished over time.  One assumes that word
> got around.

Interesting. From memory (1988!) we had three, and different
ones had a different 1/4 of their memory non-functional.

> As for RTOSes, we used MTOS, which did work.  

It /might/ have been MTOS (but 1988 etc). I captured bus
transactions to determine that when one RTOS call was made
with a parameter, that did not reappear when the relevant
task awoke. That was really "fun", given that instructions
seen on the bus were not necessarily executed - the only
way to tell was to execute the instructions on paper, and
discount irrelevant prefetches.

The RTOS vendor traced the problem to some assembly
code in the port to the specific SBC, and fixed it
speedily.



> Again, there were many
> RTOSes that didn't work.  In many cases, the problem was bugs, and
> also by design.  A classic design problem was if the inter-task
> messaging facility could not handle a circular path, where A-> B -> C
> -> A, and so on.
> 
> This meant that the RTOS could handle only synchronous activities,
> which was crippling in ERT, because the order of arrival of events is
> necessarily random, and all orders will happen, and a system built on
> a synchronous RTOS would immediately lock up.  This lead to a very
> simple but deadly RTOS benchmarking architecture.  This architecture
> also works on middleware.

It is why "higher level" design patterns are so useful,
including in embedded systems. The traditional mutex/semaphore
is necessary, but not sufficient.

In the Java world Doug Lea transliterated useful design patterns
found in the real time community into Java classes. They were
eventually included in the standard Java libraries.

Most of my architectures seem to be what is sometimes called
the half-async-half-sync pattern:
  - create event (from a task or hardware interrupt)
  - put event in queue, and return pronto
  - loop, sucking event from queue, processing it to completion,
    often creating an event and yielding

A variation on that is a telecom system where there are
many calls each with their own distinct event flow, where
each call can have multiple outstanding events from different
sources that must be processed in the order of reception.

In that case
  - for each remote event source, a task sucks on the incoming
    events
  - each event queue for the relevant call, and the relevant
    call is queued in a global "work to be done" queue
  - set of worker threads (~1 per core) take then next call in
    the global queue, take the first event in that call's queue,
    and process it to completion


> One also tested for priority inversion failures.  While speed is also
> measured, this and the circular path are problems no matter the speed
> of the problem or RTOS.

I manage to structure my systems into three priority levels:
   - hardware interrupt
   - panic and commit seppuku
   - everything else

My brain is too feeble to cope with anything else.