Forums

Robust configuration memory

Started by Piotr Wyderski February 18, 2018
Hello,

I need a moderate amount of non-volatile memory (for FPGA
configuration purposes and the like), but can't tolerate
configuration errors due to charge leaks, cosmic radiation
or just the malfunction of the chip. I thought that something
like RAID5 imposed on memory chips/SD cards would be fine.
It would be extremely simple in an FPGA, but it creates
a chicken and egg problem: how can you read the controller's
configuration from flash if the flash itself might be corrupt.

It could also be done easily in hardware, but would require
parallel-output chips, which are not particularly trendy nowadays
and the SPI decoding circuitry would be insanely complex with
simple gates.

I presume I am not the first person to have such a need, so
what would you recommend me? It has to be autonomous only
for (early) reads, the write mode part can materialize later
within the FPGA. Any ideas?

	Best regards, Piotr

On Sunday, February 18, 2018 at 2:20:55 AM UTC-8, Piotr Wyderski wrote:

> I need a moderate amount of non-volatile memory (for FPGA > configuration purposes and the like), but can't tolerate > configuration errors due to ...
Flash is cheap; either redundancy (three copies and a vote) or checksums (two copies with two checksums) will warn of an error, and you can mark a block 'bad' and use another. Reading isn't stressful on flash, it's only the erase/rewrite cycles that you gotta worry about, eventually. So, minimize those. Some implementations are available with internal ECC-checking, too. <http://www.onsemi.com/pub/Collateral/CAT24C256-D.PDF>
whit3rd wrote:

> Flash is cheap; either redundancy (three copies and a vote) > or checksums (two copies with two checksums) will warn of an error, > and you can mark a block 'bad' and use another.
You're right, a single voting engine attached to three copies of the DOUT SPI line will do the job without the need to analyze the protocol. Problem solved, dead simple, thanks!
> Reading isn't stressful on flash, it's only the erase/rewrite cycles > that you gotta worry about, eventually. So, minimize those.
There will not be many write cycles, but the storage is leaky, so it will turn into all ones eventually. With the voting mechanism I can detect it early and re-program the chips with the correct(ed) content. The interesting discovery now is the apparent lack of majority gates and the circuit is sufficiently complex not to fit within a simple logic circuit in a single package. It seems the best thing to do is to use an 8:1 MUX (say, 74AC241) and etch the LUT on the PCB. 6ns delay doesn't sound bad. Am I missing an even simpler implementation? Best regards, Piotr
On 18/02/2018 23:24, Piotr Wyderski wrote:
> whit3rd wrote: > >> Flash is cheap; either redundancy (three copies and a vote) >> or checksums (two copies with two checksums) will warn of an error, >> and you can mark a block 'bad' and use another. > > You're right, a single voting engine attached to three copies > of the DOUT SPI line will do the job without the need to analyze > the protocol. Problem solved, dead simple, thanks! > >> Reading isn't stressful on flash, it's only the erase/rewrite cycles >> that you gotta worry about, eventually.&nbsp; So, minimize those. > > There will not be many write cycles, but the storage is leaky, > so it will turn into all ones eventually. With the voting mechanism > I can detect it early and re-program the chips with the correct(ed) > content. > > The interesting discovery now is the apparent lack of majority gates > and the circuit is sufficiently complex not to fit within a simple > logic circuit in a single package. It seems the best thing to do is > to use an 8:1 MUX (say, 74AC241) and etch the LUT on the PCB. 6ns > delay doesn't sound bad. Am I missing an even simpler implementation? > > &nbsp;&nbsp;&nbsp;&nbsp;Best regards, Piotr
Simpler but not better: Tie them all together and let them fight it out. If you are somewhat merciful you could give each output a series resistor.
>The interesting discovery now is the apparent lack of majority gates >and the circuit is sufficiently complex not to fit within a simple >logic circuit in a single package. It seems the best thing to do is >to use an 8:1 MUX (say, 74AC241) and etch the LUT on the PCB. 6ns >delay doesn't sound bad. Am I missing an even simpler implementation?
<dim-mamory> There are some "configurable 2-input" gates that can be used for that, I think, if you play around with their truth tables a bit. (They actually have 3 or 4 inputs.) </dim-mamory> Cheers Phil Hobbs
On Sun, 18 Feb 2018 11:20:45 +0100, Piotr Wyderski
<peter.pan@neverland.mil> wrote:

>Hello, > >I need a moderate amount of non-volatile memory (for FPGA >configuration purposes and the like), but can't tolerate >configuration errors due to charge leaks, cosmic radiation >or just the malfunction of the chip. I thought that something >like RAID5 imposed on memory chips/SD cards would be fine. >It would be extremely simple in an FPGA, but it creates >a chicken and egg problem: how can you read the controller's >configuration from flash if the flash itself might be corrupt. > >It could also be done easily in hardware, but would require >parallel-output chips, which are not particularly trendy nowadays >and the SPI decoding circuitry would be insanely complex with >simple gates.
It should be pretty simple with a CPLD. You could also compare on a bit-by-bit basis, which shouldn't be too difficult, even limited to unit logic. Put the redundancy in the FPGA. If a bit of the configuration memory is in error, it's unlikely to be wrong in each of the redundant parts. You could use the FPGA to verify the contents of the flash, after configuration but before (application) enable. There are all sorts of possibilities but a lot depends on unspecified requirements.
>I presume I am not the first person to have such a need, so >what would you recommend me? It has to be autonomous only >for (early) reads, the write mode part can materialize later >within the FPGA. Any ideas?
Not enough information.
krw@notreal.com wrote:

> It should be pretty simple with a CPLD.
Which, again, is nowadays based on flash. Vicious circle. I don't want its configuration evaporate either.
> You could also compare on a > bit-by-bit basis, which shouldn't be too difficult, even limited to > unit logic.
And this is the correct approach: bit-by-bit voting on a serial stream. A single majority gate of still unknown implementation will suffice.
> Put the redundancy in the FPGA. If a bit of the configuration memory > is in error, it's unlikely to be wrong in each of the redundant parts.
What do you mean? An FPGA won't boot from a corrupt bitstream. Do you mean many FPGAs?
> You could use the FPGA to verify the contents of the flash, after > configuration but before (application) enable.
But the flash can die just before/during configuration. Best regards, Piotr
On Sun, 18 Feb 2018 16:46:30 +0100, Piotr Wyderski
<peter.pan@neverland.mil> wrote:

>krw@notreal.com wrote: > >> It should be pretty simple with a CPLD. > >Which, again, is nowadays based on flash. Vicious circle. >I don't want its configuration evaporate either.
OK, what error in the "checker" is going to give you an almost-good flash image for the target device? What I'm suggesting is an "equivalence" gate in the configuration stream.
>> You could also compare on a >> bit-by-bit basis, which shouldn't be too difficult, even limited to >> unit logic. > >And this is the correct approach: bit-by-bit voting on a serial stream. >A single majority gate of still unknown implementation will suffice.
Not majority. Just equal, if all you care about is "valid"/"not valid" configuration. If you want TMR, then you need voting, which is a little more difficult. But in this case, you probably need TMR in the application, too.
> >> Put the redundancy in the FPGA. If a bit of the configuration memory >> is in error, it's unlikely to be wrong in each of the redundant parts. > >What do you mean? An FPGA won't boot from a corrupt bitstream. >Do you mean many FPGAs?
I thought this was your problem. If it doesn't boot, problem solved.
>> You could use the FPGA to verify the contents of the flash, after >> configuration but before (application) enable. > >But the flash can die just before/during configuration.
Sorry, I thought this was the desired outcome.
krw@notreal.com wrote:

>> What do you mean? An FPGA won't boot from a corrupt bitstream. >> Do you mean many FPGAs? > > I thought this was your problem.
OK, but no, the desired effect is to work correctly for as long as possible, even knowing about memory errors. Hence the RAID5 reference. I assume a configured FPGA works correctly, the problem is to provide it with a good bitstream, as I expect the flash chips to be least durable. Putting 3 instead of 1 doesn't increase complexity that much, but can greatly increase reliability. > If it doesn't boot, problem solved. In that case, don't most FPGAs already have all the needed checksum circuitry inside? Best regards, Piotr
On 18.2.18 17:36, krw@notreal.com wrote:
> On Sun, 18 Feb 2018 11:20:45 +0100, Piotr Wyderski > <peter.pan@neverland.mil> wrote: > >> Hello, >> >> I need a moderate amount of non-volatile memory (for FPGA >> configuration purposes and the like), but can't tolerate >> configuration errors due to charge leaks, cosmic radiation >> or just the malfunction of the chip. I thought that something >> like RAID5 imposed on memory chips/SD cards would be fine. >> It would be extremely simple in an FPGA, but it creates >> a chicken and egg problem: how can you read the controller's >> configuration from flash if the flash itself might be corrupt. >> >> It could also be done easily in hardware, but would require >> parallel-output chips, which are not particularly trendy nowadays >> and the SPI decoding circuitry would be insanely complex with >> simple gates. > > It should be pretty simple with a CPLD. You could also compare on a > bit-by-bit basis, which shouldn't be too difficult, even limited to > unit logic. > > Put the redundancy in the FPGA. If a bit of the configuration memory > is in error, it's unlikely to be wrong in each of the redundant parts. > > You could use the FPGA to verify the contents of the flash, after > configuration but before (application) enable. > > There are all sorts of possibilities but a lot depends on unspecified > requirements. > >> I presume I am not the first person to have such a need, so >> what would you recommend me? It has to be autonomous only >> for (early) reads, the write mode part can materialize later >> within the FPGA. Any ideas? > > Not enough information. >
Using a FPGA is going to a circular definition: The FPGA has some kind of semi-permanent memory for the configutarion. I see two possibilities: - Create the voter from discrete gates, - Use a processor to handle the redundancy, and get a problem of the processor code. I once had a customer with an instrument with calibration data inside an EEPROM. It had the habit of losing the calibration occasionally, so in the next generation, a triple redundancy calibration store was made in the software. During years of use of hundreds of instruments, there was not a single case of redundancy correction. It was proven that the problems with the old generation were from flaky software. -- -TV