sci.electronics.design | Trion bitstream compression test

Suppose we read in an FPGA bitstream file as 16-bit words, and want to
output a compressed file, to save room in a small flash chip.

If an input word is 0000, output a single 0 bit to the compressed
file.

If a word is nonzero, output a 1 followed by that 16-bit word.

The decompressor will be very simple.

I wrote a little PowerBasic program to test the compression, the input
being an actual Efinix T20 bit stream file.

https://www.dropbox.com/s/dkgvdsz2wytvgwm/Trion_Compression.jpg?raw=1

Reply by Ricky ●December 11, 20222022-12-11

On Saturday, December 10, 2022 at 11:13:39 PM UTC-4, John Larkin wrote:
> Suppose we read in an FPGA bitstream file as 16-bit words, and want to 
> output a compressed file, to save room in a small flash chip. 
> 
> If an input word is 0000, output a single 0 bit to the compressed 
> file. 
> 
> If a word is nonzero, output a 1 followed by that 16-bit word. 
> 
> The decompressor will be very simple. 
> 
> I wrote a little PowerBasic program to test the compression, the input 
> being an actual Efinix T20 bit stream file. 
> 
> https://www.dropbox.com/s/dkgvdsz2wytvgwm/Trion_Compression.jpg?raw=1

I don't think it would take much to improve that significantly.  I would think the emphasis would be on keeping the decoder as simple as practical.  The above algorithm would require a lot of shifting, which is often done one bit at a time on MCUs, so a bit slow.  Better to split the encoded stream in two, the "encode" bits indicating a 16 bit zero string, vs. the 16 bit literal; and the stream of 16 bit literals.  The first word in the file would be the length in bits of the "encode" bits, which divided by 16, gives the offset to the start of literal words.  

The encoder would be run on the PC running the FPGA software, and the decoder is a simple algorithm reading one bit at a time from the "encode" bits and building the uncompressed stream one word at a time, which will likely be shipped to the FPGA as generated.  Shifting all the data by arbitrary amounts would not be required.  Lots of CPU cycles saved, and the configuration time is not impacted... as much.  

-- 

Rick C.

- Get 1,000 miles of free Supercharging
- Tesla referral code - https://ts.la/richard11209

Reply by Dimiter_Popoff ●December 11, 20222022-12-11

On 12/11/2022 5:13, John Larkin wrote:
> 
> Suppose we read in an FPGA bitstream file as 16-bit words, and want to
> output a compressed file, to save room in a small flash chip.
> 
> If an input word is 0000, output a single 0 bit to the compressed
> file.
> 
> If a word is nonzero, output a 1 followed by that 16-bit word.
> 
> The decompressor will be very simple.
> 
> I wrote a little PowerBasic program to test the compression, the input
> being an actual Efinix T20 bit stream file.
> 
> https://www.dropbox.com/s/dkgvdsz2wytvgwm/Trion_Compression.jpg?raw=1
> 

If you get the results you need using it why not.
The more trivial RLE is not too resource hungry and easy to
implement, IIRC (have done it last over a decade ago, our VNC (RFB)
server does it all the time during framebuffer transfers).

Reply by Ricky ●December 11, 20222022-12-11

On Sunday, December 11, 2022 at 1:16:49 AM UTC-4, Dimiter Popoff wrote:
> On 12/11/2022 5:13, John Larkin wrote: 
> > 
> > Suppose we read in an FPGA bitstream file as 16-bit words, and want to 
> > output a compressed file, to save room in a small flash chip. 
> > 
> > If an input word is 0000, output a single 0 bit to the compressed 
> > file. 
> > 
> > If a word is nonzero, output a 1 followed by that 16-bit word. 
> > 
> > The decompressor will be very simple. 
> > 
> > I wrote a little PowerBasic program to test the compression, the input 
> > being an actual Efinix T20 bit stream file. 
> > 
> > https://www.dropbox.com/s/dkgvdsz2wytvgwm/Trion_Compression.jpg?raw=1 
> >
> If you get the results you need using it why not. 
> The more trivial RLE is not too resource hungry and easy to 
> implement, IIRC (have done it last over a decade ago, our VNC (RFB) 
> server does it all the time during framebuffer transfers).

I don't think we've seen an indication that the test case is a design that uses a large percentage of the FPGA logic.  The compression ratio of a sparse design is not important.  What will need to be considered, is the limit of compression for a worst case.  That will determine the required amount of storage space to be set aside if the design is to be updated, especially in the field.  

The test case compressed by 60%.  It is entirely possible, that with a less sparse design, the bit stream will have very few words with only zeros.  For compression to be viable in a limited size of compressed storage, the compression needs to be effective for most designs.  There will be a significant fraction of zero bits, even in a design that uses all the logic, but they may well not form many zero words.  It could easily turn out that RLE compression would work much better than such a simplistic encoding as simply replacing zero words. 

Trust, but verify.  (test) 

-- 

Rick C.

+ Get 1,000 miles of free Supercharging
+ Tesla referral code - https://ts.la/richard11209

Reply by Martin Brown ●December 11, 20222022-12-11

On 11/12/2022 03:13, John Larkin wrote:
> 
> Suppose we read in an FPGA bitstream file as 16-bit words, and want to
> output a compressed file, to save room in a small flash chip.
> 
> If an input word is 0000, output a single 0 bit to the compressed
> file.
> 
> If a word is nonzero, output a 1 followed by that 16-bit word.
> 
> The decompressor will be very simple.
> 
> I wrote a little PowerBasic program to test the compression, the input
> being an actual Efinix T20 bit stream file.
> 
> https://www.dropbox.com/s/dkgvdsz2wytvgwm/Trion_Compression.jpg?raw=1

Well you *could* do that but if the sample file you posted a while back 
is any guide to their contents you might want to investigate the run 
length N of consecutive zero bits that gives maximum compression.

My instinct is that it may be somewhere near 120000.
Since there are entire blocks of 15k zero bytes in the test file.

0 bit counts with N = 5,6,7, 8, 16, 32, 64 ... might also be worth 
investigating and stop when it no longer yields additional compression.

Your sample file was 70% 00000000 and 20% with a single bit set.
Its natural word length appears to be the 8 bit byte.

My preference would be to encode it as 8 bit words and use the symbols 
which never occur or occur just once in the entire file to represent the 
following (not necessarily in this order).

15k run of zeroes
00
000
0000
00000000

and <token>,<token> for the token itself

-- 
Regards,
Martin Brown

Reply by John Walliker ●December 11, 20222022-12-11

On Sunday, 11 December 2022 at 04:51:25 UTC, Ricky wrote:
> On Saturday, December 10, 2022 at 11:13:39 PM UTC-4, John Larkin wrote: 
> > Suppose we read in an FPGA bitstream file as 16-bit words, and want to 
> > output a compressed file, to save room in a small flash chip. 
> > 
> > If an input word is 0000, output a single 0 bit to the compressed 
> > file. 
> > 
> > If a word is nonzero, output a 1 followed by that 16-bit word. 
> > 
> > The decompressor will be very simple. 
> > 
> > I wrote a little PowerBasic program to test the compression, the input 
> > being an actual Efinix T20 bit stream file. 
> > 
> > https://www.dropbox.com/s/dkgvdsz2wytvgwm/Trion_Compression.jpg?raw=1
> I don't think it would take much to improve that significantly. I would think the emphasis would be on keeping the decoder as simple as practical. The above algorithm would require a lot of shifting, which is often done one bit at a time on MCUs, so a bit slow. Better to split the encoded stream in two, the "encode" bits indicating a 16 bit zero string, vs. the 16 bit literal; and the stream of 16 bit literals. The first word in the file would be the length in bits of the "encode" bits, which divided by 16, gives the offset to the start of literal words. 

However, in this case the MCU is an RP2040 which is a dual-core ARM M0+ which has
a 32-bit barrel shifter.  There are plenty of resources for more complex decompression.

> 
> The encoder would be run on the PC running the FPGA software, and the decoder is a simple algorithm reading one bit at a time from the "encode" bits and building the uncompressed stream one word at a time, which will likely be shipped to the FPGA as generated. Shifting all the data by arbitrary amounts would not be required. Lots of CPU cycles saved, and the configuration time is not impacted... as much. 
> 
> -- 
> 
> Rick C. 
> 
> - Get 1,000 miles of free Supercharging 
> - Tesla referral code - https://ts.la/richard11209

Reply by John Larkin ●December 11, 20222022-12-11

On Sun, 11 Dec 2022 03:57:44 -0800 (PST), John Walliker
<jrwalliker@gmail.com> wrote:

>On Sunday, 11 December 2022 at 04:51:25 UTC, Ricky wrote:
>> On Saturday, December 10, 2022 at 11:13:39 PM UTC-4, John Larkin wrote: 
>> > Suppose we read in an FPGA bitstream file as 16-bit words, and want to 
>> > output a compressed file, to save room in a small flash chip. 
>> > 
>> > If an input word is 0000, output a single 0 bit to the compressed 
>> > file. 
>> > 
>> > If a word is nonzero, output a 1 followed by that 16-bit word. 
>> > 
>> > The decompressor will be very simple. 
>> > 
>> > I wrote a little PowerBasic program to test the compression, the input 
>> > being an actual Efinix T20 bit stream file. 
>> > 
>> > https://www.dropbox.com/s/dkgvdsz2wytvgwm/Trion_Compression.jpg?raw=1
>> I don't think it would take much to improve that significantly. I would think the emphasis would be on keeping the decoder as simple as practical. The above algorithm would require a lot of shifting, which is often done one bit at a time on MCUs, so a bit slow. Better to split the encoded stream in two, the "encode" bits indicating a 16 bit zero string, vs. the 16 bit literal; and the stream of 16 bit literals. The first word in the file would be the length in bits of the "encode" bits, which divided by 16, gives the offset to the start of literal words. 
>
>However, in this case the MCU is an RP2040 which is a dual-core ARM M0+ which has
>a 32-bit barrel shifter.  There are plenty of resources for more complex decompression.

The code to get the next bit from a file is not complex.

Reply by Fred Bloggs ●December 11, 20222022-12-11

On Saturday, December 10, 2022 at 10:13:39 PM UTC-5, John Larkin wrote:
> Suppose we read in an FPGA bitstream file as 16-bit words, and want to 
> output a compressed file, to save room in a small flash chip. 
> 
> If an input word is 0000, output a single 0 bit to the compressed 
> file. 
> 
> If a word is nonzero, output a 1 followed by that 16-bit word. 
> 
> The decompressor will be very simple. 
> 
> I wrote a little PowerBasic program to test the compression, the input 
> being an actual Efinix T20 bit stream file. 
> 
> https://www.dropbox.com/s/dkgvdsz2wytvgwm/Trion_Compression.jpg?raw=1

40% doesn't seem worth it since worse case will take up your entire memory.

Reply by John Larkin ●December 11, 20222022-12-11

On Sun, 11 Dec 2022 11:05:57 +0000, Martin Brown
<'''newspam'''@nonad.co.uk> wrote:

>On 11/12/2022 03:13, John Larkin wrote:
>> 
>> Suppose we read in an FPGA bitstream file as 16-bit words, and want to
>> output a compressed file, to save room in a small flash chip.
>> 
>> If an input word is 0000, output a single 0 bit to the compressed
>> file.
>> 
>> If a word is nonzero, output a 1 followed by that 16-bit word.
>> 
>> The decompressor will be very simple.
>> 
>> I wrote a little PowerBasic program to test the compression, the input
>> being an actual Efinix T20 bit stream file.
>> 
>> https://www.dropbox.com/s/dkgvdsz2wytvgwm/Trion_Compression.jpg?raw=1
>
>Well you *could* do that but if the sample file you posted a while back 
>is any guide to their contents you might want to investigate the run 
>length N of consecutive zero bits that gives maximum compression.
>
>My instinct is that it may be somewhere near 120000.
>Since there are entire blocks of 15k zero bytes in the test file.
>
>0 bit counts with N = 5,6,7, 8, 16, 32, 64 ... might also be worth 
>investigating and stop when it no longer yields additional compression.
>
>Your sample file was 70% 00000000 and 20% with a single bit set.
>Its natural word length appears to be the 8 bit byte.
>
>My preference would be to encode it as 8 bit words and use the symbols 
>which never occur or occur just once in the entire file to represent the 
>following (not necessarily in this order).
>
>15k run of zeroes
>00
>000
>0000
>00000000
>
>and <token>,<token> for the token itself

I did consider the chunk size N at 32 bits.

If N=16, the compression gain for data=0 words is 16:1, and the
penalty on top of nonzero words is 6%. That's not bad for a simple
decoder.

0.4 file size ratio is pretty good. I like simple and done. I can
explain this to a smart intern in minutes.

I'll run my little Basic program on some more complex FPGA bit
streams, as soon as we make some. The current project has a serial
flash chip per FPGA so doesn't need compression, but a new Pi-based
product line might benefit from compression. I expect the FPGA designs
to be fairly simple or absurdly simple. We will use the T20 chips
because we could get them and bought a bucket full for $10 each.

Clocked slow and not working hard, no PLLs, the 1.25 volt core power
is milliamps. A tiny linear regulator from +5 will work fine.

My guys say that compared to the big-guy FPGAs, the efinix design
suite looks like garage-level engineering. That's a feature, not a
defect. A lot of the suite is in plain-sight Python and there's no
FlexLM horrors to fight.

Reply by Dimiter_Popoff ●December 11, 20222022-12-11

On 12/11/2022 17:41, John Larkin wrote:
> On Sun, 11 Dec 2022 11:05:57 +0000, Martin Brown
> <'''newspam'''@nonad.co.uk> wrote:
> 
>> On 11/12/2022 03:13, John Larkin wrote:
>>>
>>> Suppose we read in an FPGA bitstream file as 16-bit words, and want to
>>> output a compressed file, to save room in a small flash chip.
>>>
>>> If an input word is 0000, output a single 0 bit to the compressed
>>> file.
>>>
>>> If a word is nonzero, output a 1 followed by that 16-bit word.
>>>
>>> The decompressor will be very simple.
>>>
>>> I wrote a little PowerBasic program to test the compression, the input
>>> being an actual Efinix T20 bit stream file.
>>>
>>> https://www.dropbox.com/s/dkgvdsz2wytvgwm/Trion_Compression.jpg?raw=1
>>
>> Well you *could* do that but if the sample file you posted a while back
>> is any guide to their contents you might want to investigate the run
>> length N of consecutive zero bits that gives maximum compression.
>>
>> My instinct is that it may be somewhere near 120000.
>> Since there are entire blocks of 15k zero bytes in the test file.
>>
>> 0 bit counts with N = 5,6,7, 8, 16, 32, 64 ... might also be worth
>> investigating and stop when it no longer yields additional compression.
>>
>> Your sample file was 70% 00000000 and 20% with a single bit set.
>> Its natural word length appears to be the 8 bit byte.
>>
>> My preference would be to encode it as 8 bit words and use the symbols
>> which never occur or occur just once in the entire file to represent the
>> following (not necessarily in this order).
>>
>> 15k run of zeroes
>> 00
>> 000
>> 0000
>> 00000000
>>
>> and <token>,<token> for the token itself
> 
> I did consider the chunk size N at 32 bits.
> 
> If N=16, the compression gain for data=0 words is 16:1, and the
> penalty on top of nonzero words is 6%. That's not bad for a simple
> decoder.
> 
> 0.4 file size ratio is pretty good. I like simple and done. I can
> explain this to a smart intern in minutes.
> 
> I'll run my little Basic program on some more complex FPGA bit
> streams, as soon as we make some. The current project has a serial
> flash chip per FPGA so doesn't need compression, but a new Pi-based
> product line might benefit from compression. I expect the FPGA designs
> to be fairly simple or absurdly simple. We will use the T20 chips
> because we could get them and bought a bucket full for $10 each.
> 
> Clocked slow and not working hard, no PLLs, the 1.25 volt core power
> is milliamps. A tiny linear regulator from +5 will work fine.
> 
> My guys say that compared to the big-guy FPGAs, the efinix design
> suite looks like garage-level engineering. That's a feature, not a
> defect. A lot of the suite is in plain-sight Python and there's no
> FlexLM horrors to fight.
> 

I think you worry a bit too much about decompression complexity.
Decompression is much simpler/less computationally intensive than
compression, even if you do something like LZW. Probably you won't
need anything like that of course but if you do it won't be a killer.

Previous12 Next

Trion bitstream compression test

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Quick Links

About Electronics-Related.com

Social Networks

The Related Media Group