Electronics-Related.com
Forums

Trion bitstream compression test

Started by John Larkin December 10, 2022
On Sun, 11 Dec 2022 07:26:16 -0800 (PST), Fred Bloggs
<bloggs.fredbloggs.fred@gmail.com> wrote:

>On Saturday, December 10, 2022 at 10:13:39 PM UTC-5, John Larkin wrote: >> Suppose we read in an FPGA bitstream file as 16-bit words, and want to >> output a compressed file, to save room in a small flash chip. >> >> If an input word is 0000, output a single 0 bit to the compressed >> file. >> >> If a word is nonzero, output a 1 followed by that 16-bit word. >> >> The decompressor will be very simple. >> >> I wrote a little PowerBasic program to test the compression, the input >> being an actual Efinix T20 bit stream file. >> >> https://www.dropbox.com/s/dkgvdsz2wytvgwm/Trion_Compression.jpg?raw=1 > >40% doesn't seem worth it since worse case will take up your entire memory.
A Pi Pico has 2 Mbytes of flash. A non-compressed (5.4 Mbit for ours) T20 bit stream takes about a third of that. 0.4 compression gets that down to about 1/8. That sounds worth doing if it's easy and doesn't become a big project, and doesn't need a big decoder.
On Sunday, December 11, 2022 at 7:57:49 AM UTC-4, John Walliker wrote:
> On Sunday, 11 December 2022 at 04:51:25 UTC, Ricky wrote: > > On Saturday, December 10, 2022 at 11:13:39 PM UTC-4, John Larkin wrote: > > > Suppose we read in an FPGA bitstream file as 16-bit words, and want to > > > output a compressed file, to save room in a small flash chip. > > > > > > If an input word is 0000, output a single 0 bit to the compressed > > > file. > > > > > > If a word is nonzero, output a 1 followed by that 16-bit word. > > > > > > The decompressor will be very simple. > > > > > > I wrote a little PowerBasic program to test the compression, the input > > > being an actual Efinix T20 bit stream file. > > > > > > https://www.dropbox.com/s/dkgvdsz2wytvgwm/Trion_Compression.jpg?raw=1 > > I don't think it would take much to improve that significantly. I would think the emphasis would be on keeping the decoder as simple as practical. The above algorithm would require a lot of shifting, which is often done one bit at a time on MCUs, so a bit slow. Better to split the encoded stream in two, the "encode" bits indicating a 16 bit zero string, vs. the 16 bit literal; and the stream of 16 bit literals. The first word in the file would be the length in bits of the "encode" bits, which divided by 16, gives the offset to the start of literal words. > However, in this case the MCU is an RP2040 which is a dual-core ARM M0+ which has > a 32-bit barrel shifter. There are plenty of resources for more complex decompression. > > > > The encoder would be run on the PC running the FPGA software, and the decoder is a simple algorithm reading one bit at a time from the "encode" bits and building the uncompressed stream one word at a time, which will likely be shipped to the FPGA as generated. Shifting all the data by arbitrary amounts would not be required. Lots of CPU cycles saved, and the configuration time is not impacted... as much. > > > > -- > > > > Rick C. > > > > - Get 1,000 miles of free Supercharging > > - Tesla referral code - https://ts.la/richard11209
There's no point in discussing what is and what is not a good algorithm until an appropriate test file is obtained. Long runs of zeros are going to become much less common in a densely used device. Playing with a trivial test file, is of no real value. If the choices of potential designs is small enough, they can simply be identified by a single integer. That's real compression. LOL -- Rick C. -- Get 1,000 miles of free Supercharging -- Tesla referral code - https://ts.la/richard11209
On 11/12/2022 16:59, Ricky wrote:
> On Sunday, December 11, 2022 at 7:57:49 AM UTC-4, John Walliker > wrote: >> On Sunday, 11 December 2022 at 04:51:25 UTC, Ricky wrote: >>> On Saturday, December 10, 2022 at 11:13:39 PM UTC-4, John Larkin >>> wrote: >>>> Suppose we read in an FPGA bitstream file as 16-bit words, and >>>> want to output a compressed file, to save room in a small flash >>>> chip. >>>> >>>> If an input word is 0000, output a single 0 bit to the >>>> compressed file. >>>> >>>> If a word is nonzero, output a 1 followed by that 16-bit word. >>>> >>>> The decompressor will be very simple. >>>> >>>> I wrote a little PowerBasic program to test the compression, >>>> the input being an actual Efinix T20 bit stream file. >>>> >>>> https://www.dropbox.com/s/dkgvdsz2wytvgwm/Trion_Compression.jpg?raw=1 >>> >>>>
I don't think it would take much to improve that significantly. I would think the emphasis would be on keeping the decoder as simple as practical. The above algorithm would require a lot of shifting, which is often done one bit at a time on MCUs, so a bit slow. Better to split the encoded stream in two, the "encode" bits indicating a 16 bit zero string, vs. the 16 bit literal; and the stream of 16 bit literals. The first word in the file would be the length in bits of the "encode" bits, which divided by 16, gives the offset to the start of literal words.
>> However, in this case the MCU is an RP2040 which is a dual-core ARM >> M0+ which has a 32-bit barrel shifter. There are plenty of >> resources for more complex decompression. >>> >>> The encoder would be run on the PC running the FPGA software, and >>> the decoder is a simple algorithm reading one bit at a time from >>> the "encode" bits and building the uncompressed stream one word >>> at a time, which will likely be shipped to the FPGA as generated. >>> Shifting all the data by arbitrary amounts would not be required. >>> Lots of CPU cycles saved, and the configuration time is not >>> impacted... as much. >>> >>> -- >>> >>> Rick C. >>> >>> - Get 1,000 miles of free Supercharging - Tesla referral code - >>> https://ts.la/richard11209 > > There's no point in discussing what is and what is not a good > algorithm until an appropriate test file is obtained. Long runs of > zeros are going to become much less common in a densely used device. > Playing with a trivial test file, is of no real value.
I think you can make a pretty good guess that there will be a lot of zero bits in most of the FPGA designs that actually do something useful. The distribution of non zero bytes in his sample file was consistent with the probability of any given bit being a 1 very small ~2%. So that 1 bit x 2 bits x^2 3 bits x^3 4 bits x^4 etc. x ~ 1/40 By the time you get to 7 bits there were some possible states that were not present at all.
> If the choices of potential designs is small enough, they can simply > be identified by a single integer. That's real compression. LOL
Given that statistical dominance of zero bits a simple byte based rule 0+x (0<x<127) means 1 to 128 0 bits 1+y means actual value 1+y Should trounce the original suggestion hands down. Encoding it probably makes sense to treat it as literally a bitstream since 1's are so rare! -- Regards, Martin Brown
On Monday, December 12, 2022 at 5:06:49 AM UTC-4, Martin Brown wrote:
> On 11/12/2022 16:59, Ricky wrote: > > On Sunday, December 11, 2022 at 7:57:49 AM UTC-4, John Walliker > > wrote: > >> On Sunday, 11 December 2022 at 04:51:25 UTC, Ricky wrote: > >>> On Saturday, December 10, 2022 at 11:13:39 PM UTC-4, John Larkin > >>> wrote: > >>>> Suppose we read in an FPGA bitstream file as 16-bit words, and > >>>> want to output a compressed file, to save room in a small flash > >>>> chip. > >>>> > >>>> If an input word is 0000, output a single 0 bit to the > >>>> compressed file. > >>>> > >>>> If a word is nonzero, output a 1 followed by that 16-bit word. > >>>> > >>>> The decompressor will be very simple. > >>>> > >>>> I wrote a little PowerBasic program to test the compression, > >>>> the input being an actual Efinix T20 bit stream file. > >>>> > >>>> https://www.dropbox.com/s/dkgvdsz2wytvgwm/Trion_Compression.jpg?raw=1 > >>> > >>>> > I don't think it would take much to improve that significantly. I would > think the emphasis would be on keeping the decoder as simple as > practical. The above algorithm would require a lot of shifting, which is > often done one bit at a time on MCUs, so a bit slow. Better to split the > encoded stream in two, the "encode" bits indicating a 16 bit zero > string, vs. the 16 bit literal; and the stream of 16 bit literals. The > first word in the file would be the length in bits of the "encode" bits, > which divided by 16, gives the offset to the start of literal words. > >> However, in this case the MCU is an RP2040 which is a dual-core ARM > >> M0+ which has a 32-bit barrel shifter. There are plenty of > >> resources for more complex decompression. > >>> > >>> The encoder would be run on the PC running the FPGA software, and > >>> the decoder is a simple algorithm reading one bit at a time from > >>> the "encode" bits and building the uncompressed stream one word > >>> at a time, which will likely be shipped to the FPGA as generated. > >>> Shifting all the data by arbitrary amounts would not be required. > >>> Lots of CPU cycles saved, and the configuration time is not > >>> impacted... as much. > >>> > >>> -- > >>> > >>> Rick C. > >>> > >>> - Get 1,000 miles of free Supercharging - Tesla referral code - > >>> https://ts.la/richard11209 > > > > There's no point in discussing what is and what is not a good > > algorithm until an appropriate test file is obtained. Long runs of > > zeros are going to become much less common in a densely used device. > > Playing with a trivial test file, is of no real value. > I think you can make a pretty good guess that there will be a lot of > zero bits in most of the FPGA designs that actually do something useful.
Larkin's program doesn't compress zero bits. It only compresses zero words. That's my point. Until you know more about the bitstream for a densely used design, you can't gauge the compression scheme.
> The distribution of non zero bytes in his sample file was consistent > with the probability of any given bit being a 1 very small ~2%. So that > > 1 bit x > 2 bits x^2 > 3 bits x^3 > 4 bits x^4 > etc. > > x ~ 1/40 > > By the time you get to 7 bits there were some possible states that were > not present at all.
None of this means anything if you are working with a sparsely filled FPGA and you want to allow for a densely used FPGA. Bits are not set randomly in a configuration file, so a statistical approach is unlikely to provide much insight.
> > If the choices of potential designs is small enough, they can simply > > be identified by a single integer. That's real compression. LOL > Given that statistical dominance of zero bits a simple byte based rule > > 0+x (0<x<127) means 1 to 128 0 bits > 1+y means actual value 1+y > > Should trounce the original suggestion hands down. Encoding it probably > makes sense to treat it as literally a bitstream since 1's are so rare!
It *is* a bit stream. Organizing it into bytes or words has no meaning to the bitstream. Ones won't be so rare in a fully utilized part. I'm not saying there won't be useful compression when the design is close to full, but clearly, if the algorithm is only finding 60% of the words are compressible, that's going to get a *lot* worse as the design fills up. It's likely the compression curve vs. part utilization will be a gentle slope at low utilization, and drop off rapidly on reaching the higher numbers. It will also vary between designs, even at the same utilization. There's only one way to find out. Pretending to know the properties of the distribution based on a single sample is not remotely realistic. -- Rick C. -+ Get 1,000 miles of free Supercharging -+ Tesla referral code - https://ts.la/richard11209
s&oslash;ndag den 11. december 2022 kl. 04.13.39 UTC+1 skrev John Larkin:
> Suppose we read in an FPGA bitstream file as 16-bit words, and want to > output a compressed file, to save room in a small flash chip. > > If an input word is 0000, output a single 0 bit to the compressed > file. > > If a word is nonzero, output a 1 followed by that 16-bit word. > > The decompressor will be very simple. > > I wrote a little PowerBasic program to test the compression, the input > being an actual Efinix T20 bit stream file. > > https://www.dropbox.com/s/dkgvdsz2wytvgwm/Trion_Compression.jpg?raw=1
this RLE (https://github.com/MichaelDipperstein/rle) is slightly better .. checking Efinix.bin uncompressed size: 5429200 rle size: 2211264 vpackbits size: 2183728
On Mon, 12 Dec 2022 13:01:05 -0800 (PST), Lasse Langwadt Christensen
<langwadt@fonz.dk> wrote:

>s&#4294967295;ndag den 11. december 2022 kl. 04.13.39 UTC+1 skrev John Larkin: >> Suppose we read in an FPGA bitstream file as 16-bit words, and want to >> output a compressed file, to save room in a small flash chip. >> >> If an input word is 0000, output a single 0 bit to the compressed >> file. >> >> If a word is nonzero, output a 1 followed by that 16-bit word. >> >> The decompressor will be very simple. >> >> I wrote a little PowerBasic program to test the compression, the input >> being an actual Efinix T20 bit stream file. >> >> https://www.dropbox.com/s/dkgvdsz2wytvgwm/Trion_Compression.jpg?raw=1 > >this RLE (https://github.com/MichaelDipperstein/rle) is slightly better >.. >checking Efinix.bin >uncompressed size: 5429200 >rle size: 2211264 >vpackbits size: 2183728
There is a theoretical limit, useful to see what is possible and how much is being left on the table. .<https://en.wikipedia.org/wiki/Entropy_%28information_theory%29#Data_compression> If one dips below the minimum number of bits, lossless compression is impossible, because multiple data patterns will map onto a single compressed-data sequence (that is, collide), and so all those data sequences cannot be uniquely recovered in decompression. .<https://en.wikipedia.org/wiki/Pigeonhole_principle#:~:text=In%20mathematics%2C%20the%20pigeonhole%20principle,contain%20more%20than%20one%20item.> Joe Gwinn
On Monday, December 12, 2022 at 2:02:33 PM UTC-8, Joe Gwinn wrote:
> On Mon, 12 Dec 2022 13:01:05 -0800 (PST), Lasse Langwadt Christensen > <lang...@fonz.dk> wrote: > > >s&oslash;ndag den 11. december 2022 kl. 04.13.39 UTC+1 skrev John Larkin: > >> Suppose we read in an FPGA bitstream file as 16-bit words, and want to > >> output a compressed file, to save room in a small flash chip. > >> > >> If an input word is 0000, output a single 0 bit to the compressed > >> file.
> There is a theoretical limit, useful to see what is possible and how > much is being left on the table. > > .<https://en.wikipedia.org/wiki/Entropy_%28information_theory%29#Data_compression> > > If one dips below the minimum number of bits, lossless compression is > impossible, because multiple data patterns will map onto a single > compressed-data sequence (that is, collide)...
And, there has to be some framing; you cannot usefully consider the data stream to be interrupted by the '0' in a compressed file, unless you know that the position of that '0' is a flag, not a bit in the stream. Practically, something like '1' or '0' signalling has to be stored in a position in ROM just like the bits you send, but has to be NOT a bit being sent from the store... So, if there's long 0 strings, maybe store 8-bit string lengths, with 255 meaning send 256 bits from the table, and 0 meaning send 256 zeros, and 1 through 254 being available for justification up to the next string-of-zeros, or string-of-ones, or whatever.
On 12/12/2022 22:02, Joe Gwinn wrote:
> On Mon, 12 Dec 2022 13:01:05 -0800 (PST), Lasse Langwadt Christensen > <langwadt@fonz.dk> wrote: > >> s&oslash;ndag den 11. december 2022 kl. 04.13.39 UTC+1 skrev John Larkin: >>> Suppose we read in an FPGA bitstream file as 16-bit words, and want to >>> output a compressed file, to save room in a small flash chip. >>> >>> If an input word is 0000, output a single 0 bit to the compressed >>> file. >>> >>> If a word is nonzero, output a 1 followed by that 16-bit word. >>> >>> The decompressor will be very simple. >>> >>> I wrote a little PowerBasic program to test the compression, the input >>> being an actual Efinix T20 bit stream file. >>> >>> https://www.dropbox.com/s/dkgvdsz2wytvgwm/Trion_Compression.jpg?raw=1 >> >> this RLE (https://github.com/MichaelDipperstein/rle) is slightly better >> .. >> checking Efinix.bin >> uncompressed size: 5429200 >> rle size: 2211264 >> vpackbits size: 2183728 > > There is a theoretical limit, useful to see what is possible and how > much is being left on the table. > > .<https://en.wikipedia.org/wiki/Entropy_%28information_theory%29#Data_compression> > > If one dips below the minimum number of bits, lossless compression is > impossible, because multiple data patterns will map onto a single > compressed-data sequence (that is, collide), and so all those data > sequences cannot be uniquely recovered in decompression.
Lossless compression can be impossible for some already compressed formats. The "compressed" file can be longer if a truly incompressible file meets a less good compressor. Good archivers spot this and copy. Bytewise entropy is a crude but effective way to classify files by their approximate compressibility. You can classify file type this way: If the frequency of each pattern x is N_x then sum_x [ N_x ] = N p_x = N_x/N Entropy = Sum_x [ p_x log(p_x) ] You can choose the size of a byte length although 8 and 16 are favourites. It isn't helpful for byte length 1 since 0log0 = 1log1 = 0. However, the proportion of 0's and 1's in the file gives you a very good clue how you might compress it most easily. It also doesn't spot spatial correlations merely the frequency of each symbol in the alphabet. So from a byte entropy point of view these sequences are equal entropy: ABCDABCDABCDABCD and ABDCADDBCCBACDAB If there is a natural word length in the bitstream then computing this byte entropy test for each possible length of byte 2,3,4,... will find it/them - there may be more than one. -- Regards, Martin Brown