Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: Support for streamed decompression? #52

Open
simondotm opened this issue Jan 2, 2021 · 5 comments
Open

Suggestion: Support for streamed decompression? #52

simondotm opened this issue Jan 2, 2021 · 5 comments

Comments

@simondotm
Copy link

simondotm commented Jan 2, 2021

Hi there, thanks for this great project.

A while ago, I wrote a custom 8-bit oriented LZ4 compressor/decoder (but nowhere near as optimal as lzsa!), mainly to solve a particular problem of "streamed decompression", where we want to partially decompress data on the fly but without requiring full access to all of the previously decompressed data stream.

This is useful in 8-bit scenarios for example where we might be decompressing video or audio data to be consumed byte-by-byte through a small in-memory buffer, and it is not practical nor desirable to decompress the whole thing in one go due to memory or latency constraints.

In my custom modification to LZ4 I achieved this by just limiting the window size (similar to BLOCK_SIZE in lzsa I suspect) for match offsets, and setting it to some user provided command line value (in my use-case anywhere from 256 bytes to 2048 bytes).

In this way, we know the decoder will never need to persist more than WINDOW_SIZE previously decompressed bytes in memory, so all we need is a WINDOW_SIZE memory buffer on the decoder side, and some fairly trivial helper functions to supply decompressed bytes one at a time from the compressed data stream. (I just implemented a simple state machine in my 6502 decoder to keep a track of ongoing literal and match runs to facilitate fetching of individual bytes)

Naturally, setting a smaller window size for match offsets will degrade compression ratio, but we can happily accept that trade-off in exchange for the streamed decompression capability. I still achieved pretty decent ratios even with a tiny 256 byte window.

In summary, do you think the ability to specify the maximum match offset window size would be a feasible possibility for lzsa to support?
Thanks!

@emmanuel-marty
Copy link
Owner

Hey,

Sorry for the delay in replying. Here is a patch to implement an optional window size setting:
max_window.zip

Once you apply this and build the lzsa tool, you can use -w<max_value> to compress with a maximum offset value, ie. -w256 would never use offsets larger than 256 for instance. (entire parts of the decompressor may then be useless for that particular file)

Let me know if that's what works for you, and I am happy to merge this optional feature in

Thanks!

@simondotm
Copy link
Author

simondotm commented Jan 10, 2021

Hi Emmanuel,
That's fantastic, thanks. While I'm eager to test it, unfortunately I currently have no tooling to compile C/C++ code atm as I develop in python/node on Windows most of the time, so I'll see if I can find a way to build it but it may be a while before I can get back to you.
Cheers

@emmanuel-marty
Copy link
Owner

Oh, no worries, I will build a modified exe for you today and you can let me know if the feature is what you need. Thanks for speaking up :)

@emmanuel-marty
Copy link
Owner

Here is a build with the -w option:
lzsa_win64_1.3.6_maxwindow.zip

You can use eg. -w512 to use a max offset of 512 or whatever. All the other flags are as usual.

Let me know if that works for your needs and I'll be happy to merge the changes if so, if not, let me know and we can work on it further.

Obviously when you limit the offset like that, you can also envision commenting out parts of the depacker that are unused, if you will only use it with the max offset-limited data

Best regards,
Emmanuel

@simondotm
Copy link
Author

Thanks so much Emmanuel. I will give this a try with some test data sometime this week and let you know how I get on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants