Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change gensym format #215

Merged
merged 1 commit into from
May 28, 2023
Merged

Change gensym format #215

merged 1 commit into from
May 28, 2023

Conversation

gilch
Copy link
Owner

@gilch gilch commented May 28, 2023

Resolves #210.

To avoid excessive prefix length, I've rolled __name__ and template count into the hash, which I've limited to 40 bits (5 bytes). I've been reading up on the birthday paradox. I think this is more than enough hash space for typical gensym usage, and pretty much don't expect collisions ever, but maybe there's some subtle argument for greater length. It would not be too difficult to bump this up to 6 or 8 bytes using the same format, although that would take more characters to encode. The format would not have changed, so it wouldn't require as much of a change in the docs, and I could keep the old examples.

I've encoded the hashes using base32, which is 5 bits per character, so they typically take 8 characters, plus the 5-character _Qz{}z_ overhead needed to mark them as gensyms. Base64 uses non-identifier characters and doesn't save all that many. (Base32 is only 2 characters shorter than hex.) If we turn out to need really long hashes I'd consider higher Unicode to avoid excessive printed gensym length, but this seems complicated. (UUID4 only has 122 bits of space, which base32 could encode in 26 characters. There's just no way we'd need to default to more than that.)

@codecov
Copy link

codecov bot commented May 28, 2023

Codecov Report

Merging #215 (4c205ab) into master (92b210b) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##            master      #215   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            6         6           
  Lines          702       710    +8     
  Branches       109       109           
=========================================
+ Hits           702       710    +8     
Impacted Files Coverage Δ
src/hissp/reader.py 100.00% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@gilch gilch merged commit 4874223 into master May 28, 2023
@gilch gilch deleted the gensym-upgrade branch May 28, 2023 23:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gensysms aren't good enough
1 participant