Skip to content

Binary encoding with base-2048 in Python with Rust

License

Notifications You must be signed in to change notification settings

ionite34/base2048

Repository files navigation

Base 2048   pypi_badge versions

build_badge Rust Tests Python Tests

codecov pre-commit.ci status

When Base 64 is not enough

Allows up to 11 bits of data per unicode character as counted by social media and chat platforms such as Twitter and Discord.

Uses a limited charset within the Basic Multilingual Plane.

Based on, and uses a compatible encoding table with the Rust crate rust-base2048.

- Charset displayable on most locales and platforms

- No control sequences, punctuation, quotes, or RTL characters

Getting Started

pip install base2048
import base2048

base2048.encode(b'Hello!')
# => 'ϓțƘ໐µ'

base2048.decode('ϓțƘ໐µ')
# => b'Hello!'

Up to 2x less counted characters compared to Base 64

import zlib
import base64

import base2048

string = ('🐍 🦀' * 1000 + '🐕' * 1000).encode()
data = zlib.compress(string)

b64_data = base64.b64encode(data)
# => b'eJztxrEJACAQBLBVHNUFBBvr75zvRvgxBEkRSGqvkbozIiIiIiIiIiIiIiIiIiIiIiJf5wAAAABvNbM+EOk='
len(b64_data)
# => 84

b2048_data = base2048.encode(data)
# => 'ը྿Ԧҩ২ŀΏਬйཬΙāಽႩԷ࿋ႬॴŒǔ०яχσǑňॷβǑňॷβǑňॷβǯၰØØÀձӿօĴ༎'
len(b2048_data)
# => 46

unpacked = zlib.decompress(base2048.decode(b2048_data)).decode()
len(unpacked)
# => 4000
unpacked[2000:2002]
# => '🦀🐍'

Decode errors are provided with a character-position of failure

----> base2048.decode('༗ǥԢΝĒϧǰ༎ǥ')

DecodeError: Unexpected character 8: ['ǥ'] after termination sequence 7: ['༎']
  • To catch the error, use either base2048.DecodeError or its base exception, ValueError.
import base2048

try:
    base2048.decode('🤔')
except base2048.DecodeError as e:
    print(e)

License

The code in this project is released under the MIT License.

Related and prior works

Javascript - base2048

Rust - rust-base2048