Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Next generation (needs funding and time) #311

Open
tony opened this issue Feb 4, 2024 · 0 comments
Open

Next generation (needs funding and time) #311

tony opened this issue Feb 4, 2024 · 0 comments

Comments

@tony
Copy link
Member

tony commented Feb 4, 2024

P.S. I am out of contact with anyone from UNIHAN, is someone else already on the same effort as me? Can this effort be shared in any way?

This project can do much more to unlock the breadth and depth of UNIHAN:

  • Sustainability (from an informatic standpoint)

  • Correctness

    Digging deeper into the Database design, more needs to be done to ensure extraction and interlation are provided in a structured and detailed way.

  • Documentation

  • Typings

  • Speed and performance

  • Potentially

    • Checking and code generation

      Perhaps https://www.unicode.org/reports/tr38/ can be crawled and used to verify correctness, and to an extent, in the future, we can generate.

    • Cross-language compatibility

    • Language-based speedups, e.g. rust json / yaml / csv parsing. Perhaps the whole core can be a rust-based package with language interconnections

UNIHAN can be made even more accessible to the masses - I am the one that can make happen, but it would take time and above all: Funding. This would need to be my 100% focus of my free time outside of work for months, or even longer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant