ids_embed

An Embeddings for Ideographic Description Sequence (IDS)

A blog can be find here

dependency

antlr4: use to generate the ANTLR4 parser, based on ids_embed/parse/ids.g4
pytorch: Others please just refer to requirements.txt

Steps

make sure assets/kanjivg.eids exists.
run script/prepare.sh it mainly uses ANTLR to generate the parsing code. (mainly calling this command antlr4 -Dlanguage=Python3 -visitor -o ./antlr ids.g4)
Train the embedding model to find similar words.

./main.py --runner ids_embedding_runner --config ids_embedding.yml

Note that training is optional. I am also uploading the model file in the repo because it is small anyway (yeah you can really just use a small model)

Use

In order to use your own IDS, just edit in config/ids_embedding.yml, replace the line 7 test_ids: "⿱⿰耳口之" to anything you like.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
configs		configs
ids_embed		ids_embed
runners		runners
scripts		scripts
storage/ids_embedding		storage/ids_embedding
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ids_embed

dependency

Steps

Use

About

Releases

Packages

Languages

LuxxxLucy/ids_embedding

Folders and files

Latest commit

History

Repository files navigation

ids_embed

dependency

Steps

Use

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages