Skip to content

Amharic-Word Embedding-Word2vec is a pre-trained distributed word representation (word embedding) which aims to provide the Amharic NLP researcher with free to use.

Notifications You must be signed in to change notification settings

gashawdemlew/Amharic-Word-Embedding-Word2vec

Repository files navigation

Amharic-Word-Embedding-Word2vec

Amharic-Word Embedding-Word2vec is a pre-trained distributed word representation (word embedding) which aims to provide the Amharic NLP researcher with free to use. The repository consists for codes that allow users to train thier embedding using thier own dataset and computing similarity between words/phrases, and two pre-trained models with different dataset settings (models with stemmed and unstemmed datasets). In addition, the repository handle a collection pair of Amharic words referred to as "wordsim100 (provides human annotated scores of relatedness between term pairs)" collected form potential users which was used to evaluate word embedding model.

Note: -to run the code you need to import gensim (python module)

  -you can also download the above embeddings form my drive. here are the links;
  
   Amharic word embedding_skipgram with stemmed data: https://drive.google.com/file/d/1f-AAdiu_caxAfEL7Ll8dOkNOiEUjMVC5/view?usp=share_link
  
   Amharic word embedding_skipgram with unstemmed data: https://drive.google.com/file/d/1SFTeMQALxKH3rsER60rXmobgr2D2Msd1/view?usp=share_link

About

Amharic-Word Embedding-Word2vec is a pre-trained distributed word representation (word embedding) which aims to provide the Amharic NLP researcher with free to use.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages