DSC 214 Topological Data Science Project
Collaborators:
- Jiongli Zhu
- Zihe Liu
- Laurel Li
We conducted a topological structure analysis of word embeddings for 44 different languages. Using mapper algorithms, we derived the topological structures for each language and constructed persistence diagrams accordingly. We compared these persistence diagrams using the bottleneck distance to evaluate similarities. Hierarchical clustering was then employed to establish an overall multi-linguistic structure analysis. Our findings align with established literature in linguistics, offering valuable insights into the topological characteristics of multilingual word embeddings.