Bengali Dataset

version - 0.1.0 (Pre-release)

Introduction

Bengali Dataset is the largest open source Bengali dataset for NLP. Solving NLP for Bengali comes with a broad set of challenges and difficulties. This is our first step to solve this problem. In future this dataset will be integrated with HuggingFace datasets library.

Number of Samples

This data set will contain 1M annotated samples

Contribute

This dataset is still in development phase, we need more contributors, developers to finish the initial 1M annotated Bengali dataset goal.

See the how to contribute guide

Contact the maintainers of the datasets

Join our discord community for further discussions.

LivingThings Community

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
config		config
README.md		README.md
entries.txt		entries.txt
how_to_contribute.md		how_to_contribute.md
main.bat		main.bat
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bengali Dataset

Introduction

Number of Samples

Contribute

About

Releases

Packages

Languages

neuropark/bengali-dataset

Folders and files

Latest commit

History

Repository files navigation

Bengali Dataset

Introduction

Number of Samples

Contribute

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages