Skip to content
@doc-analysis

Document AI (Microsoft Research Asia)

This repo provides a list of Document AI benchmark datasets from Microsoft Research Asia. For more details, please visit http://aka.ms/document-ai

Pinned Loading

  1. TableBank TableBank Public

    TableBank: A Benchmark Dataset for Table Detection and Recognition

    988 139

  2. DocBank DocBank Public

    DocBank: A Benchmark Dataset for Document Layout Analysis

    Python 538 71

  3. XFUND XFUND Public

    XFUND: A Multilingual Form Understanding Benchmark

    174 17

  4. ReadingBank ReadingBank Public

    ReadingBank: A Benchmark Dataset for Reading Order Detection

    80 2

Repositories

Showing 8 of 8 repositories
  • ReadingBank Public

    ReadingBank: A Benchmark Dataset for Reading Order Detection

    doc-analysis/ReadingBank’s past year of commit activity
    80 2 6 0 Updated May 14, 2024
  • DocBank Public

    DocBank: A Benchmark Dataset for Document Layout Analysis

    doc-analysis/DocBank’s past year of commit activity
    Python 538 Apache-2.0 71 26 0 Updated Jul 19, 2023
  • TableBank Public

    TableBank: A Benchmark Dataset for Table Detection and Recognition

    doc-analysis/TableBank’s past year of commit activity
    988 Apache-2.0 139 28 0 Updated Jul 19, 2023
  • doc-analysis/tablebank-page’s past year of commit activity
    HTML 1 1 0 0 Updated Jul 19, 2023
  • doc-analysis/docbank-page’s past year of commit activity
    HTML 1 0 0 0 Updated Jul 19, 2023
  • XFUND Public

    XFUND: A Multilingual Form Understanding Benchmark

    doc-analysis/XFUND’s past year of commit activity
    174 17 9 0 Updated Jul 15, 2022
  • doc-analysis/doc-analysis.github.io’s past year of commit activity
    CSS 2 0 0 0 Updated Sep 28, 2021
  • DocBankLoader Public

    DocBankLoader is a dataset loader for DocBank, and can convert DocBank to the Object Detection models' format.

    doc-analysis/DocBankLoader’s past year of commit activity
    Python 23 MIT 6 0 0 Updated Mar 17, 2021