Skip to content

Latest commit

 

History

History
34 lines (20 loc) · 2.57 KB

README.md

File metadata and controls

34 lines (20 loc) · 2.57 KB

📚 ProjectTextSuite

Welcome to ProjectTextSuite, an innovative open-source library tailored for QnA language model (LLM) pipelines across various project databases. My suite effectively handles multiple file types such as Word, PDF, PPTX, etc., and is particularly adept at managing large data tables like XLSX. Its modular design allows for independent use of each component, aligning seamlessly with diverse project needs.

This suite consists of six distinct packages, each contributing uniquely to the overall functionality:

  1. TextTableScoop: A robust file-to-text and table-to-csv parser utilizing LibreOffice to safely extract text and tables in CSV format.
  2. VecMetaQ: An efficient server wrapper over a FAISS vector database, offering fast similarity search and simplified metadata storage.
  3. ProjectTextAgent: A Docker-based file observer written in Go, designed to keep your file database constantly updated.
  4. xlsx2pandas: An intelligent XLSX file parser, ensuring smooth data extraction for integration into pandas dataframes.
  5. RelaLLM: Multi-Header Dataframe Processing with LLM-Powered Relational Database Mining.
  6. ProjectTextQnA: A question-answering interface leveraging LLMs for efficient QA and text-to-SQL pipelines, with support for both self-hosted and high-quality external LLMs like OpenAI API.

🚀 Getting Started

Installation steps are straightforward but remain under development. I'm working on making the setup process as seamless as possible.

TODO: Detailed installation instructions.

📘 Usage

The suite is designed for ease of use and broad accessibility. However, i'm still in the process of finalizing the comprehensive usage guidelines.

TODO: Complete usage documentation.

🛠️ API Endpoints

Each component of the suite boasts its own set of API endpoints, designed to facilitate a wide range of operations from data parsing to question-answering.

TODO: Document the API endpoints for each component.

🤝 Collaboration & Issues

I warmly welcome collaborators and are open to community feedback. If you're interested in contributing or have found an issue, please visit the issues page and check out my project board for more information and discussions.