You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is there an existing issue for the same feature request?
I have checked the existing issues.
Is your feature request related to a problem?
Yes, I'm always frustrated when images within a PDF are processed only with OCR, which extracts text but fails to capture the context and detailed descriptions of the images. This limitation is particularly problematic when using the General or Manual chunking methods, as they do not provide comprehensive descriptions of visual content.
Describe the feature you'd like
I would like to integrate an Image2Text model to process and describe images within PDFs. This model should be used in conjunction with the General or Manual chunking methods to provide detailed and contextually accurate descriptions of images. This integration would enhance the overall data extracted from PDFs by including rich, descriptive information about visual content.
Describe implementation you've considered
Integration: Modify the existing PDF processing pipeline to incorporate the Image2Text model. When an image is detected during the General or Manual chunking process, the image should be sent to the Image2Text model for analysis.
Output Handling: The descriptive text generated by the Image2Text model should be incorporated into the final extracted data, alongside any text extracted via OCR.
User Interface: Provide options for users to enable or disable Image2Text processing, allowing flexibility based on their specific needs.
Documentation, adoption, use case
Use Case:
Technical Manuals: Users processing technical manuals often need detailed descriptions of diagrams and images to fully understand the content.
Research Papers: Researchers can benefit from comprehensive descriptions of charts and figures, which are critical for interpreting data and results.
Educational Materials: Educators and students can gain a better understanding of visual content in educational PDFs, such as textbooks and study guides.
Additional information
No response
The text was updated successfully, but these errors were encountered:
Is there an existing issue for the same feature request?
Is your feature request related to a problem?
Describe the feature you'd like
I would like to integrate an Image2Text model to process and describe images within PDFs. This model should be used in conjunction with the General or Manual chunking methods to provide detailed and contextually accurate descriptions of images. This integration would enhance the overall data extracted from PDFs by including rich, descriptive information about visual content.
Describe implementation you've considered
Documentation, adoption, use case
Additional information
No response
The text was updated successfully, but these errors were encountered: