Confluence loader "keep_newlines" not always passed to "process_pages" #20086
Labels
🤖:bug
Related to a bug, vulnerability, unexpected error with an existing feature
🔌: chroma
Primarily related to ChromaDB integrations
Ɑ: doc loader
Related to document loader module (not documentation)
Checked other resources
Example Code
libs/community/langchain_community/document_loaders/confluence.py
@@ -359,6 +359,7 @@ def _lazy_load(self, **kwargs: Any) -> Iterator[Document]:
content_format,
ocr_languages,
keep_markdown_format,
keep_newlines=keep_newlines
)
Error Message and Stack Trace (if applicable)
No response
Description
I use the confluence loader of langchain to download the pages content of a specific page of my confluence instance. While textspllitting/chunking the pages I've noticed that in none-markdown format the newlines were missing. During the debugging I saw that that the required forward-pass of the
keep_newlines
parameter was not passed down to all call of theprocess_pages
function inside oflibs/community/langchain_community/document_loaders/confluence.py
System Info
langchain=0.1.14
windows 11
python 3.10
The text was updated successfully, but these errors were encountered: