Web Sources's "website link mode" does not scrape recursively the entire web site #655

marcofiocco · 2024-08-05T04:58:59Z

It would be a very useful feature.

If you cannot implement it, I can imlement my own web scraper, but what would be the best way to load all the scraped webpages?
Even Browse mode does not allow to specify whole folders, but just multi-selection files

jexp · 2024-08-21T14:03:58Z

I think as this is a mass processing job, it would make sense to use the underlying python code with LLMGraphTransformer in Langchain.

https://python.langchain.com/v0.1/docs/use_cases/graph/constructing/#llm-graph-transformer

kartikpersistent added the wontfix This will not be worked on label Aug 21, 2024

kartikpersistent assigned jexp Aug 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Web Sources's "website link mode" does not scrape recursively the entire web site #655

Web Sources's "website link mode" does not scrape recursively the entire web site #655

marcofiocco commented Aug 5, 2024

jexp commented Aug 21, 2024

Web Sources's "website link mode" does not scrape recursively the entire web site #655

Web Sources's "website link mode" does not scrape recursively the entire web site #655

Comments

marcofiocco commented Aug 5, 2024

jexp commented Aug 21, 2024