Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"community: Fix GithubFileLoader source code", "docs: Fix GithubFileLoader code sample" #19943

Merged
merged 3 commits into from
Aug 22, 2024

Conversation

nobutoba
Copy link
Contributor

@nobutoba nobutoba commented Apr 3, 2024

This PR adds tiny improvements to the GithubFileLoader document loader and its code sample, addressing the following issues:

  1. Currently, the file_extension argument of GithubFileLoader does not change its behavior at all.
  2. The GithubFileLoader sample code in docs/docs/integrations/document_loaders/github.ipynb does not work as it stands.

The respective solutions I propose are the following:

  1. Remove file_extension argument from GithubFileLoader.
  2. Specify the branch as master (not the default main) and rename documents as document.

Currently, the `file_extension` init argument in `GithubFileLoader` does not change its behavior at all.
The `GithubFileLoader` sample code in `docs/docs/integrations/document_loaders/github.ipynb` does not work as it stands, for the default branch of the langchain repository is "master" (not "main").
Copy link

vercel bot commented Apr 3, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jun 13, 2024 2:37am

@dosubot dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. Ɑ: doc loader Related to document loader module (not documentation) 🤖:nit Small modifications/deletions, fixes, deps or improvements to existing code or docs labels Apr 3, 2024
@isahers1
Copy link
Collaborator

@baskaryan lgtm pending checks

@ccurme ccurme added the community Related to langchain-community label Jun 18, 2024
@@ -178,7 +178,6 @@ def url(self) -> str:
class GithubFileLoader(BaseGitHubLoader, ABC):
"""Load GitHub File"""

file_extension: str = ".md"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is breaking but I can't find any usage of file_extension

@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Aug 22, 2024
@ccurme ccurme merged commit 4b63a21 into langchain-ai:master Aug 22, 2024
53 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Related to langchain-community Ɑ: doc loader Related to document loader module (not documentation) lgtm PR looks good. Use to confirm that a PR is ready for merging. 🤖:nit Small modifications/deletions, fixes, deps or improvements to existing code or docs size:XS This PR changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants