GitHub - SausageMuffins/Xtra-Large-Bubble: Investigating the possibilities of echo chambers in Retrieval Augmented Generation (RAG)

Overview

This project seeks to investigate the possibility of biases in generated answers when using Retrieval Augmented Generation (RAG) techniques with Language Models (LLMs).

Language Models Tested:

ChatGPT 3.5 Turbo

Embedding Models Tested:

text-embedding-3-small

Set-up

Follow the steps below to set up the necessary dependencies. LangChain quickstart guide here.

Install Dependencies

pip install -r requirements.txt

Make sure to create your own .env file to store the necessary information for the APIs. The .env file should minimally look like:

API_KEY=insert_api_token_here

Your OpenAPI key can be created here. Make sure to top up the balance to use the api key.

Relevant Files

format_output.py is mainly for formatting and visualizing outputs at different stages of the pipeline (loading, splitting, retrieving etc.). The visualized files are under the folder "visualizing_articles".

Note: The reference articles are news articles with various topics (e.g. Covid-19, Climate Change). The articles will be used to in the main file (xlb) to perform RAG on.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
__pycache__		__pycache__
visualizing_articles		visualizing_articles
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
format_output.py		format_output.py
requirements.txt		requirements.txt
xlb.py		xlb.py
xlb_web_load.py		xlb_web_load.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Set-up

Relevant Files

About

Releases

Packages

Languages

License

SausageMuffins/Xtra-Large-Bubble

Folders and files

Latest commit

History

Repository files navigation

Overview

Set-up

Relevant Files

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages