This project successfully implements a recommendation system for Facebook Marketplace utilizing a combination of image and tabular data, as well as image similarity search.
To set up the environment for this project, follow these steps:
- Clone this repository:
git clone https://github.com/PelumiAdeboye/facebook-marketplaces-recommendation-ranking-system
- Install the required Python packages:
pip install -r requirements.txt
- Access your AWS EC2 instance and S3 bucket to download data and files needed for the project.
Data cleaning for the tabular dataset is complete. The following tasks have been executed:
-
A Python script,
clean_tabular_data.py
, has been created to clean the tabular dataset. -
All null values in any column have been removed.
-
Prices have been converted into a numerical format by removing pound signs and commas.
-
The main category of each product has been extracted, and labels have been assigned.
Image data cleaning has been successfully carried out:
-
A Python script,
clean_images.py
, was created to standardize image sizes and channels. -
An image-cleaning pipeline was established to ensure consistency in image size and channels.
Model training has been completed. The following objectives have been met:
-
Machine learning models for tabular and image data have been trained in (
classify_images.py
) -
A dataset (
dataset.py
) forfeeding entries to the model has been created. -
Transfer learning was employed to fine-tune a pre-trained model (ResNet-50) (
ResNet50_CNN.py
) for image classification. -
Model weights and label encoder/decoder have been saved in
image_modelDav.pt
. -
A training loop and validation process were implemented successfully. (
classify_images.py
)
Feature extraction has been accomplished:
-
Image embeddings for every image in the training dataset were extracted using a feature extraction model in (
image_embeddings.py
) -
A dictionary was created, mapping image IDs to their respective image embeddings. This dictionary was saved as a JSON file named
image_embeddings.json
.
The image similarity search system is in place:
-
The saved dictionary of image embeddings was loaded.
-
A FAISS model was created with image IDs as the index and corresponding image embeddings as values in file (
faiss1234.py
) -
An API was implemented to perform vector search for similar images using FAISS in file (
api.py
)
For detailed information and instructions related to each milestone, refer to the relevant section in the project documentation.
- A list of project dependencies is available in the
requirements.txt
file.
The project is ready for use, and instructions on how to run and utilize it can be found in the project documentation.