Classification project

Jedha Data Science Bootcamp - Fullstack - week 3

Session dsmft-paris-08

Dealing with imbalanced dataset

Since the notebook uses the plotly library which is not supported by Github, you can follow this nbviewer link to view it.

Conversion Rate Challenge

Main goal

Optimizing the customers conversion rate is one of the most important task of a data scientist.

To achieve this goal, I aim to build a model to predict the conversion rate of the customers of a web site and make some recommandations to the marketing team in order to increase the incomes.

Description of the challenge

We got data about the website users of an anonymous company. The project consists in:

Buiding a model to predict the conversion rate (will the user buy or not ?
Making some recommendations to the Product & Markting team in order to increase this rate

Imbalanced dataset

Since the target classes are imbalanced, we can use some of available technics to adress the problem:

oversampling: duplicate examples of the minority class or generate synthetic examples using the imbalance_learn library
downsampling: reduce the number of examples of the majority class
mixing downsampling and oversampling to benefit from both
class weigths: calculate the weigths of the classes and incorporate them to the cost function for the model training

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
data		data
images		images
README.md		README.md
conversion_rate.ipynb		conversion_rate.ipynb
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Classification project

Dealing with imbalanced dataset

Conversion Rate Challenge

Main goal

Description of the challenge

Imbalanced dataset

About

Releases

Packages

Contributors 2

Languages

thefifthagreement/jedha-fs-s3-project

Folders and files

Latest commit

History

Repository files navigation

Classification project

Dealing with imbalanced dataset

Conversion Rate Challenge

Main goal

Description of the challenge

Imbalanced dataset

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages