FDS-Project-HousePrices

Project for Fundamentals of Data Science 2018/2019, from the MSc in Computer Science.

Forked from luigiberducci, group composed by: luigiberducci, angelodimambro, and I.

Kaggle Score: 0.11440

Feature engineering overview

3 new features introduced: total number of bathrooms, number of garage cars multiplied by garage area, total square feet
removal of multicollinear features
automatic removal of features receiving caret importance score equal to 0 when considering a Lasso regression model, until the RMSE value of such model didn't decrease any further

Models overview

Simple models

Lasso regression model
Ridge regression model
eXtreme Gradient Boosting model
Support Vector Machines

More complex models

Ensemble model (average)
Stacked regression model (both variants A and B)

Ensemble model

Our ensemble model performs a weighted average of predictions produced of a set of simple models, using the following weights and models:

Model	Weight
Lasso	0.5
Ridge	0.5
XGB	3.5
SVM	5

Such weights have been optimized via 10-fold CV, minimizing the average RMSE and weights themselves.

Stacked regression model

A set of simple models' predictions is used to train a meta-model.

Variants:

Variant A: meta-model trained on the average of the predictions produced during the simple models' k-fold trainings
Variant B: meta-model trained on predictions produced by new instances of the simple models, those being trained on the whole training set

Our stacked regression model uses the following recipe:

Simple models	Meta-model
Lasso	Specific XGB
Ridge
XGB
SVM

Final predictions

Our final predictions are computed in the following way:

predictions = ( 2 * ensemble + xgb + svm + stacked_variantA + stacked_variantB ) / 6

Name		Name	Last commit message	Last commit date
Latest commit History 159 Commits
data		data
out		out
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
featureEngineering.R		featureEngineering.R
main.R		main.R
modelTesting.R		modelTesting.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FDS-Project-HousePrices

Feature engineering overview

Models overview

Simple models

More complex models

Ensemble model

Stacked regression model

Final predictions

About

Releases

Packages

Contributors 2

Languages

License

emanuelegiona/FDS-Project-HousePrices

Folders and files

Latest commit

History

Repository files navigation

FDS-Project-HousePrices

Feature engineering overview

Models overview

Simple models

More complex models

Ensemble model

Stacked regression model

Final predictions

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages