Skip to content

The House Rocket is a real state company. The data scientist from House Rocket should help the CEO answering two questions and creating two tool to help understanding the dataset.

Notifications You must be signed in to change notification settings

m4theus4ndr4de/insights-real-state-negotiation

Repository files navigation

logo

Real State Negotiation

This is a fictional project for studying purposes. The company, business context and the insights are not real. The dataset used in this project is from Kaggle and it is available there.

1. Description of the Business Problem

The House Rocket is a real state company. They work buying houses for a good price and selling them later after some time. The company has a dataset that contains information about a lot of houses available to be bought. The data scientist from House Rocket should help the CEO answering two questions and creating two tool to help understanding the dataset.

The questions to be answered:

Which houses should the House Rocket CEO buy and at what price? The source code can be found here and the dashboard is available here.

When is the best time to sell them and what would be the selling price? The source code can be found here.

The tools to be created:

An interactive dashboard in which it is possible to filter the data according to the CEO requirements and explore more about it. The dashboard was created using the Python package called Streamlit. It is available on Streamlit Cloud here.

Create a few insights about the dataset telling if they are true or false.

2. Dataset Attributes

Information about the atrributes can be found here.

AttributeDescription
idUnique ID for each home sold
dateDate of the home sale
pricePrice of each home sold
bedroomsNumber of bedrooms
bathroomsNumber of bathrooms, where .5 accounts for a room with a toilet but no shower
sqft_livingSquare footage of the apartments interior living space
sqft_lotSquare footage of the land space
floorsNumber of floors
waterfrontA dummy variable for whether the apartment was overlooking the waterfront or not
viewAn index from 0 to 4 of how good the view of the property was
conditionAn index from 1 to 5 on the condition of the apartment
gradeAn index from 1 to 13, where 1-3 falls short of building construction and design, 7 has an average level of construction and design, and 11-13 have a high quality level of construction and design
sqft_aboveThe square footage of the interior housing space that is above ground level
sqft_basementThe square footage of the interior housing space that is below ground level
yr_builtThe year the house was initially built
yr_renovatedThe year of the house's last renovation
zipcodeWhat zipcode area the house is in
latLattitude of the house
longLongitude of the house
sqft_living15The square footage of interior housing living space for the nearest 15 neighbors
sqft_lot15The square footage of the land lots of the nearest 15 neighbors

3. Business Premises

The premises that were assumed for the development of the business problem solution are:

  • The zipcode, condition and grade were the most important variables to decide which houses should be purchased or not. Only houses with condition greater than or equal to three and grade greater than or equal to seven were classified as houses to be purchased.
  • The season was considered an important variable to find the best moment to sell the house.
  • The median price was considered a better metric to evaluate if the house should be purchased because the mean value can vary considerably if a house in one region is priced much higher than other houses.
  • The median price per zipcode was also considered to set the selling price. Houses with a price below the median have 30% profit and houses above the median have 10% profit.
  • The price per square foot of the living area was the variable analized to buy or not the house
  • The values equal to zero in the column yr_renovated correspond to hoouses that were never renovated.
  • The price column represents the value at which the house was advertised for sale.
  • The date column represents the first day the house was for sale.

4. Solution Strategy

  1. Download the dataset from Kaggle.
  2. Understand the business problem.
  3. Clean, analyse and explore the dataset using data science packages in Python.
  4. Answer the main questions from the business problem.
  5. Develop dashboard for the CEO using Streamlit and deploy on the Streamlit Cloud.
  6. Create possible insights and analyse them.

5. The Insights

I1: Houses that have some kind of river, lake or sea in front of them are at least 30% more expensive than the others that don't have water in front of them.

True: Houses that have some kind of river, lake or sea in front of them are 212,64% more expensive.

I2: Houses built before 1955 are 50% cheaper.

False: The price of the houses that were built before and after 1955 are almost the same.

I3: The average price of the houses are greater in the summer than all other seasons by 10%.

False: The average price of the houses during the spring are greater than the summer.

I4: The average price increased by 10% from 2014 to 2015.

False: The mean price of the houses is almost the same in the two years considered.

I5: The difference between the lowest and highest value between the average price for the months is greater than 10 % of the maximum value.

False: The average price from april is a little bit less than 10% greater than the average price in february.

I6: Houses that were never renovated are at least 20% cheaper.

True: Houses that were never renovated are 30% cheaper than the others that were renovated.

6. Possible Profit of the Solution

The proposed solution would result in an average profit of 100 K per house purchased and sold.

House Rocket would get a profit of 998 M if all the houses were bought requiring an investment of 5,134 M.

7. Conclusion

The questions that motivated this project were answered. Analysing the dataset it was possible to find out which houses should be bought based on their price, zipcode, condition and grade. The dashboard was created using Streamlit and deployed on Streamlit. The insights were generated based on the dataset from Kaggle.

8. Future Work

  • Improve Streamlit dashboard to add new features.
  • Analyse the data to find out if houses in bad condition should be bought and renovated.
  • Develop a machine learning model to predict if a certain house with known attributes should be bought or not by a given price.
  • Develop a machine learning model to predict the adequate value to sell a house the was already bought.

About

The House Rocket is a real state company. The data scientist from House Rocket should help the CEO answering two questions and creating two tool to help understanding the dataset.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published