Real estate projects / Machine Learning / Kaggle Competition
Two Sigma Connect: Rental Listing Inquiries How much interest will a new rental listing on RentHop receive?
This real estate project was a project on Kaggle the purpose of which was to classify apartment listings in New York City, some areas of Boston and some areas of California into low, medium, and high interest by customers of this service. Details of the competition may be found here:
https://www.kaggle.com/c/two-sigma-connect-rental-listing-inquiries
"Finding the perfect place to call your new home should be more than browsing through endless listings. RentHop makes apartment search smarter by using data to sort rental listings by quality. But while looking for the perfect apartment is difficult enough, structuring and making sense of all available real estate data programmatically is even harder. Two Sigma and RentHop, a portfolio company of Two Sigma Ventures, invite Kagglers to unleash their creative engines to uncover business value in this unique recruiting competition.
Two Sigma invites you to apply your talents in this recruiting competition featuring rental listing data from RentHop. Kagglers will predict the number of inquiries a new listing receives based on the listing’s creation date and other features. Doing so will help RentHop better handle fraud control, identify potential listing quality issues, and allow owners and agents to better understand renters’ needs and preferences.
Two Sigma has been at the forefront of applying technology and data science to financial forecasts. While their pioneering advances in big data, AI, and machine learning in the financial world have been pushing the industry forward, as with all other scientific progress, they are driven to make continual progress. This challenge is an opportunity for competitors to gain a sneak peek into Two Sigma's data science work outside of finance."
The project does some exploratory data analysis, some feature engineering, and then applies multiple machine learning methods (XG Boost, Random Forest) to classify the listings.
There are relevant comments in the Jupyter Notebook files that explain what is happening step by step.