How to Create a Pricing Model for Renting A Property via Airbnb
Renting out a property via Airbnb has become a popular to earn investment income.
There are several advantages of renting a property via Airbnb.
- Attracting more people from worldwide who plan to travel in your country or city.
- Easy to advertise properties on the Internet.
- Save on agency cost.
- Renting terms are flexible — you don’t have to look for tenants who consider long-term stays.
However, it is challenging to advertise a property at a reasonable price as each property is unique.
The higher price of a listing property leads to fewer people who are willing to plan to rent. On the other way, a lower price will harm the landlord’s investment return.
In this article, let me show you how to build a simple pricing model for listings property in a few steps.
- Collect Data
First, let’s start by collecting listing datasets from kaggle.com. I use Seattle WA listing data for my pricing model.
- Choose relevant aspects
Next, thoroughly investigate and analyze the dataset and identify the attributes which have an impact on listing prices. In my case, I choose ‘location’, ‘property type’, ‘room type’ as 3 main attributes to use in my pricing model.
- Clean Data
This is the most important task for creating a good pricing model. In my case, I removed the columns which have large proportion miss values from the dataset.
If the number of missing values in a column is minor, I use the imputation data technique (Refer to Seven Ways to Make up Data: Common Methods to Imputing Missing Data) to deal with the missing values
The missing values in categorical columns can be transformed into 0 or 1. For details about categorical data, please refer to Ways To Handle Categorical Data With Implementation
- Build a model
After data is cleaned, I choose a linear regression model to train the data and then build a model. However, it is up to you to choose the model you feel comfortable and/or familiar with.
- Result
I choose 2 different sets of numeric attributes to build the pricing models.
First, I choose ‘accommodates’, ‘bathrooms’, ‘bedrooms’, ‘beds’, ‘host_response_rate’, ‘host_acceptance_rate’, ‘host_has_profile_pic’, ‘host_identity_verified’. The error rate is -706.
Note that the error rate is the variation of the data result and model output result.
Second, I only choose ‘accommodates’, ‘bathrooms’, ‘bedrooms’, ‘beds’. The result looks reasonable — the error rate is 0.56.
Therefore, it is not always true that inputting more attributes leads to a better model. It’s also related to the size of the dataset, and the relevancy of the data used.
There will be more work to be done for building a good pricing model.
Several considerations
- Software choice: I use ‘Jupyter notebook’. It’s easy to install via https://www.anaconda.com/products/individual
- Size of data: In general, the larger the dataset is, the better the model will work. However, due to the limitation of time and resources, I suggest the number of records of the dataset should not less than 3,000.
- Relevant knowledge: To build a model, basic programming and statistics knowledge is required. I use Python and Pandas. I also recommend you to do so. The reason is simple — Python and Pandas are widely used for data science modeling projects
- Iterative process: Do not expect you would be able to build a working model for the first time. Most of the time, you will need to go through lots of repeated steps, for example, adding/removing data attributes, reloading data with different sizes, changing model parameters, etc.
Pricing Model Link
I save my pricing model work in Github. Feel free to use and play.