This is my very first medium post.
In this first post, let us take a look at how a simple EDA (exploratory data analysis) can be utilized to extract useful preliminary insights from data.
For people who are new to data or analytics field, usually, the first things that would come to mind are Machine Learning or Complex statistical modeling.
In fact, based on my experience in the past few years, it is not always mandatory to perform a fancy statistical model or machine learning technique in every analysis we are carrying on.
We can just use a simple EDA combined with good domain knowledge.
Airbnb mobile app illustration (source: bt.
com)To begin with, it’s good to have an understanding of what EDA is.
According to Wikipedia ???? , exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods.
A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task.
In short, EDA is usually performed in the early stage of analysis in order to give us the major ideas of the data, have further hypotheses and could lead to new data collection and experiments.
It is also fine to use EDA alone and present our findings (just like the analysis in this article!).
Data Source and ContextWe can get the dataset provided by Airbnb from this Kaggle link.
The dataset consists of 3 tables (listings, calendar, and reviews).
Data description‘listings’ table consists of detailed information of each listing (e.
g.
listing id, url, posted time, listing description, neighborhood description, normal price, location coordinate, etc)‘calendar’ table.
The availability of each listing in particular date and the price within that dayOverview:Airbnb is an online marketplace which provides homestay or tourism services.
It acts as a broker/aggregator from hosts and receives commissions from every booking made.
Seattle is a seaport city on the West Coast of the United States with 730,000 residents as of 2018.
In this article, let’s focus on the ‘vibe of each Seattle neighborhood’.
Seattle (source: wallpaperaccess)FindingsCapitol Hill is the neighborhood with the highest availability of property listed in Airbnb.
Its listing is even 65% higher than that of Ballard, the neighborhood with the second highest no of listing.
Capitol Hill, Belltown, and First Hill’s listings are dominated by apartment while most of Ballard, Minor, and Wallingford’s listings are house.
Queen Anne consists of a more balanced mix between house and apartment.
Three main neighborhoods which have the highest price of listings and concentrated in a relatively close area are Pike Place Market, Belltown, and Central Business District.
My Tableau, unfortunately, wasn’t able to connect and overlap the plot to the real map 🙁 So let’s just display the snapshot from Gmaps side by side instead :)There is ‘Neighborhood Overview’ column in ‘listings’ table in which we can get the overview given by the lister/property owner about the neighborhood.
neighborhood overview column exampleUsing Neighborhood Overview data, the three neighborhoods share more or less the same keywords: ‘Center’, ‘Market’, ’Downtown’, ‘Tourist’, Museum’, ‘Space’, ‘Museum’, ‘Walk’.
It is then clear that Pike Place Market, Central Business District, and Belltown are the most popular area in Seattle as they are located in the heart of Seattle, the center of market/trade, and close to tourism attractions.
Thus the property rent price is higher in this area.
By calculating the price difference/distance ratio, there are some neighborhoods which provide cheaper listing price but are near to the city center.
First Hill, Capitol Hill, and Eastlake are the top neighborhoods with the highest price diff/distance ratio thus making them the recommended neighborhoods for rent in Seattle.
The busiest time to visit those neighborhoods are between May — Aug during the year and year-end and start are the most vacant time.
Price doesn’t fluctuate as much as the availability.
However, since the busiest time is in the Mid-year, the highest price is also in the range of June-Aug.
Key TakeawaysCapitol Hill has the highest number of listing followed by Ballard and Belltown.
Capitol Hill, Beltown, and First Hill are dominated by apartment while Ballard, Minor, and Wallingford are dominated by House2.
Three main neighborhoods which have the highest price of listings and concentrated in a relatively close area are Pike Place Market, Belltown, and Central Business District3.
Pike Place Market, Central Business District, and Belltown are the most popular area in Seattle as they are located in the heart of Seattle, the center of market/trade, and close to tourism attractions.
Thus the property rent price is higher in this area4.
First Hill, Capitol Hill, and Eastlake are the top neighborhoods to be recommended as they provide cheaper listing price but are also near to the city central5.
The busiest time to rent properties in the recommended areas are between May — Aug during the year and year-end and start are the most vacant time6.
Price doesn’t fluctuate as much as the availability.
However, since the busiest time is in the Mid-year, the highest price is also in the range of June-AugAppendixTools :1.
SQL and Excel for data query and processing2.
Tableau and Thinkcell for visualization3.
wordclouds.
com for making word clouds— Will always be happy to discuss further ideas about the analysis, pls reach me through https://www.
linkedin.
com/in/muhammad-lutfi-hernandi/ ????.—— Cheers —.. More details