Why is Everyone Going to Iceland?

(Hello again in Icelandic).

In my free time, I enjoy watching videos on YouTube on topics ranging from photography, cinematography, soccer, comedy, etc.

I recently started following Johnny Harris’ channel after watching his video on building a dream studio and stumbled upon his analysis on why tourism in Iceland has risen dramatically (link below).

I’ve always wanted to visit Iceland with my photographer friend Pablo because of the country’s beautiful scenery.

In this post, I decided to try to decipher why Iceland has such a meteoric rise in tourism using public data sources.

The video itself does a good job explaining the sudden spike/surge of tourism and gives a few good reasons.

My point here is not to refute his arguments but rather use data to see what I learn from it.

I find public data sources, clean the data, visualize it, and make predictions with Facebook Prophet.

Data SourcesFor my investigation, I started just as I would back in college when writing a research paper: google searches.

I was querying to find relevant, public datasets that I could download and analyze.

I settled on three main datasets which I have attached the links here:Visitors to Iceland Through Keflavik Airport : this is the main hub for international transportation.

I didn’t find stats for all visitors from all transportation hubs but I figured this would be a good sample from the entire population.

airport = pd.

read_excel(“/Users/stephenhyungilkim/Desktop/Iceland/hola.

xlsx”, parse_dates=[“year”], index_col=”year”, freq=’Y’)airport.

head(5)2.

List of facts Iceland: this was found from the World Bank and I reduced it to the most relevant 20 factors.

There were more than +150 different categories and I reduced it to the most interesting ones for me.

https://data.

worldbank.

org/country/icelandfacts = pd.

read_csv(“/Users/stephenhyungilkim/Desktop/Iceland/chau.

csv”, parse_dates=[“Date”], index_col=”Date”)facts.

head(5)Here is a list of all the reduced columns:3.

Yearly Number of Airline Passengers: foreign visitor arrivals by air and sea to Iceland from 1949–2017.

visitor = pd.

read_excel(“/Users/stephenhyungilkim/Desktop/Iceland/foreign-visitors-to-iceland-1949–2017.

xls”,parse_dates=[“Date”])#Foreign visitor arrivals by air and sea to Iceland 1949–2017visitor.

head(5)Data Pre-processingFacebook Prophet requires data to be in a specific format for it to fit the model and make the predictions.

I used dataset 3 to make the Facebook Prophet prediction since it contained the variable I was looking for, number of visitors.

However, the other two datasets (1 and 2) are also interesting to see correlation, patterns, etc.

visitor[‘Date’] = pd.

DatetimeIndex(visitor[‘Date’])visitor.

dtypesMaking sure dataset is a time series:visitor = visitor.

drop([‘percentage change’], axis=1)I am dropping the percentage change and only leaving the date and the number of visitors.

We will be making a prediction for number of visitors into the future dates which we specify later.

visitor = visitor.

rename(columns={‘Date’: ‘ds’, ‘visitors’: ‘y’})visitor.

head(5)The columns must be renamed ‘ds’, which is the date and ‘y’, which is the variable we are predicting the value for.

We are ready to use Facebook Prophet but before we do, let’s visualize this and the other two datasets.

Data VisualizationLet’s start by visualizing the first dataset.

the US clearly has a lot of interested tourist… not sure why, perhaps word of mouth and cheaper flights contribute to this (airline ticket price was one thing I wanted to analyze but couldn’t find public data…).

airport.

plot(figsize=(15,10))plt.

title(‘Visitors to Iceland Through Keflavik Airport’)plt.

ylabel(‘Visitors’)plt.

xlabel(‘Date’)Let’s look at dataset #2 now.

Since we have so many different variables, I start with a correlation_matrix in descending order.

There are the obvious suspects like ‘Air transport, passenger carried’ and ‘International tourism, number of arrivals’ being highly correlated since more passengers = more arrivals but other ones like ‘Bank liquid reserves to bank assets ratio (%)’, and ‘GNI per capita (current LCU)’ are more surprising and provocative.

corr_matrix = facts.

corr()corr_matrix[“International tourism, number of arrivals”].

sort_values(ascending=False)I also tried using a histogram but it was not as useful as I had thought so I left it out….

Let’s look at dataset #3 now.

ax = visitor.

set_index(‘ds’).

plot(figsize=(15,10))ax.

set_ylabel(‘Monthly Number of Airline Passengers’)ax.

set_xlabel(‘Date’)#ax.

set_yscale(‘log’)plt.

show()As Johnny Harris said in the video, you see the meteoric rise in number of passengers throughout time.

Since the scale is too wide, I made the graph a log scale.

Same code except one more line:ax = visitor.

set_index(‘ds’).

plot(figsize=(15,10))ax.

set_ylabel(‘Monthly Number of Airline Passengers’)ax.

set_xlabel(‘Date’)ax.

set_yscale(‘log’)plt.

show()Prediction Using Facebook ProphetWhile I mostly apply machine learning models from the scikitlearn library due to the easy nature, I wanted to try time series forecasting with Facebook Prophet.

I read an article written by Susan Li and followed the steps in the article below:https://www.

digitalocean.

com/community/tutorials/a-guide-to-time-series-forecasting-with-prophet-in-python-3Let’s set the model to 95% percent uncertainty.

Here is more information regarding prophet: https://facebook.

github.

io/prophet/# set the uncertainty interval to 95% (the Prophet default is 80%)my_model = Prophet(interval_width=0.

95)Fitting the model…my_model.

fit(visitor)Setting future dates.

Notice I put ‘A’ for annual in frequency.

future_dates = my_model.

make_future_dataframe(periods=10, freq=’A’)future_dates.

tail()Above, the dataframes are created for the years 2022,2023,2024, 2025, 2026.

forecast = my_model.

predict(future_dates)forecast[[‘ds’, ‘yhat’, ‘yhat_lower’, ‘yhat_upper’]].

tail()We get the values for our independent variable.

‘yhat_lower’ is the lower bound and ‘yhat_upper’ is the upper bound.

my_model.

plot(forecast, uncertainty=True)Below, we have a forecast plot:Above, we have a forecast plot.

Prophet also allows us to break it into components of our forecasts:Final WordsWhile traveling to Iceland might remain a fantasy since I have exhausted my holiday count early in the year, it has been interesting to use public data sources to understand more why Iceland is being such a popular tourist destination lately.

I would have loved to cross reference the #3 dataset with a historical weather dataset.

I tried using OpenWeatherMap’s API, which is easy to use but must pay for its full functionality.

Iceland is definitely in my bucket list!.

. More details

Leave a Reply