Using Geotab’s Open Datasets— Visualizing Results Using Python and Colab Notebooks

Well, we’ll make use of shapefiles, specifically from an open-source library called “us.

” Since we are using a Google Colab Notebook, we are not going to download any of the files to our local storage; instead, we are going to access the data via a zip file that the library provides a URL to:url = us.

states.

NY.

shapefile_urls('county')print('Downloading shapefile.

')!wget $urlz = zipfile.

ZipFile('/content/tl_2010_36_county10.

zip')print("Done")filenames = [y for y in sorted(z.

namelist()) for ending in ['dbf', 'prj', 'shp', 'shx'] if y.

endswith(ending)] print(filenames)!unzip /content/tl_2010_36_county10.

zipdbf, prj, shp, shx = [filename for filename in filenames]NY = gpd.

read_file(shp)print("Shape of the dataframe: {}".

format(NY.

shape))print("Projection of dataframe: {}".

format(NY.

crs))NY.

head()The resulting table from our preceding code.

Awesome, so we now have our shapefile data for each County of New York State.

Now we need to combine both dataframes so that the shapefile data is associated with the AverageIdleTime data for each corresponding County.

To do this, I made a helper method:def countyToIndex(county): for i in range(len(df['County'])): if county == df['County'][i]: return i else: print("The index " + str(i) + " does not correspond to the County " + county)countyToIndex() takes a County Name (as a string) as input, and it returns the corresponding index to the row which contains the specified County in Geotab’s dataframe “df”.

Making use of our helper method, let’s make our final dataframe:NY['AverageIdleTime'] = np.

zeros(len(NY))for i in range(len(NY)): NY['AverageIdleTime'][i] = df['AverageIdleTime'][countyToIndex(NY['NAME10'][i])]NY.

head()If we scroll to the right, we can see the newly added column “AverageIdleTime” which corresponds to each County.

Finally, let’s visualize our data using Matplotlib:fig, ax = plt.

subplots(1, 1)NY.

plot(column= NY['AverageIdleTime'], ax=ax, legend=True, cmap='Reds')Here we can see NY State segmented by County, with the shade of red indicating the amount of average idle time.

There you have it!.Try playing around with what metrics you’re measuring.

For example, what about road impediments?.Temperature?Geotab provides many datasets which are perfect for beginners and seasoned data scientists alike.

Feel free to examine their website for further in-depth examples of how you can use this data and why it matters.

Use your imagination and always consider realistic use-cases.

In the end, your graphs can be pretty but if they offer no valuable insights, who cares?.

. More details

Leave a Reply