Databricks, AWS, and SafeGraph Team Up For Easier Analysis of Consumer Behavior

The column related_same_month_brand and related_same_day_brand reports an index of how frequently visitors to a POI visit also visit other brands (relative to the average visitor rate to that brand).

Here we look at what other brands are frequently visited by customers of Starbucks.

The larger the index, the more frequently starbucks customers visit that brand.

Although Starbucks is a national chain, cross-brand shopping is highly influenced by local geography.

Here we show the top 5 top cross-shopping brands for Starbucks customers in California, New York, and Texas.

Only McDonald’s is in the Top 5 of all 3 states.

Analyzing a Brand’s Customer Demographics You can use SafeGraph data from AWS Data Exchange in Databricks to analyze the customer demographics of individual POI or brands.

For a deep dive on the methodology, along with more complete statistical analysis feel free to read this workbook.

Here we analyze Starbucks Customer Demographics along the Race Demographic dimension using available from SafeGraph in AWS Data Exchange.

This analysis could be repeated for any demographic information tracked by the Census, and reported at the census block group level.

That includes Ethnicity, Educational Attainment, Household Income, and much, much more.

To do this analysis we will use: Census data (from Open Census Data) SafeGraph Patterns data, specifically the visitor_home_cbgs column SafeGraph Panel Overview data The y-axis shows the % of total visitors for each demographic segment.

The baseline demographics of the United States are shown as a reference.

  SafeGraph Patterns shows interesting differences between the census area demographics of Starbucks Customers compared to the overall USA population SafeGraph Patterns data shows that on average, the home census block groups (CBGs) of Starbucks customers are 78.

4% White, whereas the USA population is only 73.

3% White.

In other words, the home census areas of Starbucks customers are a larger fraction White than the US population.

The home CBGs of Starbucks customers are a larger fraction Asian, compared to the USA population.

The home CBGs of Starbucks customers are a smaller fraction Black or African American compared to the overall USA average.

  Importantly, these differences are not due to geographic sampling bias in the SafeGraph dataset.

  It is true that the SafeGraph dataset has some small geographic biases.

For a full report see “What about bias in the SafeGraph dataset?”.

However, we are able to measure and correct the small effects of sampling bias in the SafeGraph dataset as part of the cbg_adjust_factor calculation.

If the differences observed were due solely to geographic sampling bias in the SafeGraph dataset, then they would disappear after the correction.

The differences that remain cannot be attributed to sampling bias.

For a thorough discussion on this methodology, see A Workbook to Analyze Demographic Profiles from SafeGraph Patterns Data.

Summary Reading SafeGraph data from AWS Data Exchange into Databricks is quick and easy.

Combining these technologies and datasets enables you to answer powerful and precise questions about consumer behavior.

Thanks for reading!.Want to get more SafeGraph data?.There are over 20 datasets available for free or for purchase in AWS Data Exchange.

Check them out!.And you can download CSVs for data on over 6MM points-of-interest at the SafeGraph Data Bar.

Use coupon code SafeGraphAWSDatabricksNotebook for $200 of free data.

Questions on this notebook?.Drop us a line at datastories+aws@safegraph.

com .

bgtxt_gray {background:#f7f7f7;padding:.

35rem;font-family: Menlo,Monaco,Consolas,”Courier New”,monospace;font-size:10pt;} Try Databricks for free.

Get started today Related Terms:Term: Unified AnalyticsTerm: GenomicsTerm: Spark SQLTerm: Data lakeTerm: Datasets.

. More details

Leave a Reply