User Segmentation and Power User Analysis in SQL

What actions do you look for?Power Users Bring You the Most ValueAs a consultant working with different clients, I run into various terms for power users — super users, influencers, whales, alpha users, etc.

It doesn’t matter what you call them, but it’s important to understand which user actions you have to look for in order to define this powerful category.

The idea behind it is simple: your power users bring you the most value.

How you define the value depends on your company size, sector, location, or product/service.

Since we are speaking of user growth here, you have to aim for increased user acquisition.

Therefore, the power users would share/promote your content and bring you the most new users.

A common approach for defining the power user in marketing is to follow this process flow/funnel:Action: A baseline user action event.

The action can be a purchase, an order, a signup, or even a simple “view event” if it fits your product model.

Share: Another user action, but dedicated to sharing your product or your content on social media or via other channels you offer.

Recruits: Users who are recruited by a Share action.

Usually these are new signups who come from the Share by a parent user.

A parent user is often called a recruiter, and the new clients they bring are called recruits.

Influence score: Defined complex metric that shows the proportion of recruited users for each Share action event.

Getting Power User Data in SQLRegardless of how you collect and store data, you have to end up with a clean set of aggregated Action, Share, and Recruits events in order to calculate the Influence Score.

Often data comes from various channels, like Salesforce or Google Analytics, and some of these events may already be stored in RDBMS.

Therefore, if the data is stored in different tables, you can create a temporary table or a view in a database for easier data acquisition.

Below I will demonstrate querying the power user category using PostgreSQL and RedShift database (Amazon RedShift data warehouse).

Step 1: Get clean Action, Share, and Recruits data for the right period of time (I am using 6 months of data):As you can see, the first subquery returns Recruits data along with recruiter id.

The second subquery gives Shares as well as share channels and share timeline, which I aggregate as days_shared.

The third subquery simply fetches Action data for the same period of time.

Make sure you query these three buckets for the same period of time.

Step 2: Assign an Influence Score and connect all subqueries together:First, I connect Action, Shares, and Recruits subqueries and calculate a few interim metrics such asRecruit Score: Simply total number of recruits for your product/featureShare Score: Returns the share data for your product/featureDays Shared Score: Additional metric describing sharing timelineChannel Shared Score: Additional metric displaying popular sharing channelsAfter that, I assign the Influence Score for each user ID using the PostgreSQL window NTILE function, which is calculated based on the described scores below:NTILE splits data into calculated groups and help us rank the data from high to low.

The highest score will be returned as 1, and the lowest as 100.

Once again, we receive a proportion of users who have shared the most and recruited the most new users to the platform.

The final step is to fetch users who have the highest Influence Score (starting with 1).

These will be our power users.

As the final step, you can filter this user category to include only specific returning users, or to include only one product feature you want to explore or test.

Using the same approach, you can locate other user categories besides power users.

You can define passive or churning user categories as well, and then compare them to find opportunities for bringing them back.

Quantitative user profile analysis opens a lot of potential for crafting your product strategy.

.

. More details

Leave a Reply