Learning Analytics

Learning AnalyticsSambit DasBlockedUnblockFollowFollowingMay 25Wow .

This is a fascinating topic to delve into.

Using the Eight step methodology for Workforce analytics proposed by Nigel Geuenole , Jonathan Ferrar and Sheri Feinzig in their book “The power of People” , I will be exploring 2 case studies in this article.

Step 1 : Frame Business QuestionsColesX is a fictious Australian supermarket , retail and consumer services chain headquarted in Melbourne.

New employees are expected to attend a three day seminar to learn about the company and its policies on Workplace health and safety , Workplace bully and harassment , Buddy program , IT policies , Data privacy regulations etc.

At the end of the seminar , they are tested to measure their knowledge of the company and its policies.

The traditional method of the New starter training has been in-person lecture and Q&A.

Management has decided to experiment with a new e-learning based training for their new employees which could take upto 1 day to complete.

If this experiment works , it could save the company thousands of dollars over a period of several years.

However the senior management is keen to determine if the new e E-learning training program is more effective than the 2 day face to face method before implementing globally.

ColesX has a lot of employees who work on the checkout counters.

The company periodically benchmarks the scan rates of its employees vs the target scan-rate target.

Scan rate is basically the average number of items scanned per minute in the week by an employee.

The management recently decided to introduce a special training program to share best practices in improving scan-rates.

This training was optional for employees to attend in the first stage.

ColesX is very aware that a slight increase in average scan rates of its employees will introduce a significant boost to the monthly sales.

The organization is keen to study the impacts of the new training on the performance of the scan rates.

Based on the events , we frame our business questions for ColesX as :How effective is the 1 day new starter program in comparison to the traditional 2 day face to face programHow effective is the new special scan rate improvement training program and what factors are impacting the scan rate performanceStep 2 : Setting up HypothesisIn order to test the difference of the two training methods , managers selected a group of 15 newly hired employees to take the 2 day seminar (Method A) and a second group of 12 new employees to take the 1 day E learning (Method B).

The test scores were collected from all the new hires for each of the training methods which will form part of further analysisIn order to test the effectiveness of the new special scan rate training program , scan rates were measured for all employees before and after the training was provided.

Hypothesis are informed , testable explainations or predictions that address the business questions and support the data gathering and analysis.

For our case study , the following hypothesis have been identified :The average test scores of New start Training methods A and B DO NOT differ.

The scan rates of employees before and after the special training program DO NOT differ.

The analytics work carried out below will attempt to disprove the above hypothesis.

If we are successful in doing so , we will establish that the change is working.

At this stage a business owner should review and sign off the business questions and hypothesis for the Data science team to start the analytical work.

Step 3: Gathering DataFor our new starter training case study , the test scores from training methods A (2 day Seminar) and B(1 day E learning) have been collected as shown below :For our scan rate training program case study , several data points have been captured.

The data items consist of worker ID , edu (number of years of education undertaken), disability (0 means no disability and 1 means some form of associated disability) ,gender , ScanPminute (average number of items scanned per minute this week),Training(0 = Employee did not undertake training.

1 = Employee undertook training) , ScanPminuteTime2 (average number of items per minute in the week after training).

The age fields relate to specific age categories of employees (such as Age 1 = 16–19 years , Age 2 (20–24) etc.

Step 4: Conduct AnalysisNew start program effectiveness analysisThe analysis involves the testing of average scores from the dataset to compare if the scores differ based on training method.

This is where the application of statistics come into play with the introduction of the Independent samples T Test.

Independent : Because the two groups of test scores for each training method have no dependency.

New employees were segregated into different groupsT test : Is used to test the means between different groups and determine if the means differ statistically.

However a T test assumes equality of variances in the each of the groups which means the variation in test scores within the two sample groups for Training method A and training method B are roughly same.

Lets test the equality of Variance within the dataset by using the Levene’s test.

Based on the significance result (P value more than 0.

5) , we can assume that their is no variance of test scores within the 2 sample groups.

Now let’s run the all important T test to test our first Hypothesis : No significant change in test scores exist between both Training methods.

Wow : The mean test scores in Group A and Group B differ by atleast 8 average points.

Training method B (1 day E Learning) has a significantly higher average mean score of 56.


The T Test result also has a significant P value result of less than 0.

05 which means the Hypothesis that the average test scores of both Training programs are same can be Rejected.

In other words the 1 day E learning training method is proving to be highly effective.


New Scan Rate Training Effectiveness AnalysisHere the analysis deals with the determination of what factors are driving the new scan rate measurements taken after the training was conducted.

Here we are trying to determine if the variable Training ( Employees has been on a training course) has any impact on the New scan rates.

We can develop such a change model by developing a multiple linear regression model.

The regression model is developed using the NewScanrate scores as the Predictor variables and all the other variables in the dataset as the independent predictor variables.

Let’s see what the analysis shows.

The multiple linear regression results produce some interesting results (see the * mark on the right most section against each variable to identify the significant factors driving scan rates)Females are showing a progressive better scan rate with time.

Employees in Age1 (16–19) and Age6 (40–44 years) show progressive increase in scan rate performance with timeAs expected , Employees with higher existing scan rates produce higher ScanRates scores undertaken after the training eventAnd Most importantly : The Training field is highly significant confirming that attendence at the new scan rate training program is potentially increasing the scan ratesStep 5 : Reveal InsightsCool.

So our analysis has been successful in dispelling both the hypothesis.

Based on the analysis , it can be concluded thatThe 1-day E learning New starter training is more effective than the 2 day Face to Face trainingThe new scan rate training is effective in boosting the scan rate performance of the employees after trainingBut wait , on the second insight , our studies showed that Female population already showed higher performance towards scan rate over time.

In order to determine the effectiveness of the new training on scan out rates , let’s control the interaction variables of gender and training to evaluate how performance of scan rates improved post training.

We do so by developing an interaction plot of Scan rates measured after Training vs the interaction variables of Training (Taken : Yes or No) and Gender.

What we find is the performance of scan rates for Male population improved significantly from around 18 to 22 for those who took the training.

The scan rates for females stayed nearly the same with or without the training.

Hence a third insight has been discovered here.


The introduction of the new scan rate training benefits the male employees more than the females (who are already performing better)Step 6 : Determine RecommendationsBased on our findings , some very focused recommendations can be developed such as :Replace the 2 day face to face new starter training with the 1 day E learning.

This move will save thousands of dollars for the company while bringing more effectiveness to the process.

A cost model can be developed here to assess cost of 2 day trainings vs the 1 day trainings for future tracking and evaluationRollout the new scan rate peformance training for all employees.

Track attendance to the sessions especially the male population and encourage them to take up this training at the stores.

Introduce adoption schemes such as bonus programs to drive more participation in the training and increase the target scan out rates to drive overall better performanceStep 7 : Get your Point AcrossThis section deals with turning recommendations into actions with a company.

This is where a strong project sponsor champions the analytics cause and makes sure that the recommendations are implemented and communicated properly.

Change management , training , communications all form part of driving the adoption of the recommendations within the company.

Step 8 : Implement and EvaluatePeriodically , lets say every 6 months of implementation period , test scores of the 1 day new starter training program , cost savings , performance scan out rates and sales performance are all reviewed against the KPIs set at the start to measure the ongoing effectiveness of the program.

The financial benefits measurement forms a key part of the evaluation of such a project.

Reporting savings from an analytics project is key to developing the business case for more analytics based projects within the organization.

ReferencesPredictive HR Analytics : Martin R Edwards and Kirsten Edwards.

The Power of People : Nigel Guenole , Jonathan Ferrar and Sheri FeinzigDataset for 2nd case study is used from the HR Predictive Analytics book by Martin R Edwards and Kirsten EdwardsR Code developed by me and the datasets are in my github account :https://github.

com/Sambit78/People-Analytics-Project/tree/master/06%20-%20Training%20Analytics.. More details

Leave a Reply