To answer this question, we look at the so-called GINI coefficient. This is primarily used in economics to reflect equality…
Continue Readingaverage
What’s up with traffic on the roads of Bangalore?
In this article, let us dive deeper into the data on traffic of Bangalore, with the objective to gain insights…
Continue ReadingWhat’s in a BGG rating?
Certainly the rating measures how people decide to rate the game, but what can we conclude from that?This article is…
Continue ReadingG-computation in Causal Inference
G-computation in Causal InferenceA counterfactual method for causal inferenceYao YangBlockedUnblockFollowFollowingJun 9In my previous post ‘Targeted Maximum Likelihood (TMLE) for Causal…
Continue ReadingTraining a Convolutional Neural Network from scratch
Time to get into it. We’ll pick back up where my introduction to CNNs left off. We were using a…
Continue ReadingA simple example of using Spark in Databricks with Python and PySpark.
A simple example of using Spark in Databricks with Python and PySpark. German GensetskiyBlockedUnblockFollowFollowingMay 28Apache Spark is an open-source distributed general-purpose…
Continue ReadingPython in Finance
Let’s move ahead to understand and explore this data further. Exploratory Data Analysis on Stock Pricing DataWith the data in our…
Continue ReadingA Very Simple Data Integration Project: Rebrickable LEGO Datasets
A Very Simple Data Integration Project: Rebrickable LEGO DatasetsEmanuele SicurellaBlockedUnblockFollowFollowingMay 20I perfectly remember the first time I got an injection. I…
Continue ReadingWhich factors influence Airbnb pricing in Boston?
Which factors influence Airbnb pricing in Boston?Tobias GorgsBlockedUnblockFollowFollowingMay 5Boston skyline — Photo by Zoltan Kovacs on UnsplashAirbnb is the flagship of the sharing economy…
Continue ReadingDecoding College Admissions
Decoding College AdmissionsJorge Castañón, Ph. D. BlockedUnblockFollowFollowingApr 8Photo by Pang Yuhao on UnsplashMost of us would agree that college admission can…
Continue ReadingEmployee Turnover: a Risk Segmenting Investigation
This was different from my initial train of thought that the employees were potentially overworked. Given this finding, I come…
Continue ReadingAverages are Meaningless*
Using the average — the representative value. Count soldiers in a representative company, multiply by number of companies and you have the…
Continue ReadingExploring chromatic storytelling in movies with R (Part I)
This is basically the reason why video sources are so heavy, by the way. Converting Uma Thurman from frames to…
Continue ReadingGenerating critical scenarios using Anomaly Detection
Well, there are several methods to perform anomaly detection such as Density based anomaly detection, Clustering based anomaly detection and…
Continue ReadingWhich is a better investment: real estate vs stocks
And how many people are telling you to invest in it now that it’s 416,870% higher?As investors we tend to…
Continue ReadingImplementing Moving Averages in Python
Source: UnsplashThe most commonly used Moving Averages (MAs) are the simple and exponential moving average. Simple Moving Average (SMA) takes the…
Continue ReadingHow I created a dashboard for faster searching on Airbnb — The Power of Tableau Dashboard
How I created a dashboard for faster searching on Airbnb — The Power of Tableau DashboardEllie WangBlockedUnblockFollowFollowingMar 25If you are planning your…
Continue ReadingIntroduction to Exploratory Data Analysis (EDA)
Could engine size possibly predict the price of the car?A great way to visualize this relationship would be to use…
Continue ReadingFeature Engineering Time
Let’s just describe direction as a 2d vector (x, y), not a single scalar that is the “heading” . Yep, let’s…
Continue ReadingHypothesis testing in the Northwind dataset using ANOVA
Hypothesis testing in the Northwind dataset using ANOVALocating the most profitable customersLaura LewisBlockedUnblockFollowFollowingMar 9Project aimAs part of a project on the…
Continue ReadingSentiment Analysis of Anthem Game Launch in Python
Langdetect by Mimino66's detect function is all we need to identify the language of our tweets. We can load in…
Continue ReadingAverage User Fallacy
{np. corrcoef(corr_data)}")>>> Correlation of randomly generated data – [[ 1. -0. 02200527] [-0. 02200527 1. ]]Correlation of correlated data –…
Continue ReadingBuilding a GradientBoostingRegressor to predict NBA player salaries
Building a GradientBoostingRegressor to predict NBA player salariesA brief guide to model validation/tuning and how to explain your model outputsSteven LiuBlockedUnblockFollowFollowingFeb 4Stephen…
Continue ReadingHow to go from bias to buyer
No, I was not. Imagine a distribution of apartments ranging from underpriced to overpriced. The underpriced apartments sell quickly and…
Continue Reading