Ten years ago I wrote about how cosine makes a decent approximation to the normal (Gaussian) probability density. It turns…
Continue Readingnormal
Statistics for Data Science: What is Normal Distribution?
Introduction to the Normal Distribution Have you heard of the bell curve? It tends to be among the most discussed…
Continue ReadingTruncated distributions vs clipped distributions
In the previous post, I looked at truncated probability distributions. A truncated normal distribution, for example, lives on some interval…
Continue ReadingBasics of Independent Component Analysis
From a visual perspective, it feels pretty clear that there are two populations with two linear trends. The two groups…
Continue ReadingEver Wondered Why Normal Distribution Is So Important?
What is the logic behind it?The idea revolves around the theorem that when you repeat an experiment a large number of…
Continue ReadingUnderstanding Gaussian Classifier
This leads to a multivariate normal distribution, the equation of which is given below:Σ is a covariance matrix. Function symbol…
Continue ReadingWhy correlation might tell us nothing about outliers
Why correlation might tell us nothing about outliersGevorg YeghikyanBlockedUnblockFollowFollowingJun 6IntroductionWe often hear claims à la “there is a high correlation between…
Continue ReadingHypothesis Testing Glossary for the Weary Reader
Hypothesis Testing Glossary for the Weary ReaderFrom “alpha” to “z-score”Steven RosaBlockedUnblockFollowFollowingJan 26TL;DR — Jump to glossaryWhy So Weary?When I try to read about statistics I…
Continue ReadingA stitch delayed — a modest fix for the biggest small problem in data science
(by which I mean if we just wrote down the two numbers — let’s say a mean of 1 kg and a…
Continue ReadingExplaining probability plots
SourceExplaining probability plotsIn this article I would like to explain the concept of probability plots — what they are, how to implement…
Continue ReadingAverages are Meaningless*
Using the average — the representative value. Count soldiers in a representative company, multiply by number of companies and you have the…
Continue ReadingNormal approximation to Laplace distribution?
Both distributions are symmetric about their means, so it’s natural to pick the means to be the same. So without…
Continue ReadingRidesharing my way — Uber
$ 4. 3 per mile!The sample size for San-Francisco, Chicago, and Baltimore aren’t large and hence I will not delve…
Continue ReadingPredicting Russian Trolls Using Reddit Comments
Predicting Russian Trolls Using Reddit CommentsUsing Machine Learning to Predict Russian TrollsBrandon PunturoBlockedUnblockFollowFollowingJan 8Code for those InterestedIntroductionReddit Logo. Source: Reddit. comRussia has…
Continue ReadingHow to Improve Your Network Performance by Using Curriculum Learning
How to Improve Your Network Performance by Using Curriculum LearningVivianeBlockedUnblockFollowFollowingJan 7The idea of curriculum learning has already been proposed by…
Continue ReadingData science concepts you need to know! Part 1
There are a number of approaches here, I will present two of the more common methods.Let’s take the following data…
Continue ReadingHow not to be afraid of Vim anymore
Figure out the command to make them happen!Delete the next 3 lines ( including current line)Copy current word — cursor is at…
Continue ReadingData Retention: Handling Data with Many Missing Values and Less Than 1000 Observations
Data Retention: Handling Data with Many Missing Values and Less Than 1000 ObservationsAsel MendisBlockedUnblockFollowFollowingNov 5The data used in the current…
Continue Reading