Bagging is an ensemble algorithm that fits multiple models on different subsets of a training dataset, then combines the predictions…

Continue Reading# random

## Convergence rate of random walk on integers mod p

There’s a theorem that says you need about p² steps. We’ll give the precise statement of the theorem shortly, but…

Continue Reading## A Gentle Introduction to Bayesian Belief Networks

Probabilistic models can define relationships between variables and be used to calculate probabilities. For example, fully conditional models may require…

Continue Reading## Discrete Probability Distributions for Machine Learning

The probability for a discrete random variable can be summarized with a discrete probability distribution. Discrete probability distributions are used…

Continue Reading## A Gentle Introduction to Probability Distributions

Probability can be used for more than calculating the likelihood of one event; it can summarize the likelihood of all…

Continue Reading## Proving that a choice was made in good faith

This is something I’ve helped companies with. It may be impossible to prove that a choice was not deliberate, but…

Continue Reading## Detecting a short period in an RNG

The last couple posts have been looking at the Cliff random number generator. I introduce the generator here and look…

Continue Reading## Monte Carlo Simulation in R with focus on Option Pricing

Monte Carlo Simulation in R with focus on Option PricingOjasvin SoodBlockedUnblockFollowFollowingJun 25In this blog, I will cover the basics of Monte…

Continue Reading## Random Forest regression model Advanced Topics (+ Python code snippet using Sklearn)

Random Forest regression model Advanced Topics (+ Python code snippet using Sklearn)Georgios DrakosBlockedUnblockFollowFollowingJun 6In my previous article, I presented the Random…

Continue Reading## 5 Probability Distributions Every Data Scientist Should Know

Here’s your reward. Source: pixabay. Now that you know what a probability distribution is, let’s learn about some of the…

Continue Reading## Probability Distributions Every Data Scientist Should Know

Here’s your reward. Source: pixabay. Now that you know what a probability distribution is, let’s learn about some of the…

Continue Reading## Random Forest Regression model explained in depth

Random Forest Regression model explained in depthGeorgios DrakosBlockedUnblockFollowFollowingJun 3In my previous article, I presented the Decision Tree Regressor algorithm. If you…

Continue Reading## Understanding Random Forest

Photo by Skitterphoto from PexelsUnderstanding Random ForestHow the Algorithm Works and Why it Is So EffectiveTony YiuBlockedUnblockFollowFollowingJun 12A big part of machine…

Continue Reading## Building Intuition for Random Forests

Building Intuition for Random ForestsRandom Forest — A group of decision trees — is a powerful machine learning algorithmRishi SidhuBlockedUnblockFollowFollowingMay 2Photo by Vladislav Babienko on UnsplashIt…

Continue Reading## A truly horrible random number generator

I needed a bad random number generator for an illustration, and chose RANDU, possibly the worst random number generator that…

Continue Reading## Seeding Viral Growth: An Application of Graph Embedding

Seeding Viral Growth: An Application of Graph EmbeddingSimulating different seeding techniques to maximize information diffusion in a networkShaw LuBlockedUnblockFollowFollowingApr 15Are Influencers…

Continue Reading## Can You Tell Random and Non-Random Apart?

We need some harder evidence. This is where this easy method I eluded to earlier comes into play. If you…

Continue Reading## Probabilistic Graphical Models: Bayesian Networks

The venue, cuisine, distance from home, pricing etc. In general, we can write a custom program to answer our query…

Continue Reading## Google Adiantum and the ChaCha RNG

The ChaCha cryptographic random number generator is in the news thanks to Google’s Adiantum project. I’ll discuss what’s going on,…

Continue Reading## Real-Time Streaming and Anomaly detection Pipeline on AWS

Real-Time Streaming and Anomaly detection Pipeline on AWSSharmistha ChatterjeeBlockedUnblockFollowFollowingFeb 25Streaming Data is data that is generated continuously by thousands of data…

Continue Reading## How I Built a Simple Command Line App in Ruby with ActiveRecord

puts "SUCCESS" else puts "ERROR: EMPTY DATA" end else puts "ERROR: INVALID DATA" end rescue RestClient::ExceptionWithResponse => e err =…

Continue Reading## Explaining Feature Importance by example of a Random Forest

Source: https://unsplash. com/photos/BPbIWva9BgoExplaining Feature Importance by example of a Random ForestEryk LewinsonBlockedUnblockFollowFollowingFeb 11In many (business) cases it is equally important to…

Continue Reading## The importance of context in data sets: A short experiment

The importance of context in data sets: A short experimentUsing four forecasting methods in the same time series to show…

Continue Reading## Statistics is the Grammar of Data Science — Part 4/5

Statistics is the Grammar of Data Science — Part 4/5Statistics refresher to kick start your Data Science journeySemi KoenBlockedUnblockFollowFollowingFeb 10This is the 4th article…

Continue Reading## How to label text for sentiment analysis — good practises

If you haven’t, here’s a great chance of discovering how hard the task is. I am sure that if you…

Continue Reading