He/she found a flaw, something that doesn’t smell right, your discoveries don’t match their understanding about the domain — After all, they…
Continue Readingmissing
Predicting Animal Shelter Outcomes
Predicting Animal Shelter OutcomesA guide to handling categorical variables in supervised machine learningRebecca VickeryBlockedUnblockFollowFollowingFeb 18Photo by Berkay Gumustekin on UnsplashI have been working…
Continue ReadingMissing value visualization with tidyverse in R
Missing value visualization with tidyverse in RA short practical guide on how to find and visualize missing data with ggplot2, dplyr, tidyrJens…
Continue ReadingA Complete Guide to an Interactive Geographical Map using Python
Our World in Data has an extensive collection of interactive data visualizations on aspects dedicated to the global changes in…
Continue Reading3 steps to a clean dataset with Pandas
Only drop it if you’re quite sure it won’t be helpfulHere’s how to do all of those things in Pandas:(2)…
Continue ReadingA Gentle Introduction to Exploratory Data Analysis
Or better, what assumptions are you trying to prove wrong?You could spend all day debating these. But best to start…
Continue Reading6 Different Ways to Compensate for Missing Values In a Dataset (Data Imputation with examples)
Photo by Vilmos Heim on Unsplash6 Different Ways to Compensate for Missing Values In a Dataset (Data Imputation with examples)Popular strategies…
Continue Reading6 Different Ways to Compensate for Missing Values (Data Imputation with examples)
Photo by Vilmos Heim on Unsplash6 Different Ways to Compensate for Missing Values (Data Imputation with examples)Popular strategies to statistically impute…
Continue ReadingPredicting Micronutrients using Neural Networks and Random Forest (Part 1)
and from where?DatasetLuckily, one of the greatest creation in history, the internet, is filled with many open datasets that we…
Continue ReadingUse Unsupervised Machine Learning To Find Potential Buyers of Your Products
multi_level = [] for column in df_copy.columns: if feat_info.loc[column].type == 'categorical' and len(df_copy[column].unique()) > 2: multi_level.append(column) for col in multi_level:…
Continue ReadingXGBoost is not black magic
If the percentage of missing values in a sample increases, the performances of the built-in strategy could worsen a lot.Ok,…
Continue ReadingGetting Data ready for modelling: Feature engineering, Feature Selection, Dimension Reduction (Part 1)
Encoding: So, What and Why is Encoding?Most algorithms we use work with numerical values whereas more often than not categorical…
Continue ReadingCustomer Segmentation Report for Arvato Financial Solutions
Customer Segmentation Report for Arvato Financial SolutionsElena IvanovaBlockedUnblockFollowFollowingDec 3Capstone Project for Udacity Data Scientist NanoDegreeIntroductionIn this project supervised and unsupervised…
Continue Reading