Overview DBSCAN clustering is an underrated yet super useful clustering algorithm for unsupervised learning problems Learn how DBSCAN clustering works,…

Continue Reading# points

## Square waves and cobwebs

This is a follow-up to yesterday’s post. In that post we looked at iterates of the function f(x) = exp(…

Continue Reading## Surface of revolution with minimum area

Suppose you’re given two points (x1, y1) and (x2, y2) with y1 and y2 positive. Find the smooth positive curve…

Continue Reading## Sinc approximation

If a function is smooth and has thin tails, it can be well approximated by sinc functions. These approximations are…

Continue Reading## Probability that a cubic has two turning points

Most cubic polynomials with real coefficients have two turning points, a local maximum and a local minimum. But how do…

Continue Reading## Lobatto integration

A basic idea in numerical integration is that if a method integrates polynomials exactly, it should do well on polynomial-like…

Continue Reading## Logistic trajectories

This post is a follow-on to the post on how to make the logistic bifurcation diagram below. That post plotting…

Continue Reading## Advantages of redundant coordinates

Barycentric coordinates make some things much simpler. For example, the coordinates of the three vertices are (1, 0, 0), (0,…

Continue Reading## Predicting environmental carcinogens with logistic regression, knn, gradient boosting and molecular fingerprinting

Predicting environmental carcinogens with logistic regression, knn, gradient boosting and molecular fingerprintingBalancing imbalanced data, exploring accuracy metrics, and an introduction…

Continue Reading## Here’s how you can accelerate your Data Science on GPU

Here’s how you can accelerate your Data Science on GPUGeorge SeifBlockedUnblockFollowFollowingJul 3Data Scientists need computing power. Whether you’re processing a big…

Continue Reading## “Where today?” — Planning my Singapore trip with clusters

“Where today?” — Planning my Singapore trip with clustersOutlining travel plans with R, k-medoids and Google MapsJuan De Dios SantosBlockedUnblockFollowFollowingJul 1Planning is not easy.…

Continue Reading## Linear Algebra. Points matching with SVD in 3D space

Linear Algebra. Points matching with SVD in 3D spaceAndrey NikishaevBlockedUnblockFollowFollowingJun 30ProblemWe need to find best rotation & translation params between two…

Continue Reading## Suicide in the 21st Century (Part 2)

If you didn’t catch part 1 you can find it below:Suicide in the 21st Century (Part 1)Suicide is not contagious,…

Continue Reading## An overview of different unsupervised learning techniques

Well, there are several ways like cross-validation, information criteria, the information theoretic jump method, the silhouette method, and the G-means…

Continue Reading## Mobility Data, Feature Engineering and Hierarchical Clustering

One concept that beautifully captures the level of randomness in a sequence of events is found in the domain of…

Continue Reading## When Machine Learning Solutions Are Not Possible!

When Machine Learning Solutions Are Not Possible!Five Scenarios Every Data Scientist Should Consider before Proposing Machine Learning Solutions. Rasoul BanaeeyanBlockedUnblockFollowFollowingJun…

Continue Reading## Best clustering algorithms for anomaly detection

How to use it?”Now we have the clusters…How can we detect anomalies in the test data?The approach I’ve followed to classify…

Continue Reading## SVM: Feature Selection and Kernels

(Source: https://towardsdatascience. com/support-vector-machine-vs-logistic-regression-94cc2975433f)SVM: Feature Selection and KernelsPier Paolo IppolitoBlockedUnblockFollowFollowingJun 2A Support Vector Machine (SVM) is a supervised machine learning algorithm that…

Continue Reading## Modern Astrophysics At The Forefront of Data Science

So, what can we do with this data?Figure 3. Simplified diagram showing exoplanet transiting in front of the host star…

Continue Reading## An Easy Introduction to SQL for Data Scientists

Inserting data into our table can be done using a command called INSERT followed by the table name and a…

Continue Reading## Using machine learning to understand customers behavior

You were using Euclidean distance. It is the square root of the sum of squared differences between corresponding elements of…

Continue Reading## Outlier Detection and Treatment: A Beginner's Guide

Outlier Detection and Treatment: A Beginner's GuideSwetha LakshmananBlockedUnblockFollowFollowingMay 8One of the most important steps in data pre-processing is outlier detection…

Continue Reading## Extracting and Analyzing 1000 Basketball Games using Pandas and Chartify

We will narrow our scope to some specific fields for this project: GameId: This is not crucial for analysis but database-wise…

Continue Reading## K-Means Clustering in SAS

K-Means Clustering in SASDhilip SubramanianBlockedUnblockFollowFollowingMay 1What is Clustering?“Clustering is the process of dividing the datasets into groups, consisting of similar data-points”.…

Continue Reading## Getting started with Visualizations in Python

First things first, don’t even think of relating it with Bar graphs. -Histograms are very different from Bar graphs, in the…

Continue Reading