Overview DBSCAN clustering is an underrated yet super useful clustering algorithm for unsupervised learning problems Learn how DBSCAN clustering works,…
Continue Readingpoints
Square waves and cobwebs
This is a follow-up to yesterday’s post. In that post we looked at iterates of the function f(x) = exp(…
Continue ReadingSurface of revolution with minimum area
Suppose you’re given two points (x1, y1) and (x2, y2) with y1 and y2 positive. Find the smooth positive curve…
Continue ReadingSinc approximation
If a function is smooth and has thin tails, it can be well approximated by sinc functions. These approximations are…
Continue ReadingProbability that a cubic has two turning points
Most cubic polynomials with real coefficients have two turning points, a local maximum and a local minimum. But how do…
Continue ReadingLobatto integration
A basic idea in numerical integration is that if a method integrates polynomials exactly, it should do well on polynomial-like…
Continue ReadingLogistic trajectories
This post is a follow-on to the post on how to make the logistic bifurcation diagram below. That post plotting…
Continue ReadingAdvantages of redundant coordinates
Barycentric coordinates make some things much simpler. For example, the coordinates of the three vertices are (1, 0, 0), (0,…
Continue ReadingPredicting environmental carcinogens with logistic regression, knn, gradient boosting and molecular fingerprinting
Predicting environmental carcinogens with logistic regression, knn, gradient boosting and molecular fingerprintingBalancing imbalanced data, exploring accuracy metrics, and an introduction…
Continue ReadingHere’s how you can accelerate your Data Science on GPU
Here’s how you can accelerate your Data Science on GPUGeorge SeifBlockedUnblockFollowFollowingJul 3Data Scientists need computing power. Whether you’re processing a big…
Continue Reading“Where today?” — Planning my Singapore trip with clusters
“Where today?” — Planning my Singapore trip with clustersOutlining travel plans with R, k-medoids and Google MapsJuan De Dios SantosBlockedUnblockFollowFollowingJul 1Planning is not easy.…
Continue ReadingLinear Algebra. Points matching with SVD in 3D space
Linear Algebra. Points matching with SVD in 3D spaceAndrey NikishaevBlockedUnblockFollowFollowingJun 30ProblemWe need to find best rotation & translation params between two…
Continue ReadingSuicide in the 21st Century (Part 2)
If you didn’t catch part 1 you can find it below:Suicide in the 21st Century (Part 1)Suicide is not contagious,…
Continue ReadingAn overview of different unsupervised learning techniques
Well, there are several ways like cross-validation, information criteria, the information theoretic jump method, the silhouette method, and the G-means…
Continue ReadingMobility Data, Feature Engineering and Hierarchical Clustering
One concept that beautifully captures the level of randomness in a sequence of events is found in the domain of…
Continue ReadingWhen Machine Learning Solutions Are Not Possible!
When Machine Learning Solutions Are Not Possible!Five Scenarios Every Data Scientist Should Consider before Proposing Machine Learning Solutions. Rasoul BanaeeyanBlockedUnblockFollowFollowingJun…
Continue ReadingBest clustering algorithms for anomaly detection
How to use it?”Now we have the clusters…How can we detect anomalies in the test data?The approach I’ve followed to classify…
Continue ReadingSVM: Feature Selection and Kernels
(Source: https://towardsdatascience. com/support-vector-machine-vs-logistic-regression-94cc2975433f)SVM: Feature Selection and KernelsPier Paolo IppolitoBlockedUnblockFollowFollowingJun 2A Support Vector Machine (SVM) is a supervised machine learning algorithm that…
Continue ReadingModern Astrophysics At The Forefront of Data Science
So, what can we do with this data?Figure 3. Simplified diagram showing exoplanet transiting in front of the host star…
Continue ReadingAn Easy Introduction to SQL for Data Scientists
Inserting data into our table can be done using a command called INSERT followed by the table name and a…
Continue ReadingUsing machine learning to understand customers behavior
You were using Euclidean distance. It is the square root of the sum of squared differences between corresponding elements of…
Continue ReadingOutlier Detection and Treatment: A Beginner's Guide
Outlier Detection and Treatment: A Beginner's GuideSwetha LakshmananBlockedUnblockFollowFollowingMay 8One of the most important steps in data pre-processing is outlier detection…
Continue ReadingExtracting and Analyzing 1000 Basketball Games using Pandas and Chartify
We will narrow our scope to some specific fields for this project: GameId: This is not crucial for analysis but database-wise…
Continue ReadingK-Means Clustering in SAS
K-Means Clustering in SASDhilip SubramanianBlockedUnblockFollowFollowingMay 1What is Clustering?“Clustering is the process of dividing the datasets into groups, consisting of similar data-points”.…
Continue ReadingGetting started with Visualizations in Python
First things first, don’t even think of relating it with Bar graphs. -Histograms are very different from Bar graphs, in the…
Continue Reading