# 5 Powerful Scikit-Learn Examples

5 Powerful Scikit-Learn ExamplesHerein lies just enough information to make you deadly; 5 models you can learn and apply to become a competent practitioner of Machine Learning.

Luke PoseyBlockedUnblockFollowFollowingApr 24A toolbox full of great machine learning toolsBelow are 5 models.

For each model we observe the model’s prediction that an Iris is/isn’t a Virginica.

We then perform classification.

It’s fascinating to observe how these models differ.

We will not spend much time going into detailed math for each example.

Our whole goal is to highlight some useful models you can add to your toolbox and highlight some of their differences.

In a subsequent article we will discuss pros and cons.

No point in ranting, let’s jump right in.

Note: Full code for all examples is here.

1.

Logistic RegressionTo explain Logistic Regression we can start by explaining the Sigmoid function.

Sigmoid is at the core of Logistic Regression.

Fundamentally, we take input and output a value between 0 and 1.

Our output P(x) is the probability that our dependent variable equals a case.

In the following output we get the probability that a certain set of observations is or isn’t a Virginica.

We can see that as our petal width feature increases the probability of being a Virginica increases.

log_reg = LogisticRegression(penalty="l2")log_reg.

fit(X,y) X_new = np.

linspace(0,3,1000).

reshape(-1,1)y_proba = log_reg.

predict_proba(X_new)Virginica ProbabilityWe can extend Logistic Regression to multiple classes and it turns out to be very powerful.

In this example we can see a very low classification error amongst the three classes.

softmax_reg = LogisticRegression(multi_class="multinomial", solver="lbfgs", C=5)softmax_reg.

fit(X,y)pred = softmax_reg.

predict(X_test)Classifying with Logistic Regression2.

Support Vector MachinesSupport Vector Machines work by attempting to pass a hyperplane through the dataset, capable of classifying the data.

This can be done on various dimensions.

Check out this article if you’re interested in diving deep into the various details.

clf = svm.

SVC(gamma='scale', decision_function_shape='ovo')clf.

fit(X,y) X_new = np.

linspace(0,3,1000).

reshape(-1,1)y_proba = clf.

predict_proba(X_new)SVM prediction probability of Virginica using Petal Widthclf = svm.

SVC(gamma='scale', decision_function_shape='ovo')clf.

fit(X,y)pred = clf.

predict(X_test)Classifying with our SVC3.

Naive BayesPerhaps the simplest of all the models discussed in this article, we make it now to Naive Bayes.

Naive Bayes is great for the small amount of data necessary to estimate parameters.

Naive Bayes applies Bayes’ theorem and is called naive because of the assumption of conditional independence between each feature.

In this example I apply Gaussian Naive Bayes:Gaussian Naive Bayesclf = GaussianNB()clf.

fit(X,y) X_new = np.

linspace(0,3,1000).

reshape(-1,1)y_proba = clf.

predict_proba(X_new)Naive Bayes prediction probability of Virginica using Petal Widthclf = GaussianNB()clf.

fit(X,y)pred = clf.

predict(X)Classifying with Naive Bayes4.

Random ForestRandom Forest is a popular ensemble model used quite frequently.

You can see ensemble models popping up all over the place, especially in Kaggle competitions.

Random forest works by fitting decision tree classifiers on subsamples of the dataset.

It then averages classification performance to garner superior accuracy whilst avoiding overfitting.

We set n_estimators to 100 which sets the number of trees in the forest to 100.

Max depth sets the maximum depth of the tree.

clf = RandomForestClassifier(n_estimators=100, max_depth=2, random_state=0)clf.

fit(X,y) X_new = np.

linspace(0,3,1000).

reshape(-1,1)y_proba = clf.

predict_proba(X_new)Random Forest prediction probability of Virginica using Petal Widthclf = RandomForestClassifier(n_estimators=100, max_depth=2, random_state=0)clf.

fit(X,y)pred = clf.

predict(X)Classifying with Random Forest5.

AdaBoostAnother popular ensemble model…AdaBoost works to fit many classifiers on the dataset with different weights for incorrectly classified instances.

AdaBoost training selects the features known to increase the classification power of the model.

This of course acts as dimension reduction, which is a plus as long as classification capabilities are preserved.

fit(X,y) X_new = np.

linspace(0,3,1000).

reshape(-1,1)y_proba = clf.

fit(X,y)pred = clf.