A Comprehensive Introductory Guide to Supervised Learning for the Non-Mathematician

That is to say, based on the value x between parentheses, the prediction is made, meaning that hθ(x(i)) symbolizes the prediction made by the hypothesis on example x(i).The funky Greek symbols you see (θ — these guys) are theta values, more professionally known as parameters..When we feed our hypothesis a training example, each feature is multiplied by its respective theta value..The x axis shows the value of that feature for each training example..So,when you see J(θ), recall that you are dealing with the cost function (sometimes called the loss function), or the output thereby associated.In machine learning, the letter m is the number of training examples..It is also important to see that the cost, shown on the y-axis, changes with regard to the theta values.Another cost function, but for only one parameterAnother variation of the loss function is pictured above..It is measuring a hypothesis containing only one parameter and is therefore a two-dimensional visualization.For every theta value, a spacial dimension is added to the function..Please don’t run away.Just like everything we’ve already covered, I’ll be explaining this concept in a simple and organized fashion.Here’s what a derivative is one sentence: an expression representing the relationship between a function and the slope of that function at any point, given the independent variable.The derivative of x² is 2x..This means that at a value of x on the function x², the slope at that point is 2 * x.Knowing that, we can understand that derivatives represent instantaneous rate of change, as they reveal what the slope is at a single point on a function, rather than between two separate points.When we evaluate a derivative, by plugging the independent variable into the derivative equation, its output is the slope at that point on the function..The derivative being 2x, the slope of the tangent is 2 * 1.Three things to note about derivatives:They have a magnitude (large, representing steep slope or small, representing gradual slope).They have a direction (positive or negative).Derivatives point in the direction of steepest change at the evaluated point on the function.Keep this in mind, as it is important for the next section.Now how about gradients?The gradient of a function, just like a derivative, represents the slope at any given point on the function..The information presented here is as follows:The slope in the x plane at this coordinate is +3The slope in the y plane at this coordinate is -4The slope in the z plane at this coordinate is +5Partial derivative visualizationAbove is a partial derivative on a 3-dimensional function..The most important thing to take away is that a gradient represents the slope on a function..A partial derivative tells us the slope with respect to one variable (or dimension) of said function.Gradient Descent: How Supervised M.L..If you understood the fundamental concept for each of the previous topics, making sense of gradient descent should come with relative ease.That being said, let’s jump into it.The objective of gradient descent is to minimize the cost function.. More details

Leave a Reply