Uncertainty involves making decisions with incomplete information, and this is the way we generally operate in the world.

Handling uncertainty is typically described using everyday words like chance, luck, and risk.

Probability is a field of mathematics that gives us the language and tools to quantify the uncertainty of events and reason in a principled manner.

In this post, you will discover a gentle introduction to probability.

After reading this post, you will know:Let’s get started.

What Is Probability?Photo by Emma Jane Hogbin Westby, some rights reserved.

This tutorial is divided into four parts; they are:Uncertainty refers to imperfect or incomplete information.

Much of mathematics is focused on certainty and logic.

Much of programming is this way too, where we develop software with the assumption that it will execute deterministically.

Yet, under the covers, the computer hardware is subject to noise and errors that are being checked and corrected all of the time.

Certainty with perfect and complete information is unusual.

It is the place of games and contrived examples.

Almost everything we do or are interested involves information on a continuum between uncertainty or wrongness.

The world is messy and imperfect and we must make decisions and operate in the face of this uncertainty.

For example, we often talk about luck, chance, odds, likelihood, and risk.

These are words that we use to interpret and negotiate uncertainty in the world.

When making inferences and reasoning in an uncertain world, we need principled, formal methods to express and solve problems.

Probability provides the language and tools to handle uncertainty.

Probability is a measure that quantifies the likelihood that an event will occur.

For example, we can quantify the probability of a fire in a neighborhood, a flood in a region, or the purchase of a product.

The probability of an event can be calculated directly by counting all of the occurrences of the event, dividing them by the total possible occurrences of the event.

The assigned probability is a fractional value and is always in the range between 0 and 1, where 0 indicates no probability and 1 represents full probability.

Together, the probability of all possible events sums to the probability value one.

If all possible occurrences are equally likely, the probability of their occurrence is 1 divided by the total possible occurrences or trials.

For example, each of the numbers 1 to 6 are equally likely from the roll of a fair die, therefore each has a probability of 1/6 or 0.

166 of occurring.

Probability is often written as a lowercase “p” and may be stated as a percentage by multiplying the value by 100.

For example, a probability of 0.

3 can be stated as 30% (given 0.

3 * 100).

A probability of 50% for an event, often spoken of as a “50-50 chance,” means that it is likely to happen half of the time.

The probability of an event, like a flood, is often denoted as a function (e.

g.

the probability function) with an uppercase “P.

” For example:It is also sometimes written as a function of lowercase “p” or “Pr.

” For example: p(flood) or Pr(flood).

The complement of the probability or its inverse can be stated as one minus the probability of the event.

For example:The probability, or likelihood, of an event is also commonly referred to as the odds of the event or the chance of the event.

These all generally refer to the same notion, although odds often has its own notation of wins to losses, written as w:l; e.

g.

1:3 for a 1 in 3 or 30% probability of a win.

We have described naive probability, although probability theory allows us to be more general.

More generally, probability is an extension of logic that can be used to quantify, manage, and harness uncertainty.

As a field of study, it is often referred to as probability theory to differentiate it from the likelihood of a specific event.

Probability can be seen as the extension of logic to deal with uncertainty.

[…] Probability theory provides a set of formal rules for determining the likelihood of a proposition being true given the likelihood of other propositions.

— Page 56 Deep Learning, 2016.

Probability theory has three important concepts:The likelihood of an event (A) being drawn from the sample space (S) is determined by the probability function (F).

The shape or distribution of all events in the sample space is called the probability distribution.

Many domains have a familiar shape to the distribution of probabilities to events, such as uniform if all events are equally likely or Gaussian if the likelihood of the events forms a normal or bell-shape.

Probability forms the foundation of many applied fields of mathematics, including statistics, and is an important foundation of many higher-level fields of study, including physics, biology, and computer science.

There are two main ways of interpreting or thinking about probability.

The perhaps simpler approach is to consider probability as the actual likelihood of an event, called the Frequentist probability.

Another approach is to consider probability a notion of how strongly it is believed the event will occur, called Bayesian probability.

It is not that one approach is correct and the other is incorrect; instead, they are complementary and both interpretations provide different and useful techniques.

The frequentist approach to probability is objective.

Events are observed and counted, and their frequencies provide the basis for directly calculating a probability, hence the name “frequentist.

”Probability theory was originally developed to analyze the frequencies of events.

— Page 55 Deep Learning, 2016.

Methods from frequentist probability include p-values and confidence intervals used in statistical inference and maximum likelihood estimation for parameter estimation.

The Bayesian approach to probability is subjective.

Probabilities are assigned to events based on evidence and personal belief and are centered around Bayes’ theorem, hence the name “Bayesian.

” This allows probabilities to be assigned to very infrequent events and events that have not been observed before, unlike frequentist probability.

One big advantage of the Bayesian interpretation is that it can be used to model our uncertainty about events that do not have long term frequencies.

— Page 27, Machine Learning: A Probabilistic Perspective, 2012.

Methods from Bayesian probability include Bayes factors and credible interval for inference and Bayes estimator and Maximum a posteriori estimation for parameter estimation.

This section provides more resources on the topic if you are looking to go deeper.

In this post, you discovered a gentle introduction to probability.

Specifically, you learned:Do you have any questions?.Ask your questions in the comments below and I will do my best to answer.

.. More details