Finding Bayesian Legos — Part 2

I really dislike picking up Legos.

Can’t you find a better way?”“Tell me more about how you go about picking up after your kids.

”Joe explains to me that he usually picks up the Legos after his children are tucked away in bed.

Joe likes to unwind with a little reading or television each night.

In between chapters of the book, or during television commercials, he will make a quick sweep picking up the Legos.

The first sweep seems to get most of the Legos, with each pass picking up part of what was missed from the prior sweep.

He keeps doing this until he calls it a night and heads of to bed.

“Joe, I think we can work with this.

”I ask Joe to do a minor bit of work next time he picks up Legos.

I ask him to count how many he picks up in each pass and record it.

He reluctantly agrees, but anything to help him gain confidence that he will not step on Legos when he gets up the next morning and heads to the coffee maker.

import numpy as npimport matplotlib.

pyplot as pltplt.




seed(42) # It's nice to replicate even virtual experimentsimport pymc3 as pmimport scipy.

stats as statsprint('Running on PyMC3 v{}'.


__version__))> Running on PyMC3 v3.

6Joe reports backA few days later I get an email from Joe: he has done as asked.

Joe reports that he made six sweeps with the following number of Legos being picked up in each sweep:Removed[1:]> array([135, 48, 11, 4, 3, 0])I invite Joe over for dinner and some Bayesian analysis that evening.

Dinner with PyMCAfter finishing up our dinner of kale, avocado, and black bean burrito bowls, Joe and I cozy around the big screen for the analysis.

I explain to Joe that each time he makes a sweep of the room, it’s like every Lego has some probability of being picked up.

We don’t know how many Legos there really are and that is what we are trying to determine, with probabilities.

Joe can then compare this to the total number he has picked up to see if he wants to make another pass.

Because I’ve had a few days to work this out, I sketch up the model and then go back and explain it:In this case, p indicates that probability that Joe will pick up a single Lego.

The probability p is modeled as a Beta distribution (in part 1 we covered why we used a Beta distribution).

The number of Legos remaining is N_i, after i sweeps by Joe to pick up Legos; Joe makes a total of n sweeps.

For each sweep i Joe removed Ri Legos.

What we are trying to determine is N₀, the number of Legos originally left behind by Joe’s offspring.

R₀ being set to zero handles the boundary condition where Joe picks up zero Legos on the zeroth sweep.

To complete the model, we need some prior distributions for N₀, α, β.

As in the first episode, we pick parameters such that the distributions prior to modeling are weakly informative:The parameters for the prior probability distribution is the same as in the first episode.

I ask Joe for the absolute maximum Legos he things his kid might leave behind.

Joe estimates around two-thousand, so I make the maximum, N_max three-thousand to be on the safe side, and use a uniform distribution of prior probabilities between 0 and N_max inclusive to model the initial number of Legos left behind.

Now all that remains is to make the model in PyMC and review the results to see if they might help Joe.

Model setup and fittingThe transfer from the model into PyMC code is a really direct translation and fitting with the default No-U-Turn sampler.

After some fidgeting around, the number of tuning samples was increased over the default and the target acceptance rate is set a bit higher than normal; this is done to deal with some of the problematic posteriors in this problem.

Nicely, the PyMC library does a good job helping users tune these parameters by doing model fitting checks and highlighting likely problematic issues.

I want Joe to have an approximation of how many Legos are remaining after each time he does a sweep.

This will help Joe make a better decision of when he wants to stop because I know just how very much Joe wants to decrease the amount of energy he puts into Lego gathering.

With that, we will start with the minimal information, just the number removed in the first sweep.

Also, a deterministic calculation of the estimated number of remaining Legos is included, mostly to make it easy for Joe to decide how many Legos might be remaining.

Even though n_remaining_est is deterministic, it is a random variable because the two input variables to n_remaining_est are random variables.

sweep_model = pm.

Model()with sweep_model: n_0_est = pm.

DiscreteUniform('n_0_est', lower=0, upper=3000) alpha_sweep = pm.

HalfCauchy('alpha_sweep', beta = 2.

) beta_sweep = pm.

HalfCauchy('beta_sweep', beta = 2.

) p_sweep = pm.

Beta('p_sweep', alpha=alpha_sweep, beta=beta_sweep) removed_sweep_1 = pm.

Binomial('removed_sweep_1', p=p_sweep, n=n_0_est, observed=Removed[1] ) n_remaining_est = pm.

Deterministic('n_remaining_est', n_0_est – removed_sweep_1)with sweep_model: trace = pm.

sample(50000, tune=7500, random_seed=123, progressbar=True, nuts_kwargs={"target_accept":0.

95})Model evaluationAfter completing the model fitting, Joe and I take a quick look at the trace plots.

We are focused on the initial number of Legos and the estimated number remaining after the first sweep.

Unfortunately, after just one sweep there is not much narrowing down of the number of remaining Legos compared to before the sweep.

This makes sense from mathematical intuition: there are two parameters to estimate, the probability of picking up a Lego and the initial number of Legos, but only one known variable, the number of Legos picked up in the first pass.

It is, at best, difficult to solve for two unknown variables with only one known equation.


traceplot(trace, varnames=['n_0_est', 'n_remaining_est'], combined=True);pm.

summary(trace, varnames=['n_0_est', 'n_remaining_est']).

round(2)Second sweepAt this point, Joe’s not sure if the work has been worth the effort.

I encourage Joe that we should continue, for his feet if nothing else.

We soldier on and create a model based on the first and second observed sweep data Joe provided.

The model creation code is starting to look a little repetitive, but for the sake of quick experimentation, I do some ‘copy-paste’ of code and run the model, thinking if it works I will refactor it later.

In this model, we take the estimated number of Legos remaining after sweep one subtracted from the estimated initial number of Legos, and use that as the input for the number of Legos remaining for the second sweep.

We are modeling repeated, sequential binomial trials.

sweep_model = pm.

Model()with sweep_model: n_0_est = pm.

DiscreteUniform('n_0_est', lower=0, upper=3000) alpha_sweep = pm.

HalfCauchy('alpha_sweep', beta = 2.

) beta_sweep = pm.

HalfCauchy('beta_sweep', beta = 2.

) p_sweep = pm.

Beta('p_sweep', alpha=alpha_sweep, beta=beta_sweep) removed_sweep_1 = pm.

Binomial('removed_sweep_1', p=p_sweep, n=n_0_est, observed=Removed[1] ) n_1_est = n_0_est – removed_sweep_1 removed_sweep_2 = pm.

Binomial('removed_sweep_2', p=p_sweep, n=n_1_est, observed=Removed[2] ) n_remaining_est = pm.

Deterministic('n_remaining_est', n_0_est – removed_sweep_1 – removed_sweep_2)with sweep_model: trace = pm.

sample(50000, tuning=7500, random_seed=123, progressbar=True, nuts_kwargs={"target_accept":0.

95})“Joe, we are getting somewhere now.

”A quick look at the model results and we see that the range of estimated Lego’s remaining has been narrowed to between 9 and 62 with 95% confidence.

At this point, Joe is getting encouraged that counting all of those Legos might have been worth it.

He wants to keep going, but I have some other plans.

It’s getting a bit late, and I decide it is time to refactor this code.

Joe heads home for the night and we promise to get together tomorrow.

With a bit cleaner code I hope to be able to move much faster.


traceplot(trace, varnames=['n_0_est', 'n_remaining_est'], combined=True);pm.

summary(trace, varnames=['n_0_est', 'n_remaining_est']).

round(2)Cleaner code through mathJoe comes by the following evening with takeaway in hand.

I grab my laptop and share with Joe what has transpired overnight.

“This is really a multinomial problem.

”Multinomial distributions model the count that some event out of k possible events happens some number of times n.

It is a generalization of the binomial distribution.

Think of throwing a 6-sided die repeatedly (6 possible events, 1 through 6, one for each side of the die) and counting how many times it lands with each side up.

The multinomial model allows for each event to have different probabilities of happening.

Continuing with the 6-side die analogy, that would like if the die somehow had different probabilities of each side landing upwards.

That is the process the multinomial distribution models.

Joe looks at me a little puzzled, so I go on.

In our scenario, each Lego lands in one of the n sweeps or have yet to be picked up.

Each sweep, plus the event of not being picked up, is like one side of the die.

The probability of a Lego getting picked up in the first sweep is unknown and modeled as p.

The probability of a Lego getting picked up in the second sweep is the probability of the Lego not being picked up in the first sweep and the probability of it getting picked up in the second sweep.

The probability of a Lego remaining is just whatever probability remains after summing up all the probabilities of a Lego getting picked across all the sweeps.

Here R is a vector of the count for each of the probabilities given (that is, it is the estimated number of Legos picked up in each sweep), N₀ is the number of trials (or in our case, the estimate of the number of Legos initially left laying), p_i is the probability of a Lego being picked up in sweep i, p_remains is the probability of a Lego not being picked up, and n_sweeps is the total number of sweeps performed.

While PyMC3 has a multinomial distribution it does not have this kind of “incomplete Multinomial distribution”, where the size of the multinomial population is unknown and to be estimated.

There have been several papers on this subject of estimating the multinomial population size from the observed draws; my favorite is by Sanathanan.

I’m committed to taking a Bayesian approach and using PyMC3.

After a bit of experimenting, I determine the workable solution is to generalize the repeated binomial approach tried earlier.

I wrote up a function which encapsulates the model of removal, calling the new model a RemovalMultinomial.

While I'm sure there are better ways to do this, Joe and I are pretty ready to get this done.

# Done quickly to make code a bit easier to deal with# Doesn't do much / any error checking of the inputs, assumes observed informationdef RemovalMultinomial(basename, n, p, observed): remaining = n results = [] for idx, p_idx in enumerate(p): i = idx+1 name = basename + '_{}'.

format(i) removed = pm.

Binomial(name, n=remaining, p=p_idx, observed=observed[idx]) results.

append(removed) remaining = remaining – removed remaining = pm.

Deterministic( basename+'_remaining', remaining) results.

append(remaining) return resultsWith this new function, the setup for the model is much shorter, parameterized just by the number of sweeps.

We will do the model with 3 sweeps, one more than last time.

This helps Joe and I know how much information we are getting sweep to sweep.

Also, it might help Joe determine when he can stop ‘early’ in his passes based on his own model of risk of having a density of Legos remaining on the floor, poised to cause him pain if he were to step on them.

Right now Joe keeps doing sweeps for Legos until he makes a pass and doesn’t see any Legos to pick up.

n_sweeps = 3sweep_model = pm.

Model()with sweep_model: n_0_est = pm.

DiscreteUniform('n_0_est', lower=0, upper=3000) alpha_sweep = pm.

HalfCauchy('alpha_sweep', beta = 2.

) # Weakly regularizing prior beta_sweep = pm.

HalfCauchy('beta_sweep', beta = 2.

) # Weakly regularizing prior p_sweep = pm.

Beta('p_sweep', alpha=alpha_sweep, beta=beta_sweep) p_multinomial = [ pm.


format(str(i+1)), p_sweep) for i in range(0,n_sweeps) ] removed = RemovalMultinomial('rr', n=n_0_est, p=p_multinomial, observed=Removed[1:] )with sweep_model: trace = pm.

sample(50000, tuning=7500, random_seed=123, progressbar=True, nuts_kwargs={"target_accept":0.

95})“This is looking pretty good!”, says JoeJoe and I are getting pretty excited.

With this last round, after three sweeps of picking up Legos, there is now a 95% confidence interval that there are between zero (!) and thirteen Legos remaining.

Joe is pretty happy with these results and feels like it can help him determine if he wants to continue making passes or call it a night.


plot_posterior(trace, varnames=['n_0_est', 'rr_remaining']);pm.

summary(trace, varnames=['n_0_est', 'rr_remaining']).

round(2)One final modelIn this last model, we want to test how good Joe’s current heuristic is for stopping.

Joe typically stops looking for Legos when he makes a pass and he finds no more Legos to pick up.

Now that we have maths behind us we can quantify, at least in based on these results and the assumptions of the model, the quality of the stopping strategy employed by Joe.

We create a model of all six passes Joes performed and use that to estimate the number of Legos remaining.

The code is not shown here but is available in the notebook linked at the end of the article.


figure(figsize=(4,3), dpi=300)plt.

hist(trace['rr_remaining'], density=True, bins=range(0,5), align='left')plt.

xlabel('Estimated Legos Remaining')plt.



title('Probability Density Function of Estimated Legos Remaining')plt.


show();Some models are usefulAt this point, Joe and I are pretty happy with the model results.

It looks like, at least in this example, there is about a one-in-five (20%) chance that there is at least one Lego remaining.

Joe says he can take this and use it if I’ll clean it up and stick it in a mobile app.

Joe is a very funny person!.I agree to check this code into GitHub and show Joe how to use it.

All is good, Joe goes home.

And then a few days later the call comes, “Hank, I’m still finding Legos the day after, and a lot more a lot more often than I think the model predicts.

”Full Jupyter notebook of this post is available.


. More details

Leave a Reply