There’s a very interesting part of the Likelihood Principle. The principle is commonly used via Likelihood ratio tests. It says that two likelihoods proportional by a constant contain the same information about a parameter vector.

It so happens that the binomial and negative binomial distributions yield proportional likelihoods. Voila math, but really code is math.

First I create the likelihoods of each distribution. It’s not a typo, their bodies are only different by one letter and two numbers.

negative_binomial_likelihood <- function(p) {
prod(dnbinom(40, 7, p))
}
binomial_likelihood <- function(p) {
prod(dbinom(7, 47, p))
}

Now, we can create a nice span of parameter in which we can calulate the likelihood. Probability-esque, but technically plausability.

We’re taking the product of the points evaluated by the distribution function.

library(dplyr); library(ggplot2)
tibble(p = seq(0.01, 0.99, 0.01)) %>%
rowwise() %>%
mutate(nb_likelihood = negative_binomial_likelihood(p),
binomial_likelihood = binomial_likelihood(p)) %>%
ungroup() %>% # important!
# calculate the relative likelihood of the negative binomial
mutate(nb_maximum_likelihood = max(nb_likelihood),
Negative binomial relative likelihood = nb_likelihood / nb_maximum_likelihood) %>%
# calculate the relative likelihood of the binomial
mutate(binomial_maximum_likelihood = max(binomial_likelihood),
Binomial relative likelihood = binomial_likelihood / binomial_maximum_likelihood) %>%
select(p, Binomial relative likelihood, Negative binomial relative likelihood) %>%
reshape2::melt("p") %>% as_tibble() %>%
ggplot(aes(x = p, y = value)) +
geom_line() +
ylab("Relative likelihood") +
xlab("Probability of success") +
theme_bw(18) + theme(axis.text = element_text(colour = "black")) +
facet_wrap(~ variable, ncol = 1)

They’re the same!

Why?

Two stories.

The first story goes: I flip a coin until I see the seventh head. It takes me forty seven flips to pick up that seven.

The second story goes: I flip a coin forty seven times. And I saw seven heads.

Does it matter how I flipped the coin if you saw seven heads and forty tails?

• Bayesian: no.
• Frequentist: yes.
• Information theoretic: no.
• Minimum description length: Not certain yet, but almost sure no.
• Fiducial: not yet sure.
• Structural inference: not sure yet.
• MaxEnt: worthy.