The Generalized Pareto distribution (“GPD”) is a so-called power law distribution. Practically useful for modeling distributional tail shapes, particularly heavy-tail shapes commonly found in finance and engineering problems — problems that involve risk.
In my use-case at hand it’s very important to model the tails of the distribution(s) carefully, the GPD is great for modeling heavy tails.
The distribution can also be called the exponential-gamma mixture. It’s very interesting to see a rich generalized model spring out of the simple exponential distribution. I’m beginning to see how distributions can be used as information extraction devices by way of entropy. If one wishes to do a bunch of automatic modeling, thinking of models as information extraction devices seems like a great way to scale.
N <- 100
lambda <- rgamma(N, shape = 2, rate = 1)
lambda %>%
map(~ rexp(1, .x)) %>%
unlist() -> x
ecdf(x) %>%
plot(main = "Exponential-Gamma mixture vs. Generalized Pareto distribution")
pareto_fit <- gpd.fit(x, method = "amle")
rgp(N,
shape = pareto_fit[[1]],
scale = pareto_fit[[2]]) %>%
ecdf() %>%
lines(col = "red")
LS0tCnRpdGxlOiAiUmVsYXRpb25zaGlwIG9mIHRoZSBFeHBvbmVudGlhbC1HYW1tYSBtaXh0dXJlIHRvIHRoZSBQYXJldG8gZGlzdHJpYnV0aW9uIgpvdXRwdXQ6CiAgaHRtbF9kb2N1bWVudDogZGVmYXVsdAogIGh0bWxfbm90ZWJvb2s6IGRlZmF1bHQKLS0tCgpgYGB7ciwgZWNobyA9IEZBTFNFLCBtZXNzYWdlPUZBTFNFLCB3YXJuaW5nPUZBTFNFfQpzZXQuc2VlZCgxKQpzdXBwcmVzc1BhY2thZ2VTdGFydHVwTWVzc2FnZXMoeyAKICBsaWJyYXJ5KGZpdGRpc3RycGx1cykKICBsaWJyYXJ5KG1hZ3JpdHRyKQogIGxpYnJhcnkoZ1BkdGVzdCkKICBsaWJyYXJ5KHB1cnJyKQp9KQpgYGAKClRoZSBbR2VuZXJhbGl6ZWQgUGFyZXRvIGRpc3RyaWJ1dGlvbl0oaHR0cHM6Ly9lbi53aWtpcGVkaWEub3JnL3dpa2kvR2VuZXJhbGl6ZWRfUGFyZXRvX2Rpc3RyaWJ1dGlvbikgKCJHUEQiKSBpcyBhIHNvLWNhbGxlZCBwb3dlciBsYXcgZGlzdHJpYnV0aW9uLiBQcmFjdGljYWxseSB1c2VmdWwgZm9yIG1vZGVsaW5nIGRpc3RyaWJ1dGlvbmFsIHRhaWwgc2hhcGVzLCBwYXJ0aWN1bGFybHkgaGVhdnktdGFpbCBzaGFwZXMgY29tbW9ubHkgZm91bmQgaW4gZmluYW5jZSBhbmQgZW5naW5lZXJpbmcgcHJvYmxlbXMgLS0tIHByb2JsZW1zIHRoYXQgaW52b2x2ZSByaXNrLgoKSW4gbXkgdXNlLWNhc2UgYXQgaGFuZCBpdCdzIHZlcnkgaW1wb3J0YW50IHRvIG1vZGVsIHRoZSB0YWlscyBvZiB0aGUgZGlzdHJpYnV0aW9uKHMpIGNhcmVmdWxseSwgdGhlIEdQRCBpcyBncmVhdCBmb3IgbW9kZWxpbmcgaGVhdnkgdGFpbHMuCgpUaGUgZGlzdHJpYnV0aW9uIGNhbiBhbHNvIGJlIGNhbGxlZCB0aGUgZXhwb25lbnRpYWwtZ2FtbWEgbWl4dHVyZS4gSXQncyB2ZXJ5IGludGVyZXN0aW5nIHRvIHNlZSBhIHJpY2ggZ2VuZXJhbGl6ZWQgbW9kZWwgc3ByaW5nIG91dCBvZiB0aGUgc2ltcGxlIGV4cG9uZW50aWFsIGRpc3RyaWJ1dGlvbi4gSSdtIGJlZ2lubmluZyB0byBzZWUgaG93IGRpc3RyaWJ1dGlvbnMgY2FuIGJlIHVzZWQgYXMgaW5mb3JtYXRpb24gZXh0cmFjdGlvbiBkZXZpY2VzIGJ5IHdheSBvZiBbZW50cm9weV0oaHR0cHM6Ly9lbi53aWtpcGVkaWEub3JnL3dpa2kvS3VsbGJhY2slRTIlODAlOTNMZWlibGVyX2RpdmVyZ2VuY2UpLiBJZiBvbmUgd2lzaGVzIHRvIGRvIGEgYnVuY2ggb2YgYXV0b21hdGljIG1vZGVsaW5nLCB0aGlua2luZyBvZiBtb2RlbHMgYXMgaW5mb3JtYXRpb24gZXh0cmFjdGlvbiBkZXZpY2VzIHNlZW1zIGxpa2UgYSBncmVhdCB3YXkgdG8gc2NhbGUuCgpgYGB7cn0KTiA8LSAxMDAKCmxhbWJkYSA8LSByZ2FtbWEoTiwgc2hhcGUgPSAyLCByYXRlID0gMSkKbGFtYmRhICU+JQogIG1hcCh+IHJleHAoMSwgLngpKSAlPiUKICB1bmxpc3QoKSAtPiB4CgplY2RmKHgpICU+JSAKICBwbG90KG1haW4gPSAiRXhwb25lbnRpYWwtR2FtbWEgbWl4dHVyZSB2cy4gR2VuZXJhbGl6ZWQgUGFyZXRvIGRpc3RyaWJ1dGlvbiIpCgpwYXJldG9fZml0IDwtIGdwZC5maXQoeCwgbWV0aG9kID0gImFtbGUiKQoKcmdwKE4sCiAgICBzaGFwZSA9IHBhcmV0b19maXRbWzFdXSwKICAgIHNjYWxlID0gcGFyZXRvX2ZpdFtbMl1dKSAlPiUKICBlY2RmKCkgJT4lCiAgbGluZXMoY29sID0gInJlZCIpCmBgYAoKIyMjIyBSZXNvdXJjZXM6CgogIC0gIGh0dHBzOi8vd3d3LmluZS5wdC9yZXZzdGF0L3BkZi9yczEyMDEwMi5wZGYKICAtICBodHRwOi8vd3d3Lm1hdGguY2FudGVyYnVyeS5hYy5uei9+Yy5zY2Fycm90dC9ldm1peC9hbm5hdGhlc2lzLnBkZgo=