What Is purrr's pmap?
pmap() applies a function to multiple input vectors simultaneously.
It generalizes map() (1 input) and map2() (2 inputs) to an
arbitrary number of inputs. The key difference: pmap takes a
list of equal-length vectors, not separate arguments.
| Function | Inputs | Syntax |
|---|---|---|
map(.x, .f) |
1 vector | map(1:3, sqrt) |
map2(.x, .y, .f) |
2 vectors | map2(1:3, 4:6, `+`) |
pmap(.l, .f) |
N vectors (list) | pmap(list(x, y, z), f) |
pmap_dfr(.l, .f) |
N vectors → tibble | Same as pmap, row-binds results |
pmap_dfr() when your function returns a one-row
tibble — it row-binds all results into a single data frame. Use pmap_chr(),
pmap_dbl(), or pmap_int() for type-safe atomic vector output.
Background
This post is part of a series lead by the fearless Isabella R. Ghement. In this series we use the #purrrResolution wherein Twitter statisticians and programmers teach themselves and others one new purrr function per week!
Great programmers seek leverage. One common path to leverage is by making the
language more terse and contextual to the problem at hand. Some call this "bending the language to
the problem" or "elevating the language to the problem." Functional programming tools like
dplyr use this concept to provide the analyst tremendous speed and quality
improvements. Code and analysis can be done faster with fewer defects.
These sentiments are what motivated my adoption of the purrr library and in return I've received a
tremendous productivity boost similar in magnitude to the boost I received from dplyr.
From map to map2
I've been using map like:
library(purrr)
map(seq_len(3), function(x) { x })
## [[1]]
## [1] 1
##
## [[2]]
## [1] 2
##
## [[3]]
## [1] 3
Recently I noticed cases where I've wanted to simultaneously map over two inputs:
map2(seq_len(3), # first input, can be named with .x
c("thing", "items", "objects"), # second input, can be named with .y
function(x, y) { paste(x, y) })
## [[1]]
## [1] "1 thing"
##
## [[2]]
## [1] "2 items"
##
## [[3]]
## [1] "3 objects"
Enter pmap: Map Over N Inputs
A more general function exists for mapping as many inputs as one wants!
pmap(list(
x = seq_len(3),
y = c("thing", "items", "objects"),
z = c(".", "?", "!")
),
.f = function(x, y, z) { paste(x, paste0(y, z)) })
## [[1]]
## [1] "1 thing."
##
## [[2]]
## [1] "2 items?"
##
## [[3]]
## [1] "3 objects!"
The difference in interface syntax is important. Both map and map2 take
vector arguments directly like x = and y = while pmap takes a
list of arguments like list(x = , y = ).
Gotcha: Arguments Must Be Vectors
The arguments must be vectors of the same length. For example, passing a function (a "closure") will fail:
pmap(list(
x = seq_len(3),
y = c("thing", "items", "objects"),
capitalize = function(x) { paste0(substr(x, 1, 1), substr(x, 2, nchar(x))) }
),
.f = function(x, y, z) { paste(x, capitalize(paste0(y, z))) })
## Error: Don't know how to coerce closure to list
My expectation was that the closure would be valid and recycled for the length of the longest vector argument, but that's not the case!
pmap generalizes map and map2
to an arbitrary number of inputs. Its key interface difference is that it takes a
list of equal-length vectors rather than separate vector arguments. For more
information, see Jenny Bryan's purrr
tutorial.
Related R Articles
- 📖 R's library() Function Dissected — Line-by-line walkthrough of R's most called function
- 📊 Poisson-Gamma Negative Binomial — Count data modeling with purrr simulations
- 📐 Simulating Right-Censored Weibull Data — purrr-powered survival analysis validation