purrr's pmap Function for Mapping Functions to Data

Christopher Peters · R Programming

What Is purrr's pmap?

pmap() applies a function to multiple input vectors simultaneously. It generalizes map() (1 input) and map2() (2 inputs) to an arbitrary number of inputs. The key difference: pmap takes a list of equal-length vectors, not separate arguments.

purrr Mapping Functions Compared
Function Inputs Syntax
map(.x, .f) 1 vector map(1:3, sqrt)
map2(.x, .y, .f) 2 vectors map2(1:3, 4:6, `+`)
pmap(.l, .f) N vectors (list) pmap(list(x, y, z), f)
pmap_dfr(.l, .f) N vectors → tibble Same as pmap, row-binds results
Quick tip: Use pmap_dfr() when your function returns a one-row tibble — it row-binds all results into a single data frame. Use pmap_chr(), pmap_dbl(), or pmap_int() for type-safe atomic vector output.

Background

This post is part of a series lead by the fearless Isabella R. Ghement. In this series we use the #purrrResolution wherein Twitter statisticians and programmers teach themselves and others one new purrr function per week!

Great programmers seek leverage. One common path to leverage is by making the language more terse and contextual to the problem at hand. Some call this "bending the language to the problem" or "elevating the language to the problem." Functional programming tools like dplyr use this concept to provide the analyst tremendous speed and quality improvements. Code and analysis can be done faster with fewer defects.

These sentiments are what motivated my adoption of the purrr library and in return I've received a tremendous productivity boost similar in magnitude to the boost I received from dplyr.

From map to map2

I've been using map like:

library(purrr)

map(seq_len(3), function(x) { x })
## [[1]]
## [1] 1
##
## [[2]]
## [1] 2
##
## [[3]]
## [1] 3

Recently I noticed cases where I've wanted to simultaneously map over two inputs:

map2(seq_len(3),  # first input, can be named with .x
     c("thing", "items", "objects"),  # second input, can be named with .y
     function(x, y) { paste(x, y) })
## [[1]]
## [1] "1 thing"
##
## [[2]]
## [1] "2 items"
##
## [[3]]
## [1] "3 objects"

Enter pmap: Map Over N Inputs

A more general function exists for mapping as many inputs as one wants!

pmap(list(
  x = seq_len(3),
  y = c("thing", "items", "objects"),
  z = c(".", "?", "!")
),
.f = function(x, y, z) { paste(x, paste0(y, z)) })
## [[1]]
## [1] "1 thing."
##
## [[2]]
## [1] "2 items?"
##
## [[3]]
## [1] "3 objects!"

The difference in interface syntax is important. Both map and map2 take vector arguments directly like x = and y = while pmap takes a list of arguments like list(x = , y = ).

Gotcha: Arguments Must Be Vectors

The arguments must be vectors of the same length. For example, passing a function (a "closure") will fail:

pmap(list(
  x = seq_len(3),
  y = c("thing", "items", "objects"),
  capitalize = function(x) { paste0(substr(x, 1, 1), substr(x, 2, nchar(x))) }
),
.f = function(x, y, z) { paste(x, capitalize(paste0(y, z))) })
## Error: Don't know how to coerce closure to list

My expectation was that the closure would be valid and recycled for the length of the longest vector argument, but that's not the case!

Key Takeaway: pmap generalizes map and map2 to an arbitrary number of inputs. Its key interface difference is that it takes a list of equal-length vectors rather than separate vector arguments. For more information, see Jenny Bryan's purrr tutorial.

Related R Articles

Discuss an R Programming Project