Either

R is a highly interactive language. Ad hoc analyses are typically hand-crafted one read-eval-print-loop at a time. When analyses grow up into jobs, they need stronger assurances on correctness. “Either” is a useful tool for getting started with error handling.

The concept is a parent function that takes a child function as an argument and returns a disjoint set of the child function’s return() or an error.

The usual R function for this is tryCatch. It’s first argument is an expression. If the expression succeeds the result is returned. tryCatch also allows for arbitrary conditions to be handled, one of which is simpleError R’s base error-type.

For example,

str(
  tryCatch({
    "It worked!"
  }, error = function(e) {
    "It failed!"
  })
)

##  chr "It worked!"

str(
  tryCatch({
    stop("It worked!")
  }, error = function(e) {
    "It failed!"
  })
)

##  chr "It failed!"

tryCatch offers a high-degree of functionality with a condition handling system, but it only returns a single result of either some type or an error. This means that it’s not type-stable and as the developer you have to then reason about handling many types.

Behold “Either” or as part of the purrr library, safely(). safely() will always return a length two list with elements result and error. Either result or error will be NULL. This lets the developer strictly partition errors and results and I expect it will be quite useful.

library(purrr)

## 
## Attaching package: 'purrr'

## The following objects are masked from 'package:dplyr':
## 
##     contains, order_by

str(
  safely(function() { "It worked!"} )()
)

## List of 2
##  $ result: chr "It worked!"
##  $ error : NULL

str(
  safely(function() { stop("It worked!") } )()
)

## List of 2
##  $ result: NULL
##  $ error :List of 2
##   ..$ message: chr "It worked!"
##   ..$ call   : language .f(...)
##   ..- attr(*, "class")= chr [1:3] "simpleError" "error" "condition"

The example below uses safely() to catch a stale database connection and then refresh that connection. If the connection is already fresh, we don’t want to needlessly refresh because that’ll temporarily create an unused database connection. If such a function were in a loop, it might overload the database with too many connections.

library(purrr)

set.seed(1234)
check_database <- function() {
  if(rbinom(1, 1, 0.5) == 1) {
    "database connection works!"
  } else {
    stop("database connection is stale. :-(")
  }
}

check_database()

## Error in check_database(): database connection is stale. :-(

check_database()

## [1] "database connection works!"

safely_check_database <- safely(check_database)

connect_to_database <- function() {
  if(!is.null(safely_check_database()$error)) {
    "connect to database"
  } else {
    "database already connected"
  }
}

set.seed(1234)
connect_to_database()

## [1] "connect to database"

connect_to_database()

## [1] "database already connected"

I’m just beginning to scratch the surface of condition handling and tools like Either (purrr::safely()), but their ability to crush bug counts is very clear.

Questions for the future:

Is / how / why is this superior to tryCatch? Type safety?
How does safely() fit into condition handling broadly?
How does safely() fit into a scheme with restarts?

@statwonk

Either

May 12, 2016

Questions for the future: