1 of 7

skimr

2 of 7

goal

Desired end state:

A frictionless approach to dealing with summary statistics iteratively and interactively as part of a pipeline that conforms to the principle of least surprise

3 of 7

problem

Existing summary functions are:

  • lacking
  • Generic
  • Don’t provide easy-to-operate-on data structures
  • Not pipeable

4 of 7

diagnosis

Why is this the state that it’s in:

  • Summary is an over-loaded term
  • Historically not been used to provide an object that is then computable
  • Not tidy
  • Other historical reasons

5 of 7

plan 1/2

mtcars %>% skim()

mtcars %>% skim() %>%

filter(var=cyl)

Explicitly:

  • creates a skim object
  • handles factors, numerics, etc
  • lists counts of N, missing

skimr data structure that can be printed in more human readable ways

6 of 7

plan 2/2

Imagine a human-readable version like this that would maybe be more labeled and work nicely with kable()

7 of 7

Feedback

Would you use this?

Does this interface seem sane?

Is there anything we’re missing?

A feature you would add to this?