1 of 19

Tabulars.jl

Metaprogramming for accessing many types of data

Pkg.clone(“https://github.com/andyferris/Tabulars.jl”)

Andy Ferris

2 of 19

inspiration

3 of 19

The problem with data

4 of 19

The problem with data

5 of 19

Help!

Generic data is messy. Here we focus on “tabular” data.��Easier if there were a single interface for tables, like there is for AbstractArray. ��Even easier if there were a way to “ingest” data from many formats.��

6 of 19

7 of 19

8 of 19

transpose

9 of 19

10 of 19

11 of 19

12 of 19

What interface should a table have?

A table is a 2D container. It is a mapping from row index + column index to values, t[row, col] = value.

We may be interested in 1D data series (e.g. a row), or 3D data cube...

Not opinionated about internal data layout - can support Dicts of Vectors, or Vectors of structs, or tuples of Dicts, or Matrix’s, or…

Common indexing interface.

13 of 19

Row’s aren’t just integers!

(rows or columns could be

categorical)

14 of 19

Pkg.clone(“https://github.com/andyferris/Tabulars.jl”)

Series, Table, Tabular{N}

Tabular{N} is an N-dimensional mapping from indices to elements.

Series = Tabular{1}; Table = Tabular{2}

It simply wraps whatever data you have - currently arbitrarily nested AbstractArray, Associative, tuple of Pairs (for heterogenous types), or arbitrary Julia structs.

Can apply indexing in any order (e.g. transposed tables), supports full range of getindex, view and setindex! (WIP).

Must index a Tabular{N} with exactly N indices.

15 of 19

Data of many different types

(i1 => v1, i2 => v2, i3 => v3, ...)

If the index has run-time data, then the compiler cannot determine the return type at compile time. OTOH, if the index is singleton instance, then the compiler can determine which element is being requested at compile-time.

(l”name” => “Andy”, l”number of JuliaCons” => 2)

where:

l”name” = Label{:name}()

�See also upcoming NamedTuple (PR #22194)

16 of 19

table.csv:

17 of 19

table.json:

18 of 19

Some metaprogramming...

19 of 19

Some more metaprogramming...