Tabulars.jl
Metaprogramming for accessing many types of data
Pkg.clone(“https://github.com/andyferris/Tabulars.jl”)
Andy Ferris
inspiration
The problem with data
The problem with data
Help!
Generic data is messy. Here we focus on “tabular” data.��Easier if there were a single interface for tables, like there is for AbstractArray. ��Even easier if there were a way to “ingest” data from many formats.��
transpose
What interface should a table have?
A table is a 2D container. It is a mapping from row index + column index to values, t[row, col] = value.
We may be interested in 1D data series (e.g. a row), or 3D data cube...
Not opinionated about internal data layout - can support Dicts of Vectors, or Vectors of structs, or tuples of Dicts, or Matrix’s, or…
Common indexing interface.
Row’s aren’t just integers!
(rows or columns could be
categorical)
Pkg.clone(“https://github.com/andyferris/Tabulars.jl”)
Series, Table, Tabular{N}
Tabular{N} is an N-dimensional mapping from indices to elements.
Series = Tabular{1}; Table = Tabular{2}
It simply wraps whatever data you have - currently arbitrarily nested AbstractArray, Associative, tuple of Pairs (for heterogenous types), or arbitrary Julia structs.
Can apply indexing in any order (e.g. transposed tables), supports full range of getindex, view and setindex! (WIP).
Must index a Tabular{N} with exactly N indices.
Data of many different types
(i1 => v1, i2 => v2, i3 => v3, ...)
If the index has run-time data, then the compiler cannot determine the return type at compile time. OTOH, if the index is singleton instance, then the compiler can determine which element is being requested at compile-time.
(l”name” => “Andy”, l”number of JuliaCons” => 2)
where:
l”name” = Label{:name}()
�See also upcoming NamedTuple (PR #22194)
table.csv:
table.json:
Some metaprogramming...
Some more metaprogramming...