There are many levels at which we can imagine implementing NumPy type checking. In approximate order of complexity, I believe we should support checking of:

Data types (ndarray.dtype), e.g., np.float32 or np.floating
Dimensionality (ndarray.ndim), e.g., scalar, vector or matrix
Dimension sizes (ndarray.shape), e.g., a vector of length 3
Dimension identity, e.g,. any pair of vectors, as long as they have the same length
Broadcasting, i.e., operations between arrays with broadcast compatible shapes

Typing for data types is less interesting, as they could be a relatively straightforward use of the generics already supported by Python’s typing module.^[a]^[b] This document focuses on what support for typing array dimensions and shapes could look like, in its full glory.

Motivation

I’m interesting in type checking support for catching the most common classes of user error with numerical Python code. These errors often result from confusion about shapes, and more specifically:

Confusion about broadcasting. Broadcasting has well defined semantics, but its rules for automatic looping over arrays with different dimensionalities are somewhat arbitrary. There is no particular reason why adding a vector to a matrix should loop over the rows of a matrix rather than the columns.
Confusion about dimension identity. What exactly does array.sum(axis=1) mean, and which (logical) dimensions are left on the result? Does the single dimension of a vector correspond match the rows or columns of a matrix? It’s possible to add explicit dimension names with additional layers on top of NumPy such as xarray, but this adds additional complexity and overhead that often isn’t worth it when working with mostly low-dimensional data.

I expect type checking could catch many examples of both kinds of errors, but even type annotations alone would help by providing a clear way to document the way that functions work.

I’m also interested in making it easy to implement useful type annotations for most of NumPy’s core functionality. In particular, broadcasting and dimension identity support would be necessary for typing NumPy’s universal (ufuncs) and generalized universal (gufuncs) functions.

Examples

Core library code

As concrete example, here is a proposal for typing a (simplified) version of np.matmul (the @ operator):

I = DimensionVar('I')

J = DimensionVar('J')

K = DimensionVar('K')

def matmul(a: NDArray[..., I, J],

b: NDArray[..., J, K]) -> NDArray[..., I, K]:

Several features are worth noting:

DimensionVar indicates a new form of generics like Python’s typing.TypeVar: arguments on this function (and its outputs) that use the same DimensionVar must have matching dimensions. The difference from TypeVar is that DimensionVar can take on an integer value instead of a type.
The literal ... (Ellipsis) in NDArray[...,^[c]^[d]^[e] I, J] is not an omission: it indicates that this function follows broadcasting rules, allowing for any number of leading dimensions on all of its arguments. The presence of ... in the result indicates that its shape can be computed by broadcasting function inputs written with ...^[f]^[g]
NDArray is a generic name for a multi-dimensional array like numpy.ndarray. For now, I am intentionally neglecting the data-types part of the type specification (but see “Syntax” below for some options).

This concise syntax allows for easily typing the majority of functions that manipulate arrays with easily predictable shapes, including the __call__ method of all NumPy universal functions.^[h]^[i]

Application code

For application code for which concerns about correctness are paramount, we would encourage users to define DimensionVar and NDArray type aliases that clearly express intent. For example, for working with images:

Batch = DimensionVar('Batch')

Row = DimensionVar('Row')

Column = DimensionVar('Column')

Channel = DimensionVar('Channel', size=3)

RGBImage = NDArray[Batch, Row, Column, Channel]

GrayscaleImage = NDArray[Batch, Row, Column]

def rgb_to_grayscale(array: RGBImage) -> GrayscaleImage:

return numpy.mean(array, axis=RGBImage.axis(Channel))

This provides a form of self-documentation not evident in the original code. The axis method on the ndarray type aliases returns the integer axis position at runtime: RGBImage.axis(Channel) == 3.

New typing concepts

Literal values

It is hard to imagine how shape typing could work well without support for recognizing literal values in types. Examples include^[j]^[k]^[l]:

numpy.sum(x, axis=0) # remove first dimension from x

numpy.zeros((N, M)) # should recognize result is two dimensional

numpy.reshape(x, (-1, 10)) # also two dimensional, but last dimension has size 10

numpy.sum(x, axis=0, keepdims=True) # sum, but dimensions are not removed

Dimension variables

Dimension variables with different identities should be treated as different for the purposes of type checking. Thus each of the following examples should be a typing error, even though technically they would succeed if the dimension sizes N and M match exactly:

N = DimensionVar('N')

M = DimensionVar('M')

def addition_error(matrix: NDArray[N, M^[m]^[n]], vector: NDArray[N]) -> NDArray[N, M]:

return matrix + vector # broadcasting error

def sum_error(matrix: NDArray[N, M]) -> NDArray[N]:

return numpy.sum(matrix, axis=0) # bad-return-type

def transpose_error(matrix: NDArray[N, M]) -> NDArray[N, M]:

return numpy^[o]^[p].transpose^[q]^[r]^[s](matrix) # bad-return-type

We should also consider supporting constraints on DimensionVar, at least to indicate fixed sizes DimensionVar(‘X’, size=3) and possibly also non-zero sizes Dimension(‘Y’, nonzero=True).

In some cases, it should not be required that exact lengths should match for a particular dimension. This might be expressed with another keyword argument on DimensionVar, e.g.,

V = DimensionVar('V', fixed_size=False)

def vector_concat(arrays: List[NDArray[V]]) -> NDArray[V]:

return numpy.concatenate(arrays)

We could potentially use colons as syntax for unnamed dimensions with unknown lengths, e.g., vector_concat(arrays: List[NDArray[:]]) -> NDArray[:].

In annotations, numeric literals might also be accepted:

def vector_magnitude(a: NDArray[3]) -> NDArray[()]:

These would be an alias for fully constrained dimension variables, e.g., 3 -> DimensionVar(‘_3’, size=3).

Integer types

One way to handle most of the desired functionality of DimensionVar (aside from constraints) would be to simply allow integers as types in type variables and generics, e.g., TypeVar(‘X’, bound=int) and NDArray[3, X].

Variadic generics

To support indexing with multiple dimensions, we will need support for variadic generics, e.g., so we can support both NDArray[A] and NDArray[A, B].

Type arithmetic

In some cases, arithmetic on integer types would be valuable, e.g., to express an operation that converts an array with D dimensions into an array with D+1 dimensions, or concatenating vectors of size N and M into a vector of size N+M.

Although very elegant, I’m not sure I would prioritize this feature. There are relatively few functions that manipulate shapes in a way that requires arithmetic. Simply matching existing sizes is enough to catch most bugs, and there are a long list of ways in which functions manipulate shapes, many of which not even expressible with simple arithmetic.

Syntax

So far, we’re focused on shapes and haven’t yet considered how to include data types as part of array types. However, this will clearly be necessary, to serve the same uses as the generic types for the built-in List.

Arrays like numpy.ndarray should probably be generic types with two type variables corresponding to data-type and shape, e.g., NDArray[DataType, ShapeType]:

DataType should be a valid scalar type. For NumPy, these would be subclasses of np.generic (rather than np.dtype instances), including non-instantiable classes that exist purely for organization such as np.floating and real types such as np.float64.
ShapeType should take on one of two possible possible generic types:

Ndim[integer] indicates an array with a fixed number of dimensions. (This might be called Rank instead.)
Shape[X, Y, Z] indicates an array with dimensions X, Y, and Z.

The full form with two type variables looks quite verbose, e.g., NDArray[float64, Shape[X]]. However, in practice we expect to define type-aliases either in NumPy or user-code that allow commonly used types to be written more succinctly, e.g., for shapes

V = TypeVar('V', variadic=True)

Float64Array = NDArray[float64, Shape[V]]

# allows writing types like Float64Array[X, Y]

And for dtypes:

D = TypeVar('D')

Array1D = NDArray[D, Ndim[1]]

# allows writing types like Array1D[float64]

Ideally, Python’s typing module would provide a generic base class that could be extended by multi-dimensional array libraries such as NumPy, TensorFlow, pytorch, etc.^[t]^[u]^[v]^[w]^[x] This could include:

A base class for multi-dimensional arrays, e.g., NDArray. Objects inheriting from this class would indicate to type-checkers that they follow the typing rules of multi-dimensional arrays for shapes. It should be possible to indicate both data-type generic (e.g., numpy.ndarray) and specific (e.g., IntegerNDArray) types.
Abstract base classes for primitive types, e.g., float32 or int64. At the least, this should support all of the primitives supported by the array module.
A generic type for objects that support the buffer protocol. For example, this would allow for typing Cython memoryview operations without requiring a NumPy import.

Colons

It is natural to use slicing syntax to indicate anonymous placeholder dimensions, e.g., NDArray[:, :] to indicate a 2D array. This would allow them to be easily mixed and matched with named dimensions, e.g,. NDArray[:, X].

This syntax looks like the typing syntax already used for Cython’s memoryviews, e.g., np.float32[:, :]. And although we don’t propose supporting this for NumPy scalar objects themselves, one could easily support this using type aliases of the form suggested above.

Ellipsis

The literal ... is an alias for Python’s builtin Ellipsis. Like its role in multi-dimensional array indexing, Ellipsis indicates “zero or more full slices :” and can only appear at most once in a shape type.

It is an open question if ellipsis should be allowed to appear anywhere other than at the start of shapes. This would be convenient for some functions:

X = DimensionVar('X')

def remove_leading_axis(array: NDArray[X, ...]) -> NDArray[...]:

If ... is part of the shape signature at the start, then integer axis method could count as negative integers from the end instead: NDArray[..., Channel].axis(Channel) == -1.

It might be useful to support multiple ellipses with names, e.g., as found in datashape. This would probably be accomplished by creating explicit type variables objects rather reusing the built-in Ellipsis. Note that in datashape, these named ellipsis objects follow different broadcasting rules: objects that reuse the same ellipses are required to have exactly matching shapes.

Dimensionality versus shape

In most cases, specifying full arrays shapes in types is desirable. However, there are some cases where it might be difficult to specify full output shapes and using ... to indicate broadcasting would not be appropriate either. For these cases, we might want to support indicating array dimensionality without shapes. (See examples below, under “Combining arrays.”)

This presents a minor complication for type checkers: Ndim[3] should be recognized as equivalent to Shape[:, :, :].

Special cases

Ideally, static type checking should be able to recognize each of these:

Creating arrays: np.array, np.linspace, np.zeros
Aggregation and indexing: numpy.mean(x, axis=1) or x[0, :]
Combining: numpy.stack, numpy.concatenate
Reshaping: numpy.reshape, numpy.expand_dims

The array manipulations page on the NumPy docs lists many of core array manipulation routines. Note that although these functions are widely used, it’s much less common to write new functions of these types. This might allow some shortcuts to be made.

Creating arrays

For painless typing, types should be inferred for typical array creation routines. NumPy arrays are typically created either by coercing an argument (e.g., a Python list) or with one of several direct array creation routines.

For coercion, np.array could support type inference by adding overloads based on the argument type, e.g., for List[float],

np.array([0.5, 1.5, 2.5]) # type: NDArray[np.float64, Shape[:]]

In contrast, dedicated array creation routines typically indicate output dtypes and shapes with arguments, e.g.,

np.zeros((5, 10), dtype=np.float64) # type: NDArray[np.float64, Shape[5, 10]]

np.linspace(0, 1, num=10) # type: NDArray[np.float64, Shape[10]]

Aggregation and indexing

Indexing for arrays could potentially be special cased in type checkers, rather than adding dedicated syntax for writing new indexing operations.

However, removing or changing dimensions (used for aggregation and indexing) with an axis argument is common enough that some sort of syntax might be worthwhile to allow users to define their own functions.

Maybe something like:

D = TypeVar('D', bound=int)

S = ShapeVar('S')

def mean(array: NDArray[S], axis: D) -> NDArray[S.drop(D)]: ...

Note that syntax like S.drop(D) is novel for Python typing: there are no existing type annotations of this form.

The alternative is writing a long set of overloads for all “reasonable” integer values, e.g.,

@overload

def mean(array: NDArray[N, ...], axis: 0) -> NDArray[...]: ...

@overload

def mean(array: NDArray[N, M, ...], axis: 1) -> NDArray[N, ...]: ...

...

@overload

def mean(array: NDArray[..., N], axis: -1) -> NDArray[...]: ...

Combining arrays

Ideally, arguments to concatenate or stack could be statically checked, e.g., to verify that all dimension sizes are the same except along the indicated dimensions. It’s not yet clear to me what syntax for supporting these functions with arbitrary dimensional input could look like.

Perhaps:

def stack(arrays: Iterable[NDArray[S]], axis: D) -> NDArray[S.insert(D)]: …

def concatenate(

arrays: Iterable[NDArray[S]], axis: D) -> NDArray[S.replace(D, UNKNOWN)]: ...

At the very least, we should make it possible to check/annotate dimensionality for the result of these computations, e.g.,

def stack(arrays: Iterable[NDArray[Ndim[D]]]) -> NDArray[Ndim[D+1]]: ...

def concatenate(arrays: Iterable[NDArray[Ndim[D]]]) -> NDArray[Ndim[D]]: ...

Reshaping

reshape would need to support for converting literal tuples into shape types.

expand_dims would need a simple way to indicate inserting a new dimension of size 1, analogous to dropping dimensions in aggregations and indexing.

[a]Type inference for dtypes might be a bit trickier.

[b]Yes, we should prove this out. But I'm OK if we require explicit dtype arguments for literals, e.g., np.array([1, 2, 3], dtype=np.float64).

[c]How would something like np.sum be documented? (NDArray[..., I, ...]) -> NDArray[..., ...] could work but doesn't indicate that the length of the ..'s depends on the axis argument and the rank. (NDArray[:*axis, I, ...]) -> NDArray[:*axis, ...] might (but that's probably a syntax error). And then there's negative axis to worry about.

[d]Maybe some VariableDims(axis]) object is needed?

[e]Yeah, I don't have a really satisfactory answer for how to type reductions yet. There are some ideas below under "Aggregations and Indexing".

[f]So the rule is that all `...` on inputs are broadcast against each other, and then the result fills in all `...` on the outputs?

[g]Yes, exactly.

[h]But it doesn't cover ufunc methods .reduce (like the axis=1 example from your motivation section!), or .outer, right?

[i]Indeed, it doesn't. reduce() will be harder -- I wrote a bit of about this later.

[j]This is also important for dtype inference:

np.zeros(N, dtype=float)

np.zeros(N, dtype=np.float64)

np.zeros(N, dtype="d")

np.sum(arr, dtype=float)

[k]Yes, in general. But I'm pretty sure we could make at least np.float64 work with TypeVar and Type from typing.

[l]See https://gist.github.com/shoyer/0c2fa6c4535209df966210930849de58, which type checks with mypy

[m]It would be nice to at least have the option to just use strings ie NDarray['M', 'N'] here, as having the overhead of defining dimension vars in advance might make people shy away from using this whole thing. We can always allow DimensionVars for when more fanciness is needed. Also unclear to me what the "scope" of a DimensionVar is. What does it mean to define two functions that share the same DimensionVar? Is it any different from defining the DimensionVar separately for each function? Suppose a function accepts two RGBImages... how should one distinguish between two equal sized RGBImages vs two arbitrary RGBImages?

[n]The scope for DimensionVar should match that for TypeVar. They should basically be a version of TypeVar allowing integer values as types. I would be reluctant to introduce any sugar beyond what is already appropriate for TypeVar.

As for scope, my understanding is that nothing inherently links type variables between different functions, though instance methods of a generic class do reuse the same type variable used to define the class.

> What does it mean to define two functions that share the same DimensionVar? Is it any different from defining the DimensionVar separately for each function?

Yes, separate DimensionVars on separate functions are equivalent to reusing the same DimensionVar.

> Suppose a function accepts two RGBImages... how should one distinguish between two equal sized RGBImages vs two arbitrary RGBImages?

To require the same sizes, you would reuse the type variables, e.g.,

def f(x: RGBImage, y: RGBImage): ...

To allow different sizes, you would need to create new type variables / generic types, e.g.,

def f(x: RGBImage, y: RGBImage2): ...

where RGBImage2 and the DimensionVars that it uses defined like RGBImage and its DimensionVars.

[o]My favorite numpy function is einsum. That might be a good extreme to attempt to include in the type system. Or it might be too complex

[p]I have a really hard time seeing how we could possibly type einsum, unless we relied on a custom type-checking extension with code specific to the np.einsum function.

[q]Sadly numpy.transpose is another operation whose type can't be written using the type system so far... (it takes an array with shape S and returns an array with shape reversed(S).)

[r]Yes, that's true in full generality. Though we could write explicit overloads to constrain the type for low dimensional inputs (e.g., 0d, 1d, 2d), which would cover the majority of uses.

[s]perhaps NDArray[...] -> NDArray[...[::-1]] could work.

[t]The idea of putting this into 'typing' makes me nervous, given the release cycle etc. (This is going to be quite complicated given that the type checkers will also need to be iterating on whatever we come up with, but at they're not on the CPython release cycle, plus we might be working with plugins or something at first.)

[u]I agree with Nathaniel, typing has been limited by being in the standard library when wishing to modify APIs. Also, it would be quite strange to me for numpy-centric type concepts to be in the standard library without numpy itself being in the standard library.

[v]My ideal would be if we could implement this as a plugin in mypy, even more idea would be if the plugin system were codified as part of PEP 484 so plugins could be shared across pytype as well.

+shoyer@gmail.com looks like you already commented on https://github.com/python/mypy/issues/3540 so I'm guessing you're well versed in the plugin system already :D.

[w]I'm interested in codifying how you *write* shape typing in a standard way, so any type checking library (pytype, mypy, pycharm) that can build in the capability of checking any ndarray types (NumPy, TensorFlow, pytorch, etc.).

[x]Agreed - even without any type checking, it would be very useful just to have a standard way of documenting functions that accept and return arrays.