1 of 13

What did the GraphBLAS get wrong?

John R. Gilbert

UC Santa Barbara

HPEC GraphBLAS BoF

September 20, 2022

2 of 13

Ben to John:

As the grandfather of GraphBLAS, would you speak on �“What the GraphBLAS Got Wrong” at the BoF?

John to Ben:

What, you’re asking me to pick on my grandkids in public?

3 of 13

Ben to John:

As the grandfather of GraphBLAS, would you speak on �“What the GraphBLAS Got Wrong” at the BoF?

John to Ben:

What, you’re asking me to pick on my grandkids in public?

Oh, all right ….

4 of 13

I asked a few of my best friends ….

Thanks to David Bader, Tim Davis, Joe Eaton, Oded Green, Jeremy Kepner, Jim Kitchen, Andrew Lumsdaine, Gabor Szarnyas, Erik Welch, Albert-Jan Yzelman

I won’t tell you now everything they said.

(I promised Ben to be provocative and brief.)

But, I’ll post all the responses somewhere later (NOW HERE).

Feel free to chime in on the discussion!

5 of 13

Too limited in scope

Semiring linear algebra is powerful, but it doesn’t do everything.

Visitor primitives, map-reduce primitives, traversal primitives, path primitives.

“GraphBLAS are not rich enough to represent the graphs and computations used by real analysts.”

“Graph pattern matching is poorly supported.”

“Depth-first search can't be implemented efficiently.”

From a 2010 funding proposal for “Graph BLAS” (not related to the GraphBLAS Forum):

  • "Algebraic primitives map less well to priority-directed traversal algorithms. We propose to include at least DfsVisitor, BfsVisitor, and UcsVisitor among our standard primitives."

  • "An important part of our research will be to continually evaluate the completeness of our chosen primitive set."

6 of 13

Too limited in scope

Semiring linear algebra is powerful, but it doesn’t do everything.

Visitor primitives, map-reduce primitives, traversal primitives, path primitives.

“GraphBLAS are not rich enough to represent the graphs and computations used by real analysts.”

“Graph pattern matching is poorly supported.”

“Depth-first search can't be implemented efficiently.”

From a 2010 funding proposal for “Graph BLAS” (not related to the GraphBLAS Forum):

  • “Algebraic primitives map less well to priority-directed traversal algorithms. We propose to include at least DfsVisitor, BfsVisitor, and UcsVisitor among our standard primitives.”

  • “An important part of our research will be to continually evaluate the completeness of our chosen primitive set.”

7 of 13

Too complicated to implement

  • “It is a large API that is challenging to implement, which significantly limits its portability.”

  • E.g., currently only one highly performant implementation

  • E.g., slow pace of GPU version of SuiteSparse:GraphBLAS (despite lots of smart people working on it)

  • Will we have timely implementations for emerging architectures (Lucata, Cerebras, PIUMA, …) and for distributed memory?

8 of 13

Interoperability isn’t a high enough priority

GraphBLAS alone isn’t enough for solving all problems.

  • Too hard to mix GraphBLAS calls with calls to other libraries.

  • Too hard to use GraphBLAS with user data structures and code �in existing packages.

  • GraphBLAS tends to assume it’s in overall control (like Matlab).

  • Badly hurts uptake in broad user communities:�

“As a community we assumed that with GraphBLAS tools, we would have numerous developers join the community and develop algorithms. �That hasn’t proven to be right.”

9 of 13

GraphBLAS needs a robust standard graph library

  • “LAGraph is a good starting point, but seems to have stalled in recent years. If any company (Intel, Nvidia, etc.) is serious about GraphBLAS, they should fund the development of this core algorithm library, along with Python-based, notebook-friendly implementations that allow easy visualization.”

  • Linear algebra is the “magical middleware” in scientific computing �not just because of a math spec and standard primitives. There’s also LAPACK, ScaLAPACK, SuperLU; MATLAB, Scipy/Jupyter; and much more.

10 of 13

Connection with numerical linear algebra should have been a primary consideration from day one

  • Numerical matrix computation has been using large-scale graph algorithms forever (partitioning, symbolic factorization, coarsening).
  • Lots of graph algorithms use numerical linear algebra (spectral methods, Laplacian max flow / min cut, …).
  • The numerical software community is large, experienced, influential.
  • We should have made it a goal for multigrid solver libraries to adopt GraphBLAS for their internals.
  • One small success story: MATLAB’s sparse matrix arithmetic.

11 of 13

Parallelism should have been a primary consideration from day one

  • “Not considering parallelization from the get-go, or not thoroughly enough--not in the API design, nor in costing.”

  • “GraphBLAS is missing the concept of distributed computation, which really limits it to a single node today.”

12 of 13

No introspection, hints, performance tools

  • Query: type, memory of a matrix; monoid of a semiring; size of a type; types for operator args; name/version of both the current GraphBLAS library and the one that created a serialized blob.

  • Ask if a matrix/vector/scalar would be changed by GrB_wait.

  • Implementation-defined hints to the library to specify formats or algorithms.

  • E.g, no good way to express push-pull optimization to let the implementation handle it: compute semiring1(AT @ x) or semiring2(x @ A) �(where semiring1 commutes to semiring2)

13 of 13

Responses to the survey, by category

  • Detailed C API issues: 32
  • Missing particular operations or primitives: 13
  • Missing categories of capabilities: 9
  • Language choice or support: 2
  • User can’t query / choose representations and algorithms: 9
  • Lack of performance definitions�and tools: 5
  • Obstacles to operator fusion, parallelism, performance: 4
  • Integration with numerical computing (linear algebra, statistics, ...): 4
  • Lack of connection to or uptake in broad user communities: 5
  • Too hard to interoperate with other graph libraries: 9
  • Capabilities too limited to be broadly useful: 4
  • Lack of breadth in implementation community: 2
  • Complicated to implement / use: 5
  • Lack of tutorials, examples, docs: 2
  • Overall approach is wrong: 1
  • The name (“Sounds like an illness”): 1