1 of 28

A Visual Functional Programming Language for Scientific Computing

Pete Vilter

CS Senior Thesis, University of Chicago

Advisor: Borja Sotomayor

Unofficial Advisors: Ravi Chugh, Mike Wilde

https://github.com/vilterp/thesis

2 of 28

3 of 28

TOC

  • Why?
  • What
    • Demo
    • Graph syntax / Semantics
    • Codegen
    • Progress Monitoring
  • Future Work
    • Swift
    • Typechecking
    • Files
  • Architecture / Implementation

4 of 28

Why a new visual PL?

  • Scientific workflows are functions:
    • data in, data out; not specifying interactivity
    • Relatively simple
  • A graphical repr is good for:
    • Easy editing (no syntax errors)
    • Easy progress monitoring
  • Existing solutions are limited
    • List processing, subroutines awkward
    • Contribution: visual syntax based on FP
  • Non-goal: support super-complex code
    • Textual code is more suitable past certain complexity threshold, because more dense

5 of 28

Stepping stone to:

  • Compilation to Swift:
  • distributed parallel execution on cluster
  • Currently generates Python code which runs on local machine

6 of 28

Language Model

Editor

Graph functions

Python

functions

Runs

7 of 28

Language Model: Editing Python

8 of 28

Run it!

  • And then aspects:
    • Graph "syntax"
    • Code generation
    • Progress monitoring

9 of 28

Graph syntax

  • Analogy:
    • valid token sequence can be invalid syntax
    • some edges in graph can be invalid
  • No cycles
  • No edges out of lambdas
  • Function usages as value vs. application are mutually exclusive
    • Design decision: function and application nodes are not separate

10 of 28

Code generation

  • Generate single Python file from whole module
  • Graphs: topological sort on application nodes
  • Each node becomes a function call
  • All functions take and return dictionaries of values

11 of 28

Generated code:

def main():

ap0_names = log_call(names, "ap0_names", {})

ap0_names_names = ap0_names["names"]

def lambda0(lambda0_ap1_add_excl_str):

lambda0_ap1_add_excl = log_call(add_excl, "ap1_add_excl", {"str":\� lambda0_ap1_add_excl_str})

lambda0_ap1_add_excl_added = lambda0_ap1_add_excl["added"]

lambda0_ap2_add_hi = log_call(add_hi, "ap2_add_hi", {"str": \

lambda0_ap1_add_excl_added})

lambda0_ap2_add_hi_added = lambda0_ap2_add_hi["added"]

return {"lambda0_ap2_add_hi_added": lambda0_ap2_add_hi_added}

ap3_map = log_call(Map, "ap3_map", {"function": lambda0, "list": ap0_names_names})

ap3_map_mappedList = ap3_map["mappedList"]

ap4_space_join = log_call(space_join, "ap4_space_join", {"list":\

ap3_map_mappedList})

ap4_space_join_joinedList = ap4_space_join["joinedList"]

return {"ap4_space_join_joinedList": ap4_space_join_joinedList}

12 of 28

Example: Word count

Spark:

text_file.flatMap(lambda line: line.split())

.map(lambda word: (word, 1))

.reduceByKey(lambda a, b: a+b)

13 of 28

Progress Monitoring

  • Server sends updates on function call & return
  • Client builds call tree������

main

names

Map

space_join

<lambda>

add_excl

add_hi

<lambda>

add_excl

add_hi

14 of 28

Progress Monitoring: More

  • Supports parallel execution (call tree w/ mult. stacks)
  • Argument & return�values sent to editor�as soon as are known
  • (can be inspected�at runtime)��
  • Generated code:
    • log_call(func, ap_site, args)

names

Map

space_join

<lambda>

add_excl

add_hi

<lambda>

add_excl

add_hi

main

15 of 28

Progress Viz: Future Work

  • Inspect nodes up call tree
  • Progress bar for list HOFs
    • Map, Filter, Fold
  • Machine-level stats
    • CPU
    • Memory
    • Disk
    • open shell
    • etc������

names

Map

space_join

<lambda>

add_excl

add_hi

<lambda>

add_excl

add_hi

main

16 of 28

Future work: files

  • Upload
  • Insert as values onto canvas
  • Note: functions cannot be run without all args present
  • Use directories as lists
  • Get files from network

17 of 28

Future work: Swift

  • Wrap binaries
  • Parallel execution
  • Distributed execution
  • Supercomputer and cloud platforms

18 of 28

Future work: typechecking

  • Hindley-Milner for Graphs

String → String

List String

ab

List a

List b

(ab) → List a → List b

19 of 28

Future work: typechecking

  • Hindley-Milner for Graphs

List String

String → b

List b

20 of 28

Future work: typechecking

  • Hindley-Milner for Graphs

String

List String

List String

String → String

21 of 28

Future work: typechecking

  • Prevent invalid edges

22 of 28

Future work: typechecking

  • Prevent invalid edges

23 of 28

Future work: typechecking

  • Mark edges rendered invalid by external changes

ages

ages

24 of 28

Implementation: Basics

  • Editor: 100% Elm
    • Paradigm: Functional Reactive Programming
    • Disciplined state management invaluable
    • Typechecking invaluable
  • Metrics
    • ~1300 lines elm-diagrams library
    • ~2700 lines project-specific
    • + Elm stdlib (data structures, wrappers of browser APIs)
    • Compiles to JavaScript
      • 21k lines, much of which is Elm stdlib
  • Backend: node.js server
    • Serves code
    • Executes python subprocess, streams results back to editor

25 of 28

Implementation: Time

  • Vast majority: Graph Editor
  • Why? No graphics lib with abstractions for
    • relative layout
    • absolute layout
    • absolute layout derived from relative
  • Made my own

26 of 28

Diagrams

relative

← alignment →

absolute, derived from relative

absolute

27 of 28

Diagrams

Canvas

SVG

HTML

Diagrams

Absolute

Good

Good

Janky

Good

Relative

DIY

DIY

Good

Good

Alignment

DIY

DIY

Confusing

Good

Abs. Derived from Relative

DIY

DIY

Not possible in FRP (layout is side effect)

Good

28 of 28

Diagrams

  • Functional combinators for relative & absolute positioning, inspired by Haskell lib (same name)
  • Query display trees for:
    • Coordinates of laid-out subtree
    • What subtree mouse is over
  • Rough graph editor code metrics

Filename

Role

LOC

Actions.elm

Reactions to mouse input

120

Model.elm

State data structures & algos

197

Styles.elm

What color are things?

54

View.elm

Layout; attaching actions

285

Total (not counting some outside G.E.)

656