1 of 10

Lucas Durand, TD Securities

Understanding People with Network Graphs in Python

2 of 10

  1. Introduction/About Your Topic
  2. Creating the Graph
  3. Clustering Attributed Graphs
  4. A Peer Finder App
  5. Conclusion

Agenda

3 of 10

Intro/About Your Topic

Who is this?

  • Name: Lucas Durand
  • Pronouns: he/him/his
  • Background:
    • Theoretical Physics
    • Quant Developer
    • Data Science / Software Engineering
  • Instrument of choice: Sax
  • Language of choice: Python

4 of 10

Intro/About Your Topic

Problem Statement: Where are my people?

You are a new employee at very big company. It's your first day and you have all the tools you need for success:

  • a laptop
  • link to a Wiki page called Onboarding Instructions
  • homepage opens up to Internal Social Tool where you can see (a bit of) The Org Chart

Social Tool Gives Us:

But It's Not Enough!

  • Same Manager shows us our First Team, a good starting point! Exploring beyond this is tedious
  • Regional teams align differently than project/platform teams
  • Agile Pods align by job family or capability, not project
  • Niche roles are often alone on a team (e.g. embedded data analysts)

What's "The Org Chart"+?

4

5 of 10

Intro/About Your Topic

What are Network Graphs?

A graph is a mathematical structure composed of a set of objects, in which pairs of these objects are in some way related

Graph

The objects in the graph are known as vertices, points, or nodes. Nodes contain attributes

Node

Nodes in the graph are connected by edges, which can be directed or undirected and can have a weight

Edge

6 of 10

Creating the Graph

Nodes, Edges, Visualize!

  • Given a good data source of People in TD and their reporting structure we create a Simple Directed Graph
    • Nodes are people
    • Edges are "manages", representing the org reporting structure
  • This is a pretty good Org Chart!

6

7 of 10

Clustering Attributed Graphs

Attribute-Nodes, Attribute-Edges, a Distance Metric

  • To find "peers" in a graph we usually follow the edges to determine the shortest path between nodes.
  • We only have edges for reporting structure – how do we incorporate their attributes (e.g. "uses Python" or "worked on Project X")?
  • There are a few ways to do this, the prevailing wisdom is to add "Attribute Nodes" and connect them
  • Now we can use an embedding method to fit our graph into a space and calculate distance!
  • BOTHOREL, C., CRUZ, J., MAGNANI, M., & MICENKOVÁ, B. (2015). Clustering attributed graphs: Models, measures and methods. Network Science, 3(3), 408-444. doi:10.1017/nws.2015.9

7

8 of 10

A Peer Finder App

Expose our tool to the masses

  • We employ a force-directed layout to treat edges like springs pulling nodes together
  • Then find the distance between each node in the 2D embedding space
  • Visualize your closest peers and see how you're connected!

8

9 of 10

Appendix

Graphs are messy

  • Our nice polished graph appears to have more dimensionality than it does (since we're applying force-directed calculations twice, once to filter the data, then once on the remaining points)
  • In reality the spring layout looks like this (on the right)
  • This makes a lot more sense given the medium dimensionality of the data, people are pulled in one of X directions with equal force

10 of 10

Appendix

More Graphs