1 of 32

WPI Greenboard

Noah Cyr, Weaver Goldman, Surbhi Kapoor, Willem van Oosterum

Final Presentation

CS 542

2 of 32

Meet the Team

Willem van Oosterum

Data Lead

Weaver Goldman

Frontend Lead & DevOps Lead

Surbhi Kapoor

Product Lead

Noah Cyr

Backend Lead

3 of 32

Project Overview

4 of 32

Application Capabilities

  • Display anonymized leaderboards of students and majors with top carbon emissions
    • Be able to filter students by major
    • Visualize as a bar chart
  • Display a timeline of emissions and provide statistics on the packages from that timeframe
  • Display detailed package history and emissions breakdown for a selected student
  • Display and visualize emission data for different service types
    • Filter students by major and class year
    • Visualize as a bar chart
  • A feature that allows users to add packages to their profile via its tracking number and other information
  • Utilize package carrier APIs when available to provide additional information about a package

5 of 32

Target Users

  • WPI Students - Primary users of the application
    • View personal package details & carbon footprint
    • Compare habits on anonymized leaderboard
    • Receive recommendations for reducing emissions
  • WPI Faculty
    • View aggregated data across majors
    • Use insights in class discussions, research, or sustainability initiatives
  • WPI Mailroom Workers
    • View mailroom trends
  • WPI Administrators
    • See global emission trends
    • Implement policies to encourage reduced emissions

6 of 32

Development Approach

  • We took an Agile-esque approach to the design and development of our system
  • Weekly meetings where’d we work as a team and assigned tasks
  • Branches for different purposes. Ended up having 21 PRs
    • We often reviewed each other’s work

7 of 32

Tracking API Details

  • Two functional and free APIs - FedEx and UPS
    • Get real weight and distance information
    • Get carrier service type to determine emission factor (based on mode of transport)
    • Geocode or use cached location information
    • Use emission factor (from research) values to determine final emission
  • Built out synthetic data script for Amazon
  • All others were minority number of packages with carriers that did not provide open APIs

8 of 32

Emissions Calculations

9 of 32

Database Details

Data Sources:

  • Data Sourcing
    • Primary data is sourced from QTrak
    • Transaction reports are extracted in weekly batches.
    • Each record captures a package event: Stored, Routed, or Delivered.
  • Anonymization & Privacy
    • Remove all Personally Identifiable Information (PII)
    • Hash student box numbers for anonymous tracking
  • Carrier APIs
    • Query APIs to add external data:
    • Origin, dimensions, weight, transit history

10 of 32

Entity Relationship Diagram

11 of 32

UML Diagram

12 of 32

Final Architecture

  • Front-end
    • UI
    • Framework: Streamlit
  • Back-end
    • Database communication
    • Additional computations
    • Framework: FastAPI
  • Database
    • PostgreSQL

13 of 32

Leaderboard Workflow

14 of 32

Query Analysis

Our most popular query:

  • Get Student Leaderboard

Join options considered:

Computed on a few days of data

  • Unclustered Index Join
    • 476 I/Os
  • Hash Join
    • 27 I/Os

We see reasonable agreement with the engine’s chosen plan

15 of 32

Query Timing

  • Student Leaderboard
    • Runtime: 0.0147s
  • Major Leaderboard
    • Runtime: 0.0075s
  • Full package history for a specific person
    • Runtime: 0.0196s
  • Individual carbon emissions timeline
    • Runtime: 0.0017s

  • Carbon emissions timeline for a major
    • Runtime: 0.0097s
  • Carbon emissions timeline for all people
    • Runtime: 0.0281s
  • Display emissions calculations
    • Runtime: 0.0096s
  • Statistics on shipping service types
    • Runtime: 0.0172s

16 of 32

Alternative Emissions Format Trigger

We automatically compute equivalent trees planted and equivalent miles driven for each package.

17 of 32

Person Carrier Stats View

  • Computes carrier (UPS, Amazon, …) statistics for each person

  • Gracefully handles null values

18 of 32

Leaderboard Page

19 of 32

Leaderboard Page (Bar Chart View)

20 of 32

Single Person View

21 of 32

Timeline View

22 of 32

Add Package - Tracking# Lookup

23 of 32

Add Package - Emission Calculator

24 of 32

Service Types (Table View)

25 of 32

Service Types (Bar Chart View and Controls)

26 of 32

Demo

27 of 32

Add Package Demo

28 of 32

Service Types Page Demo

29 of 32

Issues

  • Incorrect ERD (pointed out during the proposal - now fixed)
  • Iterating with Streamlit was slow until adjusting the Docker setup
  • Limited access to USPS and Amazon data: restricted completeness of package coverage
  • Overall data inconsistencies between the different carriers
  • Timeline view can be misleading because most people do not have packages delivered everyday

30 of 32

Lessons Learned

  • Real-world implementation is far more complex than theory
    • Turning classroom ideas into working features required deeper problem-solving
  • Full-stack development significantly expanded our technical skills
    • Gained experience with PostgreSQL, Docker, FastAPI, SQLModel, and Streamlit
  • Choosing the right tools enabled fast iteration and efficient progress
    • Selecting frameworks aligned with our experience made development more efficient
  • Collaboration strengthened our problem-solving and system-level thinking

31 of 32

Future Work

  • Leaderboard Query Caching
    • Cron jobs
  • Enhanced Data Sources
    • More data sources
    • Automated data pipelines
  • Authentication
    • WPI SSO

32 of 32

Thanks!

Noah Cyr, Weaver Goldman, Surbhi Kapoor, Willem van Oosterum

CS 542