1 of 10

1

CityScale HackTeam Members :

Thomas Marconi, Andrew Mahorner, & Schmidt Joseph

Scale Your Needs

2 of 10

Motivation & Problem Statement

We have data. What do we do with it?

Problem Case 2 (City and Tract-level Affordability Indexes)�

What’s the problem?

Disparate and complex dashboarding tools create a lack of actionable insights for non-data literate users
Policymakers, real estate investors, and nonprofits have no easy way to view data about census tracts and interact with this data to experiment with possible future case scenarios

Why did we choose this?

Democratization of data insights allows for key stakeholders like policymakers to understand their city’s needs
Data-driven solutions are the key to making positive impact in the lives of real people

3 of 10

Solution Summary

One-stop shop for policy makers to understand their city’s at risk population areas

Purpose: Generates a PCA-based Housing Affordability & Risk Index across tracts in 25 major U.S. cities
Features:

Interactive heatmap dashboard
Integrated filters for targeted analysis
Chatbot for natural language data analysis

Impact & Use Cases:

Policymakers: Direct funding, rental aid, and development near transit
Real Estate Planners: Identify locations for highest social and economic impact
Nonprofits: Focus outreach where communities need it most

4 of 10

5 of 10

Data Description and APIs

Description/key stats summarizing data

How much (depth & width)?

Used curated feature set

From 178 total features to ~40

Data pipelines/transformations description

Structured the data into sqlite db

Retrieve a feature for a tract
Retrieve all features for tract
Retrieve all tracts for a city

Added geographic data of tract from USCB shapefiles
Assumptions:

Data is coherent

CityScale’s Architecture

6 of 10

Method Description

Primary Technique: Unsupervised Dimensionality Reduction (PCA)

Unsupervised dimensionality reduction that transforms 20+ housing metrics into a single affordability risk score (0-100)

Why PCA?

No labeled training data needed
Handles correlated features naturally

Steps to generating the PCA based index:

1. Create a PCA using the entire data set
2. Label the PCs that were generated by PCA
3. Standardize (z-score)
4. Create composite score
5. Redistribute on 0-100 scale

Tech Stack:

scikit-learn, pandas/numpy, next.js, fastapi, leaflet, sqlite

Consider adding any images that represent the method you used.

7 of 10

Performance Results

Correlation of high-risk tracts aligned with eviction rates

Validation:

Compared CityScale’s tract-level risk scores with EvictionLab’s (Princeton) eviction rates

Method:

Classified results into Low, Moderate, and High risk buckets and analyzed via a confusion matrix

Findings:

As real-world risk (rows) increases, predicted scores (columns) trend higher — showing strong directional alignment

Note:

Model outputs are more conservative, incorporating broader socioeconomic factors beyond eviction rates

This could be a great place to put a confusion matrix!

A chart/graph of your results would be a nice addition.

Confusion Matrix

8 of 10

Our Challenges

Lack of Domain Knowledge

Reliance on mentors and SMEs

Feature Set too complex

Trimming down the feature set to create a model that accurately predicted outcomes

Missing tract coordinates/shapes

Early Chatbot hallucinations

9 of 10

Demo

10 of 10

Implications

Bridging the Problem → Solution

Problems:

Disparate data dashboards
No easy way to understand our data and drive policy
What do our people need?

Solution:

Interactable heat map based dashboard with filters and natural language analytics layer

Implications

One stop shop understanding of what is going on in our city for data literate and illiterate users
Data driven decisions that help real people

Future Steps

Fine tune housing index model with the help of PhD level SMEs
Improve chatbot to fine tune responses and create more valuable insight about the data
Deploy to cloud infrastructure to allow for widespread use

Scale to all backgrounds, example case a College Graduate understanding the best areas to live that would excel their career goals

Partner with city staff, survey residents in selected tracts, visit neighborhood sites to find more raw findings to refine data set
Cater an app version for homeowners or renters to understand where is affordable in their city