1 of 47

Follow the Money and Data: Mapping NYC’s Unregulated Surveillance Economy to Expand Local Government Solutions

Sponsor: Surveillance Resistance Lab

Student Team: Chaofan Zheng (cs2758), Xueying Xiao (xx781), Vivian Zhao (zz4099), Shanshan Xie (sx2230)

2 of 47

Contents

  1. Context
  2. Data & Methodology
  3. Analysis of Filtered Datasets
  4. Transparency
  5. Combining Analysis
  6. Conclusion

3 of 47

Context

Background

Literature Review

SPEX & POST Act

Project Scope

1

4 of 47

  • The NYPD is by far the biggest and most expensive police department in the world
  • In 2020, NYPD’s budget ranked the third among all agencies in NYC. (Citizens Budget Commision, 2020)

1.1 Background

5 of 47

The NYPD‘s release of information regarding the use of previously secret technologies falls short of the transparency and disclosure goals outlined in the Public Oversight of Surveillance Technology (POST) Act.

No clear information about how much is spent on surveillance technology

1.1 Background

6 of 47

1.2 Literature Review

What the NYPD has reported is a far cry from full transparency and meaningful disclosure. Despite the clear purpose of the POST Act in making once-secret technologies public, the information released by the NYPD seems designed to prevent New Yorkers from learning any new information about how the department operates. (Sisitzky,2021)

NYPD contracts are heavily redacted, making it difficult to understand how any single tool functions, let alone how they can work together to create a surveillance dragnet over people in New York. (Fussell, 2021)

NYPD’s lack of transparency serves as structural protection from any meaningful fiscal oversight. That allows the NYPD’s budget to continue to balloon and the NYPD’s impunity and power to grow unchecked. (Communities United for Police Reform, 2022)

7 of 47

1.3 SPEX & POST Act

Public Oversight and Surveillance Technology Act

Later in 2020, the Public Oversight and Surveillance Technology Act (the POST Act) was passed, mandating the NYPD to disclose impact and usage policies relating to their surveillance technology. The final impact and use policies for these surveillance technologies were published publically and submitted to the Mayor and City Council Speaker on April 11, 2021

Special Expense Fund

The "Special Expense Fund" (SPEX) agreement established in 2007 allowed the NYPD to withhold information on surveillance tools, circumventing the need for public and City Council approval. Contracts obtained through this advocacy revealed that the NYPD had allocated at least $277 million for surveillance expenditures since 2007 under the "Special Expense Fund." The Comptroller’s Office ending participation in SPEX in 2020

8 of 47

1.4 Project Scope

CheckBook NYC – NYPD

Budget

Contract

Spending

Revenue

Amounts related to surveillance and data-driven technology

Transparency

Surveillance technology allows the government to gather

more information with less effort.

Our team will use financial data to map and analyze:

Missing values

Anonymized Payee

Spending on Contracts

SPEX & POST Act

Amounts for

4 datasets

Purchase

&

Vendor Lists

9 of 47

Data &

Methodology

Data Collection

Data Processing & Labeling

Data Analysis

Summary

2

10 of 47

  • CheckBook NYC 2010-2022 Fiscal Year Ending June 30th
  • Retrieved on December 15th, 2022

Notes:

  1. Removed the “Payroll” and “Capital” for Spending datasets
  2. Excluded the FY 2023 since it is not completed

Dataset

Counts of Original

Counts of Cleaned

NYPD Revenue

165,238

158,173

NYPD Budget

131,781

117,910

NYPD Contract

17,809

17,807

NYPD Spending

1,045,557

735,301

NYPD N/A (subset of Spending)

22,420

20,145

2.1 Data Collection

11 of 47

Since most of the original records are not specified what the money for, we need to figure out whether it is an amount related to surveillance or technology.

2.2 Data Processing & Labeling

No known

Possible/Unknown

Strong

An Example of Summary Table:

Summary Table:

Show all the unique values together with cumulative counts and amounts

12 of 47

Whole dataset

Possible/Strong

Strong

2.3 Data Analysis

No known

Possible/Unknown

Strong

For each value, whether it has

No known

Possible/Unknown

Strong

relationship with surveillance/technologies

13 of 47

  • Data summary table
  • Using specific columns to identify records related to technology or surveillance

Labeling

2.4 Summary

  • Spending & Contract datasets
  • Use filtered datasets to identify records providing technological products or services
  • Keywords to show what products/services are providing

Vendor & purchase lists

  • Missing values
  • Preliminary analysis for each dataset
  • Using columns in common to merge the datasets
  • Collective analysis combining datasets
  • Post Act & SPEX

Analysis

14 of 47

Analysis

of Each Dataset

Budget Analysis

Contract Analysis

Spending Analysis

Revenue Analysis

3

15 of 47

3.1 Budget

  • NYPD budget has been increasing over last 12 years
  • The budget almost increase 50% compare to 2010

16 of 47

3.1 Budget

  • Filtered by expense category, NYPD budget has still been increasing over last 12 years
  • The selected budget contains about 8.3% of overall budget
  • The sharp drop in 2017 could be related to SPEX and drop in 2020 could be the caused by the POST Act

17 of 47

3.2 Contract

  • The current contract amount is always greater or equal to the original contract amount.
  • The trend of NYPD contract amount has been increasing over last 12 years

18 of 47

3.2 Contract

  • NYPD original contract amount of the start date has been increasing over last 12 years
  • The drop in 2020 could be related the SPEX Act
  • The sharp increase in 2021 could be caused by the POST Act

19 of 47

3.2 Contract

20 of 47

3.3 Spending

Overall

Findings:

Total expenses do not fluctuate much from year to year, but there is a seasonal fluctuation within the year

Chart: the change in total monthly expenses over time

21 of 47

3.3 Spending

Purchases

Related or not

Classification

No. of records

Amounts

Strong related

Data system & Crime Information

13,558

2.02E+08

surveillance cameras & Recognition System

4,406

5.32E+08

Robot

4,715

3.09E+08

Others strong related

52,127

-

Not strong related

Not strong related data

886,046

-

Missing data

Missing data

107,384

-

-To find out what NYPD bought and how much it spent.

  1. Divided the data into two categories: Strong related to surveillance and data tech or not
  2. Subdivided the strong related data into four categories

Findings among three categories

  • Data system & information has the highest number of records
  • NYPD spends the most on monitoring, reaching $530 million in ten years.
  • There are still 50 thousands pieces of data that are not in the categories we have proposed.

22 of 47

3.4 Revenue

  • The ‘Adopted’ is the initial budget.
  • The ‘Modified’ is the budget modified after a period of time. Essentially they are both budgets.
  • The 'Recognized' is the actual revenues.

Two simple findings:

  • Large discrepancies between budgeted and actual revenues
  • Budget revenue starts to decline after 2019.

23 of 47

3.4 Revenue

Federal Grants

  • The valid data in this graph starts from 2010 as the data before 2010 is not complete.

Findings:

  • The annual revenue from the federal government is about 50% of the total actual revenue.
  • Although the proportion is high, it does not seem to be highly correlated with changes in the trend of total revenue.

24 of 47

3.4 Revenue

Asset Forfeiture

Findings:

  • The ‘asset forfeiture’ reached more than 10% of total revenue in the period 2016-2020.
  • The start after 2019 plummets from 16 percent to 3 percent in 2022.
  • The same trend as total revenue

25 of 47

3.4 Revenue

Non-Governmental & Private Grants

Findings:

-The proportions of both categories are stable from 2015-2020. The subsequent three years began to show an upward trend.

-For the past 8 years, Non-Governmental has been stable at around 5%, and Private has been stable at around 1%

26 of 47

DHS Department of Homeland Security DOJ Department of Justice

Findings:

  • The ratio of DHS-related revenue: The overall trend is a downward trend, falling from 17% to 5%.
  • The ratio of DOJ-related revenue: It is relatively stable overall, remaining at around 1%, but there were three years when higher values were seen, reaching over 4%.

3.4 Revenue

DHS & DOJ

27 of 47

Transparency

Missing Values

Anonymized Payees

Overall Gaps in Data

4

28 of 47

Transparency

Completeness

Significance

4.1 Missing Values

For budget,spending,contract and revenue data

No missing values in the highlighted columns as chosen

All money-related categories are zeros

Datasets

Columns

Description

Budget

Budget Code

Budget code identifier

Expense Category

An identification code

Contract

Contract ID

Contract number

Expense Category

An identification code

Purpose

Contract purpose

Vendor

Vendor identifier

Spending

Budget Code

Budget code identifier

Expense Category

An identification code

Purpose

Contract purpose

Payee Name

Vendor name

Revenue

Revenue Source

Identification code

29 of 47

4.1 Missing Values

For budget data

Completeness

Significance

  • No missing values in Expense Category
  • Only 2010 has missing values in Budget Category, rest 11 years has no missing values

73%

30 of 47

4.1 Missing Values

For contract data

Completeness

  • Only focusing on the missing values from Expense Category, Purpose and Contract Type
  • Money-related category is not the focus here, so there is no need of considering significance

31 of 47

4.1 Missing Values

For spending data

Completeness

Significance

  • No records with all-money related columns have zero values (check amount)
  • Budget Code / Contract Purpose / Expense Catagory / Payee Name / Check Amount have around 3% of missing values varies from year to year
  • Associated Prime Vendor has major missing values, around 90% of values are missing

32 of 47

4.1 Missing Values

For revenue data

Completeness

Significance

33 of 47

4.2 Anonymized Payees

  • A subset of Spending dataset
  • Records with “N/A, Privacy/Security” as the value of Payee column

N/A, Privacy/Security dataset

Yearly Average: 4,788,754 Amount: 1,527

Column

Value

Amounts

Counts

Department

Administration

$36.14 m

8,280

Spending Category

Others

$62.25 m

19,833

Expense Category

Financial assistance to college students

$11.86 m

4,437

Leasing of miscellaneous equipment

$11.12 m

852

Professional service-other

$7.78 m

3,044

Budget Codes

NYPD police cadet corps

$9.88 m

3,750

Federal asset forfeiture

$7.39 m

263

Health service division

$1.87 m

1,417

34 of 47

4.2 Anonymized Payees

Category

Amounts

% Amounts

Counts

% Counts

Strong

$1.59 m

32.85

324

29.00

Strong & Possible

$3.53 m

74.91

1,192

78.21

All N/A

$4,79 m

100.00

1,527

100.00

Yearly Mean:

No known

Possible/Unknown

Strong

35 of 47

4.2 Anonymized Payees

SPEX & POST Act

SPEX

POST Act

36 of 47

4.3 Overall Gaps in Data

  • Limited information for cross-checking

  • Insufficient details of each transaction

  • Vague description of each record

  • The challenge of linking contract and spending information

  • Unknown growth of contract amount over the years in contract data

37 of 47

Combining

Analysis

Overview

Analysis

5

38 of 47

5.1 Overview

  • Contract amounts only
  • A contract can last for years

Contract dataset

  • All money going out

Spending dataset

  • Use “Contract ID” column to identify the contract
  • Figure out the spending amount on contracts related to surveillance / technologies

Combined dataset

List of purchases based on the Contract Purpose

Vendor lists based on the correlation labels

Money spent on contracts based on the correlation labels

39 of 47

5.2 Analysis

Purchase List

Category

Spending

Contract

Counts

% Counts

Counts

% Counts

Related to Technologies

142,165

19.33

3,754

21.08

Missing Values

391,304

53.22

11,739

65.92

40 of 47

5.2 Analysis

Vendor Lists

41 of 47

Category

Amounts

Counts

Strong

$0.95 b

1,752

Strong & Possible

$1.52 b

14,254

All

$2.03 b

17,507

5.2 Analysis

42 of 47

Conclusion

Amounts

SPEX & POST Act

Gaps and Limitations

Recommendations

6

43 of 47

6.1 Amounts

Estimated amounts based on “Strong” correlation label:

Budget: Increasing from $250 million to $450 million

Contract: Fluctuant; approximately $1 million yearly average

Spending: Seasonal trend; $8.7 billion yearly average

Spending on Contracts: Increasing trend; $80 million yearly average

Revenue:

  • Annual revenue from DHS as a percentage of total revenue decreases annually, fluctuating from 15% in 2010 to 5% in 2022.
  • Revenue from DOJ as a percentage of total revenue fluctuates relatively little from year to year, mostly within a 1% to 2% range.

Conclusion 1: Based on our methods, the estimated amount of money related to surveillance or data-driven technologies is considerable.

44 of 47

6.2 SPEX & POST Act

Events

Termination of SPEX

Implementation of POST Act

Budget

Contract

X

Spending

Spending on Contracts

X

The impact of termination of SPEX and the implementation of POST Act to each dataset is shown as follow:

Conclusion 2: The implementation of POST Act is effective in disclosing details of spending in surveillance or data-driven technologies.

45 of 47

6.3 Gaps and Limitations

A lack of details of records

  • Plenty of missing values preventing us from knowing what the spending, revenue, or contract is for. Hence, it is difficult to define what is related to technology thus should be included.
  • Current information is too broad and vague for the audience to understand the records. As a result, although we test our labeling manually, it may not be so precise thus affecting our results.

Limited information for cross-checking

  • The four datasets seem to hardly be connected to each other since they seem to have limited columns in common. This means we can hardly do the cross-checking among all datasets. Therefore, some of the records may still be left out.

46 of 47

6.4 Recommendations

Improving the budget modeling

  • There is a significant discrepancy between the budget amount and the actual amounts. The gap indicates an inaccuracy in the modeling and a failure in providing reliable budget projections. The large deviation of actual amounts from the budget ones can exacerbate the trusting issue.

Including more descriptions for each record, e.g. keywords

  • The descriptions are helpful yet the majority of them are missing. Apart from the description, providing keywords can be really helpful.
  • For example, when they are purchasing drones, keywords including “drone”, “surveillance”, “monitor”, “camera”, and “technology” can be presented, so that the audience would have a clear idea what the record is for.

Adding more details to compare datasets

  • As is mentioned above, the datasets are not so detailed for cross-checking. It is difficult to compare the spending on contracts and the budget amounts.

47 of 47

Fussell, S. (2021, August 10). The NYPD had a secret fund for Surveillance Tools. Wired. Retrieved April 25, 2023, from https://www.wired.com/story/nypd-secret-fund-surveillance-tools/

Inside the NYPD's surveillance machine. Ban the Scan. (n.d.). Retrieved April 25, 2023, from https://banthescan.amnesty.org/decode/

The NYPD published its Arsenal of Surveillance Tech. here's what we learned. New York Civil Liberties Union. (2021, April 14). Retrieved April 25, 2023, from https://www.nyclu.org/en/news/nypd-published-its-arsenal-surveillance-tech-heres-what-we-learned

The public oversight of Surveillance Technology (POST) act: A resource page. Brennan Center for Justice. (n.d.). Retrieved April 25, 2023, from https://www.brennancenter.org/our-work/research-reports/public-oversight-surveillance-technology-post-act-resource-page

References