1) Grudin: Why CSCW applications fail? (1988)
CSCW systems fail for the following 3 reasons:
- There is a disparity between who needs to do extra work to support the system and who actually benefits from it. Grudin gives an example of automatic meeting schedulers which benefits managers whose secretaries manage electronic calendars, but their reportees had to put in the actual work of maintaining electronic calendars since they did not have secretaries.
- Design intuitions work well for single-user systems, but not for multi-users systems. The designs often work well for people similar to those who built the system, but not those who actually need to put in extra work to enable the system to function properly.
- It is very challenging to evaluate the systems.
2) Horvitz: Principles of Mixed-Initiative User Interfaces (1999)
Motivation:
- In 1999, there was rapid progress in HCI research on two diverging approaches: building autonomous agents which automate services, and supporting the user to directly manipulate interfaces. Horvitz recognizes the value that both approaches bring and presents a set of combined design principles. This is termed as a mixed-initiative approach where intelligent services and users may often collaborate efficiently to achieve the user’s goals
The paper presents the design of LookOut, a feature of Microsoft Outlook, which automatically parses content of email messages grounding it based on the send date, and creates calendar events on Outlook which can be manipulated by the user. The design of LookOut is presented adhering to a set of 12 principles concerning mixed-initiative design proposed at the outset of the paper. The paper has an interesting discussion of the consideration of uncertainty, as well as the expected costs and benefits of taking autonomous action in different situations.
3) Ackerman: The Intellectual Challenge of CSCW: The Gap Between Social Requirements and Technical Feasibility (2000)
The paper reviews the existing state of CSCW research, argues that the main problem is the social-technical gap, or the gap between social requirements and technical feasibility, and presents solutions for what the field must do moving forward.
Existing CSCW research: human activity is nuanced, flexible and contextualized; CSCW systems need to be modeled similarly.
CSCW system / HCI problem: computational system that enables and facilitates collaborative work among multiple individuals or groups. It deals with research surrounding how people manage or interact with these computational systems.
Gap: There are no existing HCI mechanisms that can fully automate the everyday social handling of personal information. We must restrict the problem scope from what is socially appropriate.
Ackerman uses the P3P - privacy preferences project, to illustrate this. P3P allows services and users to configure privacy preferences related to data sharing. There is a match when the service and user preferences match. But Ackerman argues that no technical solution can accurately capture what is socially appropriate where exceptions are the norm and hence the problem scope needs to be changed.
In summary there are 3 main issues with CSCW systems:
- Systems are not nuanced
- Systems don’t allow for ambiguity
- Systems are not socially flexible
Arguments against the significance of the gap: technology will change, users will change. But after 25 years, we can see that technological solutions remain elusive, and forcing users to change is against the central premise of HCI.
Solutions: CSCW/HCI researchers should centralize the gap in their work. Ackerman argues that CSCW needs to be reconceptualized as the science of the artificial.
Here are some key highlights:
- Simon in his book The Sciences of the Artificial states that science is analysis of the natural, whereas engineering is the synthesis of the artificial. New sciences of the artificial are Computer Science and AI.
- Like AI, CSCW could be a renewal and reconstruction of Simon’s viewpoint.
Concluding quote:
“HCI and CSCW systems need to have at their core a fundamental understanding of how people really work and live in groups, organizations, communities, and other forms of collective life. Otherwise, we will produce unusable systems, badly mechanizing and distorting collaboration and other social activity”
4) Wobbrock and Kientz: Research Contributions in HCI Research (2016)
7 types of HCI contributions:
- Empirical:
Definition: new knowledge creation from through qual/quant observation and data gathering
Methods: interviews, experiments, surveys, ethnographies, diaries, logs
Significance: findings highlight new knowledge
Evaluation: importance of findings and soundness of methods
- Artifacts:
Definition: Arise from generative design
Methods: New systems, architectures, tools, toolkits, sketches, mockups
Significance: compel us to imagine new futures
Evaluation: could be accompanying empirical studies, quantitative evaluation, how well designs negotiate trade-offs and keep competing priorities in balance.
- Methodological:
Definition: Create new knowledge by informing how you do research
Methods: new techniques for design, analysis, and measurement
Significance: improve research practices
Evaluation: utility, reproducibility, reliability, and validity
- Theoretical:
Definition:Create new knowledge by defining qual/quant theories that have descriptive and/or predictive power
Methods: consist of new or improved concepts, definitions, models, principles, frameworks
Significance: theoretical contributions inform what we do, why we do it, and what we expect from it
Evaluation: novelty, soundness, and power to describe, predict, and explain. Validation through empirical work.
- Survey:
Definition:review and synthesize work done on a research topic
Methods: review prior work on a research topic which has some maturity
Significance: exposing trends and gaps
Evaluation: evaluated based on how well they organize what is currently known about a topic and reveal opportunities for further research
- Dataset:
Definition: provides a new and useful corpus, often accompanied by an analysis of its characteristics, for the benefit of the research community
Methods: Synthesis of a new corpus from a variety of sources - web, crowdsource etc
Significance: enable evaluations of shared repositories by new algorithms, systems, or methods
Evaluation: extent to which they supply the research community with a useful and representative corpus against which to test and measure.
- Opinion:
Definition: seek to change the minds of readers through persuasion
Methods: draw upon many of the above contribution types to make their case
Significance: goal is to persuade, compel reflection, discussion, and debate, not just inform.
Evaluation: strength of their argument. Strong arguments credibly use supporting evidence and fairly consider opposing perspectives
5) Rosenblat and Stark: Algorithmic Labor and Information Asymmetries: A Case Study of Uber’s Drivers (2016)
Summary:
The study examines how uber drivers experience labor under a regime of automated and algorithmic management. Specifically, through a qualitative study combining analysis of online forum data (1350 items) and driver interviews (7 participants), the paper highlights:
(i) the information and power asymmetry which exists through soft control and gamification and is vital for uber to run its business, and
(ii) critiques of uber’s algorithms and advertisement and corporate communications.
Highlights:
- Lee, Kusbit, Metsky, Dabbish coined the term Algorithmic Management (2015)
- Online Forums: UberDrive, UberOps, UberCool → collected 1350 archival items
- “Uber communicates that some services have prices and some services do not, but the power for determining these distinctions resides with Uber alone”
- “drivers perceive that Uber favors the passenger in adjudications, and even report having to gather their own data to prevent wages from being retracted”
- “the ambiguity and resistance surrounding “surge pricing” surfaced as the most obvious intersection of data collection and information asymmetry in everyday driver experience”
- “gamic elements of behavioral engagement tools, such as surge pricing, the conflation of realtime and predictive demand, and blind passenger acceptance, illustrate the multifaceted ways that Uber influences the relationship between supply and demand”
6) Alkhatib, Bernstein, and Levi: Examining Crowd Work and Gig Work Through The Historical Lens of Piecework (2017)
The paper characterizes on-demand work, i.e. crowd work like AMT and gig work like uber drivers, using historical analogy of piecework, which has been well studied and has many parallels to on-demand work.
Characteristics of piecework:
- Paid workers for quantity of work and not time e.g. Uber/AMT mostly pay per task
- Gave workers a sense of independence e.g. Uber drivers have some independence
- Structured tasks which allow people with narrow education to contribute easily e.g. AMT assumes no professional training
Research Questions:
- what are the complexity limits of on–demand work?
- Currently no crowdsourced solution for idea generation, cross-domain high-quality crowd-powered author for writing tasks, general solution for sense-making
- By augmenting human intellect, compute has shifted the complexity of work that can be done with minimal training
- Conclusion: it's not clear that on-demand work can solve much more complex tasks than piecework.
- how far can work be decomposed into smaller microtasks?
- On-demand work has been decomposed into smaller tasks e.g. AMT
- Began to modularize expertise to bring into crowdwork e.g. software design
- Used measurement to optimize workers behaviors
- 4th stage yet to happen: rise of systems to enable workers with very narrow expertise to do work e.g. using online courses as proof of qualification to complete a microtask.
- what will work and the place of work look like for workers?
- Worker relationships are likely to be inhibited by decentralized design of on-demand work
- Historically piecework was managed by a foreman, that has now been replaced by algorithms in on-demand work which are cold and unforgiving agents.
- History of piecework suggests worker relationships might be improved if more human management styles are used e.g. instead of using an algorithm, build tools and dashboards to empower workers.
7) Webb: The Impact of AI on the Labor Market (2020)
The paper presents a methodology to measure exposure of occupations to different technologies - software, robots and AI. The methodology calculates an exposure score as the overlap between task descriptions for the occupations and the tasks performed by the technology using patent titles as a proxy for technology ability.
Findings:
Nature of the occupation:
- Robots have mostly automated occupations which have manual tasks, “muscle tasks”
- Least-exposed occupations to software are those that require manual labor; “routine information processing”
- AI - high skilled occupations are most exposed.
Nature of individuals
- Robots: less than high school, low-wage, men under 30
- Software: middle wage, more men (women perform complex interpersonal interaction tasks)
- AI: MS degrees, older workers
Employment and Wages:
- Robots: moving from 25->75 percentile of exposure: decline in employment 9-> 18%, decline in wages 8 -> 14%
- Software: moving from 25->75 percentile of exposure: decline 7-11% (employment), 2-6% (wages)
- AI: reduction in the 90:10 percentile wage inequality
Other references on impact of AI on Labor:
- Future of Work with AI Agents: Auditing Automation and Augmentation Potential across the U.S. Workforce; https://arxiv.org/abs/2506.06576
- What Happens When AI Sets Wages https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5404966
- AI-Generated “Workslop” Is Destroying Productivity by Kate Niederhoffer, Gabriella Rosen Kellerman, Angela Lee, Alex Liebscher, Kristina Rapuano and Jeffrey T. Hancock https://hbr.org/2025/09/ai-generated-workslop-is-destroying-productivity
- A Research Agenda for the Economics of Transformative AI; https://www.nber.org/papers/w34256; Erik Brynjolfsson, Anton Korinek & Ajay K. Agrawal
- Autor, David. Applying AI to Rebuild Middle Class Jobs. No. w32140. National Bureau of Economic Research, 2024.
- Acemoglu, Daron. “The Simple Macroeconomics of AI.” (2024).
- Agrawal, Ajay, Joshua S. Gans, and Avi Goldfarb. "Artificial intelligence: the ambiguous labor market impact of automating prediction." Journal of Economic Perspectives 33.2 (2019): 31-50.
- https://www.brookings.edu/articles/the-effects-of-ai-on-firms-and-workers/
- Humlum, Anders, and Emilie Vestergaard. "Large Language Models, Small Labor Market Effects." University of Chicago, Becker Friedman Institute for Economics Working Paper 2025-56 (2025).
- https://economicgraph.linkedin.com/research/ai-skills-resources?src=li-in&veh=CGIxEGxSMxConvoxGLOBALxENxFY25Q4xAISkillsMomentx1456xEngagementxv1&mcid=7320449884845723648
- Anthropic Economic Index (see references in that paper)
- https://hai.stanford.edu/news/assessing-the-real-impact-of-automation-on-jobs
- https://www.nytimes.com/2025/05/25/business/amazon-ai-coders.html
- https://www.theatlantic.com/economy/archive/2025/04/job-market-youth/682641/ derek thomson
- labor market for young people never fully recovered from the coronavirus pandemic—or even, arguably, from the Great Recession
- A second theory points to a deeper, more structural shift: College doesn’t confer the same labor advantages that it did 15 years ago. lifetime-earnings gap between college grads and high-school graduates stopped widening.
- The third theory is that the relatively weak labor market for college grads could be an early sign that artificial intelligence is starting to transform the economy
8) Zhang: Algorithmic Management Reimagined For Workers and By Workers: Centering Worker Well-Being in Gig Work (2022)
The study explores concerns of rideshare workers on their well being and imagines new platforms through the lens of algorithmic imaginaries.
The specific research questions include:
- How do gig work’s algorithmic management and platform design affect worker well-being?
- What do gig workers desire to see in technology designs that support their well-being and work preferences?
The paper argues algorithmic imaginaries, or ways of thinking what algorithms are and how they should function, is better than mental models. The authors conduct focus groups to understand worker concerns and then conduct follow up participatory design sessions asking participants to envision new platform design features.
Highlights:
- RQ1: drivers were presented 5 sets of ride requests and asked what they preferred and why.
- RQ2: 5 prompts and 3 intervention types
- Prompts:
- Ride matching, Quests, driver-rider incidents, platform support for defending driver accusation, driver self assessment of work performance and well-being.
- Examined the prompts through the lens of 3 possible interventions in the gig economy: collective information sharing, third-party applications, new platform design features
- 4 sets of problems and corresponding solutions in the findings section:
- Lack of well-being support
- Problematic gamification and differential incentives
- Information asymmetry and opacity
- Individualized work
- The thematic analysis maps in the main paper and appendix are really done well and can be inspiring for our own work.
9) Dubal: On Algorithmic Wage Discrimination (2023)
Definitions:
- algorithmic wage discrimination
- Labor market in which people who are doing broadly similar work, with the same skill, for the same company, at the same time, may receive different hourly pay, calculated with ever-changing formulas using granular data on location, individual behavior, demand, supply, and other factors, determined through an obscure, complex system that makes it nearly impossible for workers to predict or understand their constantly changing, and frequently declining, compensation.
- 2 outcomes:
- different workers can earn vastly different amounts for substantially similar work, making payment unequal
- the same worker can earn vastly different amounts in other moments, making wages highly unpredictable; wages can be so low as to fall well below what legislatures have determined to be the lowest minimum hourly compensation
- individual consumers are charged as much as a firm determines they are willing to pay
Harms:
- Workers find that, in contrast to more standard wage dynamics, being directed by and paid through an app involves opacity, deception, and manipulation.
- Those who are most economically dependent on income from on-demand work frequently describe their experience of algorithmic wage discrimination through the lens of gambling
- Because the on-demand workforces that are remunerated through algorithmic wage discrimination are primarily made up of immigrants and racial minority workers, these harmful economic impacts are also necessarily racialized.
- Someone traveling from a wealthy neighborhood to another tony spot might be asked to pay more than another person heading to a poorer part of town, even if demand, traffic and distance are the same
- riders who start in non-white, low-income areas to have to wait extended periods of time for a ride while in other instances, price gouging consumers who were fleeing disaster
- Stark, J., & Diakopoulos, N. Uber seems to offer better service in areas with more
white people. That raises some tough questions. THE WASHINGTON POST (2016).; “Caldor Fire Evacuees Report Tahoe Ride-Hail Price Gouging of More Than $1,500,” KQED,accessed October 25, 2022 https://www.kqed.org/news/11887558/caldor-fire-evacueesreport-tahoe-ride-hail-price-gouging-of-more-than-1500.
- different forms of perceived calculative unfairness among drivers, rooted both in the variability of their pay and the differences in their pay. Experienced drivers generally report having to work longer hours to earn the amount that they earned early in their career.
- “I was promised 80% of the fares [when I started], and within two months there was no relationship between what the passenger was paying and what I was earning. So, I had started making about $200 a day and within two months it was $150. And after a while, I was having a hard time even making a $100!”
- workers who labored for longer hours complained that they earned less per hour than workers who worked shorter hours.
- Among those who drive roughly similar routes and hours, some make more than others
- Many explained that they were on group texts with other drivers who would “call out” fake surges. After being added to one of these text threads, I received text messages that alerted drivers to avoid certain areas (e.g., “I’m in the Marina. It’s dead. Fake surge.”)
Legality:
- So long as this practice does not run afoul of minimum wage or anti-discrimination laws, nothing in the laws of work makes this form of digitalized variable pay illegal
- in two states—California and Washington—the non-payment for non-engaged time has been explicitly legalized, thus leaving workers’ hourly wages and their determination to the whim of the hiring entities
- CA: 120% of the minimum wage for the area in which they are working—but only for “engaged time”
- WA: $1.17 per mile and $0.34 per minute, including a minimum pay of $3.00 per trip
- History:
- three levers that Uber uses to influence driver behavior: base fares, geographic surges, and quests
- Until 2022, drivers in California were paid a base fare rooted in what appeared to be an objective calculation: time and mileage
- Fall of 2022, Uber replaced the time and mileage calculation with a system called “Upfront Pricing.” Drivers are presented with a base fare—or the upfront pricing—but they do not know how it is calculated. California drivers have argued that upfront pricing has lowered their overall earnings.
- Because base fares are generally quite low, drivers rely heavily on surges and quests (alongside other “offers” or wage manipulators) to increase their earnings.
Empirical Evidence of Hourly Pay Calculation:
- Hyman’s research, paid for by Uber and later touted by Uber CEO Dara Khoshrowshahi, found that a typical Uber driver in Seattle made about $23 an hour, with 92% of workers earning above the local minimum wage. (without cost of vehicle)
- two labor economists, James Parrott and Michael Reich, and commissioned by the city of Seattle, arrived at a very different number: $9.74 per hour, with the majority of drivers earning far less than the minimum Wage. (with cost of vehicle)
- RDU: $6.22 per hour (after accounting for expenses and lost benefits) https://nationalequityatlas.org/prop22-paystudy
Transparency:
- Many workers are not sure how much money they made—or in some cases, lost.
- Uber calculates riders’ propensity for paying a higher price for a particular route at a certain time of day.
- withholding fare and destination data from drivers when presenting them with rides, imposing other non-price restraints on drivers, such as minimum acceptance rates, and utilizing non-linear compensation systems based on hidden algorithms rather than transparent per-mile, per-minute, or per-trip pay
- Drivers are presented with a base fare—or the upfront pricing—but they do not know how it is calculated. California drivers have argued that upfront pricing has lowered their overall earnings.
- Even within a particular locale, the surge rate is highly variable between drivers.
- According to Ben, an active driver and organizer with Rideshare Drivers United, “Everyone has different levels of surge at any given time. If the median surge is 2.5, someone else might have 5.0. We don’t know what this is based on. It’s not transparent.”
- information asymmetry that exists between the worker and the firm, this variability generates a great deal of suspicion about the algorithms that determine their pay
- “There is no way to know why the app is making these decisions for me” “I was putting the work in the way. I was supposed to, but the app was punishing me because it was cheaper to give it to someone else; literally feels like you’re being punished by some unknown spiteful God”
- workers have sought to make transparent both the data and algorithms that determine their pay (including those that determine work allocation)
Algorithm:
- Prop 22: drivers have received a base fare rooted in what Uber calls “Upfront Pricing”—an amount based on a black-box algorithmic determination. In addition to this base fare, Uber drivers rely upon any number of offers, bonuses, surges, quests, and other “wage manipulators” from which to raise their base fare, which in most cases is untenably low by itself
Mitigations:
- two novel data cooperative projects, the Driver’s Seat Coop (in the U.S.) and WeClock (in Europe), have been launched. These cooperative efforts, which counter-collect data collected by on-demand firms using a separate app, reflect the belief that if workers can collectively pool and exert ownership and control over their data, then, they will be able to better understand their work experiences and “control their destiny at work.”
- The Driver’s Seat Coop, run by longtime labor organizer Hays Witt and supported, in part, by the Ford Foundation, is a cooperative of ride-hail and delivery workers who share in profits from their data collection. The cooperative has sold the pooled data to cities and transportation agencies who, in turn, desire to use the data to address governance issues. Driver’s Seat Coop relies on a third-party service called Argyle to connect to the on-demand labor platforms and import their earnings data and activities
- Issues: can companies use this data collected in collaboration with Driver’s Seat Coop to create and sell data derivatives that trap workers into certain wage brackets based on their income history? Can they (do they) use this data to target workers for predatory pay day loans or to deny other kinds of credit?
- a statutory or regulatory non-waivable ban on the practice of algorithmic wage discrimination, including, but not limited to, a ban on remuneration through digitalized piece pay.
Gig Work vs Organized Work: Why is AWD unacceptable in gig work? (TRUS)
- Lack of transparency (people might prefer algos if transparency: Kahneman made this “pro-algo” argument in his recent book Noise)
- No recourse mechanisms unlike established organized practice
- difference between the criteria used in accepted organizational practice (like performance reviews and human judgment) and criteria used in algorithmic management (behavioral and economic inferences, tight integration of incentives with market signals like customer demand)
- Highly unpredictable and dynamic wages (1) which could fall below min-wage after accounting for all time and expenses (2) be based on demographics (illegal)
- rideshare companies have changed the status quo of equal pay for equal work, which is the implicit assumption in any organizational setting, where the burden is on the employer to justify a difference in wages
Price vs Wage Discrimination: Why is the former okay?
Consumers Hate ‘Price Discrimination,’ but They Sure Love a Discount - The New York Times
“The most important factor… is that shoppers understand the rules that merchants have created. Problems arise when there’s an “informational imbalance”.
https://www.cnn.com/2024/04/05/business/walmart-shoppers-class-action-settlement/index.html
Zephyr Teachout - Algorithmic Personalized Wages
10) Sweeney: Discrimination in Online Ad Delivery (2013)
Google ads (service aka Google AdSense) suggesting arrest appear more frequently when the search string contains names associated with blacks than whites regardless of whether the advertising company has arrest records of the person.
This isn’t illegal in itself since Title VII would only apply if you were able to prove that an employer used the ads about your arrest in a hiring decision. Furthermore, the advertiser and the ad may be protected free speech under the first amendment.
This is one of the first works which examined bias in ad delivery and google image search setting the precedent for several other studies.
11) Ali: Discrimination through Optimization (2019)
The paper presents empirical evidence of bias in ad delivery optimization on facebook, along gender and racial lines, which could be in potential violation of Title VII.
Methodology:
- Run a pair of ads at the same time, with the same budget, same audience, but different creative.
- Measure the fraction of delivered audience according to gender and race.
- Gender statistics: automatically provided by Facebook measurements
- Race statistics: cannot be automatically inferred; instead the authors make use of custom audiences, or list of people (names, emails, phone numbers, gender, race) created using voter records of North Carolina which have race ground truth. Facebook provides a breakdown of ad delivery via DMA (designated market area). So audiences are created such that all audiences part of a specific DMA are of the same race, and hence any ad impression in that DMA can be attributed to people belonging to that race.
Finding 1:
Skewed ad delivery occurs due to market effects alone, even when targeting the same audience with varying budgets.
Methodology:
Identical ads targeting the same audience but with varying budgets were run on Facebook.
Significance:
The audience that saw the ads ranged from over 55% men for low-budget ads to under 45% men for high-budget ads, demonstrating market effects alone can skew ad delivery across protected classes.
Finding 2:
Skewed ad delivery occurs due to the ad creative content (headline, text, and image).
Methodology:
Ads targeting the same audience but containing creatives stereotypically of interest to different genders and races were used (e.g., bodybuilding for men, cosmetics for women, hip-hop for Black users, country music for white users).
Significance:
Despite identical targeting and bids, ad delivery was heavily skewed based solely on the creative, with some ads delivering to over 80% men, over 90% women, over 85% Black users, or over 80% white users.
Finding 3:
The ad image alone significantly impacts ad delivery.
Methodology:
Experiments swapping different headlines, text, and images were run, including cases where the image contradicted the other creative components' stereotypical interests.
Significance:
Differences in delivery were significantly affected by just the image. E.g., an ad with male-stereotypical text/headline but a female-stereotypical image delivered primarily to women.
Finding 4:
Facebook likely automatically classifies ad images, skewing delivery from the ad run's start.
Methodology:
Ads with nearly transparent male/female stereotype images (visually indistinguishable but retaining data) were created.
Significance:
Statistically significant delivery differences based on the transparent images indicate Facebook's automated image classification and relevance estimation contribute to skewed delivery from the outset.
Finding 5:
Real employment and housing ads experience significantly skewed delivery.
Methodology:
Employment and housing ads were created and run while measuring delivery to different racial/gender users when optimizing for clicks.
Significance:
Despite identical targeting, ads for different job types and housing delivered to vastly different audiences based solely on the creative. E.g., lumber jobs: 72% white, 90% male; taxi jobs: 75% Black.
12) BHN: Fairness in Machine Learning (2023)
Chapter 2: Legitimacy
Question: is it morally acceptable to use ML in a specific scenario? E.g. social media banning, automated essay scoring, criminal risk prediction. This is different from the other notion of fairness - relative treatment of groups.
ML is not a replacement for human decision making because in high stakes decision making scenarios like hiring, credit and housing, decisions are typically made by a bureaucracy, and not an individual.
Bureaucracies incorporate procedural protections such as:
- Decisions are made transparently
- Decisions are made based on the right and relevant information
- Opportunity for recourse and appeal
Bureaucracies protect against arbitrary decisions: inconsistent or lack well justifiable reasoning. This is built on the principle that people are entitled to similar decisions unless there are reasons otherwise. Arbitrary decisions show a lack of respect for the people who are subject to them.
3 types of automation:
- Automate pre-existing decision making rules with software (no ML) e.g. applying for a drivers license online
- Replicate human decisions e.g. automatically grade an essay
- Issue: too similar to humans or too different from them
- Learn rules to predict a target aka predictive optimization, deals with concerns about reasons
PO concerns:
- Mismatch between goal and prediction target
- Fail to consider relevant information,
- may seize upon spurious correlations (e.g. color of sneakers and speed, fast runners like blue sneakers and slow runners like red sneakers; coach uses color of sneakers to decide team membership)
- Lack agency and recourse
PO concerns: Mismatch between goal and prediction target example
Goal: where to deploy police to lessen crime
Target: arrest data
Mismatch:
- many crimes are never observed and don’t result in any arrest, but your data only consist of arrests
- Even if you had all crime data, accurately predicting crime will help generate more arrests, rather than lessen crime
Conclusion:
To establish legitimacy, decision makers must affirmatively justify their scheme by: demonstrating the target's relation to agreed-upon stakeholder goals; validating the deployed system's accuracy; allowing recourse methods; and addressing other outlined dimensions. While procedural protections around automated systems can achieve justification, decision makers avoid implementing them, as it undercuts automation's intended cost savings.
Chapter 3: Classification
No fairness through unawareness: being blind to the sensitive attribute i.e. removing it from input cannot ensure fair classification. There will exist a redundant encoding of the sensitive attribute, especially in large feature spaces, across many different features.
3 Statistical Non Discrimination Criteria:
- INDEPENDENCE acceptance rate R⊥A
e.g. demographic parity P(Y^ = 1 | A=a) = P(Y^=1|A=b)
Note: group specific thresholds may not satisfy independence.
- SEPARATION error rate R⊥A | Y
P{Y^ = 1 ∣ Y = 1,A = a}= P{Y^ = 1 ∣ Y = 1,A = b} True Positive rate
P{Y^ = 1 ∣ Y = 0,A = a}= P{Y^ = 1 ∣ Y = 0,A = b} False Positive rate
E.g. For example, a lender could use a more lenient risk threshold for one group to lower its error rate.
- SUFFICIENCY output frequency of a specific score Y⊥A|R - Parity of Positive/Negative predictive value
P{Y = 1 ∣ R = r, A = a} = P{Y = 1 ∣ R = r, A = b}
P{Y = 1 ∣ R = r} = r (calibration)
P{Y = 1 ∣ R = r, A = a} = r (calibration by group implies sufficiency)
Non discrimination criteria can be satisfied by pre-processing, during training and post-processing classifiers.
Note: ProPublica implicitly adopted equality of false positive rates as a fairness criterion in their article on COMPAS scores (Black defendants had “twice the false positive rate” of White defendants). Northpointe, the maker of the COMPAS software, emphasized the importance of calibration by group in their rebuttal to ProPublica’s article.
Chapter 4: Relative notions of fairness
Question: why we might be concerned about uneven allocation of opportunities across specific groups and society overall?
Note: Race and gender have been the historical basis for organizing in most societies, not just idiosyncratic traits employers use to discriminate.
6 reasons why discrimination is morally incorrect:
- Relevance: race/gender has no relevance to outcome of decision
- Generalization: treats people within groups as overly uniform
- Prejudice: some groups are inferior to other
- Disrespect: demeans people of specific groups
- Immutability: treat people differently based on criteria they have no control over
- compounding injustice: can’t be culpable based on past injustices
Equality of opportunity: 3 Views
- Narrow: similar people should be treated similarly based on current level of similarity
e.g. education admission through a standardized exam (meritocracy)
- Middle: treat seemingly dissimilar people similarly by discounting dissimilarity as a result of past injustice beyond their control
e.g. affirmative action, texas law: top 10% of all high school students guaranteed admission
Some institutions must bear the cost even if there may be no guaranteed reward
- Broad: society should be organized such that people of similar ability should be able to attain similar outcomes
e.g. equalize quality of education accessible to rich and poor (not admissions)
Tensions: at what point in life are we ultimately responsible for how we compare with others?
Randomization and Thresholding: we must recognize that precisely controlled and purposeful randomness is not the same as arbitrariness or capriciousness.
three conditions hold (e.g. affordable housing, immigration visas):
- a resource to be allocated is indivisible,
- there are fewer units of it than claimants, and
- there is nothing that entitles one claimant to the resource any more or any less than other claimants.
Base rates (Error rate parity somewhat realizes the middle view):
One thing we can do even without the features is to look at differences in base rates (i.e., rates at which different groups achieve desired outcomes, such as loan repayment or job success). If the base rates are significantly different —and if we assume that individual differences in ability and ambition cancel out at the level of groups — it suggests that people’s qualifications may differ due to circumstances beyond the individual.
In fact, if base rates are so different that we expect large disparities in error rates that cannot be mitigated by interventions like data collection, then it suggests that the use of predictive decision making is itself problematic, and perhaps we should scrap the system or apply more fundamental interventions
Chapter 7: Broader view of discrimination
Social Scientists organize discrimination into 3 levels:
- Structural: the way society is organized e.g. laws
- Predictive systems preserve structural advantages and disadvantages
- A 2019 study found a healthcare risk prediction system exhibited racial bias, assigning lower risk scores to equally at-risk Black patients versus White patients due to the model predicting costs instead of needs, reflecting less spending on Black patients with the same conditions.
- Online job ads reinforce structural discrimination e.g. truck drivers
- Propagate algorithmic monoculture i.e. homogeneity in decision making
- Shift power to those at the top of the bureaucracy - e.g. policy makers, ML experts
- Interventions:
- Rather than individual consent, allow collective consent of geographical communities - collectively consent or reject tools
- Organizational: at the level of organizations e.g. universities

- Interpersonal: attitudes and beliefs of individuals
13) Kraut, Robert E., and Paul Resnick. Building Successful Online Communities: Evidence-Based Social Design (2012)
Chapter 4: Regulating Behavior in Online Communities
Part 1: How to lessen impact of bad behavior:
- Hide/degrade inappropriate posts
- Revert bad wiki edits
- Remove or discount manipulated ratings/links
Part 2: How to limit bad behavior:
- Throttles/quotas: Limit rate of posting/editing to prevent abuse
- Charging currency: Require earned credits/currency to post, limiting repetitive bad behavior
- Gags/bans: Temporarily or permanently restrict accounts from participating
- Registration barriers: Make it harder to create new accounts to bypass restrictions
Part 3: How to encourage voluntary compliance:
- Clear policies & fair enforcement: Well-defined, openly enforced rules increase legitimacy
- Graduated sanctions: Start with light sanctions, escalating for repeat offenses
- Ignore trolls: Deprive attention-seekers of the reaction they want
- Allow saving face: Frame disciplinary actions in a way that avoids embarrassment
- Community norm-setting: Involve the community in defining and endorsing behavioral guidelines
Recommendations:
- Start with softer approaches:
- Clarify community guidelines
- Gentle reminders for minor violations
- Move off-topic discussions to appropriate spaces
- Use tangible remedies for persistent offenses:
- Hiding/removing problematic content
- Temporarily restricting accounts
- Permanent bans for unremitting abuses
- Build legitimacy and compliance:
- Community participates in setting norms/rules
- Graduated series of sanctions
- Emphasize voluntary cooperation over punitive measures
Chapter 6: Starting New Online Communities
The chapter emphasizes the importance of carving out a useful niche, defending it from competitors, and reaching critical mass to ensure community success.
Design decisions for carving out a nice involve:
- Interactions: select, sort, highlight, notifications
- Structure: size and scope
Design decisions for reaching critical mass involve:
- External communication and integration: sharing IDs and profiles
- Create rewards for retention and recruitment
- Advertising and communicating in the right way
14) Van den Hoven, Jeroen. The Cambridge handbook of information and computer ethics (2010)
Chapter 4: The use of normative theories in computer ethics.
Moral dilemmas : there are 2 choices and a person cannot take both – how should they proceed? e.g. trolley problem: a train is chugging along, if you do nothing the train will kill 5 people, if you pull a lever you will kill 1 person.
Utilitarian ethics: limit damage, kill 1 person
Deontological ethics: do nothing, matter of principle
- Moral dilemmas are often the result of hundreds of prior design decisions and choices
- Computer ethicists should probe beyond the status quo and see how the problem came into being and what are the design decisions that have led up to it.
- We cannot resolve trolley problems satisfactorily once they have already presented themselves to us, we need to instead prevent them from occurring in the first place.
Criticisms of LLMs in HCI work:
- What’s the purpose of scale? What’s the point if we learn the same thing from 1 person vs 1 million people?
- It won’t show surprises
Committee Feedback:
- What should advertisers actually do, assuming that they will use an image? What societal considerations should they look at and assess?
- I need to be more precise and take a stance in these hard societal questions. Someone will be unhappy, but that’s okay. My responses sounded too good.
- Build up foundation in CSS, econometrics / stats classes
- Look up power analysis
- Syllabus of a course and read up. But don’t limit yourself to just reading, make sure you plan your next research project to use those analysis.
- Deactivations - 2 kinds:
- Deactivations due to algorithmic action e.g. rideshare
- Loss of Jobs: law firms will hire half the number of associates next year because LLMs can do a very good first draft
- Use of LLMs in policy: for comments, there’s a law that you must account for every comment which is made by humans (?) – so validity because very very important – ensure that the LLM is accurately representing the voices of people.
- Think closely about the validity of LLMs for specific tasks