ABCDEF
1
AI Bias Risk Assessment Template: v0.1
2
Name: [Project or Component Name]
3
Contributors: [Names]
4
AREACALCULATED RISK OF BIAS
5
DATA BIASMEDIUM
6
ALGORITHMIC BIASLOW
7
USER INTERACTION BIASHIGH
8
SAMPLING/SELECTION BIASMEDIUM
9
QuestionAnswer

Select most appropriate answer (All items must hold true in each selection)
Additional info

Provide any additional information relevant to question
10
DATA BIASImpact ResponsesProbability of BiasDefinitions and ExamplesNotes
11
Which best describes your consideration of cultural biasI have discovered unwanted cultural bias but implemented bias mitigation.Somewhat Likely.Cultural bias arises due to lack of diversity based on gender, race, economic, age, tribe, education, religion, language and geographic location. Examples: In both the Open Images and ImageNet datasets, the US and Britain represent the top locations. Datasets like IJB-A and Adience contain mainly light-skinned subjects which can bias the analysis of dark skin groups.
12
Which best describes your consideration of historical biasI do not know or have not considered historical bias.Likely.Historical bias arises when past issues in the world seep into data. Example: 5% of Fortune 500 CEOs are women which would cause search results to be biased against Male CEOs. Although this reflects you should consider whether your algorithm should reflect that reality.
13
Which best describes your consideration of aggregation biasI do not know or have not considered aggregation bias.Likely.Aggregation bias arises when false conclusions are drawn for a subgroup based on observing other subgroups or generally due to false assumptions about a population that affect the model’s outcome. Example: you might have data that shows that one particular state has a lower than average per-capita income. However, you can’t say for sure that every county in that state has a lower than average income, and you definitely can’t say that every person in the state has a low income.
14
Which best describes your consideration of temporal biasI do not know or have not considered temporal bias.Likely.Temporal bias arises when populations and/or behaviors differ over time. Example: On Twitter people talking about a particular topic start using a hashtag to capture attention, then continue the discussion about the event without using the hashtag.
15
ALGORITHMIC BIASImpact ResponsesProbability of BiasDefinitions and ExamplesNotes
16
Which best describes your consideration of algorithmic biasI do not know or have not considered algorithmic bias.Likely.Algorithmic bias is when the bias is not present in the input data and is added purely by the algorithm. Example: After uploading an image, a system asks the user for tags to associate with the image, but also provides a list of recommended tags. Users will generally choose recommended tags rather than providing their own. In this case, the algorithm reduces the chance to capture human-labeled data.
17
Which best describes your consideration of popularity biasI do not know or have not considered popularity bias.Likely.Popularity bias arises because items that are more popular tend to be exposed more. However, popularity metrics are subject to manipulation and therefore the presentation is not a result of good quality but bias. Example: Fake reviews and social bots seen in search engines or recommendation systems, where popular objects are presented more to the public.
18
Which best describes your consideration of evaluation biasI do not know or have not considered evaluation bias.Likely.Evaluation bias arises during model evaluation because of the use of inappropriate and disproportionate benchmarks. Example: Adience and IJB-A benchmarks are used to evaluate facial recognition systems that were biased towards skin color and gender.
19
Which best describes your consideration of emergent biasI do not know or have not considered emergent bias.Likely.Emergent bias arises in algorithms that use analysis of data to feed and present other content that matches the ideaset the user has already seen. Example: Tay, a twitter bot created by Microsoft, assimilated some of the Internet's worst tendencies including mysogny and racism after learning to engage in casual and playful conversations with users.
20
Which best describes your consideration of ranking or position biasI do not know or have not considered ranking or position bias.Likely.Ranking or position bias arises when items are ordered or ranked such that top-ranked items attract more attention and get more clicks. This category of bias greatly affects learning algorithms in search engines and crowdsourcing applications.
21
USER INTERACTION BIASImpact ResponsesProbability of BiasDefinitions and ExamplesNotes
22
Which best describes your consideration of social biasI do not know or have not considered social bias.Likely.Social bias happens when other people’s actions or content coming from them affect our judgment. Example: Wwe want to rate or review an item with a low score, but when influenced by other high ratings, we change our scoring thinking that perhaps we are being too harsh.
23
Which best describes your consideration of presentation biasI do not know or have not considered presentation bias.Likely.Presentation bias arises when there are differences in how items are presented for user feedback. Fonts, colors, and media types should therefore be consistent across items presented for feedback. Example: If some items look more appealing than others they could attract more attention, and get more clicks.
24
Which best describes your consideration of observer biasI have not discovered any observer bias or determined it is acceptable.Likely.Observer or confirmation bias is the effect of seeing what you expect to see or want to see in the data. Example: When researchers subconsciously project their expectations onto the research. They may unintentionally influence participants during interviews and surveys, or cherry pick participants or statistics that will favor their research.
25
Which best describes your consideration of linking biasI do not know or have not considered linking bias.Likely.Linking bias arises when network attributes obtained from user connections, activities, or interactions differ and misrepresent the true behavior of the users. Example: Social networks can be biased toward low-degree nodes when only considering the links in the network and not considering the content and behavior of users in the network.
26
Which best describes your consideration of behavioral biasI do not know or have not considered behavioral bias.Likely.Behavioral bias arises from different user behavior across platforms, contexts, or different datasets. Example: Differences in emoji representations among platforms can result in different reactions and behavior from people and sometimes even leading to communication errors.
27
Which best describes your consideration of cause-effect biasI have discovered cause-effect bias but implemented bias mitigation.Likely.Cause-effect bias arises as a result of the fallacy that correlation implies causation. Example: A data analyst in a company wants to analyze how successful a new loyalty program is. The analyst sees that customers who signed up for the loyalty program are spending more money in the company’s e-commerce store than those who did not. The analyst jumps to the conclusion that the loyalty program is successful. However, it might be that only more committed or loyal customers, who might have planned to spend more money anyway, are interested in the loyalty program.
28
Which best describes your consideration of production biasI do not know or have not considered production bias.Likely.Production bias arises from structural, lexical, semantic, and syntactic differences in the content generated by users. Example: Differences in the use of language across different gender and age groups. Such differences can also be seen across and within countries and populations.
29
SAMPLING/SELECTION BIASImpact ResponsesProbability of BiasDefinitions and ExamplesNotes
30
Which best describes your data sources and consideration of activity biasI do not know or have not considered the sources of data.Likely.When data comes from a single or selective set of sources and are therefore unlikely to reflect the population as a whole. Activity bias arises because only a few users generally contribute to the total set of data. Examples: 7% of users produce 50% of the posts on Facebook. 4% of users produce 50% of the reviews on Amazon.
31
Which best describes your sampling methodA probability sampling method was used.Likely.Probability sampling helps produce representative samples by eliminating voluntary response bias and guarding against undercoverage bias. All probability sampling methods rely on random sampling.
32
Which best describes your sample sizeI do not know how to describe the sample sizeLikely.Increasing the sample size tends to reduce the sampling error as it makes the sample statistic less variable. However, a large sample size cannot correct for the methodological problems that produce survey bias. Example: The Literary Digest had a sample size with over 2 million surveys were completed. However, the large sample size could not overcome the problems of undercoverage and nonresponse bias with the sample.
33
Which best describes your consideration of measurement biasI have discovered unwanted measurement bias and am unable to mitigate it.Likely.Measurement bias happens from the way we choose, utilize, and measure a particular feature. Example: Prior arrests and friend/family arrests are used as mismeasured proxy variables for measuring level of "riskiness" or "crime". Certain communities are controlled and policed more frequently and thus have higher arrests; however, one should not conclude people from these communities are more dangerous.
34
Which best describes your consideration of omitted variable biasI do not know or have not considered omitted variable bias.Likely.Omitted variable bias occurs when one or more important variables are left out of the model. Example: A model is designed to predict the annual percentage rate at which customers will stop subscribing to a service fails to account for the appearance of a competitor with lower prices.