BEFGHIJKLMOPQRST
1
Name (First & Last, or Pseudonym)Funding Source:Country:City:Main research topic:Additional research topics:Link to most relevant research write up:Link to public profile:I'd like to be approached for... Notes
4
Shoshannah TekofskyFTX FFNetherlandsUtrecht
Skilling Up, with a specific focus on mapping out the problem space of alignment
https://www.lesswrong.com/s/aK4hyeJyM2sYNS4Yc
https://www.lesswrong.com/users/shoshannah-tekofsky
Research collaboration, Writing collaboration, Writing reviews, Sparring partner (chat), Local AIS events, Online AIS events, Co-organizing AIS events
5
Tamsin LeakeAttempting to get LTFFFranceNantes
Alignment to formal goals
Anthropics, metaethics, embedded agency, various other agent foundations topics
https://carado.moe/rough-sketch-formal-aligned-ai.html
https://carado.moe
Research collaboration, Writing collaboration, Sparring partner (chat), Sparring partner (video call), Local AIS events, Online AIS events
6
Nicholas Kross
None. Open to funding! Will likely apply in near future
United StatesRochester, NY
Upskilling (especially on prereq/useful maths), hopefully theory/paradigm work.
I like johnswentworth's work, although I haven't gotten to dive too deeply in yet.
N/A
https://www.lesswrong.com/users/nicholaskross
Research collaboration, Writing collaboration, Writing reviews, Sparring partner (chat), Local AIS events, Online AIS events, Being taught more about AIS
7
Arun JoseLTFFIndiaTrivandrum
High-level interpretability - trying to identify the presence / nature of certain high-level mechanistic structures or properties in a model (think the internalized objective of an optimizer, or myopia)
Simulator theory (https://www.lesswrong.com/tag/simulator-theory); some agent foundations stuff (particularly as relevant for high-level interpretability, like mechanistic models of agency, etc); tentatively modularity in neural networks; generally upskilling in depth on PyTorch (especially with transformers) and low-level interpretability (most existing interpretability work), and in breadth on stuff like linear algebra and complex systems to have more frames to think about stuff.
https://www.lesswrong.com/posts/JqnkeqaPseTgxLgEL/conditioning-generative-models-for-alignment
https://www.jozdien.com/
Research collaboration, Writing collaboration, Writing reviews, Sparring partner (chat), Sparring partner (video call), Local AIS events, Online AIS events
8
Roman LeventovSelf-fundedIndonesiaBali
Scale-free, physics-based ethics, collective intelligence, Active Inference, agency
Economic and social impacts of the development of TAI
https://www.lesswrong.com/posts/oSPhmfnMGgGrpe7ib/properties-of-current-ais-and-some-predictions-of-the
https://www.lesswrong.com/users/roman-leventov
Research collaboration, Writing collaboration, Writing reviews, Sparring partner (chat), Sparring partner (video call), Local AIS events, Online AIS events
9
Leon LangSERI MATSNetherlandsAmsterdam
Upskilling and Shard Theory
Equivariant Deep Learning, Multivariate Information Theory
https://openreview.net/forum?id=ajOrOhQOsYx
https://twitter.com/Lang__Leon
Research collaboration, Writing collaboration, Writing reviews
I'm doing a PhD unrelated to AI Safety. At the start of 2023, I intend to do a half-year break from that to focus on AI Safety upskilling, starting with SERI MATS
10
Marius Hobbhahn
SERI MATS, LTFF, Emergent Ventures
Germany (but might move soon)
Tuebingen
Currently switching to interpretability
Have done work in AI forecasting
https://www.lesswrong.com/posts/bumgqvRjTadFFkoAd/science-of-deep-learning-a-technical-agenda
https://www.lesswrong.com/users/marius-hobbhahn, https://www.mariushobbhahn.com/?from=@
Research collaboration, Writing collaboration, Writing reviews, Online AIS events
11
Koen Holtman
Self-funded till now, but starting to look for external funding
The NetherlandsEindhoven
Research into AI and AGI safety topics with intent to influence AI policy/regulation
Agent foundations for aligned agents, corrigibility, formal methods (these used to be my main topics, now secondary)
https://arxiv.org/abs/2112.10190
https://nl.linkedin.com/in/koen-holtman-2312844
Research collaboration, Sparring partner (video call), Local AIS events, Online AIS events
12
Nathan Helm-BurgerUSANevada City, CA
interpretable brain-inspired architectures
safety & generality benchmarks
https://www.lesswrong.com/users/nathan-helm-burger
Research collaboration, Writing collaboration, Writing reviews, Local AIS events, Online AIS events
13
lovetheusersSERI MATSUnited StatesDetroit
Prosaic alignment of large language models
alignment from unlabeled instructions, discovering cooperative intent, coprotection, mutual empowerment
https://twitter.com/lovetheusers
Research collaboration, Writing collaboration, Sparring partner (chat), Online AIS events
14
Rupert McCallumFTX FoundationNetherlandsEindhoven
Applying mathematical logic to Agent Foundations problems
Research collaboration, Local AIS events
15
Catalin MitelutSwitzerlandZurich
Agency and agency loss in AI-human interactions.
Bayesian inference and mechanistic interpretability in LLMs
https://www.lesswrong.com/users/catubc
Research collaboration, Writing collaboration, Local AIS events, Online AIS events
Agent and agency foundations; limits of IRL; mechanistic interpretability of agency; neuroscience of agency.
16
Alex MennenUnited StatesBerkeley, CA
inferring large sparse NNs approximated by smaller NNs (though I haven't gotten anywhere interesting on this and might be giving up)
upskilling in ML/prosaic alignment; investigating the relationship between the infinite-compute case of proof-based open-source game theory and the large finite-compute case (though there are lots of topics that I might be interested in working on)
https://www.lesswrong.com/users/alexmennen
Research collaboration, Sparring partner (chat), Local AIS events
17
Gunnar ZarnckeLTFFGermanyHamburg
Implementing brain-like AGI. Work on the aintelope project.
Show that a relatively simple set of instincts can shape complex and specifically pro-social
behaviors of a simulated agents.
https://www.lesswrong.com/posts/c2tEfqEMi6jcJ4kdg/brain-like-agi-project-aintelope
https://www.lesswrong.com/users/gunnar_zarncke
Research collaboration, Sparring partner (video call), Collaborators implementing agents in Python.
18
Ze ShenNoneMalaysiaMiri
Part time upskilling for now
https://www.lesswrong.com/posts/dYHiMeSdLrrX3cy4a/embedding-safety-in-ml-development
https://www.lesswrong.com/users/zeshen
Research collaboration, Writing collaboration, Writing reviews, Sparring partner (chat), Online AIS events
19
Antonio BorreroNoneSpainCádiz
Neuroscience and BCI for AI alignment
https://www.lesswrong.com/users/antb
Research collaboration, Sparring partner (chat), Local AIS events, Online AIS events
Fairly little experience in AI alignment
20
Johannes C. MayerLTFFGermany, currently in UKLondon
Understand properties AGIs are likely to have, e.g. better understand world models as any AGI we will build will have one.
https://www.lesswrong.com/users/johannes-c-mayer
Research collaboration, Writing collaboration, Sparring partner (chat), Sparring partner (video call), Local AIS events, Online AIS events
21
Gergely SzucsSERI MATSUSASan Francisco
Infra-Bayesianism and embedded agency
Mathematical foundations of agency
https://www.lesswrong.com/posts/cYJqGWuBwymLdFpLT/non-unitary-quantum-logic-seri-mats-research-sprint
https://www.linkedin.com/in/yegreg/
Research collaboration, Writing collaboration, Writing reviews
22
Miguelito de GuzmanNone Philippines George Town
Corrigibility, Inner and Outer alignment
GPT
https://www.lesswrong.com/posts/pu6D2EdJiz2mmhxfB/gpt-2-shuts-down-itself-386-times-post-fine-tuning-with
https://www.lesswrong.com/users/whitehatstoic
Research collaboration, Research Funding
23
Anna Katariina WisakantoSelf-fundedFinland/UKHelsinki/Cambridge
Formalisation of a general theory of alignment by bridging the gap between AI ethics and alignment with a pragmatic approach, complex systems for AI safety, and user dependency on general-purpose AI
Foundations of alignment, tool alignment, meta technical alignment, interconnectedness and cascades, norm convergence, "AI interface assistants to ascertain GPAI is a useful tool for all users", bidirectional theory of alignment, emergent "agentic" system behavior, LLM social alignment, multipolar AI safety,
http://dx.doi.org/10.13140/RG.2.2.27582.97600
http://lcfi.ac.uk/people/anna-wisakanto/
Research collaboration, Writing collaboration, Sparring partner (chat), Sparring partner (video call), Local AIS events, Online AIS events
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102