ABCDEFGHIJKLMNOPQRSTUVWXYZAAABAC
1
Mailing list:Link to listserv
2
More information:
http://harp.ri.cmu.edu/reading-group/
3
Meeting TimeWednesdays, 1-2pm (Spring 2024)
4
Meeting formats
5
PresentationDeliver a prepared presentation of a paper (one paper can be presented by more than one person)
6
Speed ReadSpeed read the paper for 30 mins and use the remaning time for discussion.
7
8
9
FormatPresenter/Contact NamePresenter/Contact EmailTopicPaper Name/CitationPaper Link (email to organizers directly if not available)
10
January 17Speed readcollectivecollective
IIFL: Implicit Interactive Fleet Learning from Heterogeneous Human Supervisors
https://proceedings.mlr.press/v229/datta23a/datta23a.pdf
11
January 24Speed readcollectivecollectiveImplicit Behavior Cloning
https://arxiv.org/abs/2109.00137
12
January 31Speed readWhen Should We Prefer Offline Reinforcement Learning over Behavioral Cloning
https://openreview.net/pdf?id=AP1MKT37rJ
13
February 7
14
February 14Speed readPranay
Diffusion World Model
https://arxiv.org/pdf/2402.03570.pdf
https://arxiv.org/pdf/2402.03570.pdf
15
February 21PresentationPranayRLHFOpen Problems and Fundamental Limitations of Reinforcement Learning from Human FeedbackOpen Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
16
February 28Speed readPranayRLHFOpen Problems and Fundamental Limitations of Reinforcement Learning from Human FeedbackOpen Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
17
Spring BreakMarch 6
18
March 13
19
March 20Speed ReadPranayLLMAre emegent properties of LLM a MirageAre emegent properties of LLM a Mirage
20
March 27Speed ReadThe Perils of Trial-and-Error Reward Design: Misdesign through Overfitting and Invalid Task Specifications
https://ojs.aaai.org/index.php/AAAI/article/view/25733
21
April 3
22
April 10
23
April 17
24
April 24TalkTiffany Min
LLM/ Human motion
Situated Instruction Following
Will go on Arxiv soon
25
May 1
26
May 8Role playing - Direct Preference Optimization
27
May 15
28
29
30
31
32
33
Papers People Want to Read
Add your name to support the motion
34
Brohan, A., Brown, N., Carbajal, J., Chebotar, Y., Chen, X., Choromanski, K., ... & Zitkovich, B. (2023). RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control. arXiv preprint arXiv:2307.15818.Suresh
35
AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic AgentsSuresh
36
Implicit Behavior Cloning
37
Analyzing the Variety Loss in the Context of Probabilistic Trajectory PredictionAbhijat
38
Self-Rewarding Language ModelsPranay
39
Diffusion World ModelPranay
40
Data Distributional Properties Drive Emergent In-Context Learning in TransformersPranay
41
Chain-of-Thought Prompting Elicits Reasoning in Large Language ModelsPranay
42
Open Problems and Fundamental Limitations of Reinforcement Learning from Human FeedbackPranay
43
Direct Preference OptimizationPranay
44
Are emegent properties of LLM a MiragePranay
45
Unfamiliar Finetuning Examples Control How Language Models Hallucinate
Pranay
46
The Perils of Trial-and-Error Reward Design: Misdesign through Overfitting and Invalid Task SpecificationsHenny
47
THE PITFALLS OF NEXT-TOKEN PREDICTIONAbhijat
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100