ABCDEFGHIJKLMNOPQRSTUVWXYZAA
1
TATitleAbstract/Summary
2
Ashish SinghFriend or Foe? Analysis of training on out-of-distribution data on fixed test set Given a fixed dataset for image classification (Eg: CUB), explore if augmumenting the training dataset with additional samples improve model performance or not. What
type of samples help? what type of smaples doesnot? how would you design data filtering/augumentation pipeline?
3
Ashish SinghMini-Spotify: neural network based music retriver Build a simple neural network based music/song retrival system , where given a music query, retrive song audio/metadata. Key questions: how would you design
query? How would you encode music query? Fixed or dynamic length query?
4
Ashish SinghIs charizard a dragon? zero-shot categorization/attribute discovery of art workGiven an artwork, construct an ensembe of models to provide visual attributes in a zero-shot manner, i.e. the model should not be trained on the artwork or similar artworks. Evaluate the models performace with a small scale human study. Evaluate generalizability of modles across different domain settings (for example type of artwork could for different domains)
5
Ashish SinghML Reproducibility challenge (General idea -- talk to me to figure out a right project)Select you favourite paper and reproduct its results. Provide your analysis about how
easy/hard it was to reproduuce the results. what changes were required and under what
settings things work/dont work?
6
Ashish SinghWhat makes Amherst look like Amherst?Given a large set of images taken of different places of Amherst, what are the key images/image patches that
define Anherst? Reference: http://graphics.cs.cmu.edu/projects/whatMakesParis/
7
EddieRepresentation learningTrain a model that takes an image and produces a representation vector that is useful for downstream tasks.
8
EddieConditional flow matchingTrain a continuous normalizing flow that makes use of conditional inputs for guided generation. We'll provide the training and sampling algorithms, so students will need to come up with a good architecture to use conditional data.
9
EddieConsistency regularization for generative modelsUse data augmentation to improve your generative model’s performance.
10
EddieGenerative model distillation Train a neural network to generate samples like a pretrained ODE-based generative model does, but without needing to solve an ODE.
11
MaxAudio classification with CNNsWhat's the best way to classify music genre? The GTZAN dataset provides song samples and spectrogram images with corresponding genre labels.
12
MaxAdversarial Patches defenseAdversarial patches, like in the APRICOT dataset, can be used to fool object detectors. Can you come up with a good defense that is resistant to these patches?
13
OindrilaObject localization based on textual descriptionUse a previous visual grounding method in a real-world scenario such as segmenting described product in pictures of aisles in grocery stores
14
OindrilaEmotion-based Style TransferAdjust the style transfer based on the emotion detected in the content image. For example, use a gloomy painting style for sad faces and vibrant styles for happy ones.
15
OindrilaText Augmentation using LLMsImprove the performance on text classification by generating augmented text using LLMs for under-represented categories
16
JunyanLLM architecture comparisonCompare the performance and inference speed for mainstream LLM architecures (e.b. encoder-decoder, decoder-only and RWKV), and understand these architectures' pros and cons.
17
JunyanMini diffusion model for MNISTTrain a mini diffusion to generate mnist. Compete on how lightweight the model can be given a minimum quality requirement.
18
JunyanThe evolution of Cats vs Dogs Classificationcat vs dog classification is a very famous and interesting task for machine learning beginners. The technique evolves from SVM -> CNN -> ViT -> multimodal LLM, and the challenges shift from feature engineering -> model design -> prompt engineering. It is interesting to understand this kind of shifting, and have a hand-on experience to implement them, test their performance and draw the trend for accuracy/inference cost/model size to understand the development.
19
KeUnsupervised / Weakly supervised segmentation for medical imaging sequencesUtilize the domain knowledge such as shape, size, location information made available through medical articles or diagrams and the fact that the region of interest changes continuously within a imaging sequence to develop segmentation model for medical imaging data sets with no annotations available.
20
Ke Supervised segmentation with augmentation for medical imaging sequencesUse the available augmentation methods such as cropping, rotation, translation, etc to improve the performance of supervised segmentation model for medical imaging data sets. In addition, try to think of innovative ways to augment the data set available to further improve the segmentation performance.
21
KeRepresentation learning for detecting cardiac diseases with cardiac MRIWhen dealing with difficult detection problem, it is usually a good idea to find a better representation for the classification task. To improve the model performance on detecting heart diseases such as bicuspid aortic valve malformation, mitral regurgitation, etc, develop a model that learns a better representation for the classification model to improve it's performance. Note: should have access to cardiac imaging dataset with disease annotations or similar datasets.
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100