AI research journey and advice

[DRAFT]

See tweet: https://twitter.com/_jasonwei/status/1681004467573911553

Q: How did your journey in AI begin?

A: Although the work you know me for is probably from 2022, I've been working on AI since 2017. When I started my undergrad at Dartmouth, I originally wanted to be a banker on Wall Street, which was the American Dream of my parents’ generation (and what most of my friends from my hometown went on to do). But I had a hard time getting a finance internship my freshman year (2017), and so I ended up working with an AI startup that I got connected with through a friend of my mom.

My first interaction with AI that summer was Michael Neilson's Neural Networks and Deep Learning. I read about backprop, and the idea of learning arbitrary mapping from input and output data of any type, appealed to me. I still remember feeling an order of magnitude more passionate about this than other subjects I had learned in school.

The cool thing to do in the late 2010s was to publish AI research, and so I tried to do that in college. I did the common path of taking a machine learning class and asking to do research with the professor who taught the class. I worked for two years on deep learning for medical image analysis. My work was decent but not amazing. Dartmouth didn't have a vibrant AI environment and it was hard to find a community. So I'm really grateful for the few people there who were actually interested in AI—Sam Greydanus was someone who I looked up to.

As graduation neared, I targeted both PhD programs and software engineering jobs. I had one good paper accepted to a major conference, and I thought I was going to get in all the PhD programs. I got rejected from all of them except USC. My junior year summer, I interned at Blend and DoorDash, but I wasn’t a stellar software engineer, and wasn’t as passionate about software.

I almost went to do a PhD at USC but changed my mind but I got an offer to do the Google AI Residency, which is a 18-month research program for people without AI PhDs to do research at Google. The AI residency was probably the single biggest jumpstart to my career so far. I probably got in by merits of a somewhat popular paper that I had written on data augmentation in NLP. The impact of that paper was a surprise to me---I naively applied an intuition that I used in medical image analysis to NLP, and didn't expect it to become so popular.

So after graduating in 2020, I spent a little more than two years working at Google on large language model research. Google was an amazing place to do research, and many prolific researchers built up their reputations from work they did at Google. The AI residency was particularly effective because the researchers who did well could continue staying on, and most people learned to work really hard for this conversion to a permanent role.

There are at least two lessons in my story. Both are cliche but I'll say them anyways:

A lot of things that I thought were failures at the time actually turned out to be good for me. Had I been more successful at getting finance or software engineering internships, I would've missed out on a career in AI that I currently love.
Luck played a big role, but I also created a lot of the opportunities to get lucky. For example, I had absolutely no competitive advantage in writing a well-cited NLP paper, but I gave the opportunity for the paper to become popular by writing a blog about it, using easily accessible language, and making the code available online.

Questions about me

Q. What is your honest work/life balance? From undergrad till now in industry.

Most of my friends would say that I work pretty hard, but I also know people who work harder than me. I usually work in bursts and turn up the heat when I need to.

In college, I spent a lot of time on research, probably at least twenty hours a week.

I spent some time on other activities, like my fraternity, weightlifting, and then fencing club.

It helped that I never got into video games, but I had a ~1 hour per day youtube addiction throughout college.

I probably socialized an average amount for Dartmouth students.

Other than that I was probably working. I'd put myself in the top 20% work ethic at Dartmouth.

When I was an AI resident at Google, I worked quite hard. There was a lot of pressure to produce work and stay on as a permanent role. There were a few weeks were I hit 40+ hours of concentrated coding per week; I didn't do much else those weeks other than exercise, eat, sleep, and a little bit of leisure time.

But other than that, I made the time to weightlift and play tennis with friends.

When I moved to the Bay Area, I made a habit of generally just being at the office working unless I had specific social obligations. I rarely work from home.

Q: What research direction should I work on?

Obviously there's no single right answer for this. My personal opinion is that it's important to work on research that you enjoy, because you'll do a better job at it in the long run. If you don't know what research you enjoy, read broadly for a few weeks or ask others what directions they're excited about, and then pick something and start working on it.

Sometimes you'll have to make a tradeoff between what you want to work on and other factors. For example, you might have an opportunity to work with an amazing professor on something that's not your first choice topic. I think it's fine to do stuff like this if you're going to learn a lot from it, or if it helps you get to where you want to be. But it's important to remember the reasons you do things, and be transparent with yourself about what you want.

One research direction that I'd blanket suggest people to consider is alignment. A few good reasons why I consider alignment a good direction:

Aligning intelligent AI with human values is certainly going to be very important.
Alignment is a relatively nascent field, so the people who work on it earlier will have an outsized opportunity for impact.
Alignment can be interdisciplinary; it touches on a lot of other topics such as ethics (which some people might enjoy).
There is less supply than demand for alignment researchers, so it can be easier to get a job.

A few reasons why more people aren't working on alignment are that it's not traditional, there aren't great benchmarks, it's not super accessible, and that the goals of alignment aren't agreed on by everyone. But I don't think these reasons are blockers; I've found that people working in alignment are really open to talking to people who are alignment-curious.

Q: Where/how did you learn most of the stuff you needed to conduct effective research? Is it better to spend more time learning, or to jump straight into research if you have interesting ideas?

There are three sources of learning in research, and you want to maximize all three:

You read stuff: other research papers, blog posts, twitter.
Other people tell you stuff: your advisor gives you feedback, reviewers review your paper.
You try stuff: you run an experiment, it works or doesn't work, you dig deeper into why.

At the beginning, it's good to read stuff, because there are much higher costs to do the second two (other people's time; your time to do experiments). However, you'll want to soon start getting the second and third types of learning as soon as possible. The reason is that learning happens fastest when you're at the edge of your abilities, and since (2) and (3) are customized for you they will accelerate you faster.

Q: What would you say is the most important trait you need for research?

My view is that research, like most other skills, can be learned through practice (see this blog post). Yes, maybe Terrance Tao has some god-gifted talent and few of us will reach that level. But I believe most people have the ability to become a high-caliber researcher. So my short answer to this question is probably "grit", since grit enables practice.

A criminally underrated skill in research is learning from feedback. You'd be surprised at how much people ignore feedback---I've found that most people who ask me for advice don't listen to it. One thing I try to do is to take very seriously the feedback that others give, especially if they're someone at the top of the game. Feedback is like a gradient, it tells you in which direction to go in to become a better researcher. When I had an advisor, I would ask them every week what I could do better, and then try to do it. We're lucky to have a culture of quick feedback loops in research; not every field is like this.

A caveat is that it's also good to know when to ignore feedback. Your advisor or boss might not always be in the right mental state to give feedback, and they haven't thought about your problem as much as you have. But you should still think carefully about what they say anyways.

Another underrated skill is being willing to do gruntwork, especially looking at data. In 2019, I trained one of the first neural networks for lung cancer classification, and I labeled most of the data by doing an initial pass, and then asking pathologists to review my classifications. It took me 40+ hours, and in the end I could classify a particular type of lung cancer probably as well as practicing pathologists. It took a lot of time, but the intuitions I gained from doing that data labeled were used in three proceeding papers after that. So it was worth it.

A final underrated skill in research is being a good communicator. Communicating well makes you much more trustworthy to work with. For example, there are three pet peeves that I have and I know that at least some others have:

A lot of people like to say stuff like "I'll have this done by tomorrow", and then it's not done until a few weeks later. I try to avoid saying stuff like this, unless it's really important and I can actually finish it (not just work on it) by tomorrow. I want people to be able to trust my word when they need an important task done.
In initial project meetings, people like to express a lot of interest to be involved, but then aren't willing to put time into it proportional to the interest they expressed. Again I want my interest to provide signal for others, so I try to say stuff like "I'm not committing to working on this project, but <X> idea seems really interesting to me."
People often say stuff like "<X> doesn't work", without the correct amount of specificity. I try to instead say statements like "<X> didn't work when I tried it using formulation F on model M using dataset D", which decreases the mental load of the other person trying to guess at which specificity level I'm talking at.

Q: [From a Dartmouth undergrad] How much does self-learning (e.g. taking these more recent courses that you mention, etc.) play into the whole research process? Do you wish you went to a school that offered more in terms of academics and coursework? If so, do you wish there were more options or that the courses were more rigorous?

I think I understand how you feel. In the beginning I felt pretty bad about the opportunities in AI at Dartmouth. My sophomore fall I looked into applying as a transfer to Stanford, but decided not to since it wasn't worth the effort given the low transfer acceptance rate.

As a college student I cared about courses a lot, and I found the AI courses at Dartmouth to be pretty thin. Looking back, it's clear that while coursework is good for sparking interest, it's not enough to prepare you for a full-time AI role. So success does not require going somewhere with good coursework.

What I think I learned the most from was doing research, and a good thing about Dartmouth is that professors try to make the time to mentor you or give feedback on your work, even if they aren't an expert in the exact thing you want to work on. During my time at Dartmouth, I had worked with or taken a class with nearly every professor who did anything deep-learning related:

I wrote multiple papers with Saeed Hassanpour.
I wrote multiple papers with Soroush Vosoughi.
I did a senior thesis with Lorenzo Torresani.
I wrote a NLP paper with Thalia Wheatley, a psychologist.
I did a reading course with Eugene Santos, which culminated in a paper on NLP for social science, which I even got my Middle East studies professor to give feedback on.

(Not that counting papers matters much, I just say it to operationalize the extent to which I sought out opportunities.)

Another tough thing about doing AI at Dartmouth is that social pressures will incentivize you to try out other things instead. Most of my friends went into software engineering, consulting, or investment banking. So it was hard to find others who could relate to the struggles I was having. I found solace in a few PhD student friends that I made my senior year.

To-do / not answered yet

Q: Do effective ideas (at least in the areas you work on) tend to be more deep mathematically, or more broadly creative? If mathematical depth is important, what is the best way to get to this level of understanding during undergrad? Is it more important to develop good mathematical intuition, or is it worth the time to formally study these mathematical topics in-depth?

It’s hard to argue that more math is bad, but here’s what I think (and it’s just my opinion):

In the history of deep learning, two simple things have stood the test of time and almost always work: bigger models and more data. Neither of these are mathematically deep, or even particularly creative.

So I don’t recommend getting very deep in mathematics for several reasons:

Right now, there are opportunities and unexplored ideas in the space, which means that the opportunity cost of time is high. As a result, the comparative value of investing time in math is much less.

Even if you wanted to spend time in a longer-term skill, I think there are ones that are much higher leverage than being good at math. Here are a few:

Much of today’s AI landscape is about being good at engineering, and developing good software engineering skills.
A bottleneck today is GPUs, and learning about hardware and how to use GPUs efficiently is probably higher payoff than learning math.
Investing in being a good communicator will make you much easier to work with, help you think more clearly, and be more organized.

Given that most ideas do not come from some mathematical motivation (although some do), being well-versed at math might bias you in the wrong direction when you’re looking for engineering solutions (the same way that being good at linguistics doesn’t help you with large language models).

Q: How essential is a PhD?

PhDs are helpful if you want to be a professor.

Having a PhD can help you get an AI job at a company. However, if you don’t have a PhD, getting a PhD is not the fastest way of getting an AI job at a company.

However, if you’re choosing between doing an AI PhD or doing something else (e.g., quant finance), I’d still recommend doing the PhD.

Q: How do I get a job in AI research?

A: This is a pretty loaded question and the advice will depend heavily on your background. But here's my try at a "zero to hero" answer anyways.

0. If you don't know how to code, learn to code in python first.

1. Take a deep learning course to understand the fundamentals of deep learning (backprop, deep neural networks, transformers, etc). I think Andrew Ng's deep learning specialization is a good place to start.

2. Take a more recent course to understand the current status of the field. Language models are a popular direction these days, and I'd recommend working on them. The field moves super fast, and these courses are a structured way to learn and figure out what all the jargon people use actually means.

3. After you're somewhat literate in the field, there are two big questions: who to work with and what to work on. To-do(jasonwei)

Q: How do you find collaborators? Or is it worth it to pursue ideas by oneself?

I don't recommend pursuing ideas by oneself unless you're an experienced researcher. Finding collaborators is important to get feedback on ideas and to help find the right direction. I mostly work with people who are in close proximity at my company. Sometimes I change companies to work with particular people. It can be a good idea to go out of your way to work with particularly strong people at your company, or in academia. Academia can be particularly collaborative; for example, I reached out to Ryan Cotterell as an undergrad, and we worked together on a paper together which ended up getting published. I learned a lot from that.

Q: Why did you prefer AI research over doing AI-based engineering roles? (like ML infrastructure for example)

I initially started doing AI research because that was what was needed to do a PhD. I guess I was ambitious and also liked the component of pushing the edge of innovation, and so I had a slight bent towards research. Nowadays though, the lines between AI research and AI-based engineering is a lot blurrier, since AI has having enormous product impact. A lot of people are transition from AI research to engineering or product roles now.

Q: How can a graduate student come up with a good idea independently? Is there a feasible operational process (instead of just "reading papers and deep thinking")? What is the thought process behind coming up with the idea of CoT? Some research habits in your daily life (such as where to record notes on papers and where to organize ideas)?

One great way of finding a research direction is to ask another good researcher what they think are the good directions---most will be more than happy to share with you. The hard part for many people about doing this is just putting their own ego aside and working on what someone else thinks is exciting.

Another way of finding a research direction that I like is to be involved in another field and see if certain ways of thinking can be transferred to the current field. For example, my "easy data augmentation for NLP '' paper was inspired by my work in medical image analysis, where it was common to do lots of data augmentation. My inspiration for chain-of-thought prompting actually came from meditation--I had actually first written a paper about how language models could generate "stream of consciousness", and then applied that to math word problems.

At the beginning of my residency, I used to read papers and take notes. This was helpful at the beginning to get to know the field and find references to previous related work, but it didn't really give me any great new ideas. I haven't taken notes on papers for a long time.

Q: What’s your daily life while doing research? A lot of discussion with people, or like a lot of individual thinking? Are you using ChatGPT to help you with those questions?

Nowadays, research is increasingly collaborative. I spend a lot of time talking with collaborators (e.g., Hyung Won Chung), about how to prioritize the work that we do and how to coordinate with other people on the team. For academic researchers or PhD students, there can be as little as one meeting a week with an advisor, with the rest of the time devoted to head-down coding (which I actually kinda miss).

I use ChatGPT most frequently to help with coding. It's not super useful for really detailed research questions that require a lot of contextual knowledge about my personal situation or about OpenAI. But maybe one day it can be!

Q: You once tweeted that “ AGI is probably coming in a few years” (https://twitter.com/agikoala/status/1644431320720855040). More concretely speaking, what does “AGI” encompass here and what key milestones will have to be achieved before AGI?

When I talk about AGI, I basically mean an AI that can perform at the level of your favorite co-worker at your tech company. The path to AGI is not super clear, but I think two milestones will be the ability to reliably execute actions in the world, and long term planning.

Q: You once said that you were surprised how the Danes valued healthy work-life balance and quality family time more than the mindless careerism and overworking found in the US. Have you adopted some of these Scandinavian/European values to your life?

I spent 4 months at the Technical University of Denmark in 2018 and love the country. I was intruiged by how Danes seemed to be a perfect society in terms of happiness, low crime, et cetera. However, I found the contentness there to be not inspiring. I realized that my goal is to push the edges of humanity and I'm a pretty ambitious person, and so I would say that I actually haven't adopted many scandinavian values into my life.

General questions about research

Q: What role does socialization play in research (attending conferences, reaching out to authors of papers you find interesting, paper collaborations). Do you have any tips for being more involved in the community as a newcomer?

Socialization plays a very big role in research (for better or for worse). At Google, people used to say that the Senior/Staff RS promotion was to learn that research is a highly social endeavor. One reason that socialization is important. The first is that you'll get more exposure to your work. Doing great work doesn't count if no one knows about it. I wrote more about this in [reference this blog post].

Q: What are common job trajectories for people in this space? What roles do people initially have and how do they evolve over time?

One common trajectory in AI research is to do a PhD, join a big company at a research scientist for a few years, and then work towards being a manager or research lead. Other trajectories include working at or co-founding startups. However, many people also joined these roles without PhDs, such as if they did a residency or did a lot of related work during undergrad.

For ML engineers, there's probably a more diverse range of backgrounds, including people with or without PhDs.

Q: What’s your opinion on LLMs’ future? Would it become a “god-like” AGI that excels at everything? Or would there be a generalist vs. specialist LLMs? I guess my question is, do you see opportunities for LLMs in vertical spaces, like medicine or law.

Language models are a very special technology due to their generality. The question of generalist vs. specialist LLMs is an interesting one, and I think that whenever there is data that can be found on the internet or that many companies will be able to train on, a general approach (i.e., scaling data and compute without domain-specific interventions) will win out. If you have really special data and a lot of it, then a specialist approach could potentially be a winner, but my guess is that this will just be a small portion of the market.

Q: There are so many different models, nowadays, neural network sounds to be a huge thing. But there are still a lot of people like doing things using their own way, like GP, bayesian. What’s your opinion on that (to Bayesian)? Does neural network more like an easy black box toy than other complex mathematical models? Someone says learning is trying to find a good data representation using your model, neural network seemed to be a great option for learning text, images and a lot of different things. What do you think its disadvantage, how to improve that in the future?

Neural networks and the bitter lesson of scale have seemed to be true for a while, and I think most people should work on that. However, it's good to have the community invested in a diverse range of approaches, and some people should actually be working on these alternative algorithmic approaches. From an allocation of resources perspective, perhaps it's net most productive if people who have the ability to innovate on alternative approaches (e.g., those particularly talented at math), to do that, and others to work on mainstream research and applications.

Q: For many ML research intern / scientist positions in industry where one uses mostly pytorch for experiments, why do the interviews still test on classical algorithms from LeetCode?

Not all research positions in industry test software engineering; some just do research interviews. But I also think it's reasonable for them to do so, because basic software engineering is important for smooth collaboration as part of a team at a tech company. I guess I agree that testing for LeetCode isn't super important if the role is pure research and doesn't involve a lot of collaborators.

Q: What are some smaller companies doing some really interesting/impactful work in the AI/ML space?

Other than the big three (OAI, Anthropic, GDM), here are some other companies I've heard interesting things about: Inflection, Character, Midjourney, Reka, Perplexity.ai, Runway, Elicit.

Q: Could you please describe more precisely what you mean by alignment? What is the goal of it? What qualifications and skills are needed for this?

Alignment is a general term for the field of research of getting language models to do what humans what them to do. I'm not an expert in alignment and I'm learning more about it myself!

Q: You mentioned that communication skill is important, how do you develop your communication skill sets? A lot of researchers maybe ingoing person. Any advice for them (thanks for providing this Q&A opptunity for shy people)

Q: How to become better in academic / research communication?

Developing good communication skills is pretty general and there are better people out there to give advice than me. Here's a few personal things I do:

When I talk to people, I like to spend extra time to make sure they have all the context for the problem / setup. I like to re-iterate overarching goals and motivations at the beginning of meetings.

I try to hedge a lot and be precise about statements that I make. A lot of conflicts in the workplace come from bad or inaccurate communication.

If I'm leading a meeting, I usually make an agenda with some notes. Writing down ideas helps me gain clarity.

Q: When someone holds AI PhD start creating a startup that focuses on 2C product and their life is no longer research-oriented, is it usual for such a person to still insist in doing research, or most of them might stop doing AI research anymore?

Many people in AI are transitioning from open-ended research to more product-oriented research. At the end of the day, most companies have to make money and devoting headcount towards longer term research isn't the best use of headcount. Wanting to do research at a company that focuses on product may mean that it's not the best fit.

Q: Is Vim still useful for AI research in 2023?

Vim is useful and you should learn it.

UNDERGRADUATES AND HIGH SCHOOLERS

I got a lot of questions from undergrads, and even high schoolers. The best advice varies from person-to-person, but my view is that there are basically three ways to get into an AI job from undergrad: (1) applying for AI jobs directly (including residencies), (2) doing a PhD and then applying for AI jobs, and (3) applying for a software engineering role and then transitioning to AI. I'll give separate advice for how to do each of them.

Path 1: applying for AI jobs directly.

This is the most direct path. I'd only recommend trying to do this if you have extensive prior experience and a strong profile to do so. Typically the people who are successful at this have published multiple research papers, worked on highly-influential open-source projects, or completed multiple research internships. So if you're early in undergrad, you can aim for doing some of the above. Hopefully by the time you're looking for jobs, you'll know people in the field to get interviews, and pass the interviews based on your own merits.

The one easier way to break through to AI jobs directly is through residency programs. These are typically 6-18 month research programs for people without a PhD in AI, and if you do well the company will often keep you on permanently. I did mine at Google, but Google doesn't offer this program anymore. OpenAI does, along with some other companies (Meta?). Residencies tend to be competitive to break into since a lot of people will apply.

Path 2: applying for PhD and then applying for AI jobs.

Do some google searches

Here is my general advice for getting into a PhD program:

If you have access to working with a strong professor, do that and then work really hard. Their recommendation letter will matter a lot for the PhD.

If you cannot find an oppportunity to work with a strong professor, try to email their PhD students and work with them. Work really hard at that.

If you absolutely can't find a strong professor or their PhD student, try to find a professor at your school who can help advise you on research. Be extremely prolific with them.

When approaching someone to work with them, it's good to do your research beforehand. Talk to people and develop some opinions and maybe have some opinions beforehand. People don't owe you their time. Whether someone wants to talk to you will be a function of how much your prepared and whether your "ask" is relevant to them.

Path 3: applying for a software engineering job and then transitioning to AI.

For those who aren't particularly passionate about Paths 1 or 2, this may be the best choice. What I mean by this is first applying for a regular non-AI job at a company such as Google, Meta, or Amazon, and then start working on side projects that involve AI, gain experience, and then perhaps eventually transfer to the AI team. It's important to do this at a company that does have an AI team, but the good news is that many companies nowadays have AI teams.

The benefit of doing this is that you'll probably make good money and get relevant experience as a software engineer. But since you have a full-time job in something else, you'll have to work extra-hard to do AI on the side while others are enjoying their new-grad lives. It could also be challenging to maintain the original passion for AI once you've gotten used to the cushy big-tech life.

Q: I heard that as an undergraduate who wants to apply to a top-tier PhD program, connections between you/your letter writer and the target school is the “most” critical thing in the application package. But as an undergrad who didn’t go to a prestigious university (say ranks about ~100 in the US) and no professor has established connections with professors in the research community. When I try to cold email reaching out to some professors trying to see if there is an opportunity to collaborate with them on a research project or do an intern, all I get is either they don’t have bandwidth or no reply. So, it seems like extremely hard for an undergrad to “break into” the research community they’re interested in especially there is no mentor/advisor in their own university, and it seems to create a dead loop that undergrads in this situation almost impossible to get into a top-tier PhD program at all. Is there any advice for me in this kind of situation?

Most professors are extremely busy and also rejected me when I was emailing them. PhD programs are competitive these days---I went to a Ivy League college and published multiple papers, but still got rejected from everywhere the first two times I applied to top programs. Generally, I think people in your types of scenarios will have to work harder to get to the same spot as say, Stanford undergrads. But the good news is that if you can do that successfully, you'll have more skills to be successful once you get the job than others.

Q: As an undergraduate student in India, what would be the most basic requirements to apply for research internships abroad? I have written and published a paper at a workshop recently under the guidance of a professor at my college and am currently working on more, but I am still unsure as to how to proceed while looking and approaching professors and research labs for internships?

Research internships are pretty competitive to get these days. The chances of getting a research internship will be a lot higher as a PhD student with multiple publications.

Here's two actionable pieces of advice:

Try emailing PhD students of the top professors and ask to work with them. Spend a lot of time writing the best email you possibly can. Prove to them that you'll be worth working with by reading their papers and understanding their subject areas. Maybe even run a few experiments with some interesting results.

Try reaching out to groups such as ML collective or Eleuther AI; they will be more welcoming of people and could be a good place to start.

Q: I’m currently very close to finishing my undergrad. Unfortunately I don’t have much research experience to apply straight for a PhD. Do you think that going for a Masters with a concentration in AI/ML will be a good way to ease into the field and then if opportunities arise go for a PhD or go into a Research-based job?

Barring financial considerations, doing a masters can be a good idea to buy time to focus on studying AI.

Q: My grades are pretty messed up, i’m an undergraduate student. I’m very passionate about working in AI, the limit i think is doing AI research. Is there any possibility i can engage in AI research. I’m asking because since i have bad grades i will probably not get into graduate school.

I wouldn't get too caught up on bad grades. With a good research record, grades don't matter as much.

Q: Are there any volunteered or non-paid research roles you can get in AI/ML that are entry level to start building experience or knowledge?

There are some great communities to get started in AI! One that I'd recommend is ML collective.

Research internships are pretty competitive to get these days. The chances of getting a research internship will be a lot higher as a PhD student with multiple publications.

Q: As a high school student, I’ve been working on some AI research myself and feel that I want to expand the work I do in more meaningful and deep areas! apart from going to college should I do next? Slash when I go, what should I major in or focus on if I want to eventually get into the AI scene? Double slash do you know anywhere that would hire a high school student, if only for an internship, for AI research?

Computer science is the most standard major and a good idea in my opinion. It can be hard to get an internship as a high school student, but writing papers or doing open source projects can be a good way of increasing the probability.

Q: I am an undergrad at a good university but ML research is not the strong point of our CS. Given that I still managed to work on some ML research topics and was able to get 2 papers in top conferences (and working on more). However, given how competitive the top programs are I still think I am not a top candidate for such programs. Do you have any recommendations regarding this?

I had a similar profile to you, and also felt the same way. Here are some things that I did:

I hedged by also applying for software engineering jobs, and was ready to do that if no PhD programs accepted me.

I was OK with not going to a top-5 AI program (USC). The students and advisor there were still pretty good, and it would be a good opportunity for me to learn and apply again for the top places after my PhD.

Q: I see you published many papers during your undergrad, helping you land in the Google residency program. As a fresh new grad without the support of academia anymore, what low-hanging fruits do you recommend people can take that jumpstarts them to the next big step (residency, employment, masters/PhD)?

I'd probably recommend Path 3 for this, but would need more context about your goals.

TRANSITIONING FROM OTHER FIELDS

If you don't currently work in computer science, I have a few recommendations:

Spend enough time getting familiar with programming, especially in python. The barrier to entry is pretty low but the fundamentals are worth half a year to a few years of your time.

Take a deep learning course. I personally took and liked the "Deep learning and neural networks course" from Andrew Ng.

Get up to date on the state-of-the-art in AI. You can do this by browsing twitter, attending conferences, and reading recent papers.

Then start doing research or working on projects.

Q: What is a basic roadmap that would be valuable to follow for learning the fundamentals of Machine Learning from a CS Undergrad position?

Q: Currently working on online courses like Deeplearning.ai for mathematics in AI/ML. Do you have any other recommendations or books to start getting entry level knowledge? I have basic course-work experience with Python already but want to start working on some projects.

Q: How can a mid career professional begin research without any experience? Any directions?

Q: If you were to do your master’s thesis, on what AI related field would it be? I work on distributed systems research as a graduate, and I am trying to apply distributed systems to AI or the inverse.

Q: I got my master’s degree as a computer scientist from one of a european university. I have been working on designing software for more than 10 years. I’m currently considering retraining as a machine learning engineer. I've been brushing up my math for some time and now i'm going to invest some time in python. What would you recommend as next steps? Is this transition even possible ?, how much does a machine learning position differ from a typical SE position?

Q: What recommendations do you have for those looking to enter AI roles (e.g., research scientist or research engineer) with a PhD in a technical field outside CS (i.e., no publications in top AI conferences)?

Q: What recommendations do you have for those looking to transition from applied roles (data scientist, applied scientist, machine learning engineer, etc.) to research roles in AI (research scientist, research engineer)? Is such a transition possible without publications in top AI conferences?

Q： I’m a PhD candidate in another field, I’m interested in AI. How to make that transition and if it is possible to get AI post-doc or TT track? Is there a website for this kind of info?

People not currently in computer science

Q: I had undiagnosed ADHD as an undergrad but still managed to graduate in CS from an ~ok school but don’t have any research experience. I did manage to land a job at JPMorgan where I’ve been for the past few years as a Software Engineer but thinking of trying to move into AI research. What advice would you give on how to go about it? Where would I begin (especially with how to get experience with AI research)? My grades weren’t stellar and I don’t have any research experience so not sure if a top AI masters / PhD is possible though I did take a decent amount of AI classes in undergrad.

Q: Are publications at top research venues required to start a career in ML/AI?

For very specific research scientist roles, publications are needed. Publications are generally helpful, but for the majority of jobs, impactful project experience can be more important than papers at the top conferences (thousands of people publish at the top conferences every year).

Q: Say that you recently graduated from a PhD program in ML, and can’t land a research scientist job, only research engineer, and your goal is to be a research scientist in an interesting company. While the difference between the two roles might not be too relevant in certain companies, in many of them there is a clear distinction. What would you suggest? a) Settle on research engineer, and try to move to a research scientist position asap. b) Research scientist in a less relevant/exciting company. c) Stay an academia and level up (post-doc?) until it’s possible to land RS job.

I think it depends on how much you want to be a research scientist. If you really want to be a research scientist than don't settle for anything less. These days, many of the top AI researchers have transitioned to doing engineering. So my personal opinion is that being a research engineer at a place like OpenAI, Anthropic, or OpenAI is a pretty good gig, and could be better than being a research sicentist, and it's not worth making big decisions based on that distinction.

Q: What books/courses would you recommend for someone who is not familiar with AI but wants to gain knowledge about it?

I don't know that I would recommend any books, but I liked the Deep Learning and Neural Networks course from Andrew Ng.

Q. My biggest struggle with self-learning AI/ML has been analysis-paralysis with the number of resources across internet. Coming from a top 200 school with not the best AI/ML curriculum and zero concentration tracks, I find myself quite lost in terms of finding the right resources and testing my knowledge. How do you test the quality of resources and what pointers would be helpful to me as I attempt to do the same?

Reading papers that are popular can be a good idea; you can rely on other people to filter for quality. I like this list from Yi Tay.^[a]^[b]

OAI

Q: Is OpenAI hiring for the OpenAI Residency in 2023? There do not appear to be Residency positions listed on the Careers page, but the November 2021 blog post is still featured on the Careers page.

Q: Does OpenAI hire people right out of undergraduate? If so what do they look for and should I try connecting with multiple people at OpenAI for referrals?

OTHER

Q: What advice would you give to someone who is interested in doing AI on African languages?

I just graduated undergrad and I have been working on my first deep learning research project for more than a year now. I tried a lot of different things and none have worked well. I feel like I am losing motivation, any advice?

I'm not particularly qualified to talk about multilingual NLP, but I wrote one paper, and what I learned was that a multilingual benchmark can be a solid contribution nowadays. Some things that make up a good benchmark include:

The benchmark metrics should be easy to understand.

The benchmark should be easy to download and run.

It's good to run some popular baselines on the benchmark.

Do some PR around the benchmark on twitter, and help other people use it when they encounter difficulties.

Q: You mentioned, a bigger payoff would be to learn hardware like GPU and infra. Can you share the resources for those concepts, if one want to start from scratch? Currently, I know Python and have done courses on Deep learning along with some projects.

Q: Is there any advice or direction you can give for doing research in game AI? Very interested in working in projects similar to AlphaStar or AlphaGo but don’t have a PhD yet. Similar to the question above, I’ll be going for a Masters first.

I'm not qualified to give advice on game AI.

Q: Can you provide details on the potential structure of the underlying database that supports ChatGPT? I am intrigued by its possible tree-like structure. As a commonly implemented solution, what would be the best method to store conversational datasets? Additionally, as an AI researcher, what types of datasets would be most beneficial for training conversational AI models?

Q: What makes it hard for OpenAI to collect large amounts of scalable human feedback data on human values, expectations, etc? Soo many people around the world use ChatGPT. Is it a lack of quality data (hence using Scale.. but this should be filter-able)? Is it a lack of diversity (but I’m sure ChatGPT must still get some small usage from underrepresented countries…)? What would help make it easier?

Q: How much freedom did you have in choosing what you research at google ai residency? What about OpenAI? What kind of people can try new ideas at places like OpenAI?

Q: What are some of your favorite papers? Whether that is they had an idea you really liked or were extremely well written.

Q: Is OpenAI offering internships?

A: Not at the moment, but we offer full-time roles and residencies!

Q: I’m particularly interested in working with ML in an engineering way rather than a research way (working with pre-existing research rather than doing research on new things), so does OpenAI offer a full-time software engineering role that doesn’t require a graduate degree?

Yes definitely! Most (probably all?) roles at OpenAI don’t require a graduate degree.

[a]Is there a link to Yi Tay's list? Thank you for the answer :)

[b]https://www.yitay.net/blog/2022-best-nlp-papers

Maybe this one?