HODP Projects List
View only
TitleDescriptionTypeAssigneeCategoryDifficultySubtasksDue DateTags
End of year scheduler/flightsActive MemberVinay KanumuriAcademicsModerate62019-11-23
Foreign gifts and contract report analysisActive MemberJasonFinancesModerate42019-11-11HPR
Crimson Article Coreference GraphWouldn’t it be cool to see who is associated with who at Harvard? How many degrees removed you are from Dean Khurana? This project would entail scraping Crimson article for people’s names and adding them to some sort of graph visualization with edges between people whose names appear in the same article. We have no idea what this looks like, but people will definitely be interested! This would be a great project for someone interested in building visualizations and data collection.Active MemberMaddy NakadaExtracurricularsModerate12019-11-18
Running MapThis is mostly finished, just want to track in AsanaActive MemberMaddy NakadaCampusEasy12019-11-18
Faculty DirectoryCreate a directory of all Harvard faculty, includng easily searching faculty by name, department, tenure, etc. Would be useful especially if we wanted to survey Harvard Professors in the future. Ideally is an automated way to scrape and update the data for the future, but if that is not possible, then a one time manual aggregation might also be worthwhile.

Useful links: https://www.seas.harvard.edu/faculty-research/people/ladder

Active MemberWilliam ZhangPeopleModerate12019-12-09
Opportunity at HarvardAnalyze Raj Chetty's datasets about mobility at different higher educatin institutions. Current status is that it is being edited by HPR and responses are requiredActive Memberdfriedman@college.harvard.eduPeopleModerate22019-11-18HPR
Scrape HCS Email list infosYou can pretty easily find all of the HCS email lists. We could write a script to scrape all of these to see which lists have the most members, etc. This might be a good proxy for how many members are in different clubs, stuff like that:

Active MemberEric SunExtracurricularsModerate52019-11-18
Phi Beta KappaWe have PBK nominees over the past few years by their year and their concentration. There's definitely some cool stuff that we can do with this.

Active MemberClaire FridkinAcademicsEasy12019-11-18
Harvard survey group analysisWhat is the composition of our survey group? What do they feel strongly/not strongly about?Active MemberJenny*Politics*Easy22019-11-18
Update Crime MapRevamp crime map to be written in Python and to store data in Firebase.Active MemberAshley WongCampusDifficult0
Set up a donation system for individual contributionsActive MembersethbilliauMetaDifficult02020-01-18WinterBreak
Mange DataRelated: see HUDS menu item analysis belowFast-TrackJames RoneyFoodDifficult52019-11-22
Financial aid over timeExamines finanical aid at Harvard over time and how iit has changedBootcampcdarms@college.harvard.eduFinancesModerate4
Harvard tuition over timeNormalize over inflation, look at possible reasons for changes (health seems to be related to the presidential term), light comparison with comparable universities. BootcampBenji KanFinancesEasy4
Harvard 990 Tax ReturnsHarvard has to file a 990 tax return every year. We should try to scrape these and get some relevant information (XMLs shouldn't be so hard to take from). Can find them here: https://projects.propublica.org/nonprofits/organizations/42103580BootcampNicole ChenFinancesModerate4
Candidate-Issue VisualizationVisualize how candidates align under issue positions and figure out a visual way to compare student positions on political issues with candidates' positions, weighted by how much students support said candidates. Goal would be to figure out if we're actually supporting candidates who align with us on the big issues. BootcampJohn Tucker*Politics*Moderate4Politics
Harvard Faculty Political DonationsThe donors to political campaigns is public information on the FEC website, and many Harvard faculty members presumably donate to political campaigns. It would be interesting to find the data on which candidates and political parties Harvard faculty members tend to donate toBootcampmbergam@college.harvard.eduFinancesModerate4HPR
Alumni DemographicsBootcampBobby McCarthyPeopleDifficult2
Athlete HometownsBootcampRichard ZhuCampusModerate4
OCS Career Fair DataSomething like the Crimson's senior class visualizations but long-term and across alumni. Does the alumni association keep this kind of data? Would they let us look at things like career paths, geography, etc. 

*(should this bootcamp group be combined with alumni demographics?)*

RELATED: alumni clubs idea below

BootcampMoritz PailPeopleDifficult5
Upperclassmen house altruismI'm guessing graduating class participation to class fund? Or donations that houses receive? Either way, can get data from houses themselves. BootcampGabriel CederbergFinancesModerate5
UC voter turnout over timeBootcampHenry Austin CampusEasy4
Quality of Harvard AdvisingInequalities in the subject areas of tutors. What are the processes for hiring tutors in different houses. Which houses are resident tutors vs which ones are non-resident tutors. What is the difference for freshmen advising?BootcampEmma FreemanAcademicsModerate7
Analyze the crime dataWe have like 2 years worth of crime data, that is currently mapped, but is in no way analyzed. It would be pretty cool if we could analyze the data and draw some conclusions


Also check out http://hodp.org/catalog/?q=crime for reference
BootcampKevin HuangCampusEasy4
Harvard Google TrendsAnalyze the google search trends of Harvard in relaation to other schools and over time. BootcampAsher NoelCampusModerate4
Chilled Water UsageBootcampAlyssa CFoodModerate52019-11-18HPR
Sustainability and WasteHow much HUDS food is wasted. Data can come from:
Sustainability Rep program
Family meals that packages up HUDS food

I'm sure one of these will support us on this analysis

Scraping Twitter/News feeds for HarvardUnclaimedCampusDifficult1
Analyze Course Enrollments over timeTake a look to see if there are any interesting trends in course enrollment data over time. Especially looking at the data of gen-ed. Do certain departments grow more quickly?


Do more enrollees lead to more concentrators (historical concentrator data can be found under each concentration in handbook): https://handbook.fas.harvard.edu/book/export/html/403021.

What people graduate as: https://oir.harvard.edu/fact-book/degrees-awarded-summary)?
Ted Talk sentiment analysisWould be nice to have some sort of Harvard relation. Maybe speakers that come to/are from Harvard. UnclaimedPeopleModerate1
Harvard Employment ReportWe should do a survey of students to see what they are looking for during the summer, whether they have found anything, and try to correlate concentrations with jobs as well as see which concentrations have the most and the earliest successUnclaimedPeopleModerate1Survey
ELO Ranking of Ivy League Basketball TeamsUnclaimedExtracurricularsModerate1
Harvard Confessions: Part 2Bootcamper Lucas Chu doing this.UnclaimedPeopleEasy0
Data on Harvard TenureRecently, there has been both student outcry and legal action regarding Harvard's decisions to extend tenure to not extend tenure to certain professors. There have also been ongoing controversies about the diversity of tenured faculty in various departments. Examine the number of publications and the influence of articles published by professors when they got tenure. We could also try to get Harvard archive data on when professors got tenure and compare across things like age, number of publications at time of tenure, etc. UnclaimedPeopleModerate1
Harvard IM Sports AnalysisUnclaimedExtracurricularsModerate1
Mixed grad undergrad enrollment/cross enrollment over timehttps://registrar.fas.harvard.edu/faculty-staff/courses/enrollment/archived-course-enrollment-reports

Which depts have the most grad students cross enrolling? What is the change over time?
Commencement speechesSentiment and word-frequency analysis. Do commencement speeches address different topics over time? Start with president's speech over the years (since we do not have to account for speaker's background) then speakers (can also compare speaker's speeches with president's)UnclaimedPeopleModerate0
Pulse comparisonWe hava data tables for the pulse survey conducted earlier Fall 2019: https://pulse.harvard.edu/files/pulse/files/pilot_pulse_survey_ib_data_tables.pdf
Harvard has already published basic analyses and charts (https://pulse.harvard.edu/results), but this data could be useful for reference/comparison.
Distance to class/Room Return FrequencyCreate and send survey asking how far they are from their classes and appx when, for how long, and how often students return to their room (noting, of course, where the room is). Could also survey wheres students spend time during the day and combine with the libraries' data on student usage (they sample the number of people in libraries every so often) to create some visualization map as well. Almost negative to social space data, since that's how much people hang out outside of their room. UnclaimedCampusModerate0
Club/Course deadline response vs. Overall distributionTry to get data from various large courses/clubs of how long before deadlines people submit. Compare distribution with something everyone has to fill out, like the check-in, to focus article and to tie it together. UnclaimedAcademicsDifficult0
Course enrollment vs. Faculty diversityIs there any correlation between the race/gender/status/school of faculty vs. enrollment? Should normalize over departmental popularity

We have course enrollment and faculty diversity data (https://faculty.harvard.edu/data) but will require some matching script
UC AgendasWe have this pretty cool dataset that includes all of the UC agendas from the past year. We should totally do some text analsyis of the types of bills that are being submitted and what the time allocation of UC meetings is like. UnclaimedCampusEasy0
LaundryView DataWe have this massive dataset gathered over the course of three months where every hour, in every laundry room, we collected data on how many laundry machines were in use. We could look into like when freshmen do laundry vs when upperclassmen do laundry, when are the best times do laundry, etc.

Freshman IMsDifferent ways to rank Freshman dorms. Wait until intramural manager has more data/or per person data. UnclaimedExtracurricularsModerate0
Lamont Library Location DataLamont and other libraries collect detailed data on the per-hour locations of students. We're in contact with the administrators in charge of compling and maintaining that data. We could try and determine which areas are most and least popular depending on the time (weeknights, weekends, reading period, finals, etc) and help generate recomendations for less-popular substitutions for very crowded areas.UnclaimedCampusModerate0
Athletes' hometownsWhere do athletes come from? Any correlations? (Ex. cold sport athletes come from cold places?)UnclaimedExtracurricularsModerate0
HPOPOverall issue analysis. What do people feel strongly/not strongly about? Data in csv.Unclaimed*Politics*Easy0
Course grading breakdown vs. popularity/Q scoresScraping syllabus explorer pdfs, and potentially confirming whether or not we can publish our scraped q data if we aggregate it enoughUnclaimedAcademicsDifficult0
How many club emails do you get?/Comping surveyFun survey (or get data from mailing lists). On that note, if we're going to do a survey, comping analysis. Compare number of clubs comped/vs. number completed/vs. Tenure. average comp history

HUDS Menu item analysisWhich foods show up the most often? Any correlation over day of week/month?

Data collected, but not very clean
Final Exams AnalysisOnce we see what people take, can analyze conflicts beyond conflicts within departments here: https://registrar.fas.harvard.edu/files/fas-registrar/files/final_exam_schedule.pdf. data available, taken from PDF, see Vinay and Daniel's data collection. UnclaimedAcademicsEasy0
Reworking SearchWe’d like to rework the search on our website so that we don’t have to show all of our data on a single page. This might include helping to migrate our data over to a postgres or firestore database and building an endpoint to access the data. This is a great project for someone interested in getting lots of hands-on experience with databases.UnclaimedMetaModerate0
Winthrop Controversy / Climate Review DataHow can we use data to analyze the Winthrop controversy?

Survey of the students to see how strongly they feel about the issue. Demographics of who's taking the survey, correlations, etc.

Maybe could look at tutors and what they specialize in?
Add Dataset Details to CatalogWe’d like to present a data set summary for all of our data sets either on the catalog page or on the data set page (see example here). This would build off of the data set preview that Festus built, and would allow users to get a sense of what’s in a data set before they download it.UnclaimedMetaEasy0
Moving Web Apps to FlaskWe have a number of web apps including the Huds scraper and the crime map that we need to move over to the new flask website. This is a great project for someone who wants to increase their familiarity with flask and get experience with web apps.UnclaimedMetaModerate0
Update Data LinksA lot of data links are missing or not up-to-dateUnclaimedMetaEasy0
Harvard timeHow are we affected by Harvard time? Will need to extract from many pdfs somehow... perhaps with PDF scraping apps. We can look at new schedule vs. old schedule course density (especially at lunch time), correlations and conflicts within departments and among large courses, etc. https://registrar.fas.harvard.edu/courses-exams/courses-instruction/previous-courses-instructionUnclaimedAcademicsModerate0
Course title/description world cloud + enrollmentCan do word cloud to visualize whether certain key words in titles leads to more enrollees ("data" for example)- will have to normalize over dept popularity and may be most interesting to analyze when courses change names.

What gets you into an Ivy LeagueUnclaimedPeopleModerate0HotButton
Effective Means of Info DistributionWhat's the most effective way to distribute information? Would require sending out some announcement (could be a HODP pub if you talk to board about it) via multiple channels with maybe slightly different sign-up links in each announcement, then comparing total responsiveness and responsiveness over time, conversion from opening the link to signing up for whatever it is, etc. Mainly analyzing HODP's distribution itself. UnclaimedExtracurricularsEasy0
Alumni Clubs and SIGSHarvard Alums have a bunch of clubs and social groups. It would be pretty cool to figure out which locations have the highest density of alumni associations, which types of clubs are most prevalent, etc.

Roombook usageWe could totally scrape the roombook site to see which rooms are the most commonly booked and what times are the most common. We can't see what activities are going on in each, but this data seems pretty readily availableUnclaimedExtracurricularsDifficult0
Create individual members pageWe currently allow members to edit their profile, but this only changes it on the "People Page". Members should have their own individual pages that people can click on to see their contributions and any social media linksUnclaimedMetaEasy0
Divestment DataUse data and statistical methods to offer a new perspective on the Harvard divestment issue. How much of Harvard's endowment is actually invested in fossil fuels and private prisons? Are fossil fuels and private prisons a profitable investment? How effective have divestment campaigns been in the past? This encompasses a lot of outside data, but will be highly influential with respect to a major Harvard issue. More thoughts: we could talk to the divestment groups/collab with them in some capacityUnclaimedFinancesDifficult0HotButton
Sponsored Program Returnssponsored program revenue (https://finance.harvard.edu/annual-report), actual expenditure (https://osp.finance.harvard.edu/annual-report)

Needs to extract data from pdfs, but should be pretty clear
Social Spaces MapSize, amenities, access - potentially need to go to the places manually, but being able to access them should be fineUnclaimedCampusModerate0
House Floor Plans Just Need To Compile by hand (though some houses may have their own info compiled in spreadsheets, worth asking)...... GIS data (https://guides.library.harvard.edu/gsd/gis#s-lg-box-7271210) likely not detailed enough for rooms, though total area of house is possible. what percent have n+1 (can do analysis based on # people and # rooms-might not even have to compile by hand to do this)? Average size of rooms? If we have plans/data from before renovation, can compare quality of housing over years.UnclaimedCampusDifficult0
HUDS Grill Waiting Time (Mange) Needs to collect data from mange app or directly from HUDSUnclaimedFoodModerate0
Harvard Square Restaurant Ratingshttps://www.yelp.com/fusion –– would involve price/rating/popularity, probably need some scraping plus some student survey. How often people go out to eat?

Can do a map with price/rating/popularity (no. of ratings)

New Seas Campus Data(Have to wait at least a year for this) Ask facilites for usage and efficiency data, how will that change once the new SEAS campus opens? https://oir.harvard.edu/fact-book/infrastructureUnclaimedCampusEasy0
Art Museum DataArt musuem website has public data. MetaLab did beat us to visualizations (and we might be able to ask them for data in aggregated form so we don't need to scrape or ask the museums) but we may provide some interesting analyses-like what artists do we usually collect? Which part of our collections has the most value?UnclaimedCampusModerate0
Student Eating PreferencesWhat food do students like to eat the most? Queston is how we get the data: (can take survey and ask people to rate dishes, like the Independent Dietary Habits survey)

Green reps, ask sustainability groups
Red, green, yellow stickers, how enviornmentally friendly is HUDS and what do people like? Useful for clubs too, what to order: pizza or nachos? soft drink popularity? Jefe's or Felipe's? Can analyze dish frequency vs. popularity, popularity correlated with day of week/time of month HUDS serves
HUDS Survey Satisfaction DataI'd love to integrate some of the HUDS data that we already have with the HUDS Satisfaction survey. I'm not entirely sure what they do with that data or how to get it but it'd be good to get the ball rolling on thatUnclaimedFoodEasy0
Centralized Course Website DirectoryMany courses have their own websites (ex. CS50) and it would be cool to get all of them in one place. This project would likely be a lot of manually searching for websites, but at the end you’ll have a great data set that you could scrape over and possibly extract some meaningful data from.UnclaimedAcademicsModerate0WinterBreak
Create a prediction market for HODPI think it would be a lot of fun for us to create a betting market for events at Harvard where students can use fake money to bet against each other. We could use a leaderboard system or offer some chance to win a gift card based on how much money people have. Something like that. UnclaimedMetaDifficult0WinterBreak
Projects pageRight now our Asana Sync code syncs from Asana to spreadsheet. We should sync it to our site instead.UnclaimedMetaDifficult0WinterBreak
HODP WikiHarvard doesn’t really have a central repository to learn about stuff. I think it would be really cool to make oneUnclaimedMetaDifficult0WinterBreak
Experiment with Google forms vs qualtricsUnclaimedsethbilliauMetaEasy0
Hot takes surveyUnclaimedMelissa KwanCampusEasy0