HODP Projects List
 Share
The version of the browser you are using is no longer supported. Please upgrade to a supported browser.Dismiss

View only
 
ABCDEFGHI
1
TitleDescriptionTypeAssigneeCategoryDifficultySubtasksDue DateTags
2
Update Crime MapRevamp crime map to be written in Python and to store data in Firebase.Active MemberAshley WongCampusDifficult0
3
End of year scheduler/flightsActive MemberVinay KanumuriAcademicsModerate22019-11-01
4
Foreign gifts and contract report analysisActive MemberJasonFinancesModerate42019-10-05HPR
5
Harvard IM Sports AnalysisActive Memberandreazhang@college.harvard.eduExtracurricularsModerate12019-11-01
6
Crimson Article Coreference GraphWouldn’t it be cool to see who is associated with who at Harvard? How many degrees removed you are from Dean Khurana? This project would entail scraping Crimson article for people’s names and adding them to some sort of graph visualization with edges between people whose names appear in the same article. We have no idea what this looks like, but people will definitely be interested! This would be a great project for someone interested in building visualizations and data collection.Active MemberMaddy NakadaExtracurricularsModerate12019-10-05
7
Running MapThis is mostly finished, just want to track in AsanaActive MemberMaddy NakadaCampusEasy12019-10-19
8
Candidate Alignment QuizFinished with an HTML version. Need to incorporate into main HODP siteActive MemberBridger Gordon*Politics*Easy32019-10-11
9
Faculty DirectoryCreate a directory of all Harvard faculty, includng easily searching faculty by name, department, tenure, etc. Would be useful especially if we wanted to survey Harvard Professors in the future. Ideally is an automated way to scrape and update the data for the future, but if that is not possible, then a one time manual aggregation might also be worthwhile.

Useful links: https://www.seas.harvard.edu/faculty-research/people/ladder

http://facultyfinder.harvard.edu/search
Active MemberWilliam ZhangPeopleModerate22019-10-05
10
Opportunity at HarvardAnalyze Raj Chetty's datasets about mobility at different higher educatin institutions. Current status is that it is being edited by HPR and responses are requiredActive Memberdfriedman@college.harvard.eduPeopleModerate12019-10-05HPR
11
LaundryView DataWe have this massive dataset gathered over the course of three months where every hour, in every laundry room, we collected data on how many laundry machines were in use. We could look into like when freshmen do laundry vs when upperclassmen do laundry, when are the best times do laundry, etc.

https://docs.google.com/spreadsheets/d/1Eaa9h2IEthJdRVVAqfWzsGoXyg7FL47HJDKGz-FOinY/edit?usp=sharing
Active MemberJeffrey HeCampusEasy62019-10-19
12
Scrape HCS Email list infosYou can pretty easily find all of the HCS email lists. We could write a script to scrape all of these to see which lists have the most members, etc. This might be a good proxy for how many members are in different clubs, stuff like that:

https://www.google.com/search?q=site:https://lists.hcs.harvard.edu/mailman/listinfo/&oq=site:https://lists.hcs.harvard.edu/mailman/listinfo/&aqs=chrome..69i57j69i58j69i61.117j0j1&sourceid=chrome&ie=UTF-8
Active MemberEric SunExtracurricularsModerate42019-10-19
13
Phi Beta KappaWe have PBK nominees over the past few years by their year and their concentration. There's definitely some cool stuff that we can do with this.

https://docs.google.com/spreadsheets/d/1VUG2WBq_D2HRpCGlJjsmS1n2glRU2P6uFrrbMSJuxa0/edit?usp=sharing
Active MemberClaire FridkinAcademicsEasy12019-10-15
14
Harvard survey group analysisWhat is the composition of our survey group? What do they feel strongly/not strongly about?Jenny*Politics*Easy22019-10-05
15
Scraping Twitter/News feeds for HarvardFast-TrackEric ZhangCampusDifficult22019-11-01
16
HUDS Nutritional DataRelated: see HUDS menu item analysis belowFast-TrackJames RoneyFoodDifficult22019-11-01
17
ELO Ranking of Ivy League Basketball TeamsFast-TrackDennis LinExtracurricularsModerate12019-11-01
18
Alumni Post-grad dataSomething like the Crimson's senior class visualizations but long-term and across alumni. Does the alumni association keep this kind of data? Would they let us look at things like career paths, geography, etc. 

*(should this bootcamp group be combined with alumni demographics?)*

RELATED: alumni clubs idea below
BootcampMoritz PailPeopleDifficult4
19
Upperclassmen house altruismI'm guessing graduating class participation to class fund? Or donations that houses receive? Either way, can get data from houses themselves. BootcampFinancesModerate0
20
Harvard tuition over timeNormalize over inflation, look at possible reasons for changes (health seems to be related to the presidential term), light comparison with comparable universities. BootcampBenji KanFinancesEasy1
21
Ted Talk sentiment analysisWould be nice to have some sort of Harvard relation. Maybe speakers that come to/are from Harvard. BootcampEvelyn CaiPeopleModerate1
22
Commencement speechesSentiment and word-frequency analysis. Do commencement speeches address different topics over time? Start with president's speech over the years (since we do not have to account for speaker's background) then speakers (can also compare speaker's speeches with president's)BootcampPeopleModerate0
23
UC Election DataThe UC recently conducted their election. Let's try to get some analysis about it out there. BootcampHenry Austin PeopleEasy2
24
Harvard 990 Tax ReturnsHarvard has to file a 990 tax return every year. We should try to scrape these and get some relevant information (XMLs shouldn't be so hard to take from). Can find them here: https://projects.propublica.org/nonprofits/organizations/42103580BootcampNicole ChenFinancesModerate1
25
Lamont Library Location DataLamont and other libraries collect detailed data on the per-hour locations of students. We're in contact with the administrators in charge of compling and maintaining that data. We could try and determine which areas are most and least popular depending on the time (weeknights, weekends, reading period, finals, etc) and help generate recomendations for less-popular substitutions for very crowded areas.BootcampRichard ZhuCampusModerate1
26
Athlete HometownsBootcampRichard Zhu1
27
Quality of Harvard AdvisingInequalities in the subject areas of tutors. What are the processes for hiring tutors in different houses. Which houses are resident tutors vs which ones are non-resident tutors. What is the difference for freshmen advising?BootcampEmma FreemanAcademicsModerate4
28
Analyze Course Enrollments over timeTake a look to see if there are any interesting trends in course enrollment data over time. Especially looking at the data of gen-ed. Do certain departments grow more quickly?

https://registrar.fas.harvard.edu/faculty-staff/courses/enrollment/archived-course-enrollment-reports

Do more enrollees lead to more concentrators (historical concentrator data can be found under each concentration in handbook): https://handbook.fas.harvard.edu/book/export/html/403021.

What people graduate as: https://oir.harvard.edu/fact-book/degrees-awarded-summary)?
Connor McRobertAcademicsEasy1
29
Mixed grad undergrad enrollment/cross enrollment over timehttps://registrar.fas.harvard.edu/faculty-staff/courses/enrollment/archived-course-enrollment-reports

Which depts have the most grad students cross enrolling? What is the change over time?
AcademicsEasy0
30
Course enrollment vs. Faculty diversityIs there any correlation between the race/gender/status/school of faculty vs. enrollment? Should normalize over departmental popularity

We have course enrollment and faculty diversity data (https://faculty.harvard.edu/data) but will require some matching script
BootcampAcademicsModerate0
31
Harvard Employment ReportWe should do a survey of students to see what they are looking for during the summer, whether they have found anything, and try to correlate concentrations with jobs as well as see which concentrations have the most and the earliest successBootcampelizabethling@college.harvard.eduPeopleModerate1Survey
32
Data on Harvard TenureRecently, there has been both student outcry and legal action regarding Harvard's decisions to extend tenure to not extend tenure to certain professors. There have also been ongoing controversies about the diversity of tenured faculty in various departments. Examine the number of publications and the influence of articles published by professors when they got tenure. We could also try to get Harvard archive data on when professors got tenure and compare across things like age, number of publications at time of tenure, etc. BootcampGeena KimPeopleModerate1
33
Harvard Faculty Political DonationsThe donors to political campaigns is public information on the FEC website, and many Harvard faculty members presumably donate to political campaigns. It would be interesting to find the data on which candidates and political parties Harvard faculty members tend to donate toBootcampFinancesModerate0
34
Distance to class/Room Return FrequencyCreate and send survey asking how far they are from their classes and appx when, for how long, and how often students return to their room (noting, of course, where the room is). Could also survey wheres students spend time during the day and combine with the libraries' data on student usage (they sample the number of people in libraries every so often) to create some visualization map as well. Almost negative to social space data, since that's how much people hang out outside of their room. BootcampCampusModerate0
35
Candidate-Issue VisualizationVisualize how candidates align under issue positions and figure out a visual way to compare student positions on political issues with candidates' positions, weighted by how much students support said candidates. Goal would be to figure out if we're actually supporting candidates who align with us on the big issues. Bootcamp*Politics*Moderate0Politics
36
Sustainability and wasteHow much HUDS food is wasted. Data can come from:
PBHA
Sustainability Rep program
Family meals that packages up HUDS food

I'm sure one of these will support us on this analysis

Bootcampalyssachen@college.harvard.eduFoodModerate02019-11-05
37
Harvard Confessions: Part 2Bootcamper Lucas Chu doing this.BootcampPeopleEasy0
38
Club/Course deadline response vs. Overall distributionTry to get data from various large courses/clubs of how long before deadlines people submit. Compare distribution with something everyone has to fill out, like the check-in, to focus article and to tie it together. BootcampAcademicsDifficult0
39
Analyze the crime dataWe have like 2 years worth of crime data, that is currently mapped, but is in no way analyzed. It would be pretty cool if we could analyze the data and draw some conclusions

https://docs.google.com/spreadsheets/d/1AMbEglG18BDz4-mQgTfAl4-jiT2Th_tKyIjwBEMDWF8/edit#gid=0

Also check out http://hodp.org/catalog/?q=crime for reference
BootcampKevin HuangCampusEasy1
40
Freshman IMsDifferent ways to rank Freshman dorms. Wait until intramural manager has more data/or per person data. UnclaimedExtracurricularsModerate0
41
Athletes' hometownsWhere do athletes come from? Any correlations? (Ex. cold sport athletes come from cold places?)UnclaimedExtracurricularsModerate0
42
HPOPOverall issue analysis. What do people feel strongly/not strongly about? Data in csv.Unclaimed*Politics*Easy0
43
Course grading breakdown vs. popularity/Q scoresScraping syllabus explorer pdfs, and potentially confirming whether or not we can publish our scraped q data if we aggregate it enoughUnclaimedAcademicsDifficult0
44
How many club emails do you get?/Comping surveyFun survey (or get data from mailing lists). On that note, if we're going to do a survey, comping analysis. Compare number of clubs comped/vs. number completed/vs. Tenure. average comp history

UnclaimedExtracurricularsModerate0
45
HUDS Menu item analysisWhich foods show up the most often? Any correlation over day of week/month?

Data collected, but not very clean
UnclaimedFoodModerate0
46
Final Exams AnalysisOnce we see what people take, can analyze conflicts beyond conflicts within departments here: https://registrar.fas.harvard.edu/files/fas-registrar/files/final_exam_schedule.pdf. data available, taken from PDF, see Vinay and Daniel's data collection. UnclaimedAcademicsEasy0
47
Reworking SearchWe’d like to rework the search on our website so that we don’t have to show all of our data on a single page. This might include helping to migrate our data over to a postgres or firestore database and building an endpoint to access the data. This is a great project for someone interested in getting lots of hands-on experience with databases.UnclaimedMetaModerate0
48
Winthrop Controversy / Climate Review DataHow can we use data to analyze the Winthrop controversy?

Survey of the students to see how strongly they feel about the issue. Demographics of who's taking the survey, correlations, etc.

Maybe could look at tutors and what they specialize in?
UnclaimedPeopleEasy0
49
Add Dataset Details to CatalogWe’d like to present a data set summary for all of our data sets either on the catalog page or on the data set page (see example here). This would build off of the data set preview that Festus built, and would allow users to get a sense of what’s in a data set before they download it.UnclaimedMetaEasy0
50
Moving Web Apps to FlaskWe have a number of web apps including the Huds scraper and the crime map that we need to move over to the new flask website. This is a great project for someone who wants to increase their familiarity with flask and get experience with web apps.UnclaimedMetaModerate0
51
Update Data LinksA lot of data links are missing or not up-to-dateUnclaimedMetaEasy0
52
Harvard timeHow are we affected by Harvard time? Will need to extract from many pdfs somehow... perhaps with PDF scraping apps. We can look at new schedule vs. old schedule course density (especially at lunch time), correlations and conflicts within departments and among large courses, etc. https://registrar.fas.harvard.edu/courses-exams/courses-instruction/previous-courses-instructionUnclaimedAcademicsModerate0
53
Course title/description world cloud + enrollmentCan do word cloud to visualize whether certain key words in titles leads to more enrollees ("data" for example)- will have to normalize over dept popularity and may be most interesting to analyze when courses change names.

https://registrar.fas.harvard.edu/faculty-staff/courses/enrollment/archived-course-enrollment-reports
AcademicsEasy0
54
What gets you into an Ivy LeagueUnclaimedPeopleModerate0HotButton
55
Effective Means of Info DistributionWhat's the most effective way to distribute information? Would require sending out some announcement (could be a HODP pub if you talk to board about it) via multiple channels with maybe slightly different sign-up links in each announcement, then comparing total responsiveness and responsiveness over time, conversion from opening the link to signing up for whatever it is, etc. Mainly analyzing HODP's distribution itself. UnclaimedExtracurricularsEasy0
56
Third Democratic Debate ArticleUnclaimed*Politics*Easy0
57
Alumni Clubs and SIGSHarvard Alums have a bunch of clubs and social groups. It would be pretty cool to figure out which locations have the highest density of alumni associations, which types of clubs are most prevalent, etc.

https://alumni.harvard.edu/community/clubs-sigs
UnclaimedExtracurricularsModerate0
58
Roombook usageWe could totally scrape the roombook site to see which rooms are the most commonly booked and what times are the most common. We can't see what activities are going on in each, but this data seems pretty readily availableUnclaimedExtracurricularsDifficult0
59
Create individual members pageWe currently allow members to edit their profile, but this only changes it on the "People Page". Members should have their own individual pages that people can click on to see their contributions and any social media linksUnclaimedMetaEasy0
60
Divestment DataUse data and statistical methods to offer a new perspective on the Harvard divestment issue. How much of Harvard's endowment is actually invested in fossil fuels and private prisons? Are fossil fuels and private prisons a profitable investment? How effective have divestment campaigns been in the past? This encompasses a lot of outside data, but will be highly influential with respect to a major Harvard issue. More thoughts: we could talk to the divestment groups/collab with them in some capacityUnclaimedFinancesDifficult0HotButton
61
Sponsored Program Returnssponsored program revenue (https://finance.harvard.edu/annual-report), actual expenditure (https://osp.finance.harvard.edu/annual-report)

Needs to extract data from pdfs, but should be pretty clear
FinancesModerate0
62
Social Spaces MapSize, amenities, access - potentially need to go to the places manually, but being able to access them should be fineUnclaimedCampusModerate0
63
House Floor Plans Just Need To Compile by hand (though some houses may have their own info compiled in spreadsheets, worth asking)...... GIS data (https://guides.library.harvard.edu/gsd/gis#s-lg-box-7271210) likely not detailed enough for rooms, though total area of house is possible. what percent have n+1 (can do analysis based on # people and # rooms-might not even have to compile by hand to do this)? Average size of rooms? If we have plans/data from before renovation, can compare quality of housing over years.UnclaimedCampusDifficult0
64
HUDS Grill Waiting Time (Mange) Needs to collect data from mange app or directly from HUDSUnclaimedFoodModerate0
65
Harvard Square Restaurant Ratingshttps://www.yelp.com/fusion –– would involve price/rating/popularity, probably need some scraping plus some student survey. How often people go out to eat?

Can do a map with price/rating/popularity (no. of ratings)


UnclaimedFoodModerate0
66
New Seas Campus Data(Have to wait at least a year for this) Ask facilites for usage and efficiency data, how will that change once the new SEAS campus opens? https://oir.harvard.edu/fact-book/infrastructureUnclaimedCampusEasy0
67
Art Museum DataArt musuem website has public data. MetaLab did beat us to visualizations (and we might be able to ask them for data in aggregated form so we don't need to scrape or ask the museums) but we may provide some interesting analyses-like what artists do we usually collect? Which part of our collections has the most value?CampusModerate0
68
Student Eating PreferencesWhat food do students like to eat the most? Queston is how we get the data: (can take survey and ask people to rate dishes, like the Independent Dietary Habits survey)

Green reps, ask sustainability groups
Red, green, yellow stickers, how enviornmentally friendly is HUDS and what do people like? Useful for clubs too, what to order: pizza or nachos? soft drink popularity? Jefe's or Felipe's? Can analyze dish frequency vs. popularity, popularity correlated with day of week/time of month HUDS serves
UnclaimedFoodEasy0
69
HUDS Survey Satisfaction DataI'd love to integrate some of the HUDS data that we already have with the HUDS Satisfaction survey. I'm not entirely sure what they do with that data or how to get it but it'd be good to get the ball rolling on thatUnclaimedFoodEasy0
70
Centralized Course Website DirectoryMany courses have their own websites (ex. CS50) and it would be cool to get all of them in one place. This project would likely be a lot of manually searching for websites, but at the end you’ll have a great data set that you could scrape over and possibly extract some meaningful data from.UnclaimedAcademicsModerate0
Loading...