Learning Objectives -- Practical Data Science - Fall 2013
By the end of this course, you will (as time permits):
- Be familiar with common tools for programming, development and data management:
- unix command line and utilities
- the Python programming language
- basic database querying
- for data access, data processing, visualization, and machine learning
- Understand what data is: objects, relationships, & information
- Know how to represent (store and retrieve) data in a variety of common formats
- Use grep and regular expressions to process data: clean noise from raw data, extract exactly what is needed from raw data for your task
- Interact with databases to query for relevant info, store data, provide a storage point for model results
- Deal with big data: using hadoop to mine massive amounts of information
- Using web APIs to query for diverse information, possibly setting up a simple API to act as an end point for a system you’ve developed
- Find correlations between (attributes of) objects
- Visualize data for exploratory and confirmatory analysis
- Build models to make predictions given data, categorize objects
- Transforming raw data into features that are useful for predictive models
- Evaluate predictive models: how well do models predict a phenomenon?
- produce quantitative evaluations
- visualize model evaluations to assess business value / usefulness
- Understand controlled experiments in the wild;
- deploying models / data systems
- comparing and understanding treatments
- Understand some of the main applications of data science
- Recommender Systems
- Applications to online advertising
- Others depending on case studies and guest speakers