Pyladies 業界分享 - Data Scientist
潘玫樺
清華大學電資學院學士班
清華大學資訊工程研究所
目前是 的資料科學家
讓我們一起用好設計實現美感生活
credit: xkcd.com/1838/
Garbage In,
Garbage Out
numpy
scipy
pandas
sqlalchemy
scikit-learn
click
Apache Airflow
Apache Superset
pyspark
pyhive
Data Preparation
Model Development
other ml modules..
Model Deployment
Monitor
poetry
dramtiq
numpy
scipy
pandas
sqlalchemy
pyspark
pyhive
Data Preparation
pyspark
pyhive + sqlalchemy (ORM)
numpy
scipy
pandas
scikit-learn
Model Development
other ml modules..
pipenv
pipenv + Jupyter Notebook
# install self�pipenv install -e .�
# add kernels of current environment�pipenv shell�python -m ipykernel install --user --name=my-virtualenv-name��# to check whethter installed successfully�jupyter kernelspec list
numpy
scipy
pandas
scikit-learn
Model Development
other ml modules..
pipenv
numpy + scipy
pandas
scikit-learn
click
Apache Airflow
Model Deployment
pipenv
Airflow
Task Duration
Click
Usage:
click
Apache Airflow
Model Deployment
pipenv
Airflow + click + pipenv
Apache Airflow
Apache Superset
Monitor
Superset:
Simple Demo
Q & A
謝謝大家
mei-hua.pan@pinkoi.com