BDA-602 - Machine Learning Engineering
Dr. Julien Pierret
Lecture 2
Python Libraries - Demo
Setup pip
Make a PR
scipy
SciPy (pronounced “Sigh Pie”) is a Python-based ecosystem of open-source software for mathematics, science, and engineering.
NumPy
✔️
SciPy library
➖
Matplotlib
❌
IPython
➖
SymPy
❌
pandas
✔️
›
In Summary (1 of 2)
We will use
pip-compile
along with
requirements.in
for dependency management
argparse
will help us pass arguments to our progrems
Pandas
Load
small
datasets
Preparing datasets for analysis
Numpy
Manipulate arrays
Built new features
sci-kit learn
Build repeatable transformations
Train ML models
Build reusable pipelines
›
In Summary (2 of 2)
Plotly
Inspect candidate predictors
Visualize our data / results
Difference with mean of response
Our "go-to" for visualizing relationships
Jupyter Notebooks
Are garbage 💩🚽, 🗑️
›
Homework - Tutorials 📓
Bash
Ubuntu Bash
(1 hour)
Bash scripting
Numpy
Quickstart
Pandas
Quickstart
(10 minutes)
Official Tutorials
scikit-learn
Official Tutorials
argparse
Official Tutorial
Homework - References 📚
Bash
Bash Reference Manual
The Linux Command Line
Numpy
Tons of resources
API
Pandas
Pandas User Guide
API
scikit-learn
User guide
API
argparse
API
plotly
API
Homework - Cheatsheets
Bash
- There are other good cheatsheets here
Numpy
Pandas