X4DS
This repo is a collection of
Python & Julia port of codes in the following excellent R books:
Python Stack | Julia Stack | |
Language Version |
v3.9 | v1.7 |
Data Processing |
|
|
Visualization | |
|
Machine Learning |
|
|
Probablistic Programming |
|
Code Styles
2.1. Basics
- prefer
enumerate()
overrange(len())
xs = range(3)
# good
for ind, x in enumerate(xs):
print(f'{ind}: {x}')
# bad
for i in range(len(xs)):
print(f'{i}: {xs[i]}')
2.2. Matplotlib
including seaborn
- prefer
Axes
object overFigure
object - use
constrained_layout=True
when draw subplots
# good
_, axes = plt.subplots(1, 2, constrained_layout=True)
axes[0].plot(x1, y1)
axes[1].hist(x2, y2)
# bad
plt.subplot(121)
plt.plot(x1, y1)
plt.subplot(122)
plt.hist(x2, y2)
- prefer
axes.flatten()
overplt.subplot()
in cases where subplots' data is iterable - prefer
zip()
orenumerate()
overrange()
for iterable objects
# good
_, ax = plt.subplots(2, 2, figsize=[12,8],constrained_layout=True)
for ax, x, y in zip(axes.flatten(), xs, ys):
ax.plot(x, y)
# bad
for i in range(4):
ax = plt.subplot(2, 2, i+1)
ax.plot(x[i], y[i])
- prefer
set()
method overset_*()
method
# good
ax.set(xlabel='x', ylabel='y')
# bad
ax.set_xlabel('x')
ax.set_ylabel('y')
- Prefer
despine()
overax.spines[*].set_visible()
# good
sns.despine()
# bad
ax.spines["top"].set_visible(False)
ax.spines["bottom"].set_visible(False)
ax.spines["right"].set_visible(False)
ax.spines["left"].set_visible(False)
2.3. Pandas
- prefer
df['col']
overdf.col
# good
movies['duration']
# bad
movies.duration
- prefer
df.query
overdf[]
ordf.loc[]
in simple-selection
# good
movies.query('duration >= 200')
# bad
movies[movies['duration'] >= 200]
movies.loc[movies['duration'] >= 200, :]
- prefer
df.loc
anddf.iloc
overdf[]
in multiple-selection
# good
movies.loc[movies['duration'] >= 200, 'genre']
movies.iloc[0:2, :]
# bad
movies[movies['duration'] >= 200].genre
movies[0:2]
LaTeX Styles
Multiple lines
Reduce the use of begin{array}...end{array}
- equations:
begin{aligned}...end{aligned}
$$
\begin{aligned}
y_1 = x^2 + 2*x \\
y_2 = x^3 + x
\end{aligned}
$$
- equations with conditions:
begin{cases}...end{cases}
$$
\begin{cases}
y = x^2 + 2*x & x > 0 \\
y = x^3 + x & x ≤ 0
\end{cases}
$$
- matrix:
begin{matrix}...end{matrix}
$$
\begin{vmatrix}
a + a^′ & b + b^′ \\ c & d
\end{vmatrix}= \begin{vmatrix}
a & b \\ c & d
\end{vmatrix} + \begin{vmatrix}
a^′ & b^′ \\ c & d
\end{vmatrix}
$$
Brackets
- prefer
\Bigg...\Bigg
over\left...\right
$$
A\Bigg[v_1\ v_2\ ⋯\ v_r\Bigg]
$$
- prefer
\underset{}{}
over\underset{}
$$
\underset{θ}{\mathrm{argmax}}\ p(x_i|θ)
$$
Expressions
- prefer
^{\top}
over^T
for transpose
$$ 𝐀^⊤ $$
$$
𝐀^{\top}
$$
- prefer
\to
over\rightarrow
for limit
$$ \lim_{n → ∞} $$
$$
\lim_{n\to \infty}
$$
- prefer
underset{}{}
over\limits_
$$ \underset{w}{\rm argmin}\ (wx +b) $$
$$
\underset{w}{\rm argmin}\ (wx +b)
$$
Fonts
- prefer
\mathrm
over\mathop
or\operatorname
$$
θ_{\mathrm{MLE}}=\underset{θ}{\mathrm{argmax}}\ ∑_{i = 1}^{N}\log p(x_i|θ)
$$
ISLR
- Chapter 02 - Statistical Learning
- Chapter 03 - Linear Regression
- Chapter 04 - Classification
- Chapter 05 - Resampling Methods
- Chapter 06 - Linear Model Selection and Regularization
- Chapter 07 - Moving Beyond Linearity
- Chapter 08 - Tree Based Methods
- Chapter 09 - Support Vector Machines
- Chapter 10 - Unsupervised Learning