TabNet for fastai

Overview

TabNet for fastai

This is an adaptation of TabNet (Attention-based network for tabular data) for fastai (>=2.0) library. The original paper https://arxiv.org/pdf/1908.07442.pdf.

Install

pip install fast_tabnet

How to use

model = TabNetModel(emb_szs, n_cont, out_sz, embed_p=0., y_range=None, n_d=8, n_a=8, n_steps=3, gamma=1.5, n_independent=2, n_shared=2, epsilon=1e-15, virtual_batch_size=128, momentum=0.02)

Parameters emb_szs, n_cont, out_sz, embed_p, y_range are the same as for fastai TabularModel.

  • n_d : int Dimension of the prediction layer (usually between 4 and 64)
  • n_a : int Dimension of the attention layer (usually between 4 and 64)
  • n_steps: int Number of sucessive steps in the newtork (usually betwenn 3 and 10)
  • gamma : float Float above 1, scaling factor for attention updates (usually betwenn 1.0 to 2.0)
  • momentum : float Float value between 0 and 1 which will be used for momentum in all batch norm
  • n_independent : int Number of independent GLU layer in each GLU block (default 2)
  • n_shared : int Number of independent GLU layer in each GLU block (default 2)
  • epsilon: float Avoid log(0), this should be kept very low

Example

Below is an example from fastai library, but the model in use is TabNet

from fastai.basics import *
from fastai.tabular.all import *
from fast_tabnet.core import *
path = untar_data(URLs.ADULT_SAMPLE)
df = pd.read_csv(path/'adult.csv')
df_main,df_test = df.iloc[:-1000].copy(),df.iloc[-1000:].copy()
df_main.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
age workclass fnlwgt education education-num marital-status occupation relationship race sex capital-gain capital-loss hours-per-week native-country salary
0 49 Private 101320 Assoc-acdm 12.0 Married-civ-spouse NaN Wife White Female 0 1902 40 United-States >=50k
1 44 Private 236746 Masters 14.0 Divorced Exec-managerial Not-in-family White Male 10520 0 45 United-States >=50k
2 38 Private 96185 HS-grad NaN Divorced NaN Unmarried Black Female 0 0 32 United-States <50k
3 38 Self-emp-inc 112847 Prof-school 15.0 Married-civ-spouse Prof-specialty Husband Asian-Pac-Islander Male 0 0 40 United-States >=50k
4 42 Self-emp-not-inc 82297 7th-8th NaN Married-civ-spouse Other-service Wife Black Female 0 0 50 United-States <50k
cat_names = ['workclass', 'education', 'marital-status', 'occupation', 
             'relationship', 'race', 'native-country', 'sex']
cont_names = ['age', 'fnlwgt', 'education-num']
procs = [Categorify, FillMissing, Normalize]
splits = RandomSplitter()(range_of(df_main))
to = TabularPandas(df_main, procs, cat_names, cont_names, y_names="salary", 
                   y_block = CategoryBlock(), splits=splits)
dls = to.dataloaders(bs=32)
dls.valid.show_batch()
workclass education marital-status occupation relationship race native-country sex education-num_na age fnlwgt education-num salary
0 Private HS-grad Married-civ-spouse Other-service Wife White United-States Female False 39.000000 196673.000115 9.0 <50k
1 Private HS-grad Married-civ-spouse Craft-repair Husband White United-States Male False 32.000000 198067.999771 9.0 <50k
2 State-gov HS-grad Never-married Adm-clerical Own-child White United-States Female False 18.999999 176633.999931 9.0 <50k
3 Private Some-college Married-civ-spouse Prof-specialty Husband White United-States Male False 67.999999 107626.998490 10.0 <50k
4 Private Masters Never-married Exec-managerial Not-in-family Black United-States Male False 29.000000 214925.000260 14.0 <50k
5 Private HS-grad Married-civ-spouse Priv-house-serv Wife White United-States Female False 22.000000 200109.000126 9.0 <50k
6 Private Some-college Never-married Sales Own-child White United-States Female False 18.000000 60980.998429 10.0 <50k
7 Private Some-college Separated Adm-clerical Not-in-family White United-States Female False 28.000000 334367.998199 10.0 <50k
8 Private 11th Married-civ-spouse Transport-moving Husband White United-States Male False 49.000000 123584.001097 7.0 <50k
9 Private Masters Never-married Prof-specialty Not-in-family White United-States Female False 26.000000 397316.999922 14.0 <50k
to_tst = to.new(df_test)
to_tst.process()
to_tst.all_cols.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
workclass education marital-status occupation relationship race native-country sex education-num_na age fnlwgt education-num salary
31561 5 2 5 9 3 3 40 2 1 -1.505833 -0.559418 -1.202170 0
31562 2 12 5 2 5 3 40 1 1 -1.432653 0.421241 -0.418032 0
31563 5 7 3 4 1 5 40 2 1 -0.115406 0.132868 -1.986307 0
31564 8 12 3 9 1 5 40 2 1 1.494561 0.749805 -0.418032 0
31565 1 12 1 1 5 3 40 2 1 -0.481308 7.529798 -0.418032 0
emb_szs = get_emb_sz(to)

That's the use of the model

model = TabNetModel(emb_szs, len(to.cont_names), dls.c, n_d=8, n_a=8, n_steps=5, mask_type='entmax');
learn = Learner(dls, model, CrossEntropyLossFlat(), opt_func=Adam, lr=3e-2, metrics=[accuracy])
learn.lr_find()
SuggestedLRs(lr_min=0.2754228591918945, lr_steep=1.9054607491852948e-06)

png

learn.fit_one_cycle(5)
epoch train_loss valid_loss accuracy time
0 0.446274 0.414451 0.817015 00:30
1 0.380002 0.393030 0.818916 00:30
2 0.371149 0.359802 0.832066 00:30
3 0.349027 0.352255 0.835868 00:30
4 0.355339 0.349360 0.836819 00:30

Tabnet interpretability

# feature importance for 2k rows
dl = learn.dls.test_dl(df.iloc[:2000], bs=256)
feature_importances = tabnet_feature_importances(learn.model, dl)
# per sample interpretation
dl = learn.dls.test_dl(df.iloc[:20], bs=20)
res_explain, res_masks = tabnet_explain(learn.model, dl)
plt.xticks(rotation='vertical')
plt.bar(dl.x_names, feature_importances, color='g')
plt.show()

png

def plot_explain(masks, lbls, figsize=(12,12)):
    "Plots masks with `lbls` (`dls.x_names`)"
    fig = plt.figure(figsize=figsize)
    ax = fig.add_axes([0.1, 0.1, 0.8, 0.8])
    plt.yticks(np.arange(0, len(masks), 1.0))
    plt.xticks(np.arange(0, len(masks[0]), 1.0))
    ax.set_xticklabels(lbls, rotation=90)
    plt.ylabel('Sample Number')
    plt.xlabel('Variable')
    plt.imshow(masks)
plot_explain(res_explain, dl.x_names)

png

Hyperparameter search with Bayesian Optimization

If your dataset isn't huge you can tune hyperparameters for tabular models with Bayesian Optimization. You can optimize directly your metric using this approach if the metric is sensitive enough (in our example it is not and we use validation loss instead). Also, you should create the second validation set, because you will use the first as a training set for Bayesian Optimization.

You may need to install the optimizer pip install bayesian-optimization

from functools import lru_cache
# The function we'll optimize
@lru_cache(1000)
def get_accuracy(n_d:Int, n_a:Int, n_steps:Int):
    model = TabNetModel(emb_szs, len(to.cont_names), dls.c, n_d=n_d, n_a=n_a, n_steps=n_steps, gamma=1.5)
    learn = Learner(dls, model, CrossEntropyLossFlat(), opt_func=opt_func, lr=3e-2, metrics=[accuracy])
    learn.fit_one_cycle(5)
    return float(learn.validate(dl=learn.dls.valid)[1])

This implementation of Bayesian Optimization doesn't work naturally with descreet values. That's why we use wrapper with lru_cache.

def fit_accuracy(pow_n_d, pow_n_a, pow_n_steps):
    n_d, n_a, n_steps = map(lambda x: 2**int(x), (pow_n_d, pow_n_a, pow_n_steps))
    return get_accuracy(n_d, n_a, n_steps)
from bayes_opt import BayesianOptimization

# Bounded region of parameter space
pbounds = {'pow_n_d': (0, 8), 'pow_n_a': (0, 8), 'pow_n_steps': (0, 4)}

optimizer = BayesianOptimization(
    f=fit_accuracy,
    pbounds=pbounds,
)
optimizer.maximize(
    init_points=15,
    n_iter=100,
)
|   iter    |  target   |  pow_n_a  |  pow_n_d  | pow_n_... |
-------------------------------------------------------------
epoch train_loss valid_loss accuracy time
0 0.404888 0.432834 0.793885 00:10
1 0.367979 0.384840 0.818600 00:09
2 0.366444 0.372005 0.819708 00:09
3 0.362771 0.366949 0.823511 00:10
4 0.353682 0.367132 0.823511 00:10
| �[0m 1       �[0m | �[0m 0.8235  �[0m | �[0m 0.9408  �[0m | �[0m 1.898   �[0m | �[0m 1.652   �[0m |
epoch train_loss valid_loss accuracy time
0 0.393301 0.449742 0.810836 00:08
1 0.379140 0.413773 0.815589 00:07
2 0.355790 0.388907 0.822560 00:07
3 0.349984 0.362671 0.828739 00:07
4 0.348000 0.360150 0.827313 00:07
| �[95m 2       �[0m | �[95m 0.8273  �[0m | �[95m 4.262   �[0m | �[95m 5.604   �[0m | �[95m 0.2437  �[0m |
epoch train_loss valid_loss accuracy time
0 0.451572 0.434189 0.781210 00:12
1 0.423763 0.413420 0.805450 00:12
2 0.398922 0.408688 0.814164 00:12
3 0.390981 0.392398 0.808935 00:12
4 0.376418 0.382250 0.817174 00:12
| �[0m 3       �[0m | �[0m 0.8172  �[0m | �[0m 7.233   �[0m | �[0m 6.471   �[0m | �[0m 2.508   �[0m |
epoch train_loss valid_loss accuracy time
0 0.403187 0.413986 0.798162 00:07
1 0.398544 0.390102 0.820184 00:07
2 0.390569 0.389703 0.825253 00:07
3 0.375426 0.385706 0.826996 00:07
4 0.370446 0.383366 0.831115 00:06
| �[95m 4       �[0m | �[95m 0.8311  �[0m | �[95m 5.935   �[0m | �[95m 1.241   �[0m | �[95m 0.3809  �[0m |
epoch train_loss valid_loss accuracy time
0 0.464145 0.458641 0.751267 00:18
1 0.424691 0.436968 0.788023 00:18
2 0.431576 0.436581 0.775824 00:18
3 0.432143 0.437062 0.759506 00:18
4 0.429915 0.438332 0.758555 00:18
| �[0m 5       �[0m | �[0m 0.7586  �[0m | �[0m 2.554   �[0m | �[0m 0.4992  �[0m | �[0m 3.111   �[0m |
epoch train_loss valid_loss accuracy time
0 0.470359 0.475826 0.748891 00:12
1 0.411564 0.409433 0.797053 00:12
2 0.392718 0.397363 0.809727 00:12
3 0.387564 0.380033 0.814322 00:12
4 0.374153 0.378258 0.818916 00:12
| �[0m 6       �[0m | �[0m 0.8189  �[0m | �[0m 4.592   �[0m | �[0m 2.138   �[0m | �[0m 2.824   �[0m |
epoch train_loss valid_loss accuracy time
0 0.547042 0.588752 0.754119 00:18
1 0.491731 0.469795 0.771863 00:18
2 0.454340 0.433961 0.775190 00:18
3 0.424386 0.432385 0.782953 00:18
4 0.397645 0.406420 0.805767 00:19
| �[0m 7       �[0m | �[0m 0.8058  �[0m | �[0m 6.186   �[0m | �[0m 7.016   �[0m | �[0m 3.316   �[0m |
epoch train_loss valid_loss accuracy time
0 0.485245 0.487635 0.751109 00:18
1 0.450832 0.446423 0.750317 00:18
2 0.448203 0.449419 0.755228 00:18
3 0.430258 0.443562 0.744297 00:18
4 0.429821 0.437173 0.761565 00:18
| �[0m 8       �[0m | �[0m 0.7616  �[0m | �[0m 2.018   �[0m | �[0m 1.316   �[0m | �[0m 3.675   �[0m |
epoch train_loss valid_loss accuracy time
0 0.458425 0.455733 0.751584 00:12
1 0.439781 0.467807 0.751109 00:12
2 0.420331 0.432216 0.775190 00:12
3 0.421012 0.421412 0.782319 00:12
4 0.401828 0.413434 0.801014 00:12
| �[0m 9       �[0m | �[0m 0.801   �[0m | �[0m 2.051   �[0m | �[0m 1.958   �[0m | �[0m 2.332   �[0m |
epoch train_loss valid_loss accuracy time
0 0.546997 0.506728 0.761407 00:18
1 0.489712 0.439324 0.799588 00:18
2 0.448558 0.448419 0.786122 00:18
3 0.436869 0.435375 0.801648 00:18
4 0.417128 0.421093 0.798321 00:18
| �[0m 10      �[0m | �[0m 0.7983  �[0m | �[0m 5.203   �[0m | �[0m 7.719   �[0m | �[0m 3.407   �[0m |
epoch train_loss valid_loss accuracy time
0 0.380781 0.463409 0.786439 00:07
1 0.359212 0.461147 0.798321 00:07
2 0.351414 0.368950 0.822719 00:07
3 0.347257 0.367056 0.829373 00:07
4 0.337212 0.362375 0.830799 00:07
| �[0m 11      �[0m | �[0m 0.8308  �[0m | �[0m 6.048   �[0m | �[0m 4.376   �[0m | �[0m 0.08141 �[0m |
epoch train_loss valid_loss accuracy time
0 0.430772 0.430897 0.767744 00:12
1 0.402611 0.432137 0.764259 00:12
2 0.407579 0.409651 0.812104 00:12
3 0.374988 0.391822 0.816698 00:12
4 0.378011 0.389278 0.816065 00:12
| �[0m 12      �[0m | �[0m 0.8161  �[0m | �[0m 7.083   �[0m | �[0m 1.385   �[0m | �[0m 2.806   �[0m |
epoch train_loss valid_loss accuracy time
0 0.402018 0.412051 0.812262 00:09
1 0.372804 0.464937 0.811629 00:09
2 0.368274 0.384675 0.820184 00:09
3 0.364502 0.371920 0.820659 00:09
4 0.348998 0.369445 0.823828 00:09
| �[0m 13      �[0m | �[0m 0.8238  �[0m | �[0m 4.812   �[0m | �[0m 3.785   �[0m | �[0m 1.396   �[0m |
| �[0m 14      �[0m | �[0m 0.8172  �[0m | �[0m 7.672   �[0m | �[0m 6.719   �[0m | �[0m 2.72    �[0m |
epoch train_loss valid_loss accuracy time
0 0.476033 0.442598 0.803549 00:12
1 0.405236 0.414015 0.788973 00:11
2 0.406291 0.451269 0.789449 00:11
3 0.391013 0.393243 0.816065 00:12
4 0.374160 0.377635 0.821451 00:12
| �[0m 15      �[0m | �[0m 0.8215  �[0m | �[0m 6.464   �[0m | �[0m 7.954   �[0m | �[0m 2.647   �[0m |
epoch train_loss valid_loss accuracy time
0 0.390142 0.390678 0.810995 00:06
1 0.381717 0.382202 0.813055 00:06
2 0.368564 0.378705 0.823828 00:06
3 0.358858 0.368329 0.823511 00:07
4 0.353392 0.363913 0.825887 00:06
| �[0m 16      �[0m | �[0m 0.8259  �[0m | �[0m 0.1229  �[0m | �[0m 7.83    �[0m | �[0m 0.3708  �[0m |
epoch train_loss valid_loss accuracy time
0 0.381215 0.422651 0.800697 00:06
1 0.377345 0.380863 0.815906 00:06
2 0.366631 0.370579 0.822877 00:06
3 0.362745 0.366619 0.823352 00:07
4 0.356861 0.364835 0.825887 00:07
| �[0m 17      �[0m | �[0m 0.8259  �[0m | �[0m 0.03098 �[0m | �[0m 3.326   �[0m | �[0m 0.007025�[0m |
epoch train_loss valid_loss accuracy time
0 0.404604 0.443035 0.824461 00:07
1 0.361872 0.388880 0.823669 00:06
2 0.375164 0.369968 0.825095 00:06
3 0.352091 0.363823 0.827947 00:06
4 0.335458 0.362544 0.829373 00:07
| �[0m 18      �[0m | �[0m 0.8294  �[0m | �[0m 7.81    �[0m | �[0m 7.976   �[0m | �[0m 0.0194  �[0m |
epoch train_loss valid_loss accuracy time
0 0.679292 0.677299 0.248891 00:05
1 0.675403 0.678406 0.248891 00:05
2 0.673259 0.673374 0.248891 00:06
3 0.674996 0.673514 0.248891 00:07
4 0.668813 0.673671 0.248891 00:07
| �[0m 19      �[0m | �[0m 0.2489  �[0m | �[0m 0.4499  �[0m | �[0m 0.138   �[0m | �[0m 0.001101�[0m |
epoch train_loss valid_loss accuracy time
0 0.524201 0.528132 0.729880 00:30
1 0.419377 0.403198 0.812104 00:31
2 0.399398 0.418890 0.812421 00:31
3 0.381651 0.391744 0.819075 00:31
4 0.368742 0.377904 0.822085 00:31
| �[0m 20      �[0m | �[0m 0.8221  �[0m | �[0m 0.0     �[0m | �[0m 6.575   �[0m | �[0m 4.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.681083 0.682397 0.248891 00:05
1 0.672935 0.679371 0.248891 00:06
2 0.675200 0.673466 0.248891 00:06
3 0.674251 0.673356 0.248891 00:06
4 0.668861 0.673186 0.248891 00:06
| �[0m 21      �[0m | �[0m 0.2489  �[0m | �[0m 8.0     �[0m | �[0m 0.0     �[0m | �[0m 0.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.407246 0.432203 0.801331 00:10
1 0.385086 0.399513 0.811312 00:11
2 0.377365 0.384121 0.816065 00:12
3 0.366855 0.371010 0.823194 00:12
4 0.361931 0.368933 0.825095 00:12
| �[0m 22      �[0m | �[0m 0.8251  �[0m | �[0m 0.0     �[0m | �[0m 4.502   �[0m | �[0m 2.193   �[0m |
epoch train_loss valid_loss accuracy time
0 0.493623 0.476921 0.766001 00:30
1 0.441126 0.443774 0.776774 00:31
2 0.424523 0.437125 0.783904 00:31
3 0.402457 0.408628 0.795944 00:31
4 0.439420 0.431756 0.788973 00:32
| �[0m 23      �[0m | �[0m 0.789   �[0m | �[0m 8.0     �[0m | �[0m 3.702   �[0m | �[0m 4.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.515615 0.513919 0.751109 00:31
1 0.462674 0.495322 0.751584 00:31
2 0.465430 0.483685 0.751267 00:31
3 0.481308 0.495375 0.755070 00:31
4 0.481324 0.491275 0.754911 00:32
| �[0m 24      �[0m | �[0m 0.7549  �[0m | �[0m 6.009   �[0m | �[0m 0.0     �[0m | �[0m 4.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.422837 0.403953 0.819392 00:06
1 0.380753 0.367345 0.826838 00:06
2 0.353045 0.365174 0.830006 00:07
3 0.348628 0.364282 0.826362 00:07
4 0.343561 0.361509 0.829214 00:07
| �[0m 25      �[0m | �[0m 0.8292  �[0m | �[0m 3.522   �[0m | �[0m 8.0     �[0m | �[0m 0.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.807766 1.307279 0.481622 00:31
1 0.513308 0.499470 0.783587 00:32
2 0.445906 0.492620 0.798004 00:31
3 0.385094 0.399986 0.807509 00:32
4 0.387228 0.384739 0.817015 00:31
| �[0m 26      �[0m | �[0m 0.817   �[0m | �[0m 0.0     �[0m | �[0m 8.0     �[0m | �[0m 4.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.442076 0.491338 0.755387 00:31
1 0.441078 0.443674 0.760773 00:31
2 0.417575 0.418758 0.792142 00:31
3 0.410825 0.417581 0.788498 00:34
4 0.403407 0.410941 0.798321 00:46
| �[0m 27      �[0m | �[0m 0.7983  �[0m | �[0m 0.0     �[0m | �[0m 0.0     �[0m | �[0m 4.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.407006 0.419679 0.792142 00:08
1 0.390913 0.392631 0.810520 00:08
2 0.365560 0.394330 0.817491 00:08
3 0.378459 0.387244 0.820659 00:08
4 0.375275 0.385417 0.828897 00:08
| �[0m 28      �[0m | �[0m 0.8289  �[0m | �[0m 3.379   �[0m | �[0m 2.848   �[0m | �[0m 0.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.430604 0.469592 0.781210 00:45
1 0.423074 0.429704 0.797529 00:45
2 0.400120 0.393398 0.810995 00:45
3 0.382361 0.390651 0.816065 00:46
4 0.389520 0.401878 0.807193 00:46
| �[0m 29      �[0m | �[0m 0.8072  �[0m | �[0m 0.0     �[0m | �[0m 2.588   �[0m | �[0m 4.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.396348 0.397454 0.806717 00:08
1 0.383342 0.386023 0.819550 00:07
2 0.369493 0.374401 0.820025 00:07
3 0.356015 0.366535 0.826204 00:08
4 0.341073 0.365241 0.826204 00:08
| �[0m 30      �[0m | �[0m 0.8262  �[0m | �[0m 1.217   �[0m | �[0m 5.622   �[0m | �[0m 0.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.809077 0.480591 0.782795 00:45
1 0.571318 0.497731 0.739068 00:45
2 0.514562 0.461726 0.781527 00:45
3 0.439822 0.451722 0.787231 00:44
4 0.419881 0.422125 0.801648 00:45
| �[0m 31      �[0m | �[0m 0.8016  �[0m | �[0m 8.0     �[0m | �[0m 8.0     �[0m | �[0m 4.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.410521 0.435045 0.810044 00:08
1 0.363098 0.378001 0.821926 00:08
2 0.359525 0.364477 0.827788 00:08
3 0.354005 0.366507 0.821610 00:08
4 0.347293 0.362657 0.829373 00:08
| �[0m 32      �[0m | �[0m 0.8294  �[0m | �[0m 5.864   �[0m | �[0m 8.0     �[0m | �[0m 0.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.498376 0.436696 0.794043 00:16
1 0.411699 0.435537 0.801331 00:16
2 0.385327 0.396916 0.820184 00:16
3 0.382020 0.389856 0.813371 00:16
4 0.373869 0.377804 0.820817 00:15
| �[0m 33      �[0m | �[0m 0.8208  �[0m | �[0m 1.776   �[0m | �[0m 8.0     �[0m | �[0m 2.212   �[0m |
epoch train_loss valid_loss accuracy time
0 0.404653 0.440106 0.772180 00:11
1 0.377931 0.393715 0.817332 00:11
2 0.373221 0.379273 0.826838 00:11
3 0.359682 0.362844 0.828422 00:11
4 0.340384 0.363072 0.828897 00:11
| �[0m 34      �[0m | �[0m 0.8289  �[0m | �[0m 5.777   �[0m | �[0m 2.2     �[0m | �[0m 1.31    �[0m |
epoch train_loss valid_loss accuracy time
0 0.520308 0.503207 0.749208 00:45
1 0.472501 0.451469 0.780418 00:45
2 0.454686 0.429175 0.784854 00:45
3 0.400800 0.413727 0.795469 00:44
4 0.405604 0.409770 0.801648 00:45
| �[0m 35      �[0m | �[0m 0.8016  �[0m | �[0m 2.748   �[0m | �[0m 5.915   �[0m | �[0m 4.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.504537 0.501541 0.750317 00:45
1 0.465937 0.477715 0.773289 00:45
2 0.435364 0.481415 0.766635 00:45
3 0.425434 0.442198 0.772814 00:45
4 0.425779 0.458947 0.771863 00:45
| �[0m 36      �[0m | �[0m 0.7719  �[0m | �[0m 6.251   �[0m | �[0m 2.532   �[0m | �[0m 4.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.420782 0.420721 0.791350 00:10
1 0.403576 0.408376 0.800222 00:10
2 0.390236 0.393624 0.820342 00:11
3 0.377777 0.389657 0.821610 00:11
4 0.382809 0.386011 0.820976 00:11
| �[0m 37      �[0m | �[0m 0.821   �[0m | �[0m 5.093   �[0m | �[0m 0.172   �[0m | �[0m 1.64    �[0m |
epoch train_loss valid_loss accuracy time
0 0.393575 0.397811 0.812262 00:08
1 0.378272 0.381915 0.815748 00:08
2 0.364799 0.369214 0.824620 00:08
3 0.355757 0.364554 0.826996 00:08
4 0.342090 0.362723 0.824303 00:08
| �[0m 38      �[0m | �[0m 0.8243  �[0m | �[0m 8.0     �[0m | �[0m 5.799   �[0m | �[0m 0.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.393693 0.396980 0.822085 00:11
1 0.361231 0.393146 0.813847 00:11
2 0.345645 0.379510 0.823986 00:11
3 0.349778 0.367077 0.826679 00:11
4 0.342390 0.362027 0.827788 00:11
| �[0m 39      �[0m | �[0m 0.8278  �[0m | �[0m 1.62    �[0m | �[0m 3.832   �[0m | �[0m 1.151   �[0m |
epoch train_loss valid_loss accuracy time
0 0.832737 0.491002 0.771546 00:43
1 0.627948 0.553552 0.764734 00:43
2 0.498901 0.467162 0.791984 00:46
3 0.431196 0.444576 0.785646 00:43
4 0.399745 0.427060 0.796578 00:42
| �[0m 40      �[0m | �[0m 0.7966  �[0m | �[0m 2.198   �[0m | �[0m 8.0     �[0m | �[0m 4.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.511301 0.514401 0.751267 00:43
1 0.447332 0.445157 0.751109 00:43
2 0.451125 0.438327 0.750951 00:42
3 0.445883 0.443266 0.751267 00:42
4 0.444816 0.438459 0.764100 00:42
| �[0m 41      �[0m | �[0m 0.7641  �[0m | �[0m 8.0     �[0m | �[0m 1.03    �[0m | �[0m 4.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.408504 0.413275 0.797212 00:15
1 0.392707 0.399085 0.805767 00:15
2 0.379938 0.395550 0.817807 00:15
3 0.375288 0.383186 0.820817 00:15
4 0.360417 0.375098 0.823194 00:16
| �[0m 42      �[0m | �[0m 0.8232  �[0m | �[0m 0.0     �[0m | �[0m 2.504   �[0m | �[0m 2.135   �[0m |
epoch train_loss valid_loss accuracy time
0 0.399371 0.415196 0.801014 00:07
1 0.367804 0.392020 0.810995 00:06
2 0.362288 0.385124 0.820659 00:07
3 0.344728 0.371339 0.823669 00:07
4 0.345769 0.362059 0.829373 00:07
| �[0m 43      �[0m | �[0m 0.8294  �[0m | �[0m 0.0     �[0m | �[0m 5.441   �[0m | �[0m 0.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.397157 0.431003 0.803866 00:06
1 0.394964 0.396448 0.810361 00:06
2 0.378584 0.387943 0.820659 00:07
3 0.371601 0.386186 0.818283 00:07
4 0.369759 0.384339 0.827630 00:07
| �[0m 44      �[0m | �[0m 0.8276  �[0m | �[0m 4.636   �[0m | �[0m 1.476   �[0m | �[0m 0.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.408654 0.426806 0.791191 00:12
1 0.394184 0.406586 0.786439 00:12
2 0.369625 0.372680 0.822560 00:12
3 0.349444 0.368142 0.823828 00:12
4 0.351684 0.363406 0.826204 00:12
| �[0m 45      �[0m | �[0m 0.8262  �[0m | �[0m 0.0     �[0m | �[0m 7.071   �[0m | �[0m 2.071   �[0m |
epoch train_loss valid_loss accuracy time
0 0.400293 0.416098 0.811629 00:08
1 0.377387 0.433395 0.807034 00:08
2 0.368131 0.395448 0.796420 00:08
3 0.367750 0.376879 0.817174 00:08
4 0.362124 0.371432 0.821134 00:08
| �[0m 46      �[0m | �[0m 0.8211  �[0m | �[0m 4.26    �[0m | �[0m 6.934   �[0m | �[0m 1.79    �[0m |
epoch train_loss valid_loss accuracy time
0 0.404579 0.437443 0.814797 00:07
1 0.375342 0.380416 0.824937 00:07
2 0.365835 0.377617 0.812738 00:07
3 0.354619 0.364503 0.827471 00:07
4 0.340603 0.363488 0.827947 00:07
| �[0m 47      �[0m | �[0m 0.8279  �[0m | �[0m 6.579   �[0m | �[0m 6.485   �[0m | �[0m 0.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.384890 0.440342 0.812579 00:08
1 0.371483 0.387200 0.813847 00:09
2 0.365951 0.378071 0.818283 00:09
3 0.362965 0.369994 0.821610 00:09
4 0.356483 0.365151 0.826521 00:09
| �[0m 48      �[0m | �[0m 0.8265  �[0m | �[0m 8.0     �[0m | �[0m 4.293   �[0m | �[0m 1.74    �[0m |
epoch train_loss valid_loss accuracy time
0 0.386308 0.389250 0.815431 00:08
1 0.368402 0.389338 0.814956 00:09
2 0.362211 0.377196 0.824778 00:09
3 0.356135 0.362951 0.829531 00:09
4 0.341577 0.362476 0.830799 00:09
| �[0m 49      �[0m | �[0m 0.8308  �[0m | �[0m 7.909   �[0m | �[0m 7.827   �[0m | �[0m 1.323   �[0m |
epoch train_loss valid_loss accuracy time
0 0.426044 0.422882 0.791984 00:08
1 0.375756 0.381810 0.817491 00:09
2 0.363932 0.375904 0.818916 00:09
3 0.349442 0.365052 0.823986 00:09
4 0.344509 0.363027 0.830323 00:09
| �[0m 50      �[0m | �[0m 0.8303  �[0m | �[0m 4.946   �[0m | �[0m 1.246   �[0m | �[0m 1.589   �[0m |
epoch train_loss valid_loss accuracy time
0 0.388522 0.431909 0.820976 00:07
1 0.372532 0.448644 0.751109 00:07
2 0.358823 0.373322 0.823669 00:07
3 0.352838 0.362424 0.831591 00:07
4 0.352949 0.361356 0.831432 00:07
| �[95m 51      �[0m | �[95m 0.8314  �[0m | �[95m 5.664   �[0m | �[95m 2.626   �[0m | �[95m 0.003048�[0m |
epoch train_loss valid_loss accuracy time
0 0.389195 0.390032 0.817332 00:06
1 0.369993 0.382199 0.819708 00:07
2 0.362801 0.373282 0.826521 00:06
3 0.359760 0.363597 0.824303 00:06
4 0.344525 0.362097 0.828897 00:07
| �[0m 52      �[0m | �[0m 0.8289  �[0m | �[0m 1.287   �[0m | �[0m 3.505   �[0m | �[0m 0.06804 �[0m |
epoch train_loss valid_loss accuracy time
0 0.394940 0.403165 0.814639 00:06
1 0.371518 0.452118 0.806876 00:06
2 0.364734 0.377214 0.824461 00:06
3 0.347968 0.365335 0.823511 00:07
4 0.345476 0.363670 0.827155 00:07
| �[0m 53      �[0m | �[0m 0.8272  �[0m | �[0m 1.606   �[0m | �[0m 7.998   �[0m | �[0m 0.2009  �[0m |
epoch train_loss valid_loss accuracy time
0 0.500322 0.519261 0.757129 00:10
1 0.413270 0.423630 0.801965 00:11
2 0.380234 0.395588 0.813371 00:12
3 0.361677 0.378123 0.817174 00:12
4 0.374629 0.373772 0.820025 00:12
| �[0m 54      �[0m | �[0m 0.82    �[0m | �[0m 4.579   �[0m | �[0m 5.017   �[0m | �[0m 2.928   �[0m |
| �[0m 55      �[0m | �[0m 0.8259  �[0m | �[0m 0.02565 �[0m | �[0m 3.699   �[0m | �[0m 0.9808  �[0m |
epoch train_loss valid_loss accuracy time
0 0.452787 0.443697 0.768695 00:11
1 0.428332 0.415454 0.800697 00:11
2 0.396522 0.402850 0.807668 00:12
3 0.424802 0.414648 0.783587 00:12
4 0.385055 0.392359 0.801489 00:12
| �[0m 56      �[0m | �[0m 0.8015  �[0m | �[0m 1.927   �[0m | �[0m 5.92    �[0m | �[0m 2.53    �[0m |
epoch train_loss valid_loss accuracy time
0 0.435597 0.438222 0.810836 00:19
1 0.399920 0.531189 0.770754 00:19
2 0.403408 0.409382 0.804816 00:18
3 0.363519 0.383823 0.815906 00:19
4 0.360030 0.377621 0.819708 00:19
| �[0m 57      �[0m | �[0m 0.8197  �[0m | �[0m 0.7796  �[0m | �[0m 4.576   �[0m | �[0m 3.952   �[0m |
epoch train_loss valid_loss accuracy time
0 0.388445 0.420243 0.800539 00:07
1 0.372912 0.369659 0.827630 00:07
2 0.354443 0.366757 0.828105 00:07
3 0.352468 0.366038 0.822560 00:07
4 0.347822 0.362001 0.829690 00:07
| �[0m 58      �[0m | �[0m 0.8297  �[0m | �[0m 3.525   �[0m | �[0m 4.198   �[0m | �[0m 0.02314 �[0m |
epoch train_loss valid_loss accuracy time
0 0.444627 0.431739 0.787072 00:12
1 0.392637 0.412985 0.799747 00:12
2 0.369733 0.396133 0.802440 00:12
3 0.365821 0.373095 0.820342 00:12
4 0.370486 0.371560 0.819392 00:12
| �[0m 59      �[0m | �[0m 0.8194  �[0m | �[0m 6.711   �[0m | �[0m 3.848   �[0m | �[0m 2.395   �[0m |
epoch train_loss valid_loss accuracy time
0 0.396831 0.389045 0.809569 00:07
1 0.371171 0.375065 0.818600 00:07
2 0.350309 0.371795 0.824620 00:07
3 0.359700 0.363041 0.828739 00:07
4 0.345735 0.361556 0.830799 00:07
| �[0m 60      �[0m | �[0m 0.8308  �[0m | �[0m 4.914   �[0m | �[0m 7.944   �[0m | �[0m 0.9998  �[0m |
epoch train_loss valid_loss accuracy time
0 0.422853 0.412691 0.804341 00:09
1 0.375209 0.394692 0.817174 00:09
2 0.365574 0.380376 0.820184 00:08
3 0.359143 0.363607 0.831115 00:08
4 0.347991 0.361650 0.827947 00:08
| �[0m 61      �[0m | �[0m 0.8279  �[0m | �[0m 7.962   �[0m | �[0m 6.151   �[0m | �[0m 1.119   �[0m |
epoch train_loss valid_loss accuracy time
0 0.388147 0.405885 0.810678 00:07
1 0.367743 0.391867 0.807826 00:07
2 0.366964 0.362980 0.828739 00:07
3 0.363402 0.363396 0.829531 00:07
4 0.351094 0.362245 0.829214 00:07
| �[0m 62      �[0m | �[0m 0.8292  �[0m | �[0m 2.583   �[0m | �[0m 6.996   �[0m | �[0m 0.008348�[0m |
epoch train_loss valid_loss accuracy time
0 0.432724 0.390516 0.815114 00:11
1 0.407100 0.401564 0.811153 00:11
2 0.359414 0.384463 0.820342 00:11
3 0.358061 0.371844 0.826362 00:12
4 0.345357 0.362986 0.831115 00:12
| �[0m 63      �[0m | �[0m 0.8311  �[0m | �[0m 0.2249  �[0m | �[0m 8.0     �[0m | �[0m 2.823   �[0m |
epoch train_loss valid_loss accuracy time
0 0.418990 0.420463 0.808618 00:07
1 0.389830 0.398110 0.816223 00:08
2 0.382975 0.387620 0.814956 00:06
3 0.384093 0.379607 0.819392 00:06
4 0.358019 0.371140 0.823828 00:08
| �[0m 64      �[0m | �[0m 0.8238  �[0m | �[0m 5.764   �[0m | �[0m 5.509   �[0m | �[0m 1.482   �[0m |
epoch train_loss valid_loss accuracy time
0 0.385147 0.399680 0.806242 00:09
1 0.376032 0.381131 0.822560 00:09
2 0.363870 0.378227 0.822402 00:09
3 0.351089 0.368790 0.826838 00:09
4 0.340404 0.361807 0.829214 00:09
| �[0m 65      �[0m | �[0m 0.8292  �[0m | �[0m 1.048   �[0m | �[0m 2.939   �[0m | �[0m 1.922   �[0m |
epoch train_loss valid_loss accuracy time
0 0.526628 0.507236 0.746673 00:17
1 0.460229 0.455675 0.765684 00:19
2 0.417427 0.421368 0.785963 00:19
3 0.462800 0.458844 0.773923 00:19
4 0.449479 0.456627 0.783587 00:19
| �[0m 66      �[0m | �[0m 0.7836  �[0m | �[0m 3.68    �[0m | �[0m 3.977   �[0m | �[0m 3.919   �[0m |
epoch train_loss valid_loss accuracy time
0 0.621678 0.559080 0.749049 00:10
1 0.457104 0.473610 0.758397 00:11
2 0.416287 0.416622 0.764575 00:13
3 0.388107 0.403844 0.811945 00:13
4 0.384231 0.396397 0.813055 00:13
| �[0m 67      �[0m | �[0m 0.8131  �[0m | �[0m 5.907   �[0m | �[0m 0.9452  �[0m | �[0m 2.168   �[0m |
epoch train_loss valid_loss accuracy time
0 0.410105 0.416171 0.808618 00:11
1 0.381669 0.400109 0.809094 00:13
2 0.377539 0.403879 0.803074 00:13
3 0.374653 0.389122 0.808618 00:13
4 0.366356 0.380526 0.814005 00:13
| �[0m 68      �[0m | �[0m 0.814   �[0m | �[0m 7.981   �[0m | �[0m 2.796   �[0m | �[0m 2.78    �[0m |
epoch train_loss valid_loss accuracy time
0 0.380539 0.420856 0.815589 00:07
1 0.368232 0.424276 0.803866 00:05
2 0.358194 0.378501 0.827155 00:06
3 0.353427 0.362224 0.829690 00:07
4 0.344443 0.361554 0.826838 00:07
| �[0m 69      �[0m | �[0m 0.8268  �[0m | �[0m 7.223   �[0m | �[0m 3.762   �[0m | �[0m 0.6961  �[0m |
epoch train_loss valid_loss accuracy time
0 0.445924 0.488581 0.752693 00:17
1 0.410709 0.400962 0.813688 00:18
2 0.373518 0.393235 0.820184 00:18
3 0.364160 0.378920 0.820817 00:17
4 0.357551 0.371629 0.825412 00:17
| �[0m 70      �[0m | �[0m 0.8254  �[0m | �[0m 0.009375�[0m | �[0m 5.081   �[0m | �[0m 3.79    �[0m |
epoch train_loss valid_loss accuracy time
0 0.410323 0.442670 0.814797 00:08
1 0.379279 0.405500 0.807034 00:08
2 0.387576 0.392448 0.819708 00:09
3 0.371622 0.389167 0.823035 00:09
4 0.374182 0.386964 0.825095 00:09
| �[0m 71      �[0m | �[0m 0.8251  �[0m | �[0m 3.293   �[0m | �[0m 2.76    �[0m | �[0m 1.061   �[0m |
| �[0m 72      �[0m | �[0m 0.8308  �[0m | �[0m 4.589   �[0m | �[0m 7.13    �[0m | �[0m 0.003179�[0m |
epoch train_loss valid_loss accuracy time
0 0.391509 0.399966 0.806400 00:06
1 0.366694 0.405719 0.823828 00:07
2 0.359751 0.375496 0.822877 00:07
3 0.347678 0.361711 0.830799 00:07
4 0.336896 0.361922 0.828580 00:07
| �[0m 73      �[0m | �[0m 0.8286  �[0m | �[0m 7.118   �[0m | �[0m 5.204   �[0m | �[0m 0.5939  �[0m |
epoch train_loss valid_loss accuracy time
0 0.435485 0.414883 0.808143 00:08
1 0.373591 0.417138 0.814005 00:09
2 0.369590 0.375724 0.820184 00:09
3 0.370829 0.368655 0.829531 00:09
4 0.346463 0.366307 0.825412 00:09
| �[0m 74      �[0m | �[0m 0.8254  �[0m | �[0m 6.837   �[0m | �[0m 7.988   �[0m | �[0m 1.055   �[0m |
| �[0m 75      �[0m | �[0m 0.8259  �[0m | �[0m 0.6629  �[0m | �[0m 7.012   �[0m | �[0m 0.03222 �[0m |
| �[0m 76      �[0m | �[0m 0.8311  �[0m | �[0m 5.177   �[0m | �[0m 1.457   �[0m | �[0m 0.5857  �[0m |
epoch train_loss valid_loss accuracy time
0 0.455741 0.530008 0.794994 00:11
1 0.421961 0.423317 0.805292 00:11
2 0.405799 0.405729 0.807351 00:12
3 0.383895 0.395092 0.816857 00:12
4 0.378882 0.386044 0.818758 00:12
| �[0m 77      �[0m | �[0m 0.8188  �[0m | �[0m 3.308   �[0m | �[0m 4.533   �[0m | �[0m 2.048   �[0m |
epoch train_loss valid_loss accuracy time
0 0.412261 0.410950 0.798954 00:08
1 0.382314 0.408471 0.803390 00:09
2 0.347488 0.387500 0.815273 00:09
3 0.343408 0.372050 0.821451 00:09
4 0.344963 0.366158 0.822719 00:09
| �[0m 78      �[0m | �[0m 0.8227  �[0m | �[0m 0.4036  �[0m | �[0m 7.997   �[0m | �[0m 1.439   �[0m |
epoch train_loss valid_loss accuracy time
0 0.412856 0.410016 0.799113 00:08
1 0.410852 0.416405 0.788498 00:08
2 0.373897 0.384385 0.824303 00:09
3 0.353164 0.366129 0.822719 00:09
4 0.353253 0.362269 0.826362 00:09
| �[0m 79      �[0m | �[0m 0.8264  �[0m | �[0m 3.438   �[0m | �[0m 7.982   �[0m | �[0m 1.829   �[0m |
epoch train_loss valid_loss accuracy time
0 0.419316 0.408936 0.798162 00:08
1 0.393826 0.390526 0.820184 00:08
2 0.372879 0.374823 0.822719 00:08
3 0.358019 0.370913 0.820342 00:08
4 0.346020 0.362252 0.829690 00:08
| �[0m 80      �[0m | �[0m 0.8297  �[0m | �[0m 6.88    �[0m | �[0m 2.404   �[0m | �[0m 1.666   �[0m |
epoch train_loss valid_loss accuracy time
0 0.433481 0.437320 0.790082 00:18
1 0.415280 0.402946 0.814164 00:18
2 0.365575 0.376285 0.822877 00:18
3 0.363206 0.371865 0.820501 00:18
4 0.356401 0.370252 0.823828 00:18
| �[0m 81      �[0m | �[0m 0.8238  �[0m | �[0m 0.03221 �[0m | �[0m 1.306   �[0m | �[0m 3.909   �[0m |
epoch train_loss valid_loss accuracy time
0 0.393150 0.420964 0.783745 00:07
1 0.375065 0.371380 0.823986 00:07
2 0.362952 0.387037 0.813688 00:07
3 0.347245 0.370225 0.824937 00:07
4 0.348406 0.361420 0.830640 00:07
| �[0m 82      �[0m | �[0m 0.8306  �[0m | �[0m 1.575   �[0m | �[0m 2.689   �[0m | �[0m 0.8684  �[0m |
epoch train_loss valid_loss accuracy time
0 0.395530 0.397430 0.818600 00:06
1 0.358679 0.396773 0.818283 00:07
2 0.349305 0.372877 0.823828 00:07
3 0.347346 0.363006 0.828422 00:07
4 0.335652 0.362567 0.830957 00:07
| �[0m 83      �[0m | �[0m 0.831   �[0m | �[0m 2.765   �[0m | �[0m 5.439   �[0m | �[0m 0.04047 �[0m |
epoch train_loss valid_loss accuracy time
0 0.380831 0.386072 0.814322 00:05
1 0.369778 0.392521 0.810520 00:07
2 0.368286 0.383131 0.816857 00:07
3 0.356585 0.367839 0.821768 00:07
4 0.344722 0.366639 0.825253 00:07
| �[0m 84      �[0m | �[0m 0.8253  �[0m | �[0m 0.1961  �[0m | �[0m 4.123   �[0m | �[0m 0.02039 �[0m |
epoch train_loss valid_loss accuracy time
0 0.502605 0.463703 0.772180 00:11
1 0.407231 0.404386 0.804499 00:13
2 0.406849 0.411254 0.817491 00:12
3 0.379910 0.389118 0.817174 00:12
4 0.370964 0.379835 0.821293 00:12
| �[0m 85      �[0m | �[0m 0.8213  �[0m | �[0m 7.937   �[0m | �[0m 7.939   �[0m | �[0m 2.895   �[0m |
epoch train_loss valid_loss accuracy time
0 0.402574 0.431237 0.786755 00:11
1 0.380494 0.392253 0.817966 00:11
2 0.363284 0.393907 0.815748 00:11
3 0.355680 0.368488 0.822560 00:12
4 0.362978 0.367755 0.823511 00:12
| �[0m 86      �[0m | �[0m 0.8235  �[0m | �[0m 0.06921 �[0m | �[0m 5.7     �[0m | �[0m 2.778   �[0m |
epoch train_loss valid_loss accuracy time
0 0.448572 0.451255 0.789766 00:10
1 0.417093 0.411838 0.808143 00:11
2 0.400799 0.400185 0.816223 00:12
3 0.378641 0.385082 0.820342 00:12
4 0.365278 0.380320 0.818441 00:11
| �[0m 87      �[0m | �[0m 0.8184  �[0m | �[0m 7.965   �[0m | �[0m 5.261   �[0m | �[0m 2.661   �[0m |
epoch train_loss valid_loss accuracy time
0 0.399453 0.429288 0.783112 00:07
1 0.368987 0.376985 0.825729 00:07
2 0.358226 0.372103 0.830165 00:07
3 0.348940 0.362069 0.831115 00:07
4 0.341437 0.361470 0.830482 00:07
| �[0m 88      �[0m | �[0m 0.8305  �[0m | �[0m 2.792   �[0m | �[0m 7.917   �[0m | �[0m 0.761   �[0m |
| �[0m 89      �[0m | �[0m 0.8294  �[0m | �[0m 7.995   �[0m | �[0m 7.186   �[0m | �[0m 0.1199  �[0m |
epoch train_loss valid_loss accuracy time
0 0.390914 0.403780 0.799905 00:07
1 0.375241 0.400406 0.821610 00:07
2 0.359710 0.373233 0.826046 00:07
3 0.351108 0.367255 0.823986 00:07
4 0.348230 0.362827 0.830482 00:07
| �[0m 90      �[0m | �[0m 0.8305  �[0m | �[0m 2.526   �[0m | �[0m 3.741   �[0m | �[0m 0.1186  �[0m |
| �[0m 91      �[0m | �[0m 0.8294  �[0m | �[0m 0.03285 �[0m | �[0m 5.742   �[0m | �[0m 0.9747  �[0m |
epoch train_loss valid_loss accuracy time
0 0.384485 0.413711 0.816857 00:06
1 0.358383 0.382125 0.813055 00:06
2 0.349632 0.380644 0.825570 00:07
3 0.350005 0.363857 0.833492 00:07
4 0.341342 0.362533 0.829848 00:07
| �[0m 92      �[0m | �[0m 0.8298  �[0m | �[0m 4.112e-0�[0m | �[0m 8.0     �[0m | �[0m 0.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.461806 0.454314 0.752218 00:11
1 0.429429 0.452721 0.768061 00:13
2 0.408792 0.422315 0.785013 00:12
3 0.400290 0.411134 0.805292 00:12
4 0.396974 0.409337 0.804658 00:12
| �[0m 93      �[0m | �[0m 0.8047  �[0m | �[0m 4.563   �[0m | �[0m 0.6868  �[0m | �[0m 2.461   �[0m |
epoch train_loss valid_loss accuracy time
0 0.464920 0.440214 0.772655 00:11
1 0.401611 0.400240 0.805767 00:11
2 0.391905 0.414512 0.798638 00:13
3 0.391561 0.407271 0.805608 00:13
4 0.387596 0.397586 0.814639 00:13
| �[0m 94      �[0m | �[0m 0.8146  �[0m | �[0m 4.697   �[0m | �[0m 3.412   �[0m | �[0m 2.514   �[0m |
epoch train_loss valid_loss accuracy time
0 0.417917 0.423588 0.755070 00:05
1 0.390158 0.396916 0.817649 00:07
2 0.374562 0.389764 0.828422 00:07
3 0.368317 0.385268 0.822719 00:07
4 0.373367 0.385092 0.827471 00:07
| �[0m 95      �[0m | �[0m 0.8275  �[0m | �[0m 5.04    �[0m | �[0m 0.4492  �[0m | �[0m 0.5899  �[0m |
epoch train_loss valid_loss accuracy time
0 0.368991 0.388251 0.822402 00:07
1 0.363111 0.399670 0.814797 00:08
2 0.370390 0.377726 0.818441 00:10
3 0.353493 0.368603 0.823828 00:10
4 0.355493 0.368281 0.823669 00:09
| �[0m 96      �[0m | �[0m 0.8237  �[0m | �[0m 0.6025  �[0m | �[0m 2.712   �[0m | �[0m 1.166   �[0m |
epoch train_loss valid_loss accuracy time
0 0.426161 0.417567 0.793726 00:07
1 0.397633 0.392852 0.818124 00:08
2 0.371073 0.371678 0.821768 00:09
3 0.351716 0.361607 0.831749 00:10
4 0.347291 0.359881 0.832066 00:09
| �[95m 97      �[0m | �[95m 0.8321  �[0m | �[95m 6.389   �[0m | �[95m 3.648   �[0m | �[95m 1.016   �[0m |
epoch train_loss valid_loss accuracy time
0 0.408937 0.440755 0.785805 00:12
1 0.369800 0.379622 0.819867 00:12
2 0.356048 0.370119 0.827630 00:11
3 0.354271 0.366091 0.826362 00:11
4 0.359324 0.366020 0.827313 00:12
| �[0m 98      �[0m | �[0m 0.8273  �[0m | �[0m 0.5927  �[0m | �[0m 1.715   �[0m | �[0m 2.847   �[0m |
epoch train_loss valid_loss accuracy time
0 0.443831 0.469775 0.753802 00:11
1 0.411636 0.423362 0.787864 00:11
2 0.411669 0.421874 0.797529 00:13
3 0.392242 0.395377 0.816382 00:12
4 0.383947 0.390270 0.818441 00:12
| �[0m 99      �[0m | �[0m 0.8184  �[0m | �[0m 6.602   �[0m | �[0m 2.266   �[0m | �[0m 2.535   �[0m |
epoch train_loss valid_loss accuracy time
0 0.413688 0.469193 0.811153 00:08
1 0.383908 0.401520 0.815589 00:08
2 0.363908 0.375178 0.824937 00:08
3 0.360694 0.366536 0.828739 00:08
4 0.341851 0.362886 0.830165 00:08
| �[0m 100     �[0m | �[0m 0.8302  �[0m | �[0m 3.099   �[0m | �[0m 6.058   �[0m | �[0m 1.058   �[0m |
epoch train_loss valid_loss accuracy time
0 0.404324 0.402310 0.807668 00:09
1 0.383628 0.428126 0.780577 00:09
2 0.360917 0.375198 0.826996 00:09
3 0.349922 0.367114 0.828264 00:09
4 0.338715 0.363762 0.830165 00:09
| �[0m 101     �[0m | �[0m 0.8302  �[0m | �[0m 4.268   �[0m | �[0m 5.28    �[0m | �[0m 1.397   �[0m |
epoch train_loss valid_loss accuracy time
0 0.381317 0.410546 0.815431 00:06
1 0.355632 0.424577 0.802440 00:06
2 0.361193 0.377675 0.818124 00:07
3 0.355751 0.363205 0.827313 00:07
4 0.338802 0.362403 0.827471 00:07
| �[0m 102     �[0m | �[0m 0.8275  �[0m | �[0m 4.735   �[0m | �[0m 3.398   �[0m | �[0m 0.01904 �[0m |
epoch train_loss valid_loss accuracy time
0 0.381150 0.398499 0.813213 00:07
1 0.370997 0.413040 0.798162 00:07
2 0.364154 0.369066 0.823352 00:07
3 0.354434 0.362925 0.825095 00:07
4 0.348736 0.362125 0.824461 00:07
| �[0m 103     �[0m | �[0m 0.8245  �[0m | �[0m 0.0     �[0m | �[0m 6.371   �[0m | �[0m 9.721e-0�[0m |
epoch train_loss valid_loss accuracy time
0 0.388378 0.412839 0.810361 00:06
1 0.377533 0.380432 0.822560 00:06
2 0.356242 0.372250 0.824937 00:07
3 0.346274 0.364236 0.830323 00:07
4 0.349273 0.362597 0.830482 00:07
| �[0m 104     �[0m | �[0m 0.8305  �[0m | �[0m 1.683   �[0m | �[0m 6.381   �[0m | �[0m 0.8564  �[0m |
epoch train_loss valid_loss accuracy time
0 0.410286 0.413506 0.801648 00:07
1 0.386787 0.397328 0.812262 00:07
2 0.365268 0.390419 0.825570 00:07
3 0.368566 0.386955 0.828264 00:07
4 0.362751 0.383124 0.830482 00:07
| �[0m 105     �[0m | �[0m 0.8305  �[0m | �[0m 6.387   �[0m | �[0m 2.333   �[0m | �[0m 0.4877  �[0m |
epoch train_loss valid_loss accuracy time
0 0.432347 0.450028 0.780894 00:17
1 0.414766 0.402565 0.809569 00:17
2 0.382495 0.382281 0.812421 00:18
3 0.366852 0.373158 0.822085 00:18
4 0.353692 0.368471 0.824461 00:18
| �[0m 106     �[0m | �[0m 0.8245  �[0m | �[0m 0.9452  �[0m | �[0m 6.902   �[0m | �[0m 3.991   �[0m |
epoch train_loss valid_loss accuracy time
0 0.397604 0.407966 0.814797 00:06
1 0.386755 0.382698 0.800063 00:07
2 0.361672 0.373610 0.823194 00:07
3 0.345655 0.363278 0.829848 00:07
4 0.349931 0.362171 0.830957 00:07
| �[0m 107     �[0m | �[0m 0.831   �[0m | �[0m 5.475   �[0m | �[0m 5.721   �[0m | �[0m 0.02659 �[0m |
| �[0m 108     �[0m | �[0m 0.8308  �[0m | �[0m 4.597   �[0m | �[0m 7.994   �[0m | �[0m 0.1318  �[0m |
epoch train_loss valid_loss accuracy time
0 0.391720 0.395161 0.819550 00:06
1 0.372859 0.387859 0.810203 00:07
2 0.367385 0.376952 0.812262 00:07
3 0.346810 0.365312 0.827155 00:07
4 0.341221 0.363528 0.829056 00:07
| �[0m 109     �[0m | �[0m 0.8291  �[0m | �[0m 2.391   �[0m | �[0m 4.915   �[0m | �[0m 0.8695  �[0m |
epoch train_loss valid_loss accuracy time
0 0.391635 0.402138 0.804658 00:08
1 0.402425 0.468013 0.803866 00:09
2 0.368443 0.372927 0.823669 00:09
3 0.361579 0.364913 0.826679 00:09
4 0.347989 0.363416 0.828580 00:09
| �[0m 110     �[0m | �[0m 0.8286  �[0m | �[0m 5.711   �[0m | �[0m 7.031   �[0m | �[0m 1.157   �[0m |
epoch train_loss valid_loss accuracy time
0 0.393225 0.414461 0.812738 00:07
1 0.377641 0.381910 0.820659 00:07
2 0.363979 0.375042 0.826838 00:07
3 0.349867 0.363547 0.830165 00:07
4 0.350035 0.364254 0.830006 00:07
| �[0m 111     �[0m | �[0m 0.83    �[0m | �[0m 3.867   �[0m | �[0m 7.351   �[0m | �[0m 0.7373  �[0m |
| �[0m 112     �[0m | �[0m 0.8275  �[0m | �[0m 5.568   �[0m | �[0m 0.8565  �[0m | �[0m 0.9522  �[0m |
| �[0m 113     �[0m | �[0m 0.8311  �[0m | �[0m 5.553   �[0m | �[0m 1.45    �[0m | �[0m 0.0     �[0m |
epoch train_loss valid_loss accuracy time
0 0.404215 0.421293 0.817807 00:10
1 0.380753 0.388533 0.803074 00:09
2 0.362475 0.381199 0.820342 00:09
3 0.353416 0.368431 0.824620 00:09
4 0.335854 0.365448 0.825570 00:09
| �[0m 114     �[0m | �[0m 0.8256  �[0m | �[0m 3.949   �[0m | �[0m 1.125   �[0m | �[0m 1.109   �[0m |
epoch train_loss valid_loss accuracy time
0 0.397351 0.482881 0.766160 00:08
1 0.373190 0.372999 0.819075 00:08
2 0.354075 0.368455 0.826046 00:08
3 0.350617 0.362793 0.830323 00:08
4 0.347381 0.361980 0.831274 00:08
| �[0m 115     �[0m | �[0m 0.8313  �[0m | �[0m 7.258   �[0m | �[0m 4.724   �[0m | �[0m 1.639   �[0m |
=============================================================
optimizer.max['target']
0.8428390622138977
{key: 2**int(value)
  for key, value in optimizer.max['params'].items()}
{'pow_n_a': 2, 'pow_n_d': 64, 'pow_n_steps': 1}

Out of memory dataset

If your dataset is so big it doesn't fit in memory, you can load a chunk of it each epoch.

df = pd.read_csv(path/'adult.csv')
df_main,df_valid = df.iloc[:-1000].copy(),df.iloc[-1000:].copy()
# choose size that fit in memory
dataset_size = 1000
# load chunk with your own code
def load_chunk():
    return df_main.sample(dataset_size).copy()
df_small = load_chunk()
cat_names = ['workclass', 'education', 'marital-status', 'occupation', 
             'relationship', 'race', 'native-country', 'sex']
cont_names = ['age', 'fnlwgt', 'education-num']
procs = [Categorify, FillMissing, Normalize]
splits = RandomSplitter()(range_of(df_small))
to = TabularPandas(df_small, procs, cat_names, cont_names, y_names="salary", y_block = CategoryBlock(), 
                   splits=None, do_setup=True)
# save the validation set
to_valid = to.new(df_valid)
to_valid.process()
val_dl = TabDataLoader(to_valid.train)
len(to.train)
1000
class ReloadCallback(Callback):
    def begin_epoch(self): 
        df_small = load_chunk()
        to_new = to.new(df_small)
        to_new.process()
        trn_dl = TabDataLoader(to_new.train)
        self.learn.dls = DataLoaders(trn_dl, val_dl).cuda()
dls = to.dataloaders()
emb_szs = get_emb_sz(to)
model = TabNetModel(emb_szs, len(to.cont_names), dls.c, n_d=8, n_a=32, n_steps=1); 
opt_func = partial(Adam, wd=0.01, eps=1e-5)
learn = Learner(dls, model, CrossEntropyLossFlat(), opt_func=opt_func, lr=3e-2, metrics=[accuracy])
learn.add_cb(ReloadCallback());
learn.fit_one_cycle(10)
epoch train_loss valid_loss accuracy time
0 0.587740 0.550544 0.756000 00:01
1 0.545411 0.515772 0.782000 00:01
2 0.484289 0.468586 0.813000 00:01
3 0.447111 0.435774 0.817000 00:01
4 0.449050 0.394715 0.819000 00:01
5 0.428863 0.382005 0.835000 00:01
6 0.382100 0.404258 0.826000 00:01
7 0.383915 0.376179 0.833000 00:01
8 0.389460 0.367857 0.834000 00:01
9 0.376486 0.367577 0.834000 00:01
Owner
Mikhail Grankin
Mikhail Grankin
Python binding for Khiva library.

Khiva-Python Build Documentation Build Linux and Mac OS Build Windows Code Coverage README This is the Khiva Python binding, it allows the usage of Kh

Shapelets 46 Oct 16, 2022
A very simple tool for situations where optimization with onnx-simplifier would exceed the Protocol Buffers upper file size limit of 2GB, or simply to separate onnx files to any size you want.

sne4onnx A very simple tool for situations where optimization with onnx-simplifier would exceed the Protocol Buffers upper file size limit of 2GB, or

Katsuya Hyodo 10 Aug 30, 2022
GLIP: Grounded Language-Image Pre-training

GLIP: Grounded Language-Image Pre-training Updates 12/06/2021: GLIP paper on arxiv https://arxiv.org/abs/2112.03857. Code and Model are under internal

Microsoft 862 Jan 01, 2023
Instance-based label smoothing for improving deep neural networks generalization and calibration

Instance-based Label Smoothing for Neural Networks Pytorch Implementation of the algorithm. This repository includes a new proposed method for instanc

Mohamed Maher 1 Aug 13, 2022
"Domain Adaptive Semantic Segmentation without Source Data" (ACM MM 2021)

LDBE Pytorch implementation for two papers (the paper will be released soon): "Domain Adaptive Semantic Segmentation without Source Data", ACM MM2021.

benfour 16 Sep 28, 2022
A basic duplicate image detection service using perceptual image hash functions and nearest neighbor search, implemented using faiss, fastapi, and imagehash

Duplicate Image Detection Getting Started Install dependencies pip install -r requirements.txt Run service python main.py Testing Test with pytest How

Matthew Podolak 21 Nov 11, 2022
Face Library is an open source package for accurate and real-time face detection and recognition

Face Library Face Library is an open source package for accurate and real-time face detection and recognition. The package is built over OpenCV and us

52 Nov 09, 2022
Real time sign language recognition

The proposed work aims at converting american sign language gestures into English that can be understood by everyone in real time.

Mohit Kaushik 6 Jun 13, 2022
This program generates a random 12 digit/character password (upper and lowercase) and stores it in a file along with your username and app/website.

PasswordGeneratorAndVault This program generates a random 12 digit/character password (upper and lowercase) and stores it in a file along with your us

Chris 1 Feb 26, 2022
Build fully-functioning computer vision models with PyTorch

Detecto is a Python package that allows you to build fully-functioning computer vision and object detection models with just 5 lines of code. Inferenc

Alan Bi 576 Dec 29, 2022
根据midi文件演奏“风物之诗琴”的脚本 "Windsong Lyre" auto play

Genshin-lyre-auto-play 简体中文 | English 简介 根据midi文件演奏“风物之诗琴”的脚本。由Python驱动,在此承诺, ⚠️ 项目内绝不含任何能够引起安全问题的代码。 前排提示:所有键盘在动但是原神没反应的都是因为没有管理员权限,双击run.bat或者以管理员模式

御坂17032号 386 Jan 01, 2023
Automated image registration. Registrationimation was too much of a mouthful.

alignimation Automated image registration. Registrationimation was too much of a mouthful. This repo contains the code used for my blog post Alignimat

Ethan Rosenthal 9 Oct 13, 2022
Generative Art Using Neural Visual Grammars and Dual Encoders

Generative Art Using Neural Visual Grammars and Dual Encoders Arnheim 1 The original algorithm from the paper Generative Art Using Neural Visual Gramm

DeepMind 231 Jan 05, 2023
Photographic Image Synthesis with Cascaded Refinement Networks - Pytorch Implementation

Photographic Image Synthesis with Cascaded Refinement Networks-Pytorch (https://arxiv.org/abs/1707.09405) This is a Pytorch implementation of cascaded

Soumya Tripathy 63 Mar 27, 2022
natural image generation using ConvNets

The Eyescream Project Generating Natural Images using Neural Networks. For our research summary on this work, please read the Arxiv paper: http://arxi

Meta Archive 601 Nov 23, 2022
High performance distributed framework for training deep learning recommendation models based on PyTorch.

High performance distributed framework for training deep learning recommendation models based on PyTorch.

340 Dec 30, 2022
Basit bir burç modülü.

Bu modulu burclar hakkinda gundelik bir sekilde bilgi alin diye yaptim ve sizler icin kullanima sunuyorum. Modulun kullanimi asiri basit: Ornek Kullan

Special 17 Jun 08, 2022
JFB: Jacobian-Free Backpropagation for Implicit Models

JFB: Jacobian-Free Backpropagation for Implicit Models

Typal Research 28 Dec 11, 2022
【CVPR 2021, Variational Inference Framework, PyTorch】 From Rain Generation to Rain Removal

From Rain Generation to Rain Removal (CVPR2021) Hong Wang, Zongsheng Yue, Qi Xie, Qian Zhao, Yefeng Zheng, and Deyu Meng [PDF&&Supplementary Material]

Hong Wang 48 Nov 23, 2022
Training PSPNet in Tensorflow. Reproduce the performance from the paper.

Training Reproduce of PSPNet. (Updated 2021/04/09. Authors of PSPNet have provided a Pytorch implementation for PSPNet and their new work with support

Li Xuhong 126 Jul 13, 2022