当前位置：网站首页>General process of machine learning training and parameter optimization (discussion)

General process of machine learning training and parameter optimization (discussion)

2022-07-06 02:13:00 【Min fan】

Abstract : In practical machine learning applications , Not only model training , Also control the input parameters . This paper describes the general process , For reference only .

1. Training machine learning models

For an input of $m$ Features , Output as a decision indicator , Machine learning models can be built
$\mathbb{R}^m \to \mathbb{R} \tag{1}$
among $\mathbb{R}$ Is a set of real numbers . If different features have their own value range , Then the machine learning model can be expressed as
$\prod_{i=1}^m \mathbf{V}_i \to \mathbb{R} \tag{2}$
among $\mathbf{V}_i$ It's No $i$ Value range of features .
Simplicity , Only... Is discussed below (1) Model corresponding to formula .
Given to contain $n$ Characteristic matrix of instances $\mathbf{X} = [\mathbf{x}_1, \dots, \mathbf{x}_n]^{\mathrm{T}} \in \mathbb{R}^{n \times m}$ And the corresponding label vector $\mathbf{Y} \in \mathbb{R}^n$ , The optimization objective of machine learning can generally be expressed as
$\min_f \mathcal{L}(f(\mathbf{X}), \mathbf{Y}) + R(f) \tag{3}$
among $f(\mathbf{X}) = [f(\mathbf{x}_1), \dots, f(\mathbf{x}_n)]$ Vector for predicted tags , $R (f)$ by $f$ Regular term of parameter in . If the optimization goal is a convex function , Then the gradient descent method can be used to quickly find the optimal solution . For regular terms :

If $f$ For a linear model , The regular can be 1 norm 、2 norm 、 Kernel norm, etc . Its function is to prevent over fitting .
If $f$ For a neural network model , You can use the dropout And other technologies to prevent over fitting .

2. Parameter optimization method

For some practical problems , Some of the input characteristics are objective , Some are controllable . No loss of generality , Before order $m_1$ The first feature is objective , after $m_2$ Three features are controllable ( So we also call it parameter ), $m_1 + m_2 = m$ . Suppose a reliable machine learning model has been trained through a large amount of data $f$ , And we expect to maximize the decision indicators . Given the objective eigenvector $\mathbf{x}_b \in \mathbb{R}^{m_1}$ , The objective function of parameter optimization is
$\argmax_{\mathbf{x_u} \in \mathbb{R}^{m_2}} f(\mathbf{x}_b \| \mathbf{x}_u)\tag{4}$
among $\|$ Indicates the vector splicing operation .

If $f$ Each controllable feature is a convex function , Then the optimal parameters can be obtained by gradient descent and other methods .
If $f$ The controllable features are not a convex function , Then some bionic algorithms can be used to optimize the parameters .
If the controllable features are enumerated, the cardinality of the definition domain is not large , Then the optimal parameters can be obtained directly by the exhaustive method . example : Controllable features include 5 individual , Everyone with a 10 Possible values , From the $10^5$ The optimal parameter vector is obtained from three parameter combinations , It only takes a few seconds to calculate .

原网站

版权声明
本文为[Min fan]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/187/202207060149083353.html

当前位置：网站首页>General process of machine learning training and parameter optimization (discussion)

General process of machine learning training and parameter optimization (discussion)

1. Training machine learning models

2. Parameter optimization method

边栏推荐

猜你喜欢

随机推荐