当前位置:网站首页>General process of machine learning training and parameter optimization (discussion)
General process of machine learning training and parameter optimization (discussion)
2022-07-06 02:13:00 【Min fan】
Abstract : In practical machine learning applications , Not only model training , Also control the input parameters . This paper describes the general process , For reference only .
1. Training machine learning models
For an input of m m m Features , Output as a decision indicator , Machine learning models can be built
f : R m → R (1) f: \mathbb{R}^m \to \mathbb{R} \tag{1} f:Rm→R(1)
among R \mathbb{R} R Is a set of real numbers . If different features have their own value range , Then the machine learning model can be expressed as
f : ∏ i = 1 m V i → R (2) f: \prod_{i=1}^m \mathbf{V}_i \to \mathbb{R} \tag{2} f:i=1∏mVi→R(2)
among V i \mathbf{V}_i Vi It's No i i i Value range of features .
Simplicity , Only... Is discussed below (1) Model corresponding to formula .
Given to contain n n n Characteristic matrix of instances X = [ x 1 , … , x n ] T ∈ R n × m \mathbf{X} = [\mathbf{x}_1, \dots, \mathbf{x}_n]^{\mathrm{T}} \in \mathbb{R}^{n \times m} X=[x1,…,xn]T∈Rn×m And the corresponding label vector Y ∈ R n \mathbf{Y} \in \mathbb{R}^n Y∈Rn, The optimization objective of machine learning can generally be expressed as
min f L ( f ( X ) , Y ) + R ( f ) (3) \min_f \mathcal{L}(f(\mathbf{X}), \mathbf{Y}) + R(f) \tag{3} fminL(f(X),Y)+R(f)(3)
among f ( X ) = [ f ( x 1 ) , … , f ( x n ) ] f(\mathbf{X}) = [f(\mathbf{x}_1), \dots, f(\mathbf{x}_n)] f(X)=[f(x1),…,f(xn)] Vector for predicted tags , R ( f ) R(f) R(f) by f f f Regular term of parameter in . If the optimization goal is a convex function , Then the gradient descent method can be used to quickly find the optimal solution . For regular terms :
- If f f f For a linear model , The regular can be 1 norm 、2 norm 、 Kernel norm, etc . Its function is to prevent over fitting .
- If f f f For a neural network model , You can use the dropout And other technologies to prevent over fitting .
2. Parameter optimization method
For some practical problems , Some of the input characteristics are objective , Some are controllable . No loss of generality , Before order m 1 m_1 m1 The first feature is objective , after m 2 m_2 m2 Three features are controllable ( So we also call it parameter ), m 1 + m 2 = m m_1 + m_2 = m m1+m2=m. Suppose a reliable machine learning model has been trained through a large amount of data f f f, And we expect to maximize the decision indicators . Given the objective eigenvector x b ∈ R m 1 \mathbf{x}_b \in \mathbb{R}^{m_1} xb∈Rm1, The objective function of parameter optimization is
arg max x u ∈ R m 2 f ( x b ∥ x u ) (4) \argmax_{\mathbf{x_u} \in \mathbb{R}^{m_2}} f(\mathbf{x}_b \| \mathbf{x}_u)\tag{4} xu∈Rm2argmaxf(xb∥xu)(4)
among ∥ \| ∥ Indicates the vector splicing operation .
- If f f f Each controllable feature is a convex function , Then the optimal parameters can be obtained by gradient descent and other methods .
- If f f f The controllable features are not a convex function , Then some bionic algorithms can be used to optimize the parameters .
- If the controllable features are enumerated, the cardinality of the definition domain is not large , Then the optimal parameters can be obtained directly by the exhaustive method . example : Controllable features include 5 individual , Everyone with a 10 Possible values , From the 1 0 5 10^5 105 The optimal parameter vector is obtained from three parameter combinations , It only takes a few seconds to calculate .
边栏推荐
- dried food! Accelerating sparse neural network through hardware and software co design
- PHP campus movie website system for computer graduation design
- 论文笔记: 图神经网络 GAT
- Leetcode3. Implement strstr()
- Reasonable and sensible
- Redis-字符串类型
- Publish your own toolkit notes using NPM
- Xshell 7 Student Edition
- Visualstudio2019 compilation configuration lastools-v2.0.0 under win10 system
- It's wrong to install PHP zbarcode extension. I don't know if any God can help me solve it. 7.3 for PHP environment
猜你喜欢
Visualstudio2019 compilation configuration lastools-v2.0.0 under win10 system
[depth first search notes] Abstract DFS
Virtual machine network, networking settings, interconnection with host computer, network configuration
Using SA token to solve websocket handshake authentication
Extracting key information from TrueType font files
The ECU of 21 Audi q5l 45tfsi brushes is upgraded to master special adjustment, and the horsepower is safely and stably increased to 305 horsepower
Numpy array index slice
How to upgrade kubernetes in place
Spark accumulator
在线怎么生成富文本
随机推荐
Redis key operation
[solution] add multiple directories in different parts of the same word document
Redis如何实现多可用区?
leetcode3、實現 strStr()
How to use C to copy files on UNIX- How can I copy a file on Unix using C?
Selenium element positioning (2)
Comments on flowable source code (XXXV) timer activation process definition processor, process instance migration job processor
Sword finger offer 12 Path in matrix
Ali test Open face test
leetcode-2. Palindrome judgment
【clickhouse】ClickHouse Practice in EOI
SQL statement
Reasonable and sensible
Using SA token to solve websocket handshake authentication
Redis string type
2022年PMP项目管理考试敏捷知识点(8)
【coppeliasim】高效传送带
leetcode-两数之和
Computer graduation design PHP college classroom application management system
剑指 Offer 12. 矩阵中的路径