当前位置:网站首页>General process of machine learning training and parameter optimization (discussion)
General process of machine learning training and parameter optimization (discussion)
2022-07-06 02:13:00 【Min fan】
Abstract : In practical machine learning applications , Not only model training , Also control the input parameters . This paper describes the general process , For reference only .
1. Training machine learning models
For an input of m m m Features , Output as a decision indicator , Machine learning models can be built
f : R m → R (1) f: \mathbb{R}^m \to \mathbb{R} \tag{1} f:Rm→R(1)
among R \mathbb{R} R Is a set of real numbers . If different features have their own value range , Then the machine learning model can be expressed as
f : ∏ i = 1 m V i → R (2) f: \prod_{i=1}^m \mathbf{V}_i \to \mathbb{R} \tag{2} f:i=1∏mVi→R(2)
among V i \mathbf{V}_i Vi It's No i i i Value range of features .
Simplicity , Only... Is discussed below (1) Model corresponding to formula .
Given to contain n n n Characteristic matrix of instances X = [ x 1 , … , x n ] T ∈ R n × m \mathbf{X} = [\mathbf{x}_1, \dots, \mathbf{x}_n]^{\mathrm{T}} \in \mathbb{R}^{n \times m} X=[x1,…,xn]T∈Rn×m And the corresponding label vector Y ∈ R n \mathbf{Y} \in \mathbb{R}^n Y∈Rn, The optimization objective of machine learning can generally be expressed as
min f L ( f ( X ) , Y ) + R ( f ) (3) \min_f \mathcal{L}(f(\mathbf{X}), \mathbf{Y}) + R(f) \tag{3} fminL(f(X),Y)+R(f)(3)
among f ( X ) = [ f ( x 1 ) , … , f ( x n ) ] f(\mathbf{X}) = [f(\mathbf{x}_1), \dots, f(\mathbf{x}_n)] f(X)=[f(x1),…,f(xn)] Vector for predicted tags , R ( f ) R(f) R(f) by f f f Regular term of parameter in . If the optimization goal is a convex function , Then the gradient descent method can be used to quickly find the optimal solution . For regular terms :
- If f f f For a linear model , The regular can be 1 norm 、2 norm 、 Kernel norm, etc . Its function is to prevent over fitting .
- If f f f For a neural network model , You can use the dropout And other technologies to prevent over fitting .
2. Parameter optimization method
For some practical problems , Some of the input characteristics are objective , Some are controllable . No loss of generality , Before order m 1 m_1 m1 The first feature is objective , after m 2 m_2 m2 Three features are controllable ( So we also call it parameter ), m 1 + m 2 = m m_1 + m_2 = m m1+m2=m. Suppose a reliable machine learning model has been trained through a large amount of data f f f, And we expect to maximize the decision indicators . Given the objective eigenvector x b ∈ R m 1 \mathbf{x}_b \in \mathbb{R}^{m_1} xb∈Rm1, The objective function of parameter optimization is
arg max x u ∈ R m 2 f ( x b ∥ x u ) (4) \argmax_{\mathbf{x_u} \in \mathbb{R}^{m_2}} f(\mathbf{x}_b \| \mathbf{x}_u)\tag{4} xu∈Rm2argmaxf(xb∥xu)(4)
among ∥ \| ∥ Indicates the vector splicing operation .
- If f f f Each controllable feature is a convex function , Then the optimal parameters can be obtained by gradient descent and other methods .
- If f f f The controllable features are not a convex function , Then some bionic algorithms can be used to optimize the parameters .
- If the controllable features are enumerated, the cardinality of the definition domain is not large , Then the optimal parameters can be obtained directly by the exhaustive method . example : Controllable features include 5 individual , Everyone with a 10 Possible values , From the 1 0 5 10^5 105 The optimal parameter vector is obtained from three parameter combinations , It only takes a few seconds to calculate .
边栏推荐
- RDD partition rules of spark
- 【机器人手眼标定】eye in hand
- 0211 embedded C language learning
- This time, thoroughly understand the deep copy
- Redis-Key的操作
- Global and Chinese markets hitting traffic doors 2022-2028: Research Report on technology, participants, trends, market size and share
- Minecraft 1.18.1、1.18.2模组开发 22.狙击枪(Sniper Rifle)
- Xshell 7 Student Edition
- 数据工程系列精讲(第四讲): Data-centric AI 之样本工程
- Computer graduation design PHP college classroom application management system
猜你喜欢
国家级非遗传承人高清旺《四大美人》皮影数字藏品惊艳亮相!
Prepare for the autumn face-to-face test questions
How does redis implement multiple zones?
Online reservation system of sports venues based on PHP
2022 PMP project management examination agile knowledge points (8)
Jisuanke - t2063_ Missile interception
vs code保存时 出现两次格式化
Redis-列表
leetcode3、實現 strStr()
Concept of storage engine
随机推荐
GBase 8c数据库升级报错
A basic lintcode MySQL database problem
leetcode-2. Palindrome judgment
安装php-zbarcode扩展时报错,不知道有没有哪位大神帮我解决一下呀 php 环境用的7.3
[ssrf-01] principle and utilization examples of server-side Request Forgery vulnerability
剑指 Offer 12. 矩阵中的路径
It's wrong to install PHP zbarcode extension. I don't know if any God can help me solve it. 7.3 for PHP environment
Minecraft 1.16.5 生化8 模组 2.0版本 故事书+更多枪械
【clickhouse】ClickHouse Practice in EOI
Initialize MySQL database when docker container starts
Global and Chinese market of wheelchair climbing machines 2022-2028: Research Report on technology, participants, trends, market size and share
The ECU of 21 Audi q5l 45tfsi brushes is upgraded to master special adjustment, and the horsepower is safely and stably increased to 305 horsepower
Redis-字符串类型
VIM usage guide
Shutter doctor: Xcode installation is incomplete
剑指 Offer 38. 字符串的排列
0211 embedded C language learning
Have a look at this generation
[Clickhouse] Clickhouse based massive data interactive OLAP analysis scenario practice
国家级非遗传承人高清旺《四大美人》皮影数字藏品惊艳亮相!