当前位置:网站首页>Mathematical Ideas in AI
Mathematical Ideas in AI
2022-07-31 02:10:00 【IT is extremely KeBang】
The mathematical knowledge involved in AI mainly involves three aspects: linear algebra, calculus, and probability theory. Below we describe the mathematical knowledge involved in the entire AI process. In the final analysis, degree learning is learning how to fit a function, this function maps the input to the output, which is essentially a mathematical modeling problem.
Assumption space:
Hypothesis space, also known as function space, how to choose a function requires some prior knowledge. We are accustomed to dividing house price prediction as a regression problem, because the output value is a continuous real value; whether it will rain tomorrow, the result is onlyThis problem is called classification problem; for such problems, it is easy for us to choose functions, but for complex problems, such as image pattern recognition, ordinary regression cannot complete the task (in machineIn the era of learning, this requires manual extraction of features and input them into the machine learning algorithm), which requires a neural network. Although it is a black box, we don’t know what’s going on inside, but it needs several hidden layers.The so-called function space means that when the network model is fixed, the network parameters are initialized according to a certain probability distribution, and then the function space is optimized during the training process, but there is a special case, Dropout, which is similar to the Bagging idea, the existence of Dropout makes the network model more complex, but this complexity effectively prevents overfitting.
Objective function:
Since it is a fitting problem, there is an indispensable tool for evaluating the fitting effect. This is the role of the objective function, which is used to measure the gap between the output of the fitting function and the real label, and continuously reduce by changing the network parameters.Small gap, this is the optimization problem in mathematics. In mathematics, the derivative of a function represents the rate of change of the dependent variable when the independent variable changes. Ideally, the point where the derivative is 0 is called the extreme point.The parameters of is the final result we require. This ideal function is called a convex function, but the problem in real life is often not a convex optimization problem. The point where the derivative is 0 may be a local extreme point or a saddle point. Faced with thisTo solve this kind of problem, you need the gradient descent method to find the optimal solution, initialize the starting point first, and then move in the direction with the largest gradient at each step, this method may still only find the local optimal solution, the Adam and other methods proposed by scientists are for the purpose ofOptimize the solution process.
The optimization process depends on the characteristics of the loss function: it is differentiable everywhere, the linear regression model chooses the mean square error as the loss function, while the classification problem chooses the cross entropy as the loss function; the function fitted by deep learning is a very complex function, we can think of it as a deep composite function, how to derive a composite function?The answer is the chain derivation rule; for the case where the input and output are both scalars, the derivation process is very simple, but for the CV field, the input is a matrix, the input in the NLP field is a vector, and the output may be a vector or matrix.is a scalar. In this case, the matrix derivation rule can be used to find the extreme value of the loss function.
When it comes to matrix operations, it can be solved according to the properties of the matrix, for example, the inverse of the matrix, the eigenvalue decomposition of the matrix, the singular value decomposition of the matrix, the matrix determinant, etc.
边栏推荐
- What is the ideal college life?
- Static route analysis (the longest mask matching principle + active and standby routes)
- The PC side determines the type of browser currently in use
- 用户交互+格式化输出
- 如何在 go 程序中暴露 Prometheus 指标
- AI在医疗影像设备全流程应用
- GCC Rust is approved to be included in the mainline code base, or will meet you in GCC 13
- Introduction and use of Drools WorkBench
- Can an inexperienced college graduate switch to software testing?my real case
- cudaMemcpy study notes
猜你喜欢

How to design the changing system requirements

怎样做好一个创业公司CTO?

系统需求多变如何设计

Software testing basic interface testing - getting started with Jmeter, you should pay attention to these things

Shell 脚本循环遍历日志文件中的值进行求和并计算平均值,最大值和最小值

Drools规则属性,高级语法

uniapp uses 3rd party fonts

"Cloud native's master, master and vulgar skills" - 2022 National New College Entrance Examination Volume I Composition

MySQL的分页你还在使劲的limit?

Drools Rule Properties, Advanced Syntax
随机推荐
【AcWing 第62场周赛】
基于FPGA的售货机
The PC side determines the type of browser currently in use
mysql 索引
CV-Model【3】:MobileNet v2
leetcode-1161: Maximum in-layer element sum
Detailed explanation of STP election (step + case)
ShardingJDBC usage summary
Basic introduction to ShardingJDBC
【Map与Set】之LeetCode&牛客练习
Layer 2 broadcast storm (cause + judgment + solution)
tcp框架需要解决的问题
Is there a way to earn 300 yuan a day by doing a side business?
AI在医疗影像设备全流程应用
The effective square of the test (one question of the day 7/29)
怎样做好一个创业公司CTO?
系统需求多变如何设计
mysql index
Project development software directory structure specification
静态路由+PAT+静态NAT(讲解+实验)