当前位置:网站首页>Mathematical Ideas in AI
Mathematical Ideas in AI
2022-07-31 02:10:00 【IT is extremely KeBang】
The mathematical knowledge involved in AI mainly involves three aspects: linear algebra, calculus, and probability theory. Below we describe the mathematical knowledge involved in the entire AI process. In the final analysis, degree learning is learning how to fit a function, this function maps the input to the output, which is essentially a mathematical modeling problem.
Assumption space:
Hypothesis space, also known as function space, how to choose a function requires some prior knowledge. We are accustomed to dividing house price prediction as a regression problem, because the output value is a continuous real value; whether it will rain tomorrow, the result is onlyThis problem is called classification problem; for such problems, it is easy for us to choose functions, but for complex problems, such as image pattern recognition, ordinary regression cannot complete the task (in machineIn the era of learning, this requires manual extraction of features and input them into the machine learning algorithm), which requires a neural network. Although it is a black box, we don’t know what’s going on inside, but it needs several hidden layers.The so-called function space means that when the network model is fixed, the network parameters are initialized according to a certain probability distribution, and then the function space is optimized during the training process, but there is a special case, Dropout, which is similar to the Bagging idea, the existence of Dropout makes the network model more complex, but this complexity effectively prevents overfitting.
Objective function:
Since it is a fitting problem, there is an indispensable tool for evaluating the fitting effect. This is the role of the objective function, which is used to measure the gap between the output of the fitting function and the real label, and continuously reduce by changing the network parameters.Small gap, this is the optimization problem in mathematics. In mathematics, the derivative of a function represents the rate of change of the dependent variable when the independent variable changes. Ideally, the point where the derivative is 0 is called the extreme point.The parameters of is the final result we require. This ideal function is called a convex function, but the problem in real life is often not a convex optimization problem. The point where the derivative is 0 may be a local extreme point or a saddle point. Faced with thisTo solve this kind of problem, you need the gradient descent method to find the optimal solution, initialize the starting point first, and then move in the direction with the largest gradient at each step, this method may still only find the local optimal solution, the Adam and other methods proposed by scientists are for the purpose ofOptimize the solution process.
The optimization process depends on the characteristics of the loss function: it is differentiable everywhere, the linear regression model chooses the mean square error as the loss function, while the classification problem chooses the cross entropy as the loss function; the function fitted by deep learning is a very complex function, we can think of it as a deep composite function, how to derive a composite function?The answer is the chain derivation rule; for the case where the input and output are both scalars, the derivation process is very simple, but for the CV field, the input is a matrix, the input in the NLP field is a vector, and the output may be a vector or matrix.is a scalar. In this case, the matrix derivation rule can be used to find the extreme value of the loss function.
When it comes to matrix operations, it can be solved according to the properties of the matrix, for example, the inverse of the matrix, the eigenvalue decomposition of the matrix, the singular value decomposition of the matrix, the matrix determinant, etc.
边栏推荐
- 221. Largest Square
- MySQL的分页你还在使劲的limit?
- MySql的初识感悟,以及sql语句中的DDL和DML和DQL的基本语法
- mysql view
- Real-time image acquisition based on FPGA
- How to expose Prometheus metrics in go programs
- GCC Rust获批将被纳入主线代码库,或将于GCC 13中与大家见面
- 《MySQL数据库进阶实战》读后感(SQL 小虚竹)
- keep-alive cache component
- Likou Daily Question - Day 46 - 704. Binary Search
猜你喜欢
The real CTO is a technical person who understands products
一个无经验的大学毕业生,可以转行做软件测试吗?我的真实案例
Drools Rule Properties, Advanced Syntax
Introduction to flask series 】 【 flask - using SQLAlchemy
Overview of prometheus monitoring
Drools规则属性,高级语法
GCC Rust is approved to be included in the mainline code base, or will meet you in GCC 13
CV-Model【3】:MobileNet v2
MySql的安装配置超详细教程与简单的建库建表方法
Shell script to loop through values in log file to sum and calculate average, max and min
随机推荐
曼城推出可检测情绪的智能围巾,把球迷给整迷惑了
MySql的安装配置超详细教程与简单的建库建表方法
leetcode-399: division evaluation
MySQL的存储过程
如何在 go 程序中暴露 Prometheus 指标
最大路径和
Can an inexperienced college graduate switch to software testing?my real case
Validate XML documents
The final exam first year course
基于FPGA的图像实时采集
Between two orderly array of additive and Topk problem
My first understanding of MySql, and the basic syntax of DDL and DML and DQL in sql statements
uniapp uses 3rd party fonts
mysql 视图
[1154] How to convert string to datetime
Software Testing Defect Reporting - Definition, Composition, Defect Lifecycle, Defect Tracking Post-Production Process, Defect Tracking Process, Purpose of Defect Tracking, Defect Management Tools
验证整数输入
Likou Daily Question - Day 46 - 704. Binary Search
The real CTO is a technical person who understands products
[1153] The boundary range of between in mysql