当前位置:网站首页>Machine learning process and method
Machine learning process and method
2022-07-03 02:12:00 【Jieyou tree hole network】
Machine learning process and method
Problem modeling |
---|
Feature Engineering |
Model selection |
Model fusion |
Model online application |
Problem modeling |
---|
1、 Must be clear , What are my evaluation indicators ? |
2. Select the sample set |
3. Cross validation |
- What are the evaluation indicators ?
- P ,R ,PRC,ROC&AUC ,LOSS,mAP
- How to select a sample subset ?
- Stratified sampling
- How to cross verify ?
- k-fold
Feature Engineering |
---|
1.EDA |
2. Data cleaning ( Missing value , outliers ) And normalization , Continuous data discretization ( Points barrels ) |
3. feature selection |
- Feature Engineering First On data EDA( Exploratory data analysis )
- Learn about datasets General information , Ratio of missing values , The type of feature
- Box figure , Histogram , Stem and leaf , Correlation matrix heat map ,PCA Dimension reduction and so on
- EDA after , Understand the data , Also right data Conduct Cleaning and screening .
- data defect value
- data abnormal value
- Distinguishing features species :
- The number features
- Continuous feature
- normalization
- discretization
- Points barrels
- Discrete features
- Continuous feature
- Category features
- code
- Hot coding alone
- Count rank code
- ( Effective for both linear and nonlinear )
- ( The outliers are not sensitive )
- ( Feature values do not conflict , Ranking does not conflict )
- Natural coding
- Layered coding
- Postal Code , ID number
- code
- Time features
- Specific date
- Minutes and seconds
- Sunday system
- Is it a weekend 、 Month end 、 Whether quarter end 、 Whether it is business hours 、 Holidays, etc
- The last time distance … The time interval
- Space features
- GPS coordinate
- Country ID, City ID, Administrative region ID、 The street ID etc.
- Space distance
- Text features
- Regularization
- Alphanumerics are unified into alphanumerics of one language
- Corpus construction
- file : Description of the item
- Text cleaning
- Remove space , Punctuation, etc
- participle
- Part of speech tagging
- Verb
- Noun
- Adjective
- Semantic restoration
- Can express semantics
- 3-Gram Model
- Convert text to a continuous sequence , Three consecutive words make up a sentence :
- ABCDE => (ABC, BCD CDE)
- Convert text to a continuous sequence , Three consecutive words make up a sentence :
- Skip-Gram Model
- The word bag model
- First form a vector
- Each component of a vector represents , The frequency of words appearing in the document TF
- TF-IDF
- First put all the documents , Form a vocabulary ( The dictionary )
- For a document , In the dictionary key Form a vector , The value of the vector is tf*idf
- tf It's a local parameter , Express Words in text d Word frequency in
- idf It's a global parameter , by log( Total number of documents / The number of documents with this word )
- The word bag model
- Part of speech tagging
- Regularization
- The number features
- feature selection : Select the matching feature combination
- How to measure the quality of features ?
- Target label scalar and The characteristics of the distance / Similarity degree
- Distance and similarity indicators
- L-p norm distance ( length )
- Cosine similarity (cos)( angle )
- Pearson correlation coefficient ( comprehensive )
- Jaccard Similarity degree ( Number of set intersections / Number of union sets )
- Jaccard distance = 1- Jaccard Similarity degree
- Fisher score
- Mutual information KL(p(x,y) ||p(x)p(y))
- Hypothesis testing
- CFS Correlation feature selection
- Three kinds of methods for feature selection :( according to Whether the feature selection interacts with the machine learning algorithm )
- Filter Method
- Complete set of features => feature selection => Machine learning algorithm => Model effect
- Feature selection and machine learning algorithms do not interact , It's independent . So it's simple and effective .
- Univariate filtering
- Just think about relevance , according to Relevance ranking , To filter out The least relevant feature
- Multivariable filtering
- Consider not only relevance , Also consider consistency ?
- CFS Relevant feature selection , Including the correlation of cross features
- MBF
- FCBF
- Consider not only relevance , Also consider consistency ?
- Complete set of features => feature selection => Machine learning algorithm => Model effect
- encapsulation Method
- Complete set of features => | feature selection <=> Machine learning algorithm | => Model effect
- For possible feature subsets , Consider the suitability of machine learning algorithms , Use algorithms , Verification set To select the best feature subset .
- The embedded Method
- Complete set of features =>| feature selection <=> Machine learning algorithm + Model effect |
- direct The fusion Feature selection and machine learning algorithms , At the same time, evaluate the effect
- Cross validation is required
- Decision tree , Random forests , Gradient lifting tree ,SVM, Lasso
- Filter Method
- How to measure the quality of features ?
边栏推荐
- Processing of tree structure data
- Explore the conversion between PX pixels and Pt pounds, mm and MM
- Exception handling in kotlin process
- 【Camera专题】HAL层-addChannel和startChannel简析
- 浏览器是如何对页面进行渲染的呢?
- Redis:Redis的简单使用
- Introduce in detail how to communicate with Huawei cloud IOT through mqtt protocol
- Comment communiquer avec Huawei Cloud IOT via le Protocole mqtt
- [fluent] hero animation (hero animation use process | create hero animation core components | create source page | create destination page | page Jump)
- [shutter] shutter debugging (debugging control related functions | breakpoint management | code operation control)
猜你喜欢
How do it students find short-term internships? Which is better, short-term internship or long-term internship?
The technology boss is ready, and the topic of position C is up to you
深度学习笔记(持续更新中。。。)
y54.第三章 Kubernetes从入门到精通 -- ingress(二七)
通达OA 首页门户工作台
Distributed transaction solution
[Flutter] dart: class;abstract class;factory;类、抽象类、工厂构造函数
elastic stack
【Camera专题】手把手撸一份驱动 到 点亮Camera
微信小程序开发工具 POST net::ERR_PROXY_CONNECTION_FAILED 代理问题
随机推荐
Internal connection query and external connection
单词单词单词
Summary of ES6 filter() array filtering methods
Asian Games countdown! AI target detection helps host the Asian Games!
微信小程序開發工具 POST net::ERR_PROXY_CONNECTION_FAILED 代理問題
[Yu Yue education] China Ocean University job search OMG reference
Rockchip3399 start auto load driver
easyPOI
[shutter] pull the navigation bar sideways (drawer component | pageview component)
Unrecognized SSL message, plaintext connection?
elastic stack
[fluent] fluent debugging (debug debugging window | viewing mobile phone log information | setting normal breakpoints | setting expression breakpoints)
[shutter] bottom navigation bar implementation (bottomnavigationbar bottom navigation bar | bottomnavigationbaritem navigation bar entry | pageview)
Swift开发学习
【Camera专题】手把手撸一份驱动 到 点亮Camera
Where is the future of test engineers? Confused to see
【Camera专题】OTP数据如何保存在自定义节点中
stm32F407-------DMA
微服务组件Sentinel (Hystrix)详细分析
苏世民:25条工作和生活原则