当前位置:网站首页>Anomaly detection method based on SVM
Anomaly detection method based on SVM
2020-11-06 01:14:00 【Artificial intelligence meets pioneer】
author |Mahbubul Alam compile |VK source |Towards Data Science
Introduction to single class support vector machines
As an expert or novice in machine learning , You may have heard of support vector machines (SVM)—— A supervised machine learning algorithm often cited and used in classification problems .
Support vector machines use hyperplanes in multidimensional space to separate one class of observations from another . Of course , Support vector machine is used to solve multi class classification problems .
However , Support vector machine is more and more applied to a class of problems , That is, all data belongs to one class . under these circumstances , Algorithms are trained to learn what is “ natural ”, So when a new data is displayed , The algorithm can identify whether it should be normal . without , New data will be marked as exception or exception . To learn more about single class support vector machines , Please check out Roemer Vlasveld This long article of :http://rvlasveld.github.io/blog/2013/07/12/introduction-to-one-class-support-vector-machines/
The last thing to mention is , If you are familiar with sklearn library , You'll notice that there's an algorithm for what's called “ Novelty testing ” And Design . It works in a similar way to what I described in the anomaly detection using single class support vector machines . in my opinion , It's just the context that determines whether it's called novelty detection or outlier detection or whatever .
Here is Python Simple demonstration of single class support vector machine in programming language . Please note that , I alternate between outliers and outliers .
step 1: Import library
For this demonstration , We need three core libraries - For data disputes python and numpy, For model building sklearn And Visualization matlotlib.
# Import library
import pandas as pd
from sklearn.svm import OneClassSVM
import matplotlib.pyplot as plt
from numpy import where
step 2: Prepare the data
I use the famous... From online resources Iris Data sets , So you can practice using , You don't have to worry about how to get the data from where .
# Import data
data = pd.read_csv("https://raw.githubusercontent.com/uiuc-cse/data-fa14/gh-pages/data/iris.csv")
# input data
df = data[["sepal_length", "sepal_width"]]
step 3: Model
It is different from the adjustment of super parameters in other classification algorithms , Single class support vector machines use nu As a super parameter , Used to define which parts of the data should be classified as outliers .nu=0.03 Indicates that the algorithm will 3% Is specified as an outlier .
# Model parameters
model = OneClassSVM(kernel = 'rbf', gamma = 0.001, nu = 0.03).fit(df)
step 4: forecast
The predicted dataset will have 1 or -1 value , among -1 Value is the outlier detected by the algorithm .
# forecast
y_pred = model.predict(df)
y_pred
step 5: Filter exception
# Filter outlier index
outlier_index = where(y_pred == -1)
# Filter outliers
outlier_values = df.iloc[outlier_index]
outlier_values
step 6: Visual exception
# Visual output
plt.scatter(data["sepal_length"], df["sepal_width"])
plt.scatter(outlier_values["sepal_length"], outlier_values["sepal_width"], c = "r")
Red data points are outliers
summary
In this paper , I'd like to talk about a class of support vector machines (One-classsvm) Make a brief introduction , It's a form of fraud / abnormal / Machine learning algorithm for anomaly detection .
I showed you some simple steps to build intuition , But of course , A real implementation requires more experimentation to find out what works in a particular environment and Industry , What doesn't work .
Link to the original text :https://towardsdatascience.com/support-vector-machine-svm-for-anomaly-detection-73a8d676c331
Welcome to join us AI Blog station : http://panchuang.net/
sklearn Machine learning Chinese official documents : http://sklearn123.com/
Welcome to pay attention to pan Chuang blog resource summary station : http://docs.panchuang.net/
版权声明
本文为[Artificial intelligence meets pioneer]所创,转载请带上原文链接,感谢
边栏推荐
- 免费的专利下载教程(知网、espacenet强强联合)
- 阿里云Q2营收破纪录背后,云的打开方式正在重塑
- Basic principle and application of iptables
- html
- 事半功倍:在没有机柜的情况下实现自动化
- Flink on paasta: yelp's new stream processing platform running on kubernetes
- 6.8 multipartresolver file upload parser (in-depth analysis of SSM and project practice)
- tensorflow之tf.tile\tf.slice等函数的基本用法解读
- Aprelu: cross border application, adaptive relu | IEEE tie 2020 for machine fault detection
- 钻石标准--Diamond Standard
猜你喜欢
随机推荐
如果前端不使用SPA又能怎样?- Hacker News
GBDT与xgb区别,以及梯度下降法和牛顿法的数学推导
Jmeter——ForEach Controller&Loop Controller
GUI 引擎评价指标
Programmer introspection checklist
技術總監,送給剛畢業的程式設計師們一句話——做好小事,才能成就大事
人工智能学什么课程?它将替代人类工作?
使用 Iceberg on Kubernetes 打造新一代云原生数据湖
安装Anaconda3 后,怎样使用 Python 2.7?
嘘!异步事件这样用真的好么?
mac 下常用快捷键,mac启动ftp
用Python构建和可视化决策树
(1) ASP.NET Introduction to core3.1 Ocelot
向北京集结!OpenI/O 2020启智开发者大会进入倒计时
Polkadot series (2) -- detailed explanation of mixed consensus
python 保存list数据
神经网络简史
Grouping operation aligned with specified datum
Use of vuepress
哇,ElasticSearch多字段权重排序居然可以这么玩