当前位置:网站首页>Anomaly detection method based on SVM
Anomaly detection method based on SVM
2020-11-06 01:14:00 【Artificial intelligence meets pioneer】
author |Mahbubul Alam compile |VK source |Towards Data Science
Introduction to single class support vector machines
As an expert or novice in machine learning , You may have heard of support vector machines (SVM)—— A supervised machine learning algorithm often cited and used in classification problems .
Support vector machines use hyperplanes in multidimensional space to separate one class of observations from another . Of course , Support vector machine is used to solve multi class classification problems .
However , Support vector machine is more and more applied to a class of problems , That is, all data belongs to one class . under these circumstances , Algorithms are trained to learn what is “ natural ”, So when a new data is displayed , The algorithm can identify whether it should be normal . without , New data will be marked as exception or exception . To learn more about single class support vector machines , Please check out Roemer Vlasveld This long article of :http://rvlasveld.github.io/blog/2013/07/12/introduction-to-one-class-support-vector-machines/
The last thing to mention is , If you are familiar with sklearn library , You'll notice that there's an algorithm for what's called “ Novelty testing ” And Design . It works in a similar way to what I described in the anomaly detection using single class support vector machines . in my opinion , It's just the context that determines whether it's called novelty detection or outlier detection or whatever .
Here is Python Simple demonstration of single class support vector machine in programming language . Please note that , I alternate between outliers and outliers .
step 1: Import library
For this demonstration , We need three core libraries - For data disputes python and numpy, For model building sklearn And Visualization matlotlib.
# Import library
import pandas as pd
from sklearn.svm import OneClassSVM
import matplotlib.pyplot as plt
from numpy import where
step 2: Prepare the data
I use the famous... From online resources Iris Data sets , So you can practice using , You don't have to worry about how to get the data from where .
# Import data
data = pd.read_csv("https://raw.githubusercontent.com/uiuc-cse/data-fa14/gh-pages/data/iris.csv")
# input data
df = data[["sepal_length", "sepal_width"]]
step 3: Model
It is different from the adjustment of super parameters in other classification algorithms , Single class support vector machines use nu As a super parameter , Used to define which parts of the data should be classified as outliers .nu=0.03 Indicates that the algorithm will 3% Is specified as an outlier .
# Model parameters
model = OneClassSVM(kernel = 'rbf', gamma = 0.001, nu = 0.03).fit(df)
step 4: forecast
The predicted dataset will have 1 or -1 value , among -1 Value is the outlier detected by the algorithm .
# forecast
y_pred = model.predict(df)
y_pred
step 5: Filter exception
# Filter outlier index
outlier_index = where(y_pred == -1)
# Filter outliers
outlier_values = df.iloc[outlier_index]
outlier_values
step 6: Visual exception
# Visual output
plt.scatter(data["sepal_length"], df["sepal_width"])
plt.scatter(outlier_values["sepal_length"], outlier_values["sepal_width"], c = "r")
Red data points are outliers
summary
In this paper , I'd like to talk about a class of support vector machines (One-classsvm) Make a brief introduction , It's a form of fraud / abnormal / Machine learning algorithm for anomaly detection .
I showed you some simple steps to build intuition , But of course , A real implementation requires more experimentation to find out what works in a particular environment and Industry , What doesn't work .
Link to the original text :https://towardsdatascience.com/support-vector-machine-svm-for-anomaly-detection-73a8d676c331
Welcome to join us AI Blog station : http://panchuang.net/
sklearn Machine learning Chinese official documents : http://sklearn123.com/
Welcome to pay attention to pan Chuang blog resource summary station : http://docs.panchuang.net/
版权声明
本文为[Artificial intelligence meets pioneer]所创,转载请带上原文链接,感谢
边栏推荐
猜你喜欢
Swagger 3.0 天天刷屏,真的香嗎?
从海外进军中国,Rancher要执容器云市场牛耳 | 爱分析调研
网络安全工程师演示:原来***是这样获取你的计算机管理员权限的!【维持】
python过滤敏感词记录
多机器人行情共享解决方案
直播预告 | 微服务架构学习系列直播第三期
How to demote a domain controller in Windows Server 2012 and later
Working principle of gradient descent algorithm in machine learning
做外包真的很难,身为外包的我也无奈叹息。
加速「全民直播」洪流,如何攻克延时、卡顿、高并发难题?
随机推荐
向北京集结!OpenI/O 2020启智开发者大会进入倒计时
如何在Windows Server 2012及更高版本中將域控制器降級
Computer TCP / IP interview 10 even asked, how many can you withstand?
多机器人行情共享解决方案
[C#] (原創)一步一步教你自定義控制元件——04,ProgressBar(進度條)
10 easy to use automated testing tools
How to get started with new HTML5 (2)
神经网络简史
使用NLP和ML来提取和构造Web数据
[performance optimization] Nani? Memory overflow again?! It's time to sum up the wave!!
Elasticsearch 第六篇:聚合統計查詢
03_ Detailed explanation and test of installation and configuration of Ubuntu Samba
阿里云Q2营收破纪录背后,云的打开方式正在重塑
微服務 - 如何解決鏈路追蹤問題
你的财务报告该换个高级的套路了——财务分析驾驶舱
不吹不黑,跨平臺框架AspNetCore開發實踐雜談
Existence judgment in structured data
使用 Iceberg on Kubernetes 打造新一代云原生数据湖
(1)ASP.NET Core3.1 Ocelot介紹
Swagger 3.0 天天刷屏,真的香嗎?