当前位置:网站首页>Static analysis of malicious code
Static analysis of malicious code
2022-06-13 00:34:00 【P1n9】
One 、 Traditional detection methods
1. Signature based on malicious code as detection
The extraction and selection of eigenvalues are as follows :
(1) Specific substring : Extract from a computer virus 、 It is a virus specific characteristic string . Such as specific information , Specific signature information, etc . For example, the hint for the marijuana virus is :“Your PC is now stoned” etc. .
(2) Infection markers : An infection marker used by a virus to avoid repeated infection . Like Black Friday “suMs DOS”.
(3) Start at a specific place in the virus code and take out a sequence of 、 No more than 64 And no spaces (ASCII The value is 32) The byte string of . meanwhile , The extracted features need to avoid being identical with normal software , Otherwise, false alarms will be formed . The specific extraction method is as follows :
a) Manual extraction
– After the antivirus engineer analyzed the virus samples , Artificial identification of the virus .
b) Automatic extraction
– The software system can automatically extract the data with certain characteristics in a specific range and length .
– If we deal with the disadvantage, we may be used by those who have ulterior motives , Formation of manslaughter . For example, antivirus software can be used to Windows Check and kill the core files , However, there may be corresponding mistakes when taking samples , Making normal programs or files mistakenly killed by creating counter samples
2. Use the fingerprint of malicious code to detect
- The system automatically monitors
This technology is used more in anti-virus software . It will keep the checksums resident in memory , Whenever an application starts running , Automatically verify whether the current and pre saved checksums are consistent . If not, the data has been tampered with , There will be a corresponding prompt .
- Special testing tools
Check the normal state of the checked object file , Write the check sum value into the file to be checked or the detection tool , And then compare . Such as MD5Checker.
- Self testing
Some applications do self checking , such as QQ, If you modify the data, there will be a corresponding prompt QQ Be tampered with . In the application , Put in the verification and detection technology self check function , Write the check sum of the normal state of the file to the file itself , The application starts to compare the current and original checksums , Implement self-test of application program .
Be careful : It is not necessary to verify the entire file , It may be to operate on the file header or file content
3. Look for malicious strings
4. analysis PE File headers and sections


5. Analyze link libraries and functions , That is, import tables
Method 1 : take PE The head of the IAT File name and function name in the table hash To 0 To 255 Range , If a function appears in a file , The position will be 1, Of course, each corresponding function is fixed , The resulting 256 Array .
Two 、 Malicious code detection method based on machine learning
1. Byte view features
- file size
- Visible string ( Look for malicious process names , Malicious characters, etc )
- Program entropy
- pe File binary grayscale
- Byte histogram
2. Assembly view features
- Opcode characteristics
- Register characteristics
- Features of functions
- Data definition features
3. PE View features
- PE Structural features (API Number of calls 、DLL Number of calls 、 Number of imported functions 、 Number of exported functions 、 Starting virtual address of each method 、 Virtual size 、 Important structural features such as language and coding methods )
(1) PE Meta information is going to PE The numerical information of information is extracted , form 256 Dimension group , The position of each array represents a fixed type of information , Then the information type fills the corresponding information into the element position , Such as compilation timestamp .
(2) Setting property values , Judge pe Whether the document meets the requirements , Mark the corresponding value as 1 or 0, Form the corresponding vector .
- Anti detection engine features ( It can be used Yara Rules to match )
YARA Each description or rule of is composed of a series of strings and a Boolean expression , And explain its logic .YARA Rules can be submitted to a file or in a running process , To help researchers identify whether it belongs to a malware family that has been described by rules .
YARA Rules can be complex and powerful enough to support wildcards 、 Case sensitive strings 、 Regular expressions 、 Special symbols and other features .
- Compilation features
The compilation time and environment of the software ( In the version information in the resource ) And shelling information
- malice API

4. Function name (CG chart )
3、 ... and 、 Malicious code detection based on deep learning method
Turn malicious code into images , Using deep learning model to extract feature learning , classification
- In image recognition 、 speech recognition 、 The effect of machinetranslation and other fields far exceeds that of non deep learning algorithms , Convert malware into corresponding form data ( Images ), Apply advanced models in corresponding fields .
- The migration study , Migrate the more mature models here to use
next step :
1、overlay features 
2、Yara The rules
https://github.com/Yara-Rules/rules/tree/master/malware
3、 Compilation features
4、PE The attribute values in the structural features are supplemented by others
https://github.com/evilsocket/ergo-pe-av/blob/master/attribute.names
Four 、 Bypass new ideas
1、

2、
(1) Add an unused function to the import address table
(2) Operate on existing section name( Extract from the section names of benign samples )
(3) Create a new ( That is not used )section
(4) Append bytes to section Extra space at the end
(5) Create a new entry point , It immediately jumps to the original entry point
(6) manipulation ( damage ) Signature
(7) Operation debugging information
(8) Package or unzip files
(9) modify ( interrupt ) Header checksums
(10) Append bytes to overlay ( End of peer file )

5、 ... and 、 Reference resources
Binary malicious sample detection based on deep learning :https://zhuanlan.zhihu.com/p/31232907
Malware classification method based on static multi feature fusion :http://www.infocomm-journal.com/cjnis/CN/article/downloadArticleFile.do?attachType=PDF&id=166605
Actual combat analysis of malicious code - Chapter one : Fundamentals of static analysis :https://blog.csdn.net/u013687652/article/details/45288925
Malicious code detection technology based on machine learning :https://blog.csdn.net/Eastmount/article/details/107420755
Yara Rule overview :https://blog.csdn.net/qq_36090437/article/details/79911375
How to use machine learning to create a malware detection system :https://www.freebuf.com/articles/system/205444.html
Defcon“ Hacker competition ” The champion tells : How to make 50 A malicious file AI System ?:https://www.secrss.com/articles/13984
Evading Machine Learning Malware Classifiers:https://towardsdatascience.com/evading-machine-learning-malware-classifiers-ce52dabdb713
Evading Machine Learning Malware Detection: https://www.blackhat.com/docs/us-17/thursday/us-17-Anderson-Bot-Vs-Bot-Evading-Machine-Learning-Malware-Detection-wp.pdf
Learning to Evade Static PE Machine Learning Malware Models via Reinforcement Learning: https://arxiv.org/pdf/1801.08917.pdf
Code :https://github.com/endgameinc/gym-malware
边栏推荐
- PLC can also make small games ----- CoDeSys can write small games of guessing numbers
- Handling method of wrong heading of VAT special invoice
- PMP registration conditions, time, cost, new version related information
- A simple deadlock example
- Delphi Chinese digit to Arabic digit
- 1115. alternate printing foobar
- 安全事故等级划分为哪几级
- Kali system -- host, dig, dnsenum, imtry for DNS collection and analysis
- Kali system -- dnsmap for DNS collection and analysis
- Several interview questions in TCP three grips and four swings
猜你喜欢

安全事故等级划分为哪几级

浏览器缓存的执行流程

Tsinghua-Bosch Joint ML Center, THBI Lab:Chengyang Ying | 通过约束条件风险价值实现安全强化学习
![Buuctf's babysql[geek challenge 2019]](/img/6c/957e5e09f252210d0b4cf8771d4ade.png)
Buuctf's babysql[geek challenge 2019]

String类中split()方法的使用

Kalix system - use of information collection gadgets

Test platform series (97) perfect the case part

1. Google grpc framework source code analysis Hello World
![[MRCTF2020]Ez_ bypass --BUUCTF](/img/73/85262c048e177968be67456fa4fe02.png)
[MRCTF2020]Ez_ bypass --BUUCTF

分公司能与员工签劳动合同么
随机推荐
[C] Inverts the binary of a decimal number and outputs it
2022施工員-設備方向-通用基礎(施工員)操作證考試題及模擬考試
Several interview questions in TCP three grips and four swings
[matlab] matrix
[LeetCode]3. The longest substring without duplicate characters forty
KAUST:Deyao Zhu | 价值记忆图:基于离线强化学习的图结构世界模型
After so long use, CSDN has finally opened a blog
Is the newly graduated college student taking BEC or PMP? PM who wants to transfer to another job in the future
APISpace 空号检测API接口 免费好用
浏览器控制台注入JS
How to quickly query the online status of mobile phones
6.824 Lab 4B: Sharded Key/Value Service
[matlab] polynomial calculation
Will the salary increase after obtaining PMP certification?
TypeError: wave.ensureState is not a function
Lambda expression
2022 constructeur - direction de l'équipement - Fondation générale (constructeur) Questions d'examen du certificat d'exploitation et examen de simulation
Make the tasks in the scheduled task XXL job flexible
Interprocess communication - shared memory shmat
[supersocket 2.0] supersocket 2.0 from the beginning to the end