当前位置：网站首页>Static analysis of malicious code

Static analysis of malicious code

2022-06-13 00:34:00 【P1n9】

One 、 Traditional detection methods

1. Signature based on malicious code as detection

The extraction and selection of eigenvalues are as follows ：
(1) Specific substring ： Extract from a computer virus 、 It is a virus specific characteristic string . Such as specific information , Specific signature information, etc . For example, the hint for the marijuana virus is ：“Your PC is now stoned” etc. .
(2) Infection markers ： An infection marker used by a virus to avoid repeated infection . Like Black Friday “suMs DOS”.
(3) Start at a specific place in the virus code and take out a sequence of 、 No more than 64 And no spaces (ASCII The value is 32) The byte string of . meanwhile , The extracted features need to avoid being identical with normal software , Otherwise, false alarms will be formed . The specific extraction method is as follows ：
a) Manual extraction
– After the antivirus engineer analyzed the virus samples , Artificial identification of the virus .
b) Automatic extraction
– The software system can automatically extract the data with certain characteristics in a specific range and length .
– If we deal with the disadvantage, we may be used by those who have ulterior motives , Formation of manslaughter . For example, antivirus software can be used to Windows Check and kill the core files , However, there may be corresponding mistakes when taking samples , Making normal programs or files mistakenly killed by creating counter samples

2. Use the fingerprint of malicious code to detect

The system automatically monitors

This technology is used more in anti-virus software . It will keep the checksums resident in memory , Whenever an application starts running , Automatically verify whether the current and pre saved checksums are consistent . If not, the data has been tampered with , There will be a corresponding prompt .

Special testing tools

Check the normal state of the checked object file , Write the check sum value into the file to be checked or the detection tool , And then compare . Such as MD5Checker.

Self testing

Some applications do self checking , such as QQ, If you modify the data, there will be a corresponding prompt QQ Be tampered with . In the application , Put in the verification and detection technology self check function , Write the check sum of the normal state of the file to the file itself , The application starts to compare the current and original checksums , Implement self-test of application program .
Be careful ： It is not necessary to verify the entire file , It may be to operate on the file header or file content

3. Look for malicious strings

4. analysis PE File headers and sections

5. Analyze link libraries and functions , That is, import tables

Method 1 ： take PE The head of the IAT File name and function name in the table hash To 0 To 255 Range , If a function appears in a file , The position will be 1, Of course, each corresponding function is fixed , The resulting 256 Array .

Two 、 Malicious code detection method based on machine learning

1. Byte view features

file size
Visible string （ Look for malicious process names , Malicious characters, etc ）
Program entropy
pe File binary grayscale
Byte histogram

2. Assembly view features

Opcode characteristics
Register characteristics
Features of functions
Data definition features

3. PE View features

PE Structural features （API Number of calls 、DLL Number of calls 、 Number of imported functions 、 Number of exported functions 、 Starting virtual address of each method 、 Virtual size 、 Important structural features such as language and coding methods ）

(1) PE Meta information is going to PE The numerical information of information is extracted , form 256 Dimension group , The position of each array represents a fixed type of information , Then the information type fills the corresponding information into the element position , Such as compilation timestamp .
(2) Setting property values , Judge pe Whether the document meets the requirements , Mark the corresponding value as 1 or 0, Form the corresponding vector .

Anti detection engine features （ It can be used Yara Rules to match ）

YARA Each description or rule of is composed of a series of strings and a Boolean expression , And explain its logic .YARA Rules can be submitted to a file or in a running process , To help researchers identify whether it belongs to a malware family that has been described by rules .
YARA Rules can be complex and powerful enough to support wildcards 、 Case sensitive strings 、 Regular expressions 、 Special symbols and other features .

Compilation features

The compilation time and environment of the software （ In the version information in the resource ） And shelling information

malice API

4. Function name （CG chart ）

3、 ... and 、 Malicious code detection based on deep learning method

Turn malicious code into images , Using deep learning model to extract feature learning , classification

In image recognition 、 speech recognition 、 The effect of machinetranslation and other fields far exceeds that of non deep learning algorithms , Convert malware into corresponding form data ( Images ), Apply advanced models in corresponding fields .
The migration study , Migrate the more mature models here to use

next step :
1、overlay features

2、Yara The rules
https://github.com/Yara-Rules/rules/tree/master/malware
3、 Compilation features
4、PE The attribute values in the structural features are supplemented by others
https://github.com/evilsocket/ergo-pe-av/blob/master/attribute.names

Four 、 Bypass new ideas

1、

2、
(1) Add an unused function to the import address table
(2) Operate on existing section name（ Extract from the section names of benign samples ）
(3) Create a new ( That is not used )section
(4) Append bytes to section Extra space at the end
(5) Create a new entry point , It immediately jumps to the original entry point
(6) manipulation ( damage ) Signature
(7) Operation debugging information
(8) Package or unzip files
(9) modify ( interrupt ) Header checksums
(10) Append bytes to overlay ( End of peer file )