当前位置：网站首页>Design and implementation of server security audit system

Design and implementation of server security audit system

2022-06-21 11:21:00 【0xtuhao】

Before network intrusion 、 The security defense during and after the invasion should belong to the whole process linkage 、 Linked up , So for the security detection and blocking of the server , The author believes that a unified security audit system is needed . there “ Audit ”, It's not a cure for the lost , Instead, it incorporates threat discovery 、 Threat analysis and threat elimination “ The trinity ” Security defense system of . The following will start from the beginning of safety audit 、 Design concept 、 Realization way 、 Application and extension, etc 5 Analyze the design and implementation of server security audit system from three aspects .

Why safety audit

Just like a system needs port monitoring 、 The same goes for service monitoring , We need to host our own... On the server “ sentry ”, Understand the server security risk status in real time . It is different from other O & M monitoring agent, It is “ For special post ”, Specialized in safety monitoring , In performance consumption 、 function 、 In terms of implementation, there will be traditional operation and maintenance monitoring agent Different . that , What can security audit bring us ？ Why? “ It must be ”？

Server information collection

Just imagine , General server monitoring programs only collect hardware information 、 system performance 、 Service status and other data , As for what operating system kernel is running on the machine 、 What course to run 、 What port to open 、 What users 、 what are you having? crontab, The vast majority of monitoring programs usually cannot collect , However, this information may tell us the possible security risks or threats of the server at any time .

Log collection

General log collection focuses on business data , Such as access success rate 、pv、uv Data such as , But the attack data hidden in the access log , It is often submerged in normal access , At this time, the regular log collection 、 The analyzer cannot find the intrusion data .

Another situation , If the server is invaded , When you are lucky, you can go to the server to find the attack log , Bad luck , The attacker directly deletes history、syslog, At this time, the difficulty of doing intrusion backtracking immediately went up level, therefore , There must be real-time log forwarding , Only when the security emergency response or monitoring program is in progress can the system intrusion trace be found or the user be checked by analyzing the log su/sudo Is the record legal , Or the machine is hacked 、 What did you do before the fault occurred .

Access control check

The services running on the server vary greatly , A wide variety , Basically, it is difficult for O & M to understand the service configuration 、 Port opening , Not to mention visual inspection 、 Manage access control . therefore , Need to be specific to iptables And whether the access control of common services is safe and reasonable , It is best to develop the configuration template through the operating system or the application security baseline , Through the comparison, we find the omission of access control , Combined with the external vulnerability scanner .

Vulnerability local detection

2016 Years passed ,“ Dirty cow loopholes ” Sweep over 95% Of Linux System , Let ordinary users raise their rights if they want to , Such kernel level vulnerabilities , even BAT Big head , So here comes the question , How to discover such local vulnerabilities that cannot be detected remotely in time , Or the software package has vulnerabilities and is not upgraded in time , Does it have to depend on the internal server “ Security sentry ” Take the initiative to find out ？

Abnormal traffic found

If one day , Your server traffic exploded in the middle of the night , Not by DDoS, It's a large flow , What's going on ？ It is obvious that the machine was hacked , Be reduced to others DDoS The chicken , And you look confused , After checking for a long time, I don't know how it was invaded . occasionally , Even with flow monitoring , It's just for incoming monitoring , What is more , The machine is reduced to a broiler , Or operators looking for IDC The machine found in the machine room , Directly unplug the network cable , Where is it CNCERT It's not like it never happened .

The above describes why “ Non security sentinels must not ” Why , So some children's shoes said, why don't I find an open source security monitoring program to install . So here comes the question , These open source security monitors , Just run to your production environment , Did you do a code audit , How does the program consume system performance , Whether its monitoring and alarm mechanism can be combined with the existing operation and maintenance monitoring and alarm mechanism of the enterprise , Instead of sending an email ？ When you have several servers, you can install them manually , But when you have thousands, tens of thousands, or even 100000 servers , Is it still manual installation ？ In case there is a new version to be updated in the future ？

How to design a security audit system

therefore , The security audit system needs to be redefined and designed ： It needs to be combined with the existing operation and maintenance system of the enterprise , Integrate existing mass deployment methods 、 Monitoring alarm mode , Through the organization code audit 、 Only after the performance test can it be introduced into the enterprise production environment . Besides , Once the deployment scale is up , Scheduling of the system 、 Performance considerations will rise to the architecture . How to trigger from the actual situation of the enterprise , Combined with industry experience , It is necessary to build a set of security audit system that conforms to our own .

The author thinks , An enterprise level security audit system , At least the following aspects should be considered ：

function

Security audit should at least include log audit and system security audit , Log audit can collect logs of any application , The system security audit can obtain the kernel version of the server system 、 Process status 、 Port opening 、 Users and crontab Task information 、 Installed package version and its configuration , Format through custom parsing 、 Clean up the data , This is called data cleaning . after , Send data to unified storage , Index , It is convenient for subsequent analysis , This is called data storage . Last , Various demanders obtain data from the interface provided by unified storage , Analyze each dimension , Then produce a unified report , This is called data analysis .

Consider the need for functional iterations , From architecture to components , Security audit systems should be easy to expand .

Operation and maintenance

For large-scale deployment and upgrade , At the same time, master the operation status of each component , The security audit system needs to be easy to deploy 、 Easy to upgrade .

performance

without doubt , Security log analysis , It also belongs to an application category of big data . To ensure real-time or offline processing of large amounts of data , The system design should be forward-looking , The performance of data processing should be the basic guarantee .

Besides , A qualified safety audit system , It should include at least the following components ：

Client:

Deployment needs to be considered 、 Update and data collection, analysis and presentation

The client needs to be considered c/s Architecture or something else ： Use crontab perhaps daemon

It is necessary to consider whether self-development or open source is used , If you don't have self-study ability , It is necessary to adopt an open source implementation scheme with rich community support and scalability

collector:

The collector requires high performance 、 High availability , In front of massive logs , The performance of the collector is the most important , Once data loss is found , Or high delay , The following data is accumulated more and more ,“ Barrel effect ” It will be extremely obvious .

Storage:

Mass data storage , Large storage capacity ,IO High performance .

analyzer:

Data analysis , The most important thing is common report output , For example, statistics can be made for server information , Output an overall analysis table , Facilitate decision making .

Scheduler

Task scheduling and configuration push on so many components , There needs to be one “ The brain ” Conduct management . Of course , Each component may itself be a subsystem , It also has its own scheduling layer , There is no conflict between the two , Instead, it makes the system hierarchical , Reduced system coupling .

In order to facilitate the following, we will introduce how to implement the security audit system , The following provides a simple architecture diagram of the security audit system .

among , Log auditing records user commands and critical systems syslog And forward it to the log receiver , The system security audit sends the server security status report to the corresponding receiving end through local information collection and vulnerability scanning .

After processing and analyzing, the receiving end generates a report from the key information and triggers an alarm according to the alarm rules .

How to implement security audit system

Through the above simple architecture diagram , We can see its technology selection scheme , The main components mentioned above have coverage .

Here someone may ask a question about route selection ： Self research or open source ？ The author thinks , No secondary development , Open source application localization based on the actual situation of the enterprise , They're all hooligans . therefore , In terms of implementation cost , It is suggested that after the preliminary investigation of open source solutions , Make secondary development in combination with the situation of the enterprise , It's the best policy .

Security audit tool ：client

Speaking of open source security audit tools , Perhaps the most famous in the industry is Cisofy Dominant Lynis And community led Ossec, Each has its own merits , Is it necessary 2 choose 1, The author thinks that there is no unified answer . Of course , In this paper , There will be a comparison , Then give advice .

System security audit

Lynis：*nix System on use shell The system security audit tool

Installation mode ：

Debian：apt-get install lynis

Ossec： Host intrusion detection system supporting the whole platform

Installation mode ：

Debian:apt-get install ossec-hids/ossec-hids-agent
Windows: ossec-win32/64-agent.exe

Functional comparison

Lynis working principle

Ossec working principle

client Function display

Lynis

Ossec

Through the above comparison , I believe you will find , On function and platform compatibility ,Ossec be better than Lynis, Support Windows On , And after all Ossec yes C/S framework , As HIDS The real-time performance will also be better . however , If you want to audit Ossec, It is C Written ; If you don't want to run on your own server agent, But it has to run . How do you choose ？ therefore , According to the author's consideration , In order to make full use of the functions of both , take Lynis It has strong customization 、 Low performance consumption 、 pure shell The advantages of scripting and Ossec Cross platform and real-time , It can be deployed on different platforms , There will be a heterogeneous problem , But don't forget , The two corresponding schemes have a junction point , Namely Lynis+ELK and Ossec+ELK, We can deploy by platform , such as Windows Upper Department Ossec,Linux Upper Department Lynis, Send data to ELK, The rest of the real-time log analysis 、 The early warning is handed over to ELK Here we go .

Log audit

Whether it's Windows still Linux, They all support rsyslog, So you can forward their logs to syslog. Of course ,Ossec Itself is support Windows Log audit , So we mainly consider the log audit here Linux platform , It's easy to implement , Is to modify rsyslog And profile To configure .

syslog To configure ： Will specify Facility Of syslog forward

'echo \'kern.*;security.*;auth.info;authpriv.info;user.info @x.y.z.com:514 \'> /etc/rsyslog.d/logaudit.conf && /etc/init.d/rsyslog force-reload'

User behavior log ： Record user execution commands

'echo "export PROMPT_COMMAND=\'{ echo \"HISTORY:PID=\$\$ PPID=\$PPID SID=\$\$ USER=\${USER} CMD=\$( history 1 | tr -s [[:blank:]] |cut -d\" \" -f 3-100)\" ; } | logger -p user.info \'"> /etc/profile.d/logger_userlog.sh; source /etc/profile.d/logger_userlog.sh'

Function display ：

In the following figure, you can see that the command executed by the user has been recorded and forwarded ：

As can be seen from the above figure syn flooding Attack alarm , The following figure shows kernel level The alarm is also triggered , It is also good for hardware alarm .

collect 、 Storage and Analysis ：collector-storager-analyzer

ELK(Logstash-Elacsticsearch-Kibana) It is one of the most famous log processing schemes at present , Because the scheme is open source and has rich functions , At the same time there is Elastic Behind the company , The prospect is very good . Log collection here 、 Storage and analysis are implemented with this architecture .

Core functions ： Storage 、 analysis 、 Display system

rely on ：

logstash-2.3.2

elasticsearch-2.3.2

zookeeper

kafka

kibana_4.5.1

Data processing flow diagram ：

Here are the descriptions of various roles ：

Logstash： collect , It is divided into shiper/receiver And so on

ElasticSearch： Storage , Support real-time query （ Indexes ） And api

Kibana：UI, For data analysis and presentation

Kafka： Distributed message publishing and subscribing system , Message caching middleware for processing activity flow data and operation indicator data

Zookeeper： Dispatch , Distributed applications can implement synchronization services based on it , Configuration maintenance and naming services, etc

ELK Effect display

however , Is that enough ？ Did you find anything missing ？ Careful readers will find that in the initial architecture diagram, there are Hadoop The figure of . that , Why use Hadoop？

because ELK The advantage of our scheme lies in fact log retrieval , But for non real time 、 Data that can be analyzed offline , In fact, it doesn't need to be placed all the time ELK On , After all Kibana The front-end performance of is not very good , We can find another way , from Hadoop Out of a data offline batch processing , A new way for massive data analysis .

Here is a Hadoop Application cases of , combination python Of mrjob The library can do custom analysis .

Hadoop Offline analysis log

from mrjob.job import MRJob
from mrjob.step import MRStep
import heapq
class UrlRequest(MRJob):
    def steps(self):
        return (MRStep(mapper=self.mapper,
                   reducer=self.reducer_sum
                   ),
            MRStep(reducer=self.reducer_top10)
        )

Hadoop Function display

Task scheduling and background management ：scheduler

python+flask+bootstrap

Use here flask Provide restful api For addition, deletion, inspection and modification of the report ：

# according to name Get one of the resources 
@app.route('/language/<string:name>')
 
#POST Create resources 
@app.route('/language', methods=['POST'])
 
#PUT,PATCH  Update resources 
#PUT The action requires the client to provide complete resources after the change 
#PATCH The action requires that the client can only provide attributes that need to be changed 
# It is used uniformly here PATCH Methods 
@app.route('/language/<string:name>', methods=['PUT', 'PATCH'])
 
#DELETE Delete 
@app.route('/language/<string:name>', methods=['DELETE'])

because scheduler Need to write by yourself , In order to be highly available 、 High performance , It is recommended to use haproxy/nginx+keepalived The realization of the project scheduler, No code support is required , Only for deployment wsgi Add a layer of forwarding .

Let's say Nginx Provide configuration templates for the example ：

location / {
        proxy_connect_timeout 75s;
        proxy_read_timeout 300s;
        try_files @uri @gunicorn_proxy;
    }
    location @gunicorn_proxy {
        log_format postdata '$remote_addr - $remote_user [$time_local] '
                            '"$request" $status $bytes_sent '
                            '"$http_referer" "$http_user_agent" "$request_body"';
        access_log /home/test/var/log/access.log  postdata;
        proxy_read_timeout 300s;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $http_host;
        proxy_redirect off;
        proxy_pass http://127.0.0.1:8001;
    }

Core functions ： Display of dispatching system

Operation and maintenance tools ：opsys

have access to puppet/ansible/saltstack, Considering the real-time and scalability , It is recommended to use puppet perhaps saltstack,ansible It is more suitable for less repetitive work such as initialization .

puppet

Modularize relevant functions , When connecting, you only need include Related classes are sufficient

lynis class：*nix System security audit

logaudit class： Log security audit

ossec class：windows System security audit

How to apply the security audit system

Regular safety inspection report

Regularly send the server audit report to relevant personnel , Keep them aware of the server installation situation . in addition , If a security risk is found , They can also handle it together , Form a benign interaction .

Test content

Danger command alarm

message:chattr OR message:"touch -r" OR message:"pty.spawn" OR message:"nc -l" OR message:"*etc*passwd" OR message:SimpleHTTPServer OR message:http.server OR message:"ssh -D" OR message:"bash -i" OR message:"useradd"

Vulnerability detection system

authentication 、 File system browsing 、 Application vulnerability 、 Start the service 、 A firewall 、rsync service 、webserver、 database 、 File system permissions 、 Kernel check 、ssh Configuration etc.

With existing CMDB/ Combination of operation and maintenance monitoring

Get the vulnerable machine IP、 The application type 、 The attribution information is used to determine the distribution of alarm and inspection reports

System self-monitoring and external monitoring , It is used to ensure the availability and performance monitoring of the system itself

And port scanning 、 Vulnerability scanning combined with

such as “ Dirty cow loopholes ” Batch inspection of , It can be compared by collecting all the server operating system kernels

Arbitrary log analysis

1. webserver access.log analysis , Find out from it sql Inject or getshell Sensitive statements for , The recommended solutions are Nginx+Hadoop perhaps Nginx+ELK.

2. Honeypot log analysis , After deploying honeypots on the intranet and intranet of the enterprise , Used to collect attack feature library , Do intelligence gathering . The recommended solutions are ：Beeswarm+ELK.

Extension and extension

Functional iteration

1. This solution undoubtedly belongs to the integrator in terms of function , But you also need to optimize the user experience 、 Reduce client heterogeneity , This can be achieved through secondary development Lynis perhaps Ossec To achieve .

2. Whether to push repair patches or only provide repair solutions . This involves system positioning , At the beginning of this article, we mentioned that the architecture should be “ testing + analysis + block ” Of “ The trinity ” programme , However, the existing schemes need to be blocked manually after detection and analysis , So this system needs to be expanded , For example, it is linked with the gateway , Discard the data stream that conforms to the attack characteristics , Or linked with the blacklist system , Report the attack characteristics you find to the blacklist system , Then other application systems call the blacklist system as the basis for local filtering . here web Applications can use web application firewall（WAF）,db Applications can use Mycat etc. db proxy Conduct traffic interception .

Build a security knowledge base

Through this system , We will find many systems 、 Application level vulnerabilities , So how to effectively repair the vulnerability will be the next problem to be solved . The solution is to build a security knowledge base based on industry experience and enterprise actual combat experience , Provide a unified security baseline 、 Security configuration template and vulnerability repair scheme . Then rely on the enterprise automation operation and maintenance framework to push the configuration 、 Upgrade the system or application .

For the security knowledge base , Before we can use “ Some cloud ”（ Has been suspended ）, Now you can go to https://github.com/hanc00l/

wooyun_public Download and upload offline knowledge base for research . You can also build your own security knowledge base and configuration template in combination with the public security baseline standards .

Of course , The ultimate Dharma is still a reptile ：python+scrapy, Climb down the knowledge base you want through the search engine . Also refer to the scheme ：http://www.cnblogs.com/buptzym/p/5320552.html

Combine Threat Intelligence to quickly detect and respond to the latest vulnerabilities

without doubt , This plan focuses on internal risk discovery or internal problem discovery , To build a comprehensive security risk and threat defense program , You also need to rely on an external scanning system , It also needs to be associated with CVE And so on , Even need access to the latest threat intelligence resources , Before the similar Mongodb Before the blackmail problem broke out on a large scale , Take defensive measures in advance , Whether it is batch detection application configuration , Or batch scanning system , As long as you can prepare in advance , In fact, they are all victories . So the following is the external scanning system 、 build by oneself CVE Library and threat intelligence collection provide some solutions , Finally, I hope to link with this server security audit system , Realize security risks and threats “ testing + analysis + block ” Of “ The trinity ” The goal of .

External scanning system

openvas：https://github.com/mikesplain/openvas-docker

build by oneself CVE library

cve search：https://github.com/cve-search/cve-search

Open source threat intelligence

OSTrICa：https://github.com/Ptr32Void/OSTrICa

The specific applications of these three solutions will be introduced in detail in future articles , In terms of enterprise application scenarios , Coming soon .