当前位置:网站首页>PHP implements sensitive word filtering system "suggestions collection"
PHP implements sensitive word filtering system "suggestions collection"
2022-07-01 17:14:00 【Full stack programmer webmaster】
Hello everyone , I meet you again , I'm your friend, Quan Jun .
Code description
1、 Sensitive thesaurus maintenance update script :
reload_dict.php, Provides automatic dictionary update to trie-tree Documentation process
PHP
<?php
// Set memory
ini_set('memory_limit', '128M');
// Read sensitive word dictionary
$handle = fopen('dict.txt', 'r');
// Generate empty trie-tree-filter
$resTrie = trie_filter_new();
while(! feof($handle)) {
$item = trim(fgets($handle));
if (empty($item)) {
continue;
}
// Add sensitive words one by one trie-tree
trie_filter_store($resTrie, $item);
}
// Generate trie-tree file
$blackword_tree = 'blackword.tree';
trie_filter_save($resTrie, $blackword_tree);
2、trie Tree objects get tool classes
FilterHelper.php, Provide access to trie-tree object , Avoid duplicate generation trie-tree Objects and guarantees tree File and sensitive thesaurus synchronization update
PHP
<?php
/**
* Filter assistant
*
* getResTrie Provide trie-tree object ;
* getFilterWords Extract the filtered string
*
* @author W.Y.P ([email protected])
*/
class FilterHelper
{
// trie-tree object
private static $_resTrie = null;
// Update time of dictionary tree
private static $_mtime = null;
/**
* Prevent initialization
*/
private function __construct() {}
/**
* Prevent cloning objects
*/
private function __clone() {}
/**
* Provide trie-tree object
*
* @param $tree_file Dictionary tree file path
* @param $new_mtime The update time of the dictionary tree at the current call
* @return null
*/
static public function getResTrie($tree_file, $new_mtime) {
if (is_null(self::$_mtime)) {
self::$_mtime = $new_mtime;
}
if (($new_mtime != self::$_mtime) || is_null(self::$_resTrie)) {
self::$_resTrie = trie_filter_load($tree_file);
self::$_mtime = $new_mtime;
// Output dictionary file overload time
echo date('Y-m-d H:i:s') . "\tdictionary reload success!\n";
}
return self::$_resTrie;
}
/**
* Extract the filtered sensitive words from the original string
*
* @param $str Original string
* @param $res 1-3 Express From the position 1 Start ,3 Character length
* @return array
*/
static public function getFilterWords($str, $res)
{
$result = array();
foreach ($res as $k => $v) {
$word = substr($str, $v[0], $v[1]);
if (!in_array($word, $result)) {
$result[] = $word;
}
}
return $result;
}
}
3、 Provide external filtering HTTP Access interface
filter.php, Use swool, External submission filter interface access
PHP
<?php
// Set the maximum running memory of the script , Adjust according to the dictionary size
ini_set('memory_limit', '512M');
// Set time zone
date_default_timezone_set('Asia/Shanghai');
// Load the helper file
require_once('FilterHelper.php');
// http Service bound ip And port
$serv = new swoole_http_server("182.92.177.16", 9502);
/**
* Processing requests
*/
$serv->on('Request', function($request, $response) {
// receive get Request parameters
$content = isset($request->get['content']) ? $request->get['content']: '';
$result = '';
if (!empty($content)) {
// Dictionary tree file path , The default directory is
$tree_file = 'blackword.tree';
// Clear file state cache
clearstatcache();
// When getting a request , The modification time of the dictionary tree file
$new_mtime = filemtime($tree_file);
// Get the latest trie-tree object
$resTrie = FilterHelper::getResTrie($tree_file, $new_mtime);
// filtering
$arrRet = trie_filter_search_all($resTrie, $content);
// Extract the filtered sensitive words
$a_data = FilterHelper::getFilterWords($content, $arrRet);
$result = json_encode($a_data);
}
// Definition http Service information and response processing results
$response->cookie("User", "W.Y.P");
$response->header("X-Server", "W.Y.P WebServer(Unix) (Red-Hat/Linux)");
$response->header('Content-Type', 'Content-Type: text/html; charset=utf-8');
$response->end($result);
});
$serv->start();
Publisher : Full stack programmer stack length , Reprint please indicate the source :https://javaforall.cn/130918.html Link to the original text :https://javaforall.cn
边栏推荐
猜你喜欢
【C语言基础】12 字符串
[C language foundation] 12 strings
Pytest learning notes (13) -allure of allure Description () and @allure title()
美国国家安全局(NSA)“酸狐狸”漏洞攻击武器平台技术分析报告
在MeterSphere接口测试中如何使用JMeter函数和MockJS函数
vulnhub靶场-Hacker_Kid-v1.0.1
【splishsplash】关于如何在GUI和json上接收/显示用户参数、MVC模式和GenParam
Flux d'entrées / sorties et opérations de fichiers en langage C
[mathematical modeling] [matlab] implementation of two-dimensional rectangular packing code
SQL question brushing 1050 Actors and directors who have worked together at least three times
随机推荐
想做软件测试的女孩子看这里
Cookies and session keeping technology
Introduction to software engineering - Chapter 6 - detailed design
单例模式的懒汉模式跟恶汉模式的区别
SystemVerilog-结构体(二)
荣威 RX5 的「多一点」产品策略
String的trim()和substring()详解
Pytest learning notes (13) -allure of allure Description () and @allure title()
(28) Shape matching based on contour features
Research Report on development prediction and investment direction of nylon filament sewing thread in China (2022 Edition)
C language input / output stream and file operation
MySQL learning summary
Jojogan practice
PHP实现敏感词过滤系统「建议收藏」
[mathematical modeling] [matlab] implementation of two-dimensional rectangular packing code
Research Report on China's enzyme Market Forecast and investment strategy (2022 Edition)
在MeterSphere接口测试中如何使用JMeter函数和MockJS函数
中国一次性卫生用品生产设备行业深度调研报告(2022版)
判断链表是否是回文链表
[C language foundation] 12 strings