当前位置:网站首页>PHP implements sensitive word filtering system "suggestions collection"
PHP implements sensitive word filtering system "suggestions collection"
2022-07-01 17:14:00 【Full stack programmer webmaster】
Hello everyone , I meet you again , I'm your friend, Quan Jun .
Code description
1、 Sensitive thesaurus maintenance update script :
reload_dict.php, Provides automatic dictionary update to trie-tree Documentation process
PHP
<?php
// Set memory
ini_set('memory_limit', '128M');
// Read sensitive word dictionary
$handle = fopen('dict.txt', 'r');
// Generate empty trie-tree-filter
$resTrie = trie_filter_new();
while(! feof($handle)) {
$item = trim(fgets($handle));
if (empty($item)) {
continue;
}
// Add sensitive words one by one trie-tree
trie_filter_store($resTrie, $item);
}
// Generate trie-tree file
$blackword_tree = 'blackword.tree';
trie_filter_save($resTrie, $blackword_tree);2、trie Tree objects get tool classes
FilterHelper.php, Provide access to trie-tree object , Avoid duplicate generation trie-tree Objects and guarantees tree File and sensitive thesaurus synchronization update
PHP
<?php
/**
* Filter assistant
*
* getResTrie Provide trie-tree object ;
* getFilterWords Extract the filtered string
*
* @author W.Y.P ([email protected])
*/
class FilterHelper
{
// trie-tree object
private static $_resTrie = null;
// Update time of dictionary tree
private static $_mtime = null;
/**
* Prevent initialization
*/
private function __construct() {}
/**
* Prevent cloning objects
*/
private function __clone() {}
/**
* Provide trie-tree object
*
* @param $tree_file Dictionary tree file path
* @param $new_mtime The update time of the dictionary tree at the current call
* @return null
*/
static public function getResTrie($tree_file, $new_mtime) {
if (is_null(self::$_mtime)) {
self::$_mtime = $new_mtime;
}
if (($new_mtime != self::$_mtime) || is_null(self::$_resTrie)) {
self::$_resTrie = trie_filter_load($tree_file);
self::$_mtime = $new_mtime;
// Output dictionary file overload time
echo date('Y-m-d H:i:s') . "\tdictionary reload success!\n";
}
return self::$_resTrie;
}
/**
* Extract the filtered sensitive words from the original string
*
* @param $str Original string
* @param $res 1-3 Express From the position 1 Start ,3 Character length
* @return array
*/
static public function getFilterWords($str, $res)
{
$result = array();
foreach ($res as $k => $v) {
$word = substr($str, $v[0], $v[1]);
if (!in_array($word, $result)) {
$result[] = $word;
}
}
return $result;
}
}3、 Provide external filtering HTTP Access interface
filter.php, Use swool, External submission filter interface access
PHP
<?php
// Set the maximum running memory of the script , Adjust according to the dictionary size
ini_set('memory_limit', '512M');
// Set time zone
date_default_timezone_set('Asia/Shanghai');
// Load the helper file
require_once('FilterHelper.php');
// http Service bound ip And port
$serv = new swoole_http_server("182.92.177.16", 9502);
/**
* Processing requests
*/
$serv->on('Request', function($request, $response) {
// receive get Request parameters
$content = isset($request->get['content']) ? $request->get['content']: '';
$result = '';
if (!empty($content)) {
// Dictionary tree file path , The default directory is
$tree_file = 'blackword.tree';
// Clear file state cache
clearstatcache();
// When getting a request , The modification time of the dictionary tree file
$new_mtime = filemtime($tree_file);
// Get the latest trie-tree object
$resTrie = FilterHelper::getResTrie($tree_file, $new_mtime);
// filtering
$arrRet = trie_filter_search_all($resTrie, $content);
// Extract the filtered sensitive words
$a_data = FilterHelper::getFilterWords($content, $arrRet);
$result = json_encode($a_data);
}
// Definition http Service information and response processing results
$response->cookie("User", "W.Y.P");
$response->header("X-Server", "W.Y.P WebServer(Unix) (Red-Hat/Linux)");
$response->header('Content-Type', 'Content-Type: text/html; charset=utf-8');
$response->end($result);
});
$serv->start();Publisher : Full stack programmer stack length , Reprint please indicate the source :https://javaforall.cn/130918.html Link to the original text :https://javaforall.cn
边栏推荐
- 整形数组合并【JS】
- China carbon disulfide industry research and investment strategy report (2022 Edition)
- 可迭代对象与迭代器、生成器的区别与联系
- In depth evaluation and development trend prediction report of China's ice cream market (2022 Edition)
- In depth Research Report on China's disposable sanitary products production equipment industry (2022 Edition)
- 【C补充】【字符串】按日期排序显示一个月的日程
- Sword finger offer II 015 All modifiers in the string
- 中国酶制剂市场预测与投资战略研究报告(2022版)
- 提交review时ReviewBoard出现500错误解决方法
- 重磅披露!上百个重要信息系统被入侵,主机成为重点攻击目标
猜你喜欢

6月刊 | AntDB数据库参与编写《数据库发展研究报告》 亮相信创产业榜单

Introduction to software engineering - Chapter 6 - detailed design

多线程并发之CountDownLatch阻塞等待

Detailed explanation of string's trim() and substring()

Cookies and session keeping technology

多线程使用不当导致的 OOM

Vulnhub range hacker_ Kid-v1.0.1

如何使用 etcd 实现分布式 /etc 目录

整形数组合并【JS】

Girls who want to do software testing look here
随机推荐
中国冰淇淋市场深度评估及发展趋势预测报告(2022版)
【splishsplash】关于如何在GUI和json上接收/显示用户参数、MVC模式和GenParam
阿里云李飞飞:中国云数据库在很多主流技术创新上已经领先国外
FRP intranet penetration, reverse proxy
vulnhub靶场-hacksudo - Thor
[live broadcast appointment] database obcp certification comprehensive upgrade open class
存在安全隐患 起亚召回部分K3新能源
Concatenate strings to get the result with the smallest dictionary order
The amazing open source animation library is not only awesome, but also small
String class
Redis 分布式锁
阿里云、追一科技抢滩对话式AI
6月刊 | AntDB数据库参与编写《数据库发展研究报告》 亮相信创产业榜单
SystemVerilog-结构体(二)
Detailed explanation of string's trim() and substring()
How wild are hackers' ways of making money? CTF reverse entry Guide
pyqt5中,在控件上画柱状图
Pytest learning notes (13) -allure of allure Description () and @allure title()
China biodegradable plastics market forecast and investment strategy report (2022 Edition)
libcurl下载文件的代码示例