当前位置:网站首页>PHP implements sensitive word filtering system "suggestions collection"
PHP implements sensitive word filtering system "suggestions collection"
2022-07-01 17:14:00 【Full stack programmer webmaster】
Hello everyone , I meet you again , I'm your friend, Quan Jun .
Code description
1、 Sensitive thesaurus maintenance update script :
reload_dict.php, Provides automatic dictionary update to trie-tree Documentation process
PHP
<?php
// Set memory
ini_set('memory_limit', '128M');
// Read sensitive word dictionary
$handle = fopen('dict.txt', 'r');
// Generate empty trie-tree-filter
$resTrie = trie_filter_new();
while(! feof($handle)) {
$item = trim(fgets($handle));
if (empty($item)) {
continue;
}
// Add sensitive words one by one trie-tree
trie_filter_store($resTrie, $item);
}
// Generate trie-tree file
$blackword_tree = 'blackword.tree';
trie_filter_save($resTrie, $blackword_tree);2、trie Tree objects get tool classes
FilterHelper.php, Provide access to trie-tree object , Avoid duplicate generation trie-tree Objects and guarantees tree File and sensitive thesaurus synchronization update
PHP
<?php
/**
* Filter assistant
*
* getResTrie Provide trie-tree object ;
* getFilterWords Extract the filtered string
*
* @author W.Y.P ([email protected])
*/
class FilterHelper
{
// trie-tree object
private static $_resTrie = null;
// Update time of dictionary tree
private static $_mtime = null;
/**
* Prevent initialization
*/
private function __construct() {}
/**
* Prevent cloning objects
*/
private function __clone() {}
/**
* Provide trie-tree object
*
* @param $tree_file Dictionary tree file path
* @param $new_mtime The update time of the dictionary tree at the current call
* @return null
*/
static public function getResTrie($tree_file, $new_mtime) {
if (is_null(self::$_mtime)) {
self::$_mtime = $new_mtime;
}
if (($new_mtime != self::$_mtime) || is_null(self::$_resTrie)) {
self::$_resTrie = trie_filter_load($tree_file);
self::$_mtime = $new_mtime;
// Output dictionary file overload time
echo date('Y-m-d H:i:s') . "\tdictionary reload success!\n";
}
return self::$_resTrie;
}
/**
* Extract the filtered sensitive words from the original string
*
* @param $str Original string
* @param $res 1-3 Express From the position 1 Start ,3 Character length
* @return array
*/
static public function getFilterWords($str, $res)
{
$result = array();
foreach ($res as $k => $v) {
$word = substr($str, $v[0], $v[1]);
if (!in_array($word, $result)) {
$result[] = $word;
}
}
return $result;
}
}3、 Provide external filtering HTTP Access interface
filter.php, Use swool, External submission filter interface access
PHP
<?php
// Set the maximum running memory of the script , Adjust according to the dictionary size
ini_set('memory_limit', '512M');
// Set time zone
date_default_timezone_set('Asia/Shanghai');
// Load the helper file
require_once('FilterHelper.php');
// http Service bound ip And port
$serv = new swoole_http_server("182.92.177.16", 9502);
/**
* Processing requests
*/
$serv->on('Request', function($request, $response) {
// receive get Request parameters
$content = isset($request->get['content']) ? $request->get['content']: '';
$result = '';
if (!empty($content)) {
// Dictionary tree file path , The default directory is
$tree_file = 'blackword.tree';
// Clear file state cache
clearstatcache();
// When getting a request , The modification time of the dictionary tree file
$new_mtime = filemtime($tree_file);
// Get the latest trie-tree object
$resTrie = FilterHelper::getResTrie($tree_file, $new_mtime);
// filtering
$arrRet = trie_filter_search_all($resTrie, $content);
// Extract the filtered sensitive words
$a_data = FilterHelper::getFilterWords($content, $arrRet);
$result = json_encode($a_data);
}
// Definition http Service information and response processing results
$response->cookie("User", "W.Y.P");
$response->header("X-Server", "W.Y.P WebServer(Unix) (Red-Hat/Linux)");
$response->header('Content-Type', 'Content-Type: text/html; charset=utf-8');
$response->end($result);
});
$serv->start();Publisher : Full stack programmer stack length , Reprint please indicate the source :https://javaforall.cn/130918.html Link to the original text :https://javaforall.cn
边栏推荐
- Report on research and investment prospects of UHMWPE industry in China (2022 Edition)
- 智能运维实战:银行业务流程及单笔交易追踪
- Alibaba cloud, Zhuoyi technology beach grabbing dialogue AI
- String类
- Basic usage of Frida
- mysql -- explain性能优化
- Encryption and decryption of tinyurl in leetcode
- Hidden Markov model (HMM): model parameter estimation
- SQL question brushing 586 Customers with the most orders
- 【C补充】【字符串】按日期排序显示一个月的日程
猜你喜欢

多线程并发之CountDownLatch阻塞等待
![[Verilog quick start of Niuke network question brushing series] ~ priority encoder circuit ①](/img/24/23f6534e2c74724f9512c5b18661b6.png)
[Verilog quick start of Niuke network question brushing series] ~ priority encoder circuit ①

走进微信小程序

【PyG】文档总结以及项目经验(持续更新
荣威 RX5 的「多一点」产品策略

如何写出好代码 — 防御式编程指南

Yyds dry inventory MySQL RC transaction isolation level implementation

智能运维实战:银行业务流程及单笔交易追踪

6月刊 | AntDB数据库参与编写《数据库发展研究报告》 亮相信创产业榜单

Redis 分布式鎖
随机推荐
String class
SystemVerilog-结构体(二)
如何写出好代码 — 防御式编程指南
vulnhub靶场-Hacker_Kid-v1.0.1
[mathematical modeling] [matlab] implementation of two-dimensional rectangular packing code
Flux d'entrées / sorties et opérations de fichiers en langage C
Encryption and decryption of tinyurl in leetcode
Redis 分布式锁
麦趣尔:媒体报道所涉两批次产品已下架封存,受理消费者诉求
mysql -- explain性能优化
ACL 2022 | 分解的元学习小样本命名实体识别
在MeterSphere接口测试中如何使用JMeter函数和MockJS函数
中国氮化硅陶瓷基板行业研究与投资前景报告(2022版)
中国冰淇淋市场深度评估及发展趋势预测报告(2022版)
DNS
【PyG】文档总结以及项目经验(持续更新
荣威 RX5 的「多一点」产品策略
Official announcement! Hong Kong University of science and Technology (Guangzhou) approved!
[wrung Ba wrung Ba is 20] [essay] why should I learn this in college?
Judge whether a binary tree is a balanced binary tree