当前位置：网站首页>Record an online interface slow query problem troubleshooting

Record an online interface slow query problem troubleshooting

2022-07-01 06:41:00 【Cold tea ice】

Catalog

Problem description

Solution

1 Message middleware

2 Code and database optimization

3 ConcurrentLinkedQueue programme

Timing task

Asynchronous processing

The business process

other

reference

Problem description

There is an interface for classification prediction , The main business logic is to enter a piece of text , The interface calls the model internally to predict the text classification . Model data is directly in memory , So the prediction process itself is very fast . After the forecast is completed , Insert a piece of data into the forecast record table . Later, other applications will correct the record , Judge whether the prediction is successful , In order to follow-up self-learning .

At the initial stage of the interface launch, it was highly praised , Whether it is intelligent dispatch or circulation prediction of handling units, the effect is very good , With the addition of self-learning function , It is expected that the application will run better . But two days ago, the customer suddenly reported that the system was very slow , It takes threeorfour seconds or even fiveorsix seconds for intelligent dispatch and circulation to produce results .

At first, I thought there were too many classification models on site , The model tree is also relatively deep , Therefore, a multi-level prediction will inevitably be more time-consuming , However, the most model trees found after verification are only four levels , A single model directly verifies that the predicted results are all in milliseconds , But calling the interface directly is slow . If it is not the problem of algorithm model , Nor is it a matter of machine resources , That must be a point in the code logic , And as the amount of data surges , The problem is becoming more and more obvious .

After a simple investigation , Finding problems is very low level , The main reason for the slow speed is that the data in the forecast record table is nearly 100w 了 , And this table has no index , There is only one self incrementing primary key , In the program , Because business logic requires , Every time the prediction interface is called, there will be an action of query updating or inserting , And it's synchronous . As the amount of data increases , Interface RT Destined to be slower and slower .

Solution

The problem is a small one , The main reason is that there is no in-depth consideration at the initial stage of interface design , The code is beautifully written , It's a pity that it hasn't been tested , An embroidered pillow . For this kind of problem , The most intuitive solution is decoupling , Forecast and forecast record warehousing are asynchronous . Of course, if there are business scenarios that require hard synchronization , You can also optimize from the code level and database level to increase the processing speed .

1 Message middleware

（1） You can introduce kafka perhaps RocketMQ Decoupling , This kind of insurance , The data won't be lost , And it can decouple core functions from non core functions . It's the mainstream solution , It also supports horizontal expansion and distribution .

（2） If you don't want to use it MQ, It can also be used directly Redis,Redis You can also realize the function of message queue , Reference here

2 Code and database optimization

If you have to synchronize , It is necessary to optimize and adjust from the code level , The main ideas are as follows ：

（1） Add an index to the field of the query , Improve the capability of the database IO

（2） Set partition table or day table （ Lunar surface ） The concept of , Reduce the amount of data in a single table

（3） Code optimization , Merge the update or insert after query into one sql operation , Reference here

insertOrUpdate The implementation is based on mysql Of on duplicate key update To achieve .
Use ON DUPLICATE KEY UPDATE, Insert as new line , Then the affected row value of each row is 1. If you update an existing row , Then the affected row value of each row is 2; If you set an existing row to its current value , Then the affected row value of each row is 0（ Can be configured by , The value of the affected row is 1）.
Official address ：
13.2.6.2 INSERT … ON DUPLICATE KEY UPDATE Statement

For the asynchronous decoupling scheme , If the previous application itself did not MQ、Redis, To solve this problem, install these middleware , To some extent, this undoubtedly increases the complexity of the system and the workload of dimensional integration . Based on this , We can also use java Its own multithreading to achieve asynchronous , The basic idea is to do IO Where to operate , Directly start a new thread to process .

But if the request is large , This will undoubtedly result in frequent thread creation 、 Free up resources , If the thread pool is introduced , It may also be blocked due to resource exhaustion , This does not fundamentally solve the problem of synchronization .

Can pass java Multi thread simulation message queue , In need of IO The business code of the operation , Encapsulate business data as Bo Put it in a queue . An independent thread can consume and process the queue . Sure Reference here

I mainly use ConcurrentLinkedQueue To solve this problem , The basic idea is , Created a scheduled task , Every time 30 Once per second , Deal with it every time ConcurrentLinkedQueue Data in the queue , Put data in storage . Business code encapsulates business data into Bo Put it in the queue .

3 ConcurrentLinkedQueue programme

Timing task

@Component
public class TaskJob
{
    private static final Logger logger = LoggerFactory.getLogger(TaskJob.class);

    @Scheduled(cron = "*/10 * * * * ?") // Every time 10 Once per second , Processing forecast result information asynchronously , Put in storage 
    public void execuPredictResult() throws Exception
    {
        SyncSavePredictResultService syncSavePredictResultService = new SyncSavePredictResultService();
        syncSavePredictResultService.consumeData();
    }
}

Asynchronous processing

public class SyncSavePredictResultService
{
    private static final Logger logger = LoggerFactory.getLogger(SyncSavePredictResultService.class);

    // Message queue for forecast logging , Asynchronously process receipt operations 
    public static final ConcurrentLinkedQueue<PredictResultBo> RESULT_BO = new ConcurrentLinkedQueue<>();

    public void consumeData()
    {
        PredictResultBo resultBo = RESULT_BO.poll();
        while (resultBo != null)
        {
            // Do business logic processing 
            excuSubPdResult(resultBo.getContent(),
                    resultBo.getBegin(),
                    resultBo.getEnd(),
                    resultBo.getmId(),
                    resultBo.getFlagValue(),
                    resultBo.getPredictResult(),
                    resultBo.getRootPid(),
                    resultBo.getD());
            logger.info(" Write forecast records asynchronously , The object content is :{}", resultBo.toString());
            // to update resultBo
            resultBo = RESULT_BO.poll();
        }
    }

    private void excuSubPdResult(String pstr, long begin, long end, Integer mId, String flagValue, String result, Integer rootPid, Date d)
    {
        NlpSubPredictResults nResult = NlpSubPredictResults.GetInstance().findFirst("select * from nlp_sub_predict_results where flag_value=?", flagValue);
        if (nResult != null)
        {
            nResult.set("predict_result", result)
                   .set("start_time", begin)
                   .set("end_time", end)
                   .set("content", pstr)
                   .set("updated_at", d)
                   .update();
        }
        else
        {
            nResult = NlpSubPredictResults.GetInstance();
            nResult.set("flag_value", flagValue)
                   .set("root_p_id", rootPid)
                   .set("m_id", mId)
                   .set("predict_result", result)
                   .set("start_time", begin)
                   .set("end_time", end)
                   .set("content", pstr)
                   .set("created_at", d)
                   .set("updated_at", d)
                   .save();
        }
    }
}

The business process

private void excuSubPdResult(String pstr,long begin,long end,Integer mId,String flagValue,String typeId,String typeName,Integer rootPid){
        Map<String, String> rMap = new HashMap<>();
        rMap.put("name",typeName);
        rMap.put("id", typeId);
        String result = JSON.toJSONString(rMap);


        Date                 d       = new Date();
        PredictResultBo predictResultBo = new PredictResultBo(mId,rootPid,flagValue,result,begin,end,pstr,d);
        SyncSavePredictResultService.RESULT_BO.add(predictResultBo);

        /*
        //2022-05-30 Change to asynchronous 
        NlpSubPredictResults nResult = NlpSubPredictResults.GetInstance().findFirst("select * from nlp_sub_predict_results where m_id=? and flag_value=? and root_p_id=?",mId,flagValue,rootPid);
        if(nResult!=null){
            nResult.set("predict_result", result)
                   .set("start_time",begin)
                   .set("end_time",end)
                   .set("content",pstr)
                   .set("updated_at",d)
                   .update();
        }else{
            nResult = NlpSubPredictResults.GetInstance();
            nResult.set("flag_value",flagValue)
                   .set("root_p_id",rootPid)
                   .set("m_id",mId)
                   .set("predict_result", result)
                   .set("start_time",begin)
                   .set("end_time",end)
                   .set("content",pstr)
                   .set("created_at",d)
                   .set("updated_at",d)
                   .save();
        }
         */
    }

other

The above transformation method actually has hidden dangers and drawbacks

（1） If the number of interface calls is large , There will inevitably be a backlog of information , At this time, if the node hangs , Then the data is lost .

（2） If there is a large backlog of messages , It's possible to burst the memory , This will affect the normal use of the application , It can also cause data loss .

（3） Cannot support multiple consumers .

（4） The logic of consumption warehousing actually has nothing to do with the core prediction function , But if there are many consumption data , It will inevitably affect the use of the core prediction function . This is unreasonable in terms of software architecture .

These questions are passed MQ perhaps Redis Can solve the problem very well .

however , But nothing is absolute , Many times, we need to adjust measures to local conditions , According to actual business requirements 、 Data requirement 、 Project emergencies 、 Cost budget, etc , Consider which solution to use .

such as ： Through single node multithreading ConcurrentLinkedQueue Can solve the problem 99% The problem of , The required cost budget is 1, The development cycle 1 God , And by MQ Can solve 99.9% The problem of , The required cost budget is 10, The development cycle 7 God . The fault tolerance allowed by the customer is 5%. There is obviously no need to use MQ The plan , Have no meaning .

In fact, what I want to express is , Nothing is absolute , We should always face problems with an open mind , There's no need to 0.1% The advantages of 99% Additional input .