当前位置:网站首页>How to batch insert 100000 pieces of data

How to batch insert 100000 pieces of data

2022-06-11 14:57:00 Ant

How to batch insert 10 Ten thousand data

Ideas

When inserting in batches , There are generally two ways of thinking :

  1. Use one for loop , Insert data one by one ( This needs to start batch processing ).
  2. Generate an insert sql, Like this insert into user(username,address) values('aa','bb'),('cc','dd')....

This problem is mainly considered from two aspects :

  1. SQL Own execution efficiency
  2. The Internet I/O

First option

Use for loop :

  • The advantage of this scheme is that ,JDBC Medium PreparedStatement It has precompile function , After precompiling, it will be cached , hinder SQL Execution will be faster and JDBC Batch processing can be started , This batch process is awesome. .
  • The disadvantage is , A lot of times our SQL The server and application server may not be the same , Therefore, the network must be considered IO, If the Internet IO If it takes more time , Then it may slow down SQL Speed of execution .

When using for When inserting one by one , The batch mode needs to be enabled (BATCH), In this way, only one SqlSession, If batch mode is not used , Repeated acquisition Connection And release Connection It's going to take a lot of time , Efficiency is very low .

Second option

Generate a sql Insert :

  • The advantage of this scheme is that there is only one network IO, Even slicing is only a few times IO, So this scheme will not be used in the network IO Spend too much time on .
  • Of course, this scheme has several disadvantages , One is SQL Is too long. , It may even require batch processing after slicing ; Second, it can not give full play PreparedStatement The advantages of precompiling ,SQL To be re parsed and cannot be reused ; The third is the final generation SQL Is too long. , The database manager parses such a long SQL It also takes time .

The final consideration is in the network IO Time spent on , Whether it exceeds SQL The time of insertion ? This is the core issue we should consider .

Select the corresponding batch insertion method according to the actual situation .

Mybatis Plus How to do it

Actually MyBatis Plus There is also a batch insertion method saveBatch, Let's take a look at its implementation source code :

@Transactional(rollbackFor = Exception.class)
@Override
public boolean saveBatch(Collection<T> entityList, int batchSize) {
    String sqlStatement = getSqlStatement(SqlMethod.INSERT_ONE);
    return executeBatch(entityList, batchSize, (sqlSession, entity) -> sqlSession.insert(sqlStatement, entity));
}

I got it here sqlStatement It's just one. INSERT_ONE, That is, insert one by one .

executeBatch Method :

public static <E> boolean executeBatch(Class<?> entityClass, Log log, Collection<E> list, int batchSize, BiConsumer<SqlSession, E> consumer) {
    Assert.isFalse(batchSize < 1, "batchSize must not be less than one");
    return !CollectionUtils.isEmpty(list) && executeBatch(entityClass, log, sqlSession -> {
        int size = list.size();
        int i = 1;
        for (E element : list) {
            consumer.accept(sqlSession, element);
            if ((i % batchSize == 0) || i == size) {
                sqlSession.flushStatements();
            }
            i++;
        }
    });
}

Note here return The third parameter in , It's a lambda expression , This is also MP Core logic of batch insertion in , You can see ,MP Divide the data first ( The default tile size is 1000), After slicing , It is also inserted one by one . Keep looking at executeBatch Method , You'll find out sqlSession In fact, it is also a batch process sqlSession, It's not ordinary sqlSession.

Reference material

Sharing plans

Blog content will be synchronized to Tencent cloud + Community , Invite everyone to join us :https://cloud.tencent.com/

license agreement

In this paper A signature - Noncommercial use - Share in the same way 4.0 The international license agreement , Reprint please indicate the source .

原网站

版权声明
本文为[Ant]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/03/202203012040060425.html