摘要
我们经常使用useGenerateKeys来返回自增主键,避免多一次查询。也会经常使用on duplicate key update,来进行insertOrUpdate,来避免先query 在insert/update。用起来很爽,但是经常踩坑,还不知为何。本篇就是深入分析获取自增主键的原理。
问题
首先摘两段我司一些老代码的bug
批量插入用户收藏
for (tries = 0; tries < MAX_RETRY; tries++) {
final int result = collectionMapper.insertCollections(collections);
if (result == collections.size()) {
break;
}
}
if (tries == MAX_RETRY) {
throw new RuntimeSqlException("Insert collections error");
}
// 依赖数据库生成的collectionid
return collections;
collectionMapper.insertCollections 方法
<insert id="insertCollections" parameterType="list" useGeneratedKeys="true"
keyProperty="collectionId">
INSERT INTO collection(
userid, item
)
VALUES
<foreach collection="list" item="collection" separator=",">
(#{collection.userId}, #{collection.item})
</foreach>
ON DUPLICATE KEY UPDATE
status = 0
</insert>
不知道大家能不能发现其中的问题
分析
问题有两个
返回值result的判断错误
使用on duplicate key
批量update返回影响的行数是和插入的数不一样的。犯这种错主要在于想当然,不看文档
看下官网文档
写的很清楚
With ON DUPLICATE KEY UPDATE, the affected-rows value per row is 1 if the row is inserted as a new row, 2 if an existing row is updated, and 0 if an existing row is set to its current values. If you specify the CLIENT_FOUND_ROWS flag to the mysql_real_connect() C API function when connecting to mysqld, the affected-rows value is 1 (not 0) if an existing row is set to its current values.
返回值有三种
0: 没有更新 1 :insert 2. update
还有一个特殊情况,update 一个相同值到原来的值,这个根据客户端配置,可能为0,可能为1。
所以这个判断明显错误
利用批量InsertOrUpdate的userGeneratedKey来返回自增主键
这个问题批量插入时有update语句时,就会发现有问题。返回的自增主键都是错的,这是为什么呢?
1. 首先我们看下mybatis对于useGeneratedKey的描述
>This tells MyBatis to use the JDBC getGeneratedKeys method to retrieve keys generated internally by the database (e.g. auto increment fields in RDBMS like MySQL or SQL Server). Default: false.
就是使用JDBC的getGeneratedKeys的方法来获取的。
2. 我们再找下JDBC的规范
Before version 3.0 of the JDBC API, there was no standard way of retrieving key values from databases that supported auto increment or identity columns. With older JDBC drivers for MySQL, you could always use a MySQL-specific method on the Statement interface, or issue the query SELECT LAST_INSERT_ID() after issuing an INSERT to a table that had an AUTO_INCREMENT key. Using the MySQL-specific method call isn’t portable, and issuing a SELECT to get the AUTO_INCREMENT key’s value requires another round-trip to the database, which isn’t as efficient as possible. The following code snippets demonstrate the three different ways to retrieve AUTO_INCREMENT values. First, we demonstrate the use of the new JDBC 3.0 method getGeneratedKeys() which is now the preferred method to use if you need to retrieve AUTO_INCREMENT keys and have access to JDBC 3.0. The second example shows how you can retrieve the same value using a standard SELECT LAST_INSERT_ID() query. The final example shows how updatable result sets can retrieve the AUTO_INCREMENT value when using the insertRow() method.
意思就是JDBC3.0以前,有些乱七八糟的定义的,没有统一,之后统一成了getGeneratedKeys()方法。两边是一致的。实现的原理主要就是数据库端返回一个LAST_INSERT_ID
。这个跟auto_increment_id
强相关。
我们看下auto_increment_id的定义。重点关注批量插入
For a multiple-row insert, LAST_INSERT_ID() and mysql_insert_id() actually return the AUTO_INCREMENT key from the first of the inserted rows. This enables multiple-row inserts to be reproduced correctly on other servers in a replication setup.
批量插入的时候只会返回一个id,这个id值是第一个插入行的AUTO_INCREMENT值。至于为什么这么干,能够使得mysql-server在master-slave架构下也能保证id值统一的原因可以看下这篇。本篇文章就不展开了。
那么mysql server只返回一个id,客户端批量插入的时候怎么能实现获取全部的id呢
3. 客户端的实现
我们看下客户端getGeneratedKeys
的实现。
JDBC com.mysql.jdbc.StatementImpl
public synchronized ResultSet getGeneratedKeys() throws SQLException {
if (!this.retrieveGeneratedKeys) {
throw SQLError.createSQLException(Messages.getString("Statement.GeneratedKeysNotRequested"), "S1009", this.getExceptionInterceptor());
} else if (this.batchedGeneratedKeys == null) {
// 批量走这边的逻辑
return this.lastQueryIsOnDupKeyUpdate ? this.getGeneratedKeysInternal(1) : this.getGeneratedKeysInternal();
} else {
Field[] fields = new Field[]{new Field("", "GENERATED_KEY", -5, 17)};
fields[0].setConnection(this.connection);
return ResultSetImpl.getInstance(this.currentCatalog, fields, new RowDataStatic(this.batchedGeneratedKeys), this.connection, this, false);
}
}
看下调用的方法 this.getGeneratedKeysInternal()
protected ResultSet getGeneratedKeysInternal() throws SQLException {
// 获取影响的行数
int numKeys = this.getUpdateCount();
return this.getGeneratedKeysInternal(numKeys);
}
这里有个重要知识点了,首先获取本次批量插入的影响行数,然后再执行具体的获取id操作。
getGeneratedKeysInternal方法
protected synchronized ResultSet getGeneratedKeysInternal(int numKeys) throws SQLException {
Field[] fields = new Field[]{new Field("", "GENERATED_KEY", -5, 17)};
fields[0].setConnection(this.connection);
fields[0].setUseOldNameMetadata(true);
ArrayList rowSet = new ArrayList();
long beginAt = this.getLastInsertID();
// 按照受影响的范围+递增步长
for(int i = 0; i < numKeys; ++i) {
if (beginAt > 0L) {
// 值塞进去
row[0] = StringUtils.getBytes(Long.toString(beginAt));
}
beginAt += (long)this.connection.getAutoIncrementIncrement();
}
}
迭代影响的行数,然后依次获取id。
所以批量insert是正确可以返回的。
但是批量insertOrUpdate就有问题了,批量insertOrUpdate的影响行数不是插入的数据行数,可能是0,1,2这样就导致了自增id有问题了。
比如插入3条数据,2条会update,1条会insert,这时候updateCount就是5,generateid就会5个了,mybatis然后取前3个塞到数据里,显然是错的。
以上是原理分析,如果想了解更详细的实验结果,可以看下实验
总结
批量insert
<insert id="insertAuthor" useGeneratedKeys="true"
keyProperty="id">
insert into Author (username, password, email, bio) values
<foreach item="item" collection="list" separator=",">
(#{item.username}, #{item.password}, #{item.email}, #{item.bio})
</foreach>
</insert>
来自官网的例子,mapper中不能指定@Param参数,否则会有问题
批量insertOrUpdate
不能依赖useGeneratedKey返回主键。
关注公众号【方丈的寺院】,第一时间收到文章的更新,与方丈一起开始技术修行之路
参考
https://blog.csdn.net/slvher/article/details/42298355
https://blog.csdn.net/qq_27680317/article/details/81118070#%EF%BC%883%EF%BC%89%E6%9C%80%E4%BD%B3%E5%AE%9E%E8%B7%B5
https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-usagenotes-last-insert-id.html
https://dev.mysql.com/doc/refman/5.5/en/innodb-auto-increment-handling.html
https://dev.mysql.com/doc/refman/5.5/en/example-auto-increment.html