事务原理与实践

2016/3/11 posted in  数据库

事务简介

  • 事务就是锁和并发的结合体。
  • 所有的针对数据库的一个操作,都可以看做成一个事务。
  • 事务单元之间只有这四种关系:读写、写读、读读、写写
  • 真实业务场景中,如何以最快的速度的方式完成事务和事务之间的关系?又能保证上面四种逻辑的顺序?
  • 做法:

    • 序列化读写ACID中的I破坏了一致性
      • 优势:不需要冲突控制
      • 劣势:慢
    • 针对同一单元的访问进行控制(排他锁):如果两个事务发生冲突,就串行;如果两个事务不会发生冲突,就并行。

      譬如两个事务单元Bob给Smith一百块;李磊给韩梅梅一百块,这两个事务单元完全没有冲突,应该可以并行
      
    • 读写锁分开读读的场景并行;读写写读写写串行

    • MVCC:多版本并发控制Muti-Version Concurrency Control,本质就是Copy on write,每次写都是写一个新的数据,并不是在原位更新,能够做到写不阻塞读,带来写读读写场景的优化。

      • 劣势:系统实现的复杂度增加

现在只剩下写写才会产生冲突

事务谁先谁后?

一个读请求应该读哪一个写之后的数据?

逻辑时间戳:内存中维护一个数据的自增ID号,本质就是用来说明事务和事务单元之间谁先谁后

故障恢复

  • 业务属性不匹配:ACID中的A 回滚,需要记住之前数据没有更改时的状态
  • 系统崩溃

死锁与死锁检测

死锁产生的原因:

  • 两个线程参与
  • 不同方向上加锁
  • 作用在相同资源

解决方案:

  • 尽可能不死锁:降低事务隔离级别
  • 碰撞检测:把所有事务单元维持的所有锁都记下来,终止死锁的一边
  • 等锁超时

事务的ACID(数据库怎么保证的)

  • 原子性

要么全部成功,要么全部失败

只是记录了一个undo日志回滚到之前的版本

Bob给Smith一百块

ver1: Bob有100元 Smith有0元
ver2: Bob有0元 Smith有0元
    (undo:Bob有100元 Smith有0元) <= 数据库回滚段
ver3: Bob有100元 Smith有100元
    (undo:Bob有0元 Smith有0元) <= 数据库回滚段
  • 一致性
-- Lock Bob and Smith --
ver1: Bob有100元 Smith有0元
ver2: Bob有0元 Smith有0元
    (undo:Bob有100元 Smith有0元) <= 数据库回滚段
ver3: Bob有0元 Smith有100元
    (undo:Bob有0元 Smith有0元) <= 数据库回滚段
-- Unlock Bob and Smith --
  • 隔离性

以性能为理由,对一致性的破坏(各种隔离级别)

隔离性扩展:快照

  • 持久性

更改便持久的保存在数据库中,只要提交了就不丢

快照读的核心做法

在回滚段中读

  • 针对读多写少的情况

大部分数据库已经将快照读映射到读未提交、读已提交

事务的调优原则

  • 在不影响业务应用的前提下,减小锁的覆盖范围
  • 增加锁上可并行的线程数:读写锁分离、允许并行读取数据
  • 选择正确锁的类型

Consistent Nonlocking Reads in MySQL

A consistent read means that InnoDB uses multi-versioning to present to a query a snapshot of the database at a point in time.

The query sees the changes made by transactions that committed before that point of time, and no changes made by later or uncommitted transactions.

The exception to this rule is that the query sees the changes made by ealier statements within the same transaction. This exception causes the following anomaly: if you update some rows in a time, a SELECT sees the latest version of the updated rows, but it might also see older versions of any rows. If other sessions simultaneously update the same table, the anomaly means that you might see the table in a state that never existed in the database.

If the transaction isolation level is REPEATABLE READ (the default level), all consistent reads within the same transaction read the snapshot established byu the first such read in that transaction. You can get a fresher snapshot for your queries by committing the current transaction and after that issuing new queries.

With READ COMMITTED isolation level, each consistent read within a transaction sets and reads its own fresh snapshot.

Consistent read is the default mode in which InnoDB processes SELECT statements in READ COMMITTED and REPEATABLE READ isolation levels. A consistent read does not set any locks on the tables it accesses, and therefore other sessions are free to modify those tables at the same time a consistent read is being performed on the table.

Suppose that you are running in the default REPEATABLE READ isolation level. When you issue a consistent read (that is, an ordinary SELECT statement), InnoDB gives your transaction a timepoint according to which your query sees the database. If another transaction deletes a row and commits after your timepoint was assigned, you do not see the row as having been deleted. Inserts and updates are treated similarly.

你读数据的时候,数据不会加锁,其他人都能改你读的数据;但是你一直读到的都是最原始的数据,其他人怎么改都不会影响你读到的东西。

In the following example, session A sees the row inserted by B onluy when B has committed the insert and A has committed as well, so that the timepoint is advanced past the commit of B.

Session ASession B
SET autocommit=0; SET autocommit=0;
SELECT * FROM t; # empty set
INSERT INTO t VALUES (1, 2);
SELECT * FROM t; # empty set
COMMIT;
SELECT * FROM t; # empty set
COMMIT;
SELECT * FROM t;

READ UNCOMMITTED: UserA will see the change made by UserB. This isolation level is called dirty reads, which means that read data is not consistent with other parts of the table or the query, and may not yet have been committed. This isolation level ensures the quickest performance, as data is read directly from the table’s blocks with no further processing, verifications or any other validation. The process is quick and the data is as dirty as it can get.

READ COMMITTED: UserA will not see the change made by UserB. This is because in the READ COMMITTED isolation level, the rows returned by a query are the rows that were committed when the query was started. The change made by UserB was not present when the query started, and therefore will not be included in the query result.

REPEATABLE READ: UserA will not see the change made by UserB. This is because in the REPEATABLE READ isolation level, the rows returned by a query are the rows that were committed when the transaction was started. The change made by UserB was not present when the transaction was started, and therefore will not be included in the query result.

This means that “All consistent reads within the same transaction read the snapshot established by the first read” (from MySQL documentation.).

SERIALIZABLE: This isolation level specifies that all transactions occur in a completely isolated fashion, meaning as if all transactions in the system were executed serially, one after the other. The DBMS can execute two or more transactions at the same time only if the illusion of serial execution can be maintained.

In practice, SERIALIZABLE is similar to REPEATABLE READ, but uses a different implementation for each database engine. In Oracle, the REPEATABLE READ level is not supported and SERIALIZABLE provides the highest isolation level. This level is similar to REPEATABLE READ, but InnoDB implicitly converts all plain SELECT statements to “SELECT … LOCK IN SHARE MODE.

The default isolation level in MySQL’s InnoDB is REPEATABLE READ.

Since old values of row data are required for current queris, databases use a special segment to store old row values and snapshot. MySQL calls this segment a Rollback Segment (Undo Segment in Oracle).