1 of 93

Module_6�Transaction, Management and Concurrency

Ms. Khushbu Tikhe

Assistant Professor

Shree L. R. Tiwari College of Engineering

2 of 93

Transaction

  • A transaction is a program unit whose execution may or may not change the contents of a database.
  • The transaction is executed as a single unit
  • If the database operations do not update the database but only retrieve data, this type of transaction is called a read-only transaction.
  • A successful transaction can change the database from one CONSISTENT STATE to another
  • If the database were in an inconsistent state before a transaction, it would remain in the inconsistent state after the transaction.
  • Defination: A transaction can be defined as a group of tasks. A single task is the minimum processing unit which cannot be divided further.

3 of 93

Example

  • Let’s take an example of a simple transaction. Suppose a bank employee transfers Rs 500 from A's account to B's account. This very simple and small transaction involves several low-level tasks.
  • A’s Account

Open_Account(A)

Old_Balance = A.balance

New_Balance = Old_Balance - 500

A.balance = New_Balance

Close_Account(A)

  • B’s Account

Open_Account(B)

Old_Balance = B.balance

New_Balance = Old_Balance + 500

B.balance = New_Balance

Close_Account(B)

4 of 93

Types of Operation in Transaction

5 of 93

6 of 93

Transaction Structure and Boundaries

7 of 93

8 of 93

ACID Properties in DBMS

  • A transaction is a single logical unit of work which accesses and possibly modifies the contents of a database. Transactions access data using read and write operations.
  • In order to maintain consistency in a database, before and after the transaction, certain properties are followed. These are called ACID properties.

9 of 93

Atomicity�

  • By this, we mean that either the entire transaction takes place at once or doesn’t happen at all. There is no midway i.e. transactions do not occur partially. Each transaction is considered as one unit and either runs to completion or is not executed at all. It involves the following two operations.

—Abort: If a transaction aborts, changes made to database are not visible.

—Commit: If a transaction commits, changes made are visible.

  • Atomicity is also known as the ‘All or nothing rule’.

10 of 93

  • Consider the following transaction T consisting of T1 and T2: Transfer of 100 from account X to account Y.

  • If the transaction fails after completion of T1 but before completion of T2.( say, after write(X) but before write(Y)), then amount has been deducted from X but not added to Y. This results in an inconsistent database state. Therefore, the transaction must be executed in entirety in order to ensure correctness of database state.

11 of 93

Consistency

  • This means that integrity constraints must be maintained so that the database is consistent before and after the transaction. It refers to the correctness of a database. Referring to the example above,
  • The total amount before and after the transaction must be maintained.
  • Total before T occurs = 500 + 200 = 700.
  • Total after T occurs = 400 + 300 = 700.
  • Therefore, database is consistent. Inconsistency occurs in case T1 completes but T2 fails. As a result T is incomplete.

12 of 93

Isolation

  • This property ensures that multiple transactions can occur concurrently without leading to the inconsistency of database state. Transactions occur independently without interference.
  • Changes occurring in a particular transaction will not be visible to any other transaction until that particular change in that transaction is written to memory or has been committed.
  • This property ensures that the execution of transactions concurrently will result in a state that is equivalent to a state achieved these were executed serially in some order.

  • Let X= 500, Y = 500.
  • Consider two transactions T and T”.

13 of 93

Suppose T has been executed till Read (Y) and then T’’ starts. As a result , interleaving of operations takes place due to which T’’ reads correct value of X but incorrect value of Y and sum computed by

T’’: (X+Y = 50, 000+500=50, 500)

is thus not consistent with the sum at end of transaction:

T: (X+Y = 50, 000 + 450 = 50, 450).

This results in database inconsistency, due to a loss of 50 units. Hence, transactions must take place in isolation and changes should be visible only after they have been made to the main memory.

14 of 93

Durability:

  • This property ensures that once the transaction has completed execution, the updates and modifications to the database are stored in and written to disk and they persist even if a system failure occurs.
  • These updates now become permanent and are stored in non-volatile memory. The effects of the transaction, thus, are never lost.

  • The ACID properties, in totality, provide a mechanism to ensure correctness and consistency of a database in a way such that each transaction is a group of operations that acts a single unit, produces consistent results, acts in isolation from other operations and updates that it makes are durably stored.

15 of 93

States of Transactions

16 of 93

State Transition Diagram for a Database Transaction

17 of 93

Let's study a state transition diagram that highlights how a transaction moves between these various states.

  1. Once a transaction states execution, it becomes active. It can issue READ or WRITE operation.
  2. Once the READ and WRITE operations complete, the transactions becomes partially committed state.
  3. Next, some recovery protocols need to ensure that a system failure will not result in an inability to record changes in the transaction permanently. If this check is a success, the transaction commits and enters into the committed state.
  4. If the check is a fail, the transaction goes to the Failed state.
  5. If the transaction is aborted while it's in the active state, it goes to the failed state. The transaction should be rolled back to undo the effect of its write operations on the database.
  6. The terminated state refers to the transaction leaving the system.

18 of 93

Transition State

  • Active − In this state, the transaction is being executed. This is the initial state of every transaction.

  • Partially Committed − When a transaction executes its final operation, it is said to be in a partially committed state.

  • Failed − A transaction is said to be in a failed state if any of the checks made by the database recovery system fails. A failed transaction can no longer proceed further.

  • Aborted − If any of the checks fails and the transaction has reached a failed state, then the recovery manager rolls back all its write operations on the database to bring the database back to its original state where it was prior to the execution of the transaction. Transactions in this state are called aborted. The database recovery module can select one of the two operations after a transaction aborts −
    • Re-start the transaction
    • Kill the transaction

  • Committed − If a transaction executes all its operations successfully, it is said to be committed. All its effects are now permanently established on the database system.

19 of 93

Serial Execution/Transition/Schedules

  • A series of operation from one transaction to another transaction is known as schedule. It is used to preserve the order of the operation in each of the individual transaction.
  • The serial schedule is a type of schedule where one transaction is executed completely before starting another transaction. In the serial schedule, when the first transaction completes its cycle, then the next transaction is executed.

  • For example: Suppose there are two transactions T1 and T2 which have some operations. If it has no interleaving of operations, then there are the following two possible outcomes:
    1. Execute all the operations of T1 which was followed by all the operations of T2.
    2. Execute all the operations of T2 which was followed by all the operations of T1.
    3. In the given (a) figure, Schedule A shows the serial schedule where T1 followed by T2.
    4. In the given (b) figure, Schedule B shows the serial schedule where T2 followed by T1.

20 of 93

Serial Execution/Transition/Schedules

21 of 93

Concurrent Execution/Transition/Schedules

  • When operations of a transaction are interleaved with operations of other transactions of a schedule, the schedule is called Concurrent schedule.
  • Transaction-processing systems usually allow multiple transactions to run concurrently. Allowing multiple transactions to update data concurrently causes several complications with consistency of the data

However, there are some good reasons for allowing concurrency:

  • Improved throughput:
  • Resource utilization:
  • Reduced waiting time:

22 of 93

Serializable Execution/Transition/schedule

  • The serializability of schedules is used to find non-serial schedules that allow the transaction to execute concurrently without interfering with one another.
  • It identifies which schedules are correct when executions of the transaction have interleaving of their operations.
  • A non-serial schedule will be serializable if its result is equal to the result of its transactions executed serially.
  • A non-serial schedule of n number of transactions is said to be serializable schedule, if it is equivalent to the serial schedule of those n transactions. A serial schedule doesn’t allow concurrency, only one transaction executes at a time and the other starts when the already running transaction finished.

  • Types of Serializability

1. Conflict Serializability

2. View Serializability

23 of 93

1. Conflict Serializability

  • Conflict Serializability is one of the type of Serializability, which can be used to check whether a non-serial schedule is conflict serializable or not.
  • A schedule is called conflict serializable if we can convert it into a serial schedule after swapping its non-conflicting operations.

  • Conflicting operations: Two operations are said to be conflicting if all conditions satisfy:
  • They belong to different transactions
  • They operate on the same data item
  • At Least one of them is a write operation

  • Possible conflict:
  • write-write, from different transactions conflict
  • write-read, from different transactions conflict
  • read(Q), and write(Q) from different transactions conflict

Note : read(Q), and read(Q) do not conflict

24 of 93

Conflict Serializability ..........(Cont.)

  • By series of swaps of non-conflicting instructions, the schedule on the left can be transformed into the one on the right, which is a serial schedule where T2 follows T1
  • .

25 of 93

Conflict Serializability ..........(Cont.)

  • Example: –
  • Conflicting operations pair (R1(A), W2(A)) because they belong to two different transactions on same data item A and one of them is write operation.
  • Similarly, (W1(A), W2(A)) and (W1(A), R2(A)) pairs are also conflicting.
  • On the other hand, (R1(A), W2(B)) pair is non-conflicting because they operate on different data item.
  • Similarly, ((W1(A), W2(B)) pair is non-conflicting.

  • Consider the following schedule:

S1: R1(A), W1(A), R2(A), W2(A), R1(B), W1(B), R2(B), W2(B)

  • If Oi and Oj are two operations in a transaction and Oi< Oj (Oi is executed before Oj), same order will follow in the schedule as well. Using this property, we can get two transactions of schedule S1 as:

T1: R1(A), W1(A), R1(B), W1(B)

T2: R2(A), W2(A), R2(B), W2(B)

26 of 93

Conflict Serializability ..........(Cont.)

  • Possible Serial Schedules are: T1->T2 or T2->T1

-> Swapping non-conflicting operations R2(A) and R1(B) in S1, the schedule becomes,

S11: R1(A), W1(A), R1(B), W2(A), R2(A), W1(B), R2(B), W2(B)

-> Similarly, swapping non-conflicting operations W2(A) and W1(B) in S11, the schedule becomes,

S12: R1(A), W1(A), R1(B), W1(B), R2(A), W2(A), R2(B), W2(B)

  • S12 is a serial schedule in which all operations of T1 are performed before starting any operation of T2. Since S has been transformed into a serial schedule S12 by swapping non-conflicting operations of S1, S1 is conflict serializable.

27 of 93

2. View Serializability

  • A Schedule is called view serializable if it is view equal to a serial schedule (no overlapping transactions).
  • To check whether a given schedule is view serializable, we need to check whether the given schedule is View Equivalent to its serial schedule. Lets take an example to understand what I mean by that.

  • Given Schedule:
  • T1 T2
  • ----- ------
  • R(X)
  • W(X)
  • R(X)
  • W(X)
  • R(Y)
  • W(Y)
  • R(Y)
  • W(Y)

28 of 93

View Serializability..........(Cont.)

  • Serial Schedule of the above given schedule:
  • As we know that in Serial schedule a transaction only starts when the current running transaction is finished. So the serial schedule of the above given schedule would look like this:

T1 T2

----- ------

R(X)

W(X)

R(Y)

W(Y)

R(X)

W(X)

R(Y)

W(Y)

  • If we can prove that the given schedule is View Equivalent to its serial schedule then the given schedule is called view Serializable.

29 of 93

Why we need View Serializability?

  • We know that a serial schedule never leaves the database in inconsistent state because there are no concurrent transactions execution. However a non-serial schedule can leave the database in inconsistent state because there are multiple transactions running concurrently. By checking that a given non-serial schedule is view serializable, we make sure that it is a consistent schedule.

  • You may be wondering instead of checking that a non-serial schedule is serializable or not, can’t we have serial schedule all the time? The answer is no, because concurrent execution of transactions fully utilize the system resources and are considerably faster compared to serial schedules.

30 of 93

View Equivalent

  • Two schedules T1 and T2 are said to be view equivalent, if they satisfy all the following conditions:

1. Initial Read: Initial read of each data item in transactions must match in both schedules. For example, if transaction T1 reads a data item X before transaction T2 in schedule S1 then in schedule S2, T1 should read X before T2.

  • Read vs Initial Read: You may be confused by the term initial read. Here initial read means the first read operation on a data item, for example, a data item X can be read multiple times in a schedule but the first read operation on X is called the initial read.

2. Final Write: Final write operations on each data item must match in both the schedules. For example, a data item X is last written by Transaction T1 in schedule S1 then in S2, the last write operation on X should be performed by the transaction T1.

3. Update Read: If in schedule S1, the transaction T1 is reading a data item updated by T2 then in schedule S2, T1 should read the value after the write operation of T2 on same data item. For example, In schedule S1, T1 performs a read operation on X after the write operation on X by T2 then in S2, T1 should read the X after T2 performs write on X.

31 of 93

Precedence Graph For Testing Conflict Serializability

  • Precedence Graph or Serialization Graph is used commonly to test Conflict Serializability of a schedule.
  • The Algorithm can be written as:
  • Create a node T in the graph for each participating transaction in the schedule.
  • For the conflicting operation read_item(X) and write_item(X) – If a Transaction Tj executes a read_item (X) after Ti executes a write_item (X), draw an edge from Ti to Tj in the graph.
  • For the conflicting operation write_item(X) and read_item(X) – If a Transaction Tj executes a write_item (X) after Ti executes a read_item (X), draw an edge from Ti to Tj in the graph.
  • For the conflicting operation write_item(X) and write_item(X) – If a Transaction Tj executes a write_item (X) after Ti executes a write_item (X), draw an edge from Ti to Tj in the graph.

32 of 93

Example:

33 of 93

Explanation:

  • Read(A): In T4,no subsequent writes to A, so no new edges
  • Read(C): In T4, no subsequent writes to C, so no new edges
  • Write(A): A is subsequently read by T5, so add edge T4 → T5
  • Read(B): In T5,no subsequent writes to B, so no new edges
  • Write(C): C is subsequently read by T6, so add edge T4 → T6
  • Write(B): A is subsequently read by T6, so add edge T5 → T6
  • Write(C): In T6, no subsequent reads to C, so no new edges
  • Write(A): In T5, no subsequent reads to A, so no new edges
  • Write(B): In T6, no subsequent reads to B, so no new edges

  • Precedence graph for schedule S2:

  • The precedence graph for schedule S2 contains no cycle that's why Schedule S2 is serializable.

34 of 93

Concurrency Control

  • Concurrency control is the procedure in DBMS for managing simultaneous operations without conflicting with each other.
  • Concurrent access is quite easy if all users are just reading data. There is no way they can interfere with one another.
  • Though for any practical database, would have a mix of reading and WRITE operations and hence the concurrency is a challenge.
  • Concurrency control is used to address such conflicts which mostly occur with a multi-user system. It helps you to make sure that database transactions are performed concurrently without violating the data integrity of respective databases.
  • Therefore, concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data, are executed simultaneously.

35 of 93

Potential problems of Concurrency

  1. Lost Updates
  2. Uncommitted dependency issues
  3. Inconsistent analysis Problem
  4. Dirty read
  5. Non-Repeatable Read
  6. Incorrect Summary issue
  7. Phantom Read Problem:

36 of 93

1. Lost Updates Problems

  • When two transactions that access the same database items contain their operations in a way that makes the value of some database item incorrect, then the lost update problem occurs.
  • If two transactions T1 and T2 read a record and then update it, then the effect of updating of the first record will be overwritten by the second update.
  • Example:

37 of 93

Here

  • At time t2, transaction-X reads A's value.
  • At time t3, Transaction-Y reads A's value.
  • At time t4, Transactions-X writes A's value on the basis of the value seen at time t2.
  • At time t5, Transactions-Y writes A's value on the basis of the value seen at time t3.
  • So at time T5, the update of Transaction-X is lost because Transaction y overwrites it without looking at its current value.
  • Such type of problem is known as Lost Update Problem as update made by one transaction is lost here.

38 of 93

2. Uncommitted dependency issues

  • There is always a chance that the uncommitted transaction might roll back later.
  • Thus, uncommitted transaction might make other transactions read a value that does not even exist.
  • This leads to inconsistency of the database.
  • Note: It becomes problematic only when the uncommitted transaction fails and roll backs later due to some reason.

In this example,

T2 reads the dirty value of A written by the uncommitted transaction T1.

T1 fails in later stages and roll backs.

Thus, the value that T2 read now stands to be incorrect.

Therefore, database becomes inconsistent.

39 of 93

3. Inconsistent analysis Problem

  • Inconsistent Retrievals Problem is also known as unrepeatable read. When a transaction calculates some summary function over a set of data while the other transactions are updating the data, then the Inconsistent Retrievals Problem occurs.
  • A transaction T1 reads a record and then does some other processing during which the transaction T2 updates the record. Now when the transaction T1 reads the record, then the new value will be inconsistent with the previous value.
  • Example:
  • Suppose two transactions operate on three accounts.

40 of 93

41 of 93

Here

  • Transaction-X is doing the sum of all balance while transaction-Y is transferring an amount 50 from Account-1 to Account-3.
  • Here, transaction-X produces the result of 550 which is incorrect. If we write this produced result in the database, the database will become an inconsistent state because the actual sum is 600.
  • Here, transaction-X has seen an inconsistent state of the database.

42 of 93

4. Dirty read

  • The dirty read occurs in the case when one transaction updates an item of the database, and then the transaction fails for some reason. The updated database item is accessed by another transaction before it is changed back to the original value.
  • A transaction T1 updates a record which is read by T2. If T1 aborts then T2 now has values which have never formed part of the stable database.
  • Example:

43 of 93

Here

  • At time t2, transaction-Y writes A's value.
  • At time t3, Transaction-X reads A's value.
  • At time t4, Transactions-Y rollbacks. So, it changes A's value back to that of prior to t1.
  • So, Transaction-X now contains a value which has never become part of the stable database.
  • Such type of problem is known as Dirty Read Problem, as one transaction reads a dirty value which has not been committed.

44 of 93

5. Non-Repeatable Read

  • Non-Repeatable Read occurs when a second transaction is trying to access the same row several times and reads different data each time.
  • The unrepeatable problem occurs when two or more read operations of the same transaction read different values of the same variable.

  • Example
  • In the above example, once transaction 2 reads the variable X, a write operation in transaction 1 changes the value of the variable X. Thus, when another read operation is performed by transaction 2, it reads the new value of X which was updated by transaction 1.

45 of 93

6. Incorrect Summary issue

  • Incorrect Summary issue occurs when one transaction takes summary over the value of all the instances of a repeated data-item, and second transaction update few instances of that specific data-item. In that situation, the resulting summary does not reflect a correct result.
  • Consider a situation, where one transaction is applying the aggregate function on some records while another transaction is updating these records. The aggregate function may calculate some values before the values have been updated and others after they are updated.
  • In the above example, transaction 2 is calculating the sum of some records while transaction 1 is updating them. Therefore the aggregate function may calculate some values before they have been updated and others after they have been updated.

46 of 93

7. Phantom Read Problem:

  • The phantom read problem occurs when a transaction reads a variable once but when it tries to read that same variable again, an error occurs saying that the variable does not exist.

  • In the above example, once transaction 2 reads the variable X, transaction 1 deletes the variable X without transaction 2’s knowledge. Thus, when transaction 2 tries to read X, it is not able to it.
  • Example:

47 of 93

Concurrency Control Protocol

  • Concurrency control techniques are used to ensure that the Isolation (or non-interference) property of concurrently executing transactions is maintained.
  • Concurrency-control protocols : allow concurrent schedules, but ensure that the schedules are conflict/view serializable, and are recoverable.
  • These protocols do not examine the precedence graph as it is being created, instead a protocol imposes a discipline that avoids non-seralizable schedules.
  • Concurrency control protocols ensure atomicity, isolation, and serializability of concurrent transactions. The concurrency control protocol can be divided into three categories:

  1. Lock based protocol
  2. Time-stamp protocol

48 of 93

1. Lock based protocol

  • A lock is a variable associated with a data item that describes a status of data item with respect to possible operation that can be applied to it.
  • They synchronize the access by concurrent transactions to the database items. It is required in this protocol that all the data items must be accessed in a mutually exclusive manner.
  • Two common locks which are used and some terminology followed in this protocol.

  • Shared Lock (S): also known as Read-only lock. As the name suggests it can be shared between transactions because while holding this lock the transaction does not have the permission to update data on the data item. S-lock is requested using lock-S instruction.
  • Exclusive Lock (X): Data item can be both read as well as written.This is Exclusive and cannot be held simultaneously on the same data item. X-lock is requested using lock-X instruction.

49 of 93

Example:

1. Shared Lock (S):

  • For example, consider a case where two transactions are reading the account balance of a person. The database will let them read by placing a shared lock. However, if another transaction wants to update that account's balance, shared lock prevent it until the reading process is over.

2. Exclusive Lock (X):

  • With the Exclusive Lock, a data item can be read as well as written. This is exclusive and can't be held concurrently on the same data item. X-lock is requested using lock-x instruction. Transactions may unlock the data item after finishing the 'write' operation.
  • For example, when a transaction needs to update the account balance of a person. You can allows this transaction by placing X lock on it. Therefore, when the second transaction wants to read or write, exclusive lock prevent this operation.

50 of 93

Lock Compatibility Matrix

  • A transaction may be granted a lock on an item if the requested lock is compatible with locks already held on the item by other transactions.
  • Any number of transactions can hold shared locks on an item, but if any transaction holds an exclusive(X) lock on the item no other transaction may hold any lock on the item.
  • If a lock cannot be granted, the requesting transaction is made to wait till all incompatible locks held by other transactions have been released. Then the lock is granted.

51 of 93

Looking Level

52 of 93

Working of Locking Protocol

53 of 93

Example:

  • Consider two transition T1 and T2

54 of 93

  • Following is the schedule for transition T1 and T2 which is not scheduled properly because of graning locks at improper timing

55 of 93

  • Problems associated with Simple locking:

  1. Data inconsistency between multiple transactions
  2. Deadlock, a situation where the transactions try to access lock on already locked data items
  3. No guarantee of serializability (i.e. execution of a concurrent transaction equivalent to that of a transaction executed serially)

56 of 93

2. Timestamp-based Protocols

  • The timestamp-based algorithm uses a timestamp to serialize the execution of concurrent transactions. This protocol ensures that every conflicting read and write operations are executed in timestamp order. The protocol uses the System Time or Logical Count as a Timestamp.
  • The older transaction is always given priority in this method. It uses system time to determine the time stamp of the transaction. This is the most commonly used concurrency protocol.
  • Lock-based protocols help you to manage the order between the conflicting transactions when they will execute. Timestamp-based protocols manage conflicts as soon as an operation is created.

  • Example:
    • Suppose there are there transactions T1, T2, and T3.
    • T1 has entered the system at time 0010
    • T2 has entered the system at 0020
    • T3 has entered the system at 0030
    • Priority will be given to transaction T1, then transaction T2 and lastly Transaction T3.

57 of 93

Timestamp Ordering Protocol

  • The timestamp-ordering protocol ensures serializability among transactions in their conflicting read and write operations.
  • This is the responsibility of the protocol system that the conflicting pair of tasks should be executed according to the timestamp values of the transactions.

    • The timestamp of transaction Ti is denoted as TS(Ti).
    • Read time-stamp of data-item X is denoted by R-timestamp(X).
    • Write time-stamp of data-item X is denoted by W-timestamp(X).

58 of 93

  • Timestamp ordering protocol works as follows −

  • If a transaction Ti issues a read(X) operation −
    • If TS(Ti) < W-timestamp(X)
      • Operation rejected.
    • If TS(Ti) >= W-timestamp(X)
      • Operation executed.
    • All data-item timestamps updated.

  • If a transaction Ti issues a write(X) operation −
    • If TS(Ti) < R-timestamp(X)
      • Operation rejected.
    • If TS(Ti) < W-timestamp(X)
      • Operation rejected and Ti rolled back.
    • Otherwise, operation executed.

59 of 93

Thomas' Write Rule

  • This rule states if TS(Ti) < W-timestamp(X), then the operation is rejected and Ti is rolled back.

  • Time-stamp ordering rules can be modified to make the schedule view serializable.

  • Instead of making Ti rolled back, the 'write' operation itself is ignored.

60 of 93

61 of 93

Recovery System

  • Database systems, like any other computer system, are subject to failures but the data stored in it must be available as and when required.
  • When a database fails it must possess the facilities for fast recovery. It must also have atomicity i.e. either transactions are completed successfully and committed (the effect is recorded permanently in the database) or the transaction should have no effect on the database.
  • The techniques used to recover the lost data due to system crash, transaction errors, viruses, catastrophic failure, incorrect commands execution etc. are database recovery techniques. So to prevent data loss recovery techniques based on deferred update and immediate update or backing up data can be used.
  • Recovery techniques are heavily dependent upon the existence of a special file known as a system log. It contains information about the start and end of each transaction and any updates which occur in the transaction.
  • The log keeps track of all transaction operations that affect the values of database items. This information is needed to recover from transaction failure.

62 of 93

Failure Classification

  • To see where the problem has occurred, we generalize a failure into various categories, as follows

1. Transaction failure

  • A transaction has to abort when it fails to execute or when it reaches a point from where it can’t go any further. This is called transaction failure where only a few transactions or processes are hurt.
  • Reasons for a transaction failure could be −
  • Logical errors − Where a transaction cannot complete because it has some code error or any internal error condition.
  • System errors − Where the database system itself terminates an active transaction because the DBMS is not able to execute it, or it has to stop because of some system condition. For example, in case of deadlock or resource unavailability, the system aborts an active transaction.

63 of 93

Failure Classification

2. System Crash/ Hardware Failure:

  • There are problems − external to the system − that may cause the system to stop abruptly and cause the system to crash. For example, interruptions in power supply may cause the failure of underlying hardware or software failure.
  • Examples may include operating system errors.

3. Software Failure

4. Media Failure

5. Network Failure

6. Application Software Error

7. Physical Disasters

64 of 93

Log based Recovery in DBMS

  • Atomicity property of DBMS states that either all the operations of transactions must be performed or none. The modifications done by an aborted transaction should not be visible to database and the modifications done by committed transaction should be visible.
  • To achieve our goal of atomicity, user must first output to stable storage information describing the modifications, without modifying the database itself.
  • This information can help us ensure that all modifications performed by committed transactions are reflected in the database. This information can also help us ensure that no modifications made by an aborted transaction persist in the database.
  • Log and log records –
  • The log is a sequence of log records, recording all the update activities in the database. In a stable storage, logs for each transaction are maintained. Any operation which is performed on the database is recorded is on the log. Prior to performing any modification to database, an update log record is created to reflect that modification.
  • An update log record represented as: <Ti, Xj, V1, V2> has these fields:
  • Transaction identifier: Unique Identifier of the transaction that performed the write operation.
  • Data item: Unique identifier of the data item written.
  • Old value: Value of data item prior to write.
  • New value: Value of data item after write operation.

65 of 93

Types of Log Record

  • Example:

66 of 93

Types of Log Record

67 of 93

Types of Log Record

68 of 93

Types of Log Record

69 of 93

70 of 93

Deferred Modification Technique(No-undo/Redo Algorithm)

  • This approach ensures atomicity be recording all database modifications in the log, but deferring the execution of write operations until the transaction is done with its final action (partially committed)
  • .The execution of transaction Ti proceeds as follows:
  • Before Ti starts its execution, a record <Ti start> is written to the log,
  • A write(x) operation by Ti, inserts a new record in the log,
  • When Ti partially commits, a record <Ti commit> is added to the log
  • Consider the following two transactions:

71 of 93

Cont....

  • Assume these two transactions are executed serially in the order of T0, T1. Further assume that the contents of A, B, and C are $1,000, $2,000, and $700.
  • Portion of the log related to these two transactions look like”

<T0 start>

<T0, A, 950> <TO, A, 1000, 950>

<T0, B, 2050> <TO, B, 2000, 2050>

<T0 commit>

<T1 start>

<T1, C, 600>

<T1 commit>

72 of 93

cont........

73 of 93

Case1:

  • Let us look at the following scenarios:
  • Case1: Crash occurs just after the log record write(B) of T0. Then, the
    • log looks as:
    • <T0 start>
    • <T0, A, 950>
    • <T0, B, 2050>
  • When the system comes back up, no redo actions is needed, since no changes has been made to the database and the contents of A and B accounts remain the same. The log records of incomplete transaction T0 may be deleted from the log.

74 of 93

Case2:

  • Crash occurs just after the log record write(C) of T1. Then, the
  • log looks as:

<T0 start>

<T0, A, 950>

<T0, B, 2050>

<T0 commit>

<T1 start>

<T1, C, 600>

  • When the system comes back up, operation redo(T0) is performed (because of <T0 commit>) the contents of A and B accounts will be 950 and 2050, respectively, and C remains the same as before (700). The log records of incomplete transaction T1 may be deleted from the log.

75 of 93

Case3:

  • Crash occurs just after the log record <T1 commit>. Then, the log looks as:

<T0 start>

<T0, A, 950>

<T0, B, 2050>

<T0 commit>

<T1 start>

<T1, C, 600>

<T1 commit>

  • When the system comes back up, operations redo(T0) and redo(T1) are performed (because of <T0 commit> and <T1 commit>). The contents of A, B, and C accounts will be 950, 2050, and 600, respectively.

76 of 93

Deferred Modification Technique(No-undo/Redo Algorithm)

  • Before reaching the commit point, all updates are recorded in the local transaction workspace.
  • Before commit, the updates are recorded persistently in the log, and then after commit, the updates are written to the database on disk.
  • If the transaction fails before reaching commit point, the database is unchanged, so undo is not needed.
  • It may be necessary to redo the effect of the operations of a committed transaction from the log, though.

77 of 93

Immediate Database Update (Undo algorithm)

  • This scheme allows database modification to happen while transaction is in active state. In the event of failure or crash, the system must use the old-value of the log records to store the modified data items to the value they had prior to the execution of transaction (i.e., undo).
  • Before the execution of transaction Ti, the record <Ti start> is written to the log. Any write(x) by Ti precedes by inserting a <Ti, X, Vold, Vnew > record to the log. When Ti partially commits, the <Ti commit> will be recorded to the log.
  • Consider the following two transactions:

78 of 93

Cont...

  • Assume these two transactions are executed serially in the order of T0, T1. Further assume that the contents of A, B, and C are $1,000, $2,000, and $700.
  • Portion of the log related to these two transactions look like:

<T0 start>

<T0, A, 1000, 950>

<T0, B, 2000, 2050>

<T0 commit>

<T1 start>

<T1, C, 700, 600>

<T1 commit>

79 of 93

Cont..

  • One possible order in which the actual output took place in both database and log is as follows:

80 of 93

Cont....

  • The recovery scheme uses two recovery procedures:
    • Undo(Ti) to restore the value of all data items updated by Ti to the old values.
    • Redo(Ti) to set the value of all data items updated by Ti to the new values.
  • After a failure, the recovery scheme consults the log to determine which transactions need to be redone and which transactions need to be undone.
    • Transaction Ti needs to be undone if the log contains <Ti start> but does not contain <Ti commit>.
    • Transaction Ti needs to be redone if the log contains <Ti start> and the <Ti commit>.

81 of 93

Case1

  • Let us look at the following scenarios:
  • Case1: Crash occurs just after the log record write(B) of T0.
  • Then, the log looks as:

<T0 start>

<T0, A, 1000, 950>

<T0, B, 2000, 2050>

  • When the system comes back up, the record <T0 start> is in the log but the record <T0 commit> is not, so T0 must be undone. Undo(T0) restores A and B accounts to 1000 and 2000, respectively

82 of 93

Case2

  • Crash occurs just after the log record write(C) of T1.
  • Then, the log looks as:
    • <T0 start>
    • <T0, A, 1000, 950>
    • <T0, B, 2000, 2050>
    • <T0 commit>
    • <T1 start>
    • <T1, C, 700, 600>
  • When the system comes back up, two recovery actions need to be taken: Undo(T1) and redo(T0). At the end of recovery procedure the contents of A, B, and C accounts will be 950, 2050, and 700, respectively.

83 of 93

Case3:

  • Crash occurs just after the log record <T1 commit>.
  • Then, the log looks as:
    • <T0 start>
    • <T0, A, 1000, 950>
    • <T0, B, 2000, 2050>
    • <T0 commit>
    • <T1 start>
    • <T1, C, 700, 600>
    • <T1 commit>
  • When the system comes back up, both T0 and T1 need to be redone. The contents of A, B, and C accounts will be 950, 2050, and 600, respectively.

84 of 93

Cont.....

  • The database may be updated by some operations of the transaction before reaching its commit point. However, these operations are recorded in the log permanently, by force-writing.
  • If the transaction fails before reaching its commit point then the effect of its operations must be undone (rolled back).

85 of 93

Deadlock in DBMS

86 of 93

Deadlock Prevention

87 of 93

Approach_3: Preventation and Transacition rollback

88 of 93

Deadlock Detection

89 of 93

Recovery from Deadlock

90 of 93

Cont...

91 of 93

  • https://www.slideshare.net/nikunjandy/log-based-and-recovery-with-concurrent-transaction

  • https://hurson.weebly.com/uploads/4/9/2/2/49225097/cs5300_recovery.pdf

92 of 93

  • https://www.geeksforgeeks.org/acid-properties-in-dbms/#:~:text=A%20transaction%20is%20a%20single,These%20are%20called%20ACID%20properties.
  • https://www.tutorialspoint.com/dbms/dbms_transaction.htm
  • https://www.guru99.com/dbms-transaction-management.html

93 of 93

Ref for protocol

  • https://www.guru99.com/dbms-concurrency-control.html
  • https://www.geeksforgeeks.org/lock-based-concurrency-control-protocol-in-dbms/?ref=rp