1 of 28

Overview of Physical Storage Media�

2 of 28

Several types of data storage exist in most computer systems.�They vary in speed of access, cost per unit of data, and reliability.

3 of 28

  • Another classification: Primary, secondary, and tertiary storage.
    • Primary storage: the fastest storage media, such as cache and main memory.
    • Secondary (or on-line) storage: the next level of the hierarchy, e.g., magnetic disks.
    • Tertiary (or off-line) storage: magnetic tapes and optical disk juke boxes.
  • Volatility of storage. Volatile storage loses its contents when the power is removed. Without power backup, data in the volatile storage (the part of the hierarchy from main memory up) must be written to nonvolatile storage for safekeeping.

4 of 28

  • Preferred secondary storage device for high storage capacity and low cost.

  • Data stored as magnetized areas on magnetic disk surfaces.

  • A disk pack contains several magnetic disks connected to a rotating spindle.

  • Disks are divided into concentric circular tracks on each disk surface. Track capacities vary typically from 4 to 50 Kbytes.

5 of 28

  • Because a track usually contains a large amount of information, it is divided into smaller blocks or sectors.
  • The division of a track into sectors is hard-coded on the disk surface and cannot be changed. One type of sector organization calls a portion of a track that subtends a fixed angle at the center as a sector.

  • A track is divided into blocks. The block size B is fixed for each system. Typical block sizes range from B=512 bytes to B=4096 bytes. Whole blocks are transferred between disk and main memory for processing.

6 of 28

Disk Storage Devices

7 of 28

  • A read-write head moves to the track that contains the block to be transferred. Disk rotation moves the block under the read-write head for reading or writing.
  • A physical disk block (hardware) address consists of a cylinder number (imaginary collection of tracks of same radius from all recorded surfaces), the track number or surface number (within the cylinder), and block number (within track).
  • Reading or writing a disk block is time consuming because of the seek time s and rotational delay (latency) rd.
  • Double buffering can be used to speed up the transfer of contiguous disk blocks.

8 of 28

9 of 28

Records

  • Fixed and variable length records
  • Records contain fields which have values of a particular type (e.g., amount, date, time, age)
  • Fields themselves may be fixed length or variable length
  • Variable length fields can be mixed into one record: separator characters or length fields are needed so that the record can be “parsed”.

10 of 28

Blocking

  • Blocking: refers to storing a number of records in one blo ck on the disk.

  • Blocking factor (bfr) refers to the number of records per block.

  • There may be empty space in a block if an integral number of records do not fit in one block.

  • Spanned Records: refer to records that exceed the size of one or more blocks and hence span a number of blocks.

11 of 28

Files of Records

  • A file is a sequence of records, where each record is a collection of data values (or data items).

  • A file descriptor (or file header ) includes information that describes the file, such as the field names and their data types, and the addresses of the file blocks on disk.

  • Records are stored on disk blocks. The blocking factor bfr for a file is the (average) number of file records stored in a disk block.

  • A file can have fixed-length records or variable-length records.

12 of 28

Files of Records (cont.)

  • File records can be unspanned (no record can span two blocks) or spanned (a record can be stored in more than one block).
  • The physical disk blocks that are allocated to hold the records of a file can be contiguous, linked, or indexed.
  • In a file of fixed-length records, all records have the same format. Usually, unspanned blocking is used with such files.
  • Files of variable-length records require additional information to be stored in each record, such as separator characters and field types. Usually spanned blocking is used with such files.

13 of 28

Indexing Structures for Files

  • Types of Single-level Ordered Indexes
    • Primary Indexes
    • Clustering Indexes
    • Secondary Indexes
  • Multilevel Indexes
  • Dynamic Multilevel Indexes Using B-Trees and B+-Trees
  • Indexes on Multiple Keys

14 of 28

B tree�

B-Tree is a self-balanced search tree in which every node contains multiple keys and has more than two children.

15 of 28

Here, the number of keys in a node and number of children for a node depends on the order of B-Tree. Every B-Tree has an order.

B-Tree of Order m has the following properties...

  • Property #1 - All leaf nodes must be at same level.
  • Property #2 - All nodes except root must have at least [m/2]-1 keys and maximum of m-1 keys.
  • Property #3 - All non leaf nodes except root (i.e. all internal nodes) must have at least m/2 children.
  • Property #4 - If the root node is a non leaf node, then it must have atleast 2 children.
  • Property #5 - A non leaf node with n-1 keys must have n number of children.
  • Property #6 - All the key values in a node must be in Ascending Order.

16 of 28

  • For example, B-Tree of Order 4 contains a maximum of 3 key values in a node and maximum of 4 children for a node

17 of 28

Insertion Operation in B-Tree�

  • In a B-Tree, a new element must be added only at the leaf node. That means, the new keyValue is always attached to the leaf node only. The insertion operation is performed as follows...
  • Step 1 - Check whether tree is Empty.
  • Step 2 - If tree is Empty, then create a new node with new key value and insert it into the tree as a root node.
  • Step 3 - If tree is Not Empty, then find the suitable leaf node to which the new key value is added using Binary Search Tree logic.
  • Step 4 - If that leaf node has empty position, add the new key value to that leaf node in ascending order of key value within the node.
  • Step 5 - If that leaf node is already full, split that leaf node by sending middle value to its parent node. Repeat the same until the sending value is fixed into a node.
  • Step 6 - If the spilting is performed at root node then the middle value becomes new root node for the tree and the height of the tree is increased by one.

18 of 28

Construct a B-Tree of Order 3 by inserting numbers from 1 to 10.��

19 of 28

20 of 28

21 of 28

22 of 28

23 of 28

24 of 28

25 of 28

Search Operation in B-Tree�

  • Step 1 - Read the search element from the user.
  • Step 2 - Compare the search element with first key value of root node in the tree.
  • Step 3 - If both are matched, then display "Given node is found!!!" and terminate the function
  • Step 4 - If both are not matched, then check whether search element is smaller or larger than that key value.
  • Step 5 - If search element is smaller, then continue the search process in left subtree.
  • Step 6 - If search element is larger, then compare the search element with next key value in the same node and repeate steps 3, 4, 5 and 6 until we find the exact match or until the search element is compared with last key value in the leaf node.
  • Step 7 - If the last key value in the leaf node is also not matched then display "Element is not found" and terminate the function.

26 of 28

Deletion Operation�

Before going through the steps below, one must know these facts about a B tree of degree m.

  1. A node can have a maximum of m children. (i.e. 3)
  2. A node can contain a maximum of m - 1 keys. (i.e. 2)
  3. A node should have a minimum of ⌈m/2⌉ children. (i.e. 2)
  4. A node (except root node) should contain a minimum of ⌈m/2⌉ - 1 keys. (i.e. 1)

27 of 28

There are three main cases for deletion operation in a B tree.

  • Case I :The key to be deleted lies in the leaf.
  • Case II : If the key to be deleted lies in the internal node
  • Case III: If the target key lies in an internal node

28 of 28