1 of 46

Introduction to Data Structures

By

Mr. Abhijit T. Somnathe

2 of 46

Definition

  • Data: Collection of raw facts.
  • Data structure is representation of the logical relationship existing between individual elements of data.
  • Data structure is a specialized format for organizing and storing data in memory that considers not only the elements stored but also their relationship to each other.

3 of 46

Introduction

  • Data structure affects the design of both structural & functional aspects of a program.
  • Program =algorithm + Data Structure
  • An algorithm is a step by step procedure to solve a particular function.

4 of 46

Introduction

  • The major or the common operations that can be performed on the data structures are:
  • Searching: We can search for any element in a data structure.
  • Sorting: We can sort the elements of a data structure either in an ascending or descending order.

5 of 46

Introduction

  • Insertion: We can also insert the new element in a data structure.
  • Updation: We can also update the element, i.e., we can replace the element with another element.
  • Deletion: We can also perform the delete operation to remove the element from the data structure.

6 of 46

Classification of Data Structure

  • Data structure are normally divided into two broad categories:
    • Primitive Data Structure
    • Non-Primitive Data Structure

7 of 46

Classification of Data Structure

8 of 46

Primitive Data Structure

  • These are basic structures and directly operated upon by the machine instructions.
  • Data structures that are directly operated upon the machine-level instructions are known as primitive data structures.
  • Integer, Floating-point number, Character constants, string constants, pointers etc, fall in this category.

9 of 46

Primitive Data Structure

  • The most commonly used operation on Primitive data structure are broadly categorized into following types:
  • Create
  • Selection
  • Updating
  • Destroy or Delete

10 of 46

Non-Primitive Data Structure

  • These are more sophisticated data structures.
  • The Data structures that are derived from the primitive data structures are called Non-primitive data structure.
  • The non-primitive data structures emphasize on structuring of a group of homogeneous (same type) or heterogeneous (different type) data items.

11 of 46

Non-Primitive Data Structure

Linear Data structures:

  • Linear Data structures are kind of data structure that has homogeneous elements.
  • The data structure in which elements are in a sequence and form a linear series.
  • Linear data structures are very easy to implement, since the memory of the computer is also organized in a linear fashion.
  • Some commonly used linear data structures are Stack, Queue and Linked Lists.

12 of 46

Difference Between Primitive and Non – Primitive

  • A primitive data structure is generally a basic structure that is usually built into the language, such as an integer, a float.

  • A non-primitive data structure is built out of primitive data structures linked together in meaningful ways, such as a or a linked-list, binary search tree, AVL Tree, graph etc.

13 of 46

Array

  • An array is defined as a set of finite number of homogeneous elements or same data items.
  • It means an array can contain one type of data only, either all integer, all float-point number or all character.

14 of 46

Array

  • Simply, declaration of array is as follows: int Arr[10]
  • Where int specifies the data type or type of elements arrays stores.
  • “Arr” is the name of array & the number specified inside the square brackets is the number of elements an array can store. This is also called sized or length of array.

15 of 46

Array

  • The elements of linear array are stored in consecutive memory locations. It is shown below:

16 of 46

Array

  • The elements of array will always be stored in the consecutive (continues) memory location.
  • The number of elements that can be stored in an array, that is the size of array or its length is given by the following equation:

(Upperbound-lowerbound)+1

17 of 46

Array

  • For the above array it would be (9-0)+1=10,where 0 is the lower bound of array and 9 is the upper bound of array.
  • Array can always be read or written through loop.

For(i=0;i<=9;i++)

{ scanf(“%d”,&arr[i]);

printf(“%d”,arr[i]); }

18 of 46

Array Types

  • Single Dimension Array
  • Array with one subscript
  • Two Dimension Array
  • Array with two subscripts (Rows and Column)
  • Multi Dimension Array
  • Array with Multiple subscripts
  • Mostly, Two dimensional array are used as Multi Dimensional Array.

19 of 46

Single Dimension Array

  • An array is defined as a set of finite number of homogeneous elements or same data items.
  • It means an array can contain one type of data only, either all integer, all float-point number or all character.

20 of 46

Two Dimension Array

  • A two dimensional array is a collection of elements and each element is identified by a pair of subscripts. ( A[3] [3] )
  • The elements are stored in continuous memory locations.
  • The elements of two-dimensional array as rows and columns.

21 of 46

Two Dimension Array

  • The number of rows and columns in a matrix is called as the order of the matrix and denoted as m*n.
  • The number of elements can be obtained by multiplying number of rows and number of columns.

A[0]

A[1]

A[2]

A[0]

10

20

30

A[1]

40

50

60

A[2]

70

80

90

22 of 46

Two Dimension Array

  • A is the array of order m*n.
  • To store m*n number of elements, we need m*n memory locations.
  • The elements should be in contiguous memory locations.
  • There are two methods:
  • Row-major method
  • Column-major method

23 of 46

Two Dimension Array

  • Row-Major Method: All the first-row elements are stored in sequential memory locations and then all the second-row elements are stored and so on. Ex: A[Row][Col].

1000

10

A[0][0]

1002

20

A[0][1]

1004

30

A[0][2]

1006

40

A[1][0]

1008

50

A[1][1]

1010

60

A[1][2]

1012

70

A[2][0]

1014

80

A[2][1]

1016

90

A[2][2]

24 of 46

Two Dimension Array

  • Column-Major Method: All the first column elements are stored in sequential memory locations and then all the second- column elements are stored and so on. Ex: A [Col][Row].

1000

10

A[0][0]

1002

40

A[1][0]

1004

70

A[2][0]

1006

20

A[0][1]

1008

50

A[1][1]

1010

80

A[2][1]

1012

30

A[0][2]

1014

60

A[1][2]

1016

90

A[2][2]

25 of 46

Static and Dynamic Array

  • Static Array: A fixed array/static array is an array for which the size or length is determined when the array is created and/or allocated.
  • Static arrays have their size or length determined when the array is created and/or allocated.
  • They may also be referred to as fixed-length arrays or fixed arrays.
  • Array values may be specified when the array is defined, or the array size may be defined without specifying array contents.

26 of 46

Static and Dynamic Array

  • Dynamic Array: A dynamic array is a random access, variable-size list data structure that allows elements to be added or removed.
  • It is supplied with standard libraries in many modern programming languages.
  • Dynamic arrays overcome a limit of static arrays, which have a fixed capacity that needs to be specified at allocation.

27 of 46

Static and Dynamic Array

  • Dynamic arrays allow elements to be added and removed at runtime. Most current programming languages include built-in or standard library functions for creating and managing dynamic arrays.

28 of 46

Advantages of Array

  • It is used to represent multiple data items of same type by using single name.
  • It can be used to implement other data structures like linked lists, stacks, queues, tree, graphs etc.
  • Two-dimensional arrays are used to represent matrices.
  • Many databases include one-dimensional arrays whose elements are records.

29 of 46

Disadvantages of Array

  • We must know in advance, how many elements are to be stored in array.
  • Array is static structure. It means that array is of fixed size. The memory which is allocated to array cannot be increased or decreased.
  • If we allocate more memory than requirement then the memory space will be wasted.
  • The elements of array are stored in consecutive memory locations. So insertion and deletion are very difficult and time consuming.

30 of 46

Algorithm

  • An algorithm is a set of instructions to be followed to solve a problem.
    • There can be more than one solution (more than one algorithm) to solve a given problem.
    • An algorithm can be implemented using different programming languages on different platforms.
  • An algorithm must be correct. It should correctly solve the problem.

31 of 46

Algorithm

  • Once we have a correct algorithm for a problem, we have to determine the efficiency of that algorithm.
  • There are two aspects of algorithmic performance:
  • Time
  • Instructions take time.
  • How fast does the algorithm perform?
  • What affects its runtime?

32 of 46

Algorithm

  • Space
  • Data structures take space
  • What kind of data structures can be used?
  • How does choice of data structure affect the runtime?

  • We will focus on time:
  • How to estimate the time required for an algorithm
  • How to reduce the time required

33 of 46

Algorithm Analysis

  • Algorithm analysis is an important part of computational complexity theory, which provides theoretical estimation for the required resources of an algorithm to solve a specific computational problem.
  • Analysis of algorithms is the determination of the amount of time and space resources required to execute it.

34 of 46

Why Analysis of Algorithms is important?

  • To predict the behavior of an algorithm without implementing it on a specific computer.
  • It is much more convenient to have simple measures for the efficiency of an algorithm than to implement the algorithm and test the efficiency every time a certain parameter in the underlying computer system changes.

35 of 46

Why Analysis of Algorithms is important?

  • It is impossible to predict the exact behavior of an algorithm. There are too many influencing factors.
  • The analysis is thus only an approximation; it is not perfect.
  • More importantly, by analyzing different algorithms, we can compare them to determine the best one for our purpose.

36 of 46

What to Analyze?

  • Correctness
  • Does the input/output relation match algorithm requirement?
  • Amount of Work Done
  • Basic operations to do task finite amount of time.
  • Amount of Space Used
  • Memory used.
  • Simplicity, Clarity
  • Verification and implementation.

37 of 46

What to Analyze?

  • Optimality
  • Is it impossible to do better?
  • Amount of Space Used
  • Memory used.
  • Time Complexity
  • Amount of computer time it needs to execute the program to get the intended result.
  • Space Complexity
  • Memory requirements based on the problem size.

38 of 46

Characteristics of Algorithm

  • An algorithm should have the following characteristics −
  • Unambiguous − Algorithm should be clear and unambiguous. Each of its steps (or phases), and their inputs/outputs should be clear and must lead to only one meaning.
  • Input − An algorithm should have well-defined inputs.

39 of 46

Characteristics of Algorithm

  • Output − An algorithm should have well-defined outputs, and should match the desired output.
  • Finiteness − Algorithms must terminate after a finite number of steps.
  • Feasibility − Should be feasible with the available resources.
  • Independent − An algorithm should have step-by-step directions, which should be independent of any programming code.

40 of 46

Time and Space Complexities

  • Time complexity is the amount of time taken by an algorithm to run, as a function of the length of the input.
  • Time complexity measures the time taken to execute each statement of code in an algorithm.
  • If a statement is set to execute repeatedly then the number of times that statement gets executed is equal to N multiplied by the time required to run that function each time.

41 of 46

Time and Space Complexities

  • Space complexity is a function describing the amount of memory (space) an algorithm takes in terms of the amount of input to the algorithm.
  • When we say “this algorithm takes constant extra space,” because the amount of extra memory needed doesn’t vary with the number of items processed.

42 of 46

Asymptotic Notations

  • Asymptotic notations are the mathematical notations used to describe the running time of an algorithm when the input tends towards a particular value or a limiting value.
  • There are mainly three asymptotic notations:
  • Big-O notation
  • Omega notation
  • Theta notation

43 of 46

Big-O Notation (O-notation)

  • Big-O notation represents the upper bound of the running time of an algorithm. Thus, it gives the worst-case complexity of an algorithm.

  • Since it gives the worst-case running time of an algorithm, it is widely used to analyze an algorithm as we are always interested in the worst-case scenario.
  • O(g(n)) = { f(n): there exist positive constants c and n0 such that
  • 0 ≤ f(n) ≤ cg(n) for all n ≥ n0 }

44 of 46

Omega Notation (Ω-notation)

  • Omega notation represents the lower bound of the running time of an algorithm. Thus, it provides the best case complexity of an algorithm.

  • For any value of n, the minimum time required by the algorithm is given by Omega Ω(g(n)).
  • Ω(g(n)) = { f(n): there exist positive constants c and n0 such that 0 ≤ cg(n) ≤ f(n) for all n ≥ n0 }

45 of 46

Theta Notation (Θ-notation)

  • Theta notation encloses the function from above and below. Since it represents the upper and the lower bound of the running time of an algorithm, it is used for analyzing the average-case complexity of an algorithm.

  • If a function f(n) lies anywhere in between c1g(n) and c2g(n) for all n ≥ n0, then f(n) is said to be asymptotically tight bound.
  • Θ(g(n)) = { f(n): there exist positive constants c1, c2 and n0 such that 0 ≤ c1g(n) ≤ f(n) ≤ c2g(n) for all n ≥ n0 }

46 of 46

ANY DOUBTS?