1 of 45

DATA FILE HANDLING

2 of 45

What are the different types of Files?

  • Text files
  • Binary Files

3 of 45

Name of the student is Babitha her class is 12B and her roll number is 12B30.

RNO

NAME

CLASS

12B30

BABITHA

12B

Table Name: Student

[12B30,’BABITHA’,12B]

4 of 45

INTRODUCTION TO CSV

5 of 45

CSV (comma separated values)

It is used for storing tabular data in a spreadsheet or database

Each line in a csv file is called a record

Each record consists of fields separated by commas(delimiter)

CSV

6 of 45

7 of 45

  • Files in the CSV format can be imported

to and exported from programs that store

data in tables, such as Microsoft Excel or

OpenOffice Calc.

  • Already defined, CSV stands for “comma

separated values”. Thus, we can say that

a comma-separated file is a delimited text

file that uses a comma to separate values.

8 of 45

Advantages of CSV files

Easier to create

Preferred import and export format for database and spreadsheet

Capable of storing large amount of data

9 of 45

WHY USE CSV?

  • With the use of social networking sites and its various associated applications being extensively used requires the handling of huge data.
  • CSV files are commonly used because they are easy to read and manage, small in size, and fast to process/transfer.

10 of 45

Thus, in a nutshell, the several advantages that are offered by CSV files are as follows:�

• CSV is faster to handle.

• CSV is smaller in size.

• CSV is easy to generate and import onto a spreadsheet or database.

• CSV is human readable and easy to edit manually.

• CSV is simple to implement and parse.

• CSV is processed by almost all existing applications.

11 of 45

CSV FILE HANDLING IN PYTHON

  • For working with CSV files in Python, there is an inbuilt module called CSV. It is used to read and write tabular data in CSV format.
  • Like other files (text and binary) in Python, there are basic operations that can be carried out on a CSV file.

1. Reading a CSV

2. Writing to a CSV.

3. Searching for a record

4. Deleting a Record

5. Updating a Record

12 of 45

Role of Newline Argument in CSV file

  • Newline argument specifies how would python handle new line characters while working with csv files on different operating systems.
  • Different operating systems store EOL characters differently

Symbol/ Char

Meaning

Operating System

CR [\r]

Carriage return

Mac

LF[\n]

Line Feed

Unix

CR \LF [ \r \n]

Carriage return Line feed

Ms-DOS, Windows

13 of 45

Reading from CSV File

Reading from a CSV file is done using the reader object. The CSV file is opened as a text file with Python’s built-in open() function, which returns a file object.

This creates a special type of object to access the CSV file (reader object), using the reader() function.

The reader object is an iterable that gives us access to each line of the CSV file as a list of fields. You can also use next() directly on it to read the next line of the CSV file, or you can treat it like a list in a for loop to read all the lines of the file (as lists of the file’s fields).

14 of 45

15 of 45

PROGRAM TO WRITE INTO A CSV FILE

16 of 45

PROGRAM TO WRITE INTO A CSV FILE

17 of 45

18 of 45

Programs

  1. W.A.F to find the number of records in a CSV file.
  2. W.A.F to print the specified record by asking the using to enter the record number to be printed
  3. W.A.F to search the record of a particular student from CSV file on the basis of inputted name.

19 of 45

W.A.F to find the number of records in a CSV file

20 of 45

21 of 45

To avoid the field name : Method 1

22 of 45

To avoid the field name : Method 2

  • next() method returns the current row and advances the iterator to the next row.

23 of 45

Program to print the records in the form of comma separated values, instead of lists.

24 of 45

csv_reader_object has a method

called line_num that returns the number of lines in our CSV.

line_num is nothing but a counter which returns the number of rows which have been iterated.

25 of 45

Program to search the record of a particular student from CSV file on the basis of inputted name.

26 of 45

27 of 45

Program to print the records in the form of comma separated values, instead of lists.

28 of 45

Program to print the records in the form of comma separated values, instead of lists.

29 of 45

RANDOM ACCESS IN FILES USING TELL() AND SEEK()

  • seek()—seek() function is used to change the position of the file handle (file pointer) to a given specific position. File pointer is like a cursor, which defines from where the data has to be read or written in the file.

30 of 45

  • The reference point is defined by the "from_what" argument. It can have any of the three values:
  • 0: sets the reference point at the beginning of the file, which is by default.
  • 1: sets the reference point at the current file position.
  • 2: sets the reference point at the end of the file.

31 of 45

seek() can be done in two ways:

  • • Absolute Positioning
  • • Relative Positioning
  • Absolute referencing using seek() gives the file number on which the file pointer has to position
  • itself. The syntax for seek() is—
  • f.seek(file_location) #where f is the file pointer

  • For example, f.seek(20) will give the position or file number where the file pointer has been placed.
  • This statement shall move the file pointer to 20th byte in the file no matter where you are.

32 of 45

  • Relative referencing/positioning has two arguments, offset and the position from which it has to traverse. The syntax for relative referencing is:

f.seek(offset, from_what) #where f is file pointer

For example,

  • f.seek(–10,1) from current position, move 10 bytes backward
  • f.seek(10,1) from current position, move 10 bytes forward
  • f.seek(–20,1) from current position, move 20 bytes backward
  • f.seek(10,0) from beginning of file, move 10 bytes forward

33 of 45

Multiple Choice Questions (MCQs)�

(a) Which of the following is not a valid mode to open a file?

  1. ab (ii) rw (iii) r+ (iv) w+

(b) Which statement is used to change the file position to an offset value from the start?

(i) fp.seek(offset, 0) (ii) fp.seek(offset, 1)

(iii) fp.seek(offset, 2) (iv) None of the above

34 of 45

(c) The difference between r+ and w+ modes is expressed as?

(i) No difference

(ii) In r+ mode, the pointer is initially placed at the beginning of the file and the pointer is at the

end for w+

(iii) In w+ mode, the pointer is initially placed at the beginning of the file and the pointer is at the

end for r+

(iv) Depends on the operating system

35 of 45

(d) Which module is used for working with CSV files in Python?

(i) random (ii) statistics (iii) csv (iv) math

(e) Which of the following modes is used for both writing and reading from a binary file?

(i) wb+ (ii) w (iii) wb (iv) w+

36 of 45

(f) Which statement is used to retrieve the current position within the file?

  1. fp.seek() (ii) fp.tell() (iii) fp.loc (iv) fp.pos

(g) What happens if no arguments are passed to the seek() method?

  1. file position is set to the start of file
  2. file position is set to the end of file
  3. file position remains unchanged
  4. results in an error

37 of 45

2. Consider the following code:

f = open("test", "w+")

f.write("0123456789abcdef")

f.seek(-3,2) //Statement 1

print(f.read(2)) //Statement 2

Explain statement 1 and give output of statement 2.

38 of 45

Ans. Statement 1 uses seek() method that can be used to position the file object at a particular place in the

file.

It’s syntax is:

fileobject.seek(offset [, from_what])

So, f.seek(-3,2) positions the fileobject to 3 bytes before end of file.

Output of Statement 2 is:

de

It reads 2 bytes from where the file object is placed.

39 of 45

3. Yogendra intends to position the file pointer to the tenth character from the current position of a text file. Write Python statement for the same assuming "F" is the Fileobject.

  • Ans. F.seek(10,1)

40 of 45

4. In which of the following file modes the existing data of the file will not be lost?

rb, ab, w, w+b, a+b, wb, wb+, w+, r+

Ans. In file modes rb, ab, a+b and r+, data will not be lost.

In file modes w, w+b, wb, wb+ and w+, data will be truncated i.e. lost.

41 of 45

5. Write a statement in Python to perform the following operations:

(a) To open a text file "Book.txt" in read mode

(b) To open a binary file "Book.dat" in write mode

Ans. (a) f = open("Book.txt", "r")

(b) f = open("Book.dat", "wb")

42 of 45

6. What is the output of the following code?

fh = open("test.txt", "r")

Size = len(fh.read())

print(fh.read(5))

Ans. No output.

Explanation. The fh.read() of line 2 will read the entire file content and place the file pointer at the end

of file. For the fh.read(5), it will return nothing as there are no bytes to be read from EOF. Thus print()

statement prints nothing.

43 of 45

7. Give the output of the following snippet:

4

8

44 of 45

8. Give the output of the following snippet

45 of 45

Function called words() in python to count no. of words in a text file count.txt

Function called max_word ()in python to show word with maximum length from a text file count.txt