1 of 29

Reading and Writing

text files

2 of 29

Data is often stored in a file on your computer

In what way would we like to have this data available in our program What are some possibilities

3 of 29

Opening a File

  • Before we can read or write the contents of a file we must tell Python which file we are going to work with.
  • This is done with the open() function

  • open() returns a file handle
    • What should we do with the file handle returned by open()?
      • store it in a variable
    • a file handle is NOT A FILE; a file handle is a file object which is used to perform operations on the actual file

4 of 29

open() function

myVarName = open(filename, mode)

    • open() returns a file handle
    • filename is a string containing the name of the file
      • for this class the file should be in SAME FOLDER as the program
    • mode
      • to read from the file (default)
      • 'r' to read from the file
      • 'w' to write to the file (overwrites if file already exists).
      • 'a' to append to the file.

myfhand = open('countries.txt') # read (default)

myfhand = open('countries.txt', 'r') # read

myfhand = open('countries.txt', 'w') # (over)write

myfhand = open('countries.txt', 'a') # append

5 of 29

Using with statement to open() file

with open(filename, mode) as myVarName:

indented code to read/write file

    • open() creates a file handle stored in myVarName.
    • filename is a string containing the name of the file
      • the file should be in SAME FOLDER as the program
    • mode
      • to read from the file (default)
      • 'r' to read from the file
      • 'w' to write to the file (overwrites if file already exists).
      • 'a' to append to the file.
    • automatically closes file when done with the with block

6 of 29

Using with statement to open() file

followed by indented code to access file

with open('countries.txt', 'r') as myfhand: # read

with open('countries.txt') as myfhand: # read(default)

with open('countries.txt', 'w') as myfhand: # (over)write

with open('countries.txt', 'a') as myfhand: # append

7 of 29

What is a File Handle?

with open('mbox.txt') as firstHandle: # a file object

print(firstHandle)

<open file 'mbox.txt', mode 'r' at 0x1005088b0>

print(firstHandle.read())

From stephen.m..

Return-Path: <p..

Date: Sat, 5 Jan..

⋮ ⋮

file object methods

8 of 29

What if the file does not exist ?

with open('stuff.txt') as fh:

If you are unsure if a file exists, how can we execute the open() method without crashing the program

Traceback (most recent call last):              

with open('stuff.txt') as fh:

FileNotFoundError: No such file or directory: 'stuff.txt'

directory is another name for a folder

import os.path

if os.path.isfile('stuff.txt'):

with open('stuff.txt') as fh: # file exists

pass

else:

print("file does not exist")

9 of 29

\n - the newline character

  • newline is a character in a string which represents a line break, which means that after this character, a new line will start.

  • In a string it is represented as \n

  • \n is still one character in the string - it is not two ( \ + n)

stuff = 'X\nYZ'

print(stuff)

X

YZ

print(len(stuff))

4

if stuff[1] == "\n":

print(stuff[2:])

YZ

10 of 29

File Processing

  • A text file is an iterable
  • What other iterable have we seen
    • String - a sequence of characters
  • A text file is a sequence of ....

Country;Area(sq km);Electricity - consumption(kWh); ...

String;double;double;double;double;double;double

Afghanistan;647500;652200000;540000000;446000000;...

Akrotiri;123;;;;;

Albania;28748;6760000000;5680000000;552400000;...

Algeria;2381740;23610000000;25760000000;32160000000;...

⋮ ⋮ ⋮

This file has 264 lines

1

2

3

4

5

6

11 of 29

File Processing

Each line in a text file has a newline character at the end that you can not see.

Country;Area(sq km);Electricity - consumption(kWh); ...\n

String;double;double;double;double;double;double\n

Afghanistan;647500;652200000;540000000;446000000;...\n

Akrotiri;123;;;;;\n

Albania;28748;6760000000;5680000000;552400000;...\n

Algeria;2381740;23610000000;25760000000;32160000000;...\n

⋮ ⋮ ⋮

1

2

3

4

5

6

12 of 29

3 ways to from a file

Iterate over the whole file one line at a time

i.e. treat the file as a sequence of lines

with open('stuff.txt') as fhandle:

for line in fhandle:

Explicitly read the next line of the file

with open('stuff.txt') as fhandle:

oneLine = fhandle.readline()

Read the whole file all at one time

with open('stuff.txt') as fhandle:

theWholeThing = fhandle.read()

a string

a string

a string

13 of 29

Iterate one line at a time

  • A file handle open for read can be treated as a sequence of lines of the file

  • We can use a for in loop to iterate through the sequence of lines in a file ONE LINE AT A TIME

with open('countries.txt') as cfile:

for line in cfile:

print(line)

Country;Area(sq km);Electric…

String;double;double…

Afghanistan;647500;652200000;…

14 of 29

Read the NEXT line of the file

  • We can read the next line of the file into a single string using readline()

  • Each time we execute a readline() we read the next line in the file

with open('countries.txt') as fh:

ckey = fh.readline()

print(ckey)

Country;Area(sq km);Electric…

print(len(ckey))

1155

tkey= fh.readline()

print(tkey[:20])

String;double;double

Country;Area(sq km);Electricity - consumption(kWh); ...

String;double;double;double;double;double;double

Afghanistan;647500;652200000;540000000;446000000;...

⋮ ⋮ ⋮

1

2

3

15 of 29

Read the WHOLE file

  • We can use single read() to read the whole file (newlines and all) into a single string.

with open('countries.txt') as fh:

everything = fh.read()

print(len(everything))

64856

print(everything[:20])

Country;Area(sq km);

16 of 29

Newline Problem

Akrotiri;123;;;;;

Albania;28748;67600000...

Algeria;2381740;2361000...

American Samoa;199;1209...

Why are there blank lines between the lines of text

with open('countries.txt') as cfile:

for line in cfile:

print(line)

17 of 29

Newline Problem

Then the print statement also prints a newline at the end of each line as well.

Each line from the file already has a newline at the end.

Akrotiri;123;;;;;\n

\n

Albania;28748;67600000...\n

\n

Algeria;2381740;2361000...\n

\n

American Samoa;199;1209...\n

...

with open('countries.txt') as cfile:

for line in cfile:

print(line)

18 of 29

Newline Problem

  • We can strip the “whitespace” (i.e. spaces, tabs, or newlines) from the right hand side of a string using the string rstrip() string method

Akrotiri;123;;;;;

Albania;28748;67600000...

Algeria;2381740;2361000...

American Samoa;199;1209...

with open('countries.txt') as cfile:

for line in cfile:

line = line.rstrip()

print(line)

  • Remember strings are immutable so rstrip() can not modify the existing string but instead returns a new modified string (which we then must assign to a variable).

19 of 29

rstrip()

[ ] means the parameter is optional

20 of 29

rstrip()

str is just the variable name in the example.

21 of 29

Counting Lines in a File

  • Open a file

  • Use a for loop to read each line

  • Count the lines and print out the number of lines

with open('countries.txt') as fh:

count = 0

for line in fh:

count = count + 1

print(f'line Count: {count}')

line Count: 264

22 of 29

Searching Through a File

We can put an if statement in our for loop to only print lines that meet some criteria

with open('countries.txt') as fh:

for line in fh:

if line.startswith('Al'):

print(line)

What is a good place to find out about string methods

What must the startswith('Al') method return

23 of 29

Skipping lines

We can skip a line by using the continue statement

with open('countries.txt') as fh:

for line in fh:

line = line.rstrip()

# Skip things we are not interested in

if 'United' in line:

continue

# Process our 'interesting' line

print(line)

24 of 29

Using in to select lines

Another use for in is to check if x is in y - x in y or isn't in y - not x in y

with open('countries.txt') as fh:

for line in fh:

line = line.rstrip()

if not 'Republic' in line :

continue

print(line)

Central African Republic;622984;98580000;...

Congo Democratic Republic of the;2345410;4168000000;6...

Congo Republic of the;342000;573600000;...

Czech Republic;78866;55330000000;..

Dominican Republic;48730;8912000000;...

25 of 29

Testing to see if file can be opened

import os.path

fname = input('Enter the file name: ')

if os.path.isfile(fname):

with open(fname) as fhand:

count = 0

for line in fhand:

if line.startswith('C') :

count = count + 1

print(f'There are {count} countries starting with C in {fname}')

else:

print(f'Error! The file {fname} cannot be opened.')

exit()

Enter the file name: countries.txt

There are 24 countries starting with C in countries.txt

Enter the file name: notthere.txt

Error! The file notthere.txt cannot be opened.

26 of 29

Closing a file

After you are done accessing a file you should close it by using the close() method

If you open() a file using a with statement, the file will automatically be closed after the block of code within the with statement has finished executing.

fhand = open('countries.txt')

for line in fhand:

if line.startswith('Al:') :

print(line)

fhand.close()

27 of 29

Files are consumed

with open("test.txt") as myFh:

# file pointer at beginning of file

somestring = myFh.read()

# file pointer now at end of file

  • When opening a file, the OS sets a file pointer to the beginning of the file.

  • The file pointer moves along as you access the file.

  • Thus before accessing the file again you need to open the file again to reset the file pointer back to the beginning of the file.

with open("test.txt") as myFh:

# file pointer back at beginning

for eachLine in myFh:

print(eachLine.rstrip())

28 of 29

to a file (overwrite existing file)

The .write() method writes a string to the file.

Opening the file with the 'w' parameter creates a brand new file with the given name (if the file already exists it is overwritten).

You need to explicitly write a newline \n to the file. Unlike print(), the write() method does not add one.

with open('users.dat','w') as fh:

fh.write("Any string\n")

29 of 29

to a file (append to existing file)

The .write() method writes a string to the file.

Opening the file with the 'a' parameter allows you to append to an existing file (if the file does not yet exist, it will be created).

if users.dat already exists if users.dat does not yet exist

with open('users.dat','a') as fh:

toFile = "Another string"

fh.write(toFile)