1 of 17

File I/O

Andrew S. Fitz Gibbon

UW CSE 160

Winter 2024

1

2 of 17

File Input and Output

  • As a programmer, when would one use a file?
  • As a programmer, what does one do with a file?

(What even is a file?)

2

3 of 17

Files store information�when a program is not running

Important operations:

  • open a file

  • close a file

  • read data

  • write data

3

4 of 17

Files and filenames

  • A file object represents data on your disk drive
    • It is an object in your Python program that you create
    • Can read from it and write to it in your program
  • A filename (usually a string) states where to find the data on your drive
    • Can be used to find/create a file
    • Examples of filenames:
      • Linux/Mac:"/home/asfg/class/160/lectures/file_io.pptx"
      • Windows:"C:\Users\asfg\My Documents\cute_dog.jpg"
      • Linux/Mac: "homework3/images/Husky.png"
      • "Husky.png"

4

5 of 17

Two types of filenames

An Absolute filename gives a specific and complete location in the file system:

  • "/home/asfg/class/160/24wi/lectures/file_io.pptx"
  • "C:\Users\asfg\My Documents\homework3\images\Husky.png"
    • Starts with “/” (Unix) or “C:\” (Windows)
    • Warning: code will fail to find the file if you move or rename files or run your program on a different computer

A Relative filename gives a location relative to the current working directory:

  • "lectures/file_io.pptx"
  • "images\Husky.png"
  • "data\test-small.fastq"
    • Warning: code will fail to find the file unless you run your program from a directory that contains the given contents

  • A relative filename is often a better choice

5

6 of 17

Examples

Linux/Mac: These could all refer to the same file:

"/home/asfg/class/160/homework3/images/Husky.png"

"homework3/images/Husky.png"

"images/Husky.png"

"Husky.png“

Windows: These could all refer to the same file:

"C:\Users\asfg\My Documents\class\160\homework3\images\Husky.png"� "homework3\images\Husky.png"

"images\Husky.png"

"Husky.png"

6

7 of 17

Aside: “Current Working Directory” in Python

Current Working Directory (sometimes also called the "Present" working directory) - the directory from which you ran Python

To determine it from a Python program:

import os

print("The current working directory is", os.getcwd())

Might print:

'/Users/johndoe/Documents'

7

os stands for “operating system”

8 of 17

Opening a file in python

To open a file for reading:

# Open takes a filename and returns a file object.

# This fails if the file cannot be found & opened.

myfile = open("datafile.dat")

  • Or equivalently:

myfile = open("datafile.dat", "r")

To open a file for writing:

# Will create datafile.dat if it does not already �# exist, if datafile.dat already exists, then it�# will be OVERWRITTEN

myfile = open("datafile.dat", "w")

# If datafile.dat already exists, then we will�# append what we write to the end of that file

myfile = open("datafile.dat", "a")

8

By default, file is opened for reading

Adding "r" makes read-only explicit.

Adding "a" opens for appending.

9 of 17

Reading a file in python

# Open takes a filename and returns a file object.

# This fails if the file cannot be found & opened.

myfile = open("datafile.dat")

# Approach 1: Process one line at a time

for line_of_text in myfile:

# process line_of_text

# Approach 2: Process entire file at once

all_data_as_a_big_string = myfile.read()

myfile.close() # close the file when done reading

Assumption: file is a sequence of lines

Where does Python expect to find this file (note the relative pathname)?

9

10 of 17

Simple Reading a file Example

# Reads in file one line at a time and

# prints the contents of the file.

in_file = "student_info.txt"

myfile = open(in_file)

for line_of_text in myfile:

print(line_of_text)

myfile.close()

10

11 of 17

Reading a file Example

# Count the number of words in a text file

in_file = "thesis.txt"

myfile = open(in_file)

num_words = 0

for line_of_text in myfile:

word_list = line_of_text.split()

num_words += len(word_list)

myfile.close()

print("Total words in file: ", num_words)

11

12 of 17

Reading a file multiple times

You can iterate over a list as many times as you like:

mylist = [ 3, 1, 4, 1, 5, 9 ]

for elt in mylist:

process elt

for elt in mylist:

process elt

Iterating over a file uses it up:

myfile = open("datafile.dat")

for line_of_text in myfile:

process line_of_text

for line_of_text in myfile:

process line_of_text

12

This loop body will never be executed!

In general, try to avoid reading a file more than one time. Reading files is slow.

13 of 17

Reading a file multiple times

How to read a file multiple times?

Solution 1: Read into a list, then iterate over it

myfile = open("datafile.dat")

mylines = []

for line_of_text in myfile:

mylines.append(line_of_text)

for line_of_text in mylines:

process line_of_text

for line_of_text in mylines:

process line_of_text

Solution 2: Re-create the file object (slower, but a better choice if the file does not fit in memory)

myfile = open("datafile.dat")

for line_of_text in myfile:

process line_of_text

myfile = open("datafile.dat")

for line_of_text in myfile:

process line_of_text

13

14 of 17

Writing to a file in python

# Replaces any existing file of this name

myfile = open("output.dat", "w")

# Similar to printing output

myfile.write("a bunch of data")

# but you must add newline if desired, unlike print

myfile.write("a line of text\n")

# and the argument must be a string

myfile.write(4)

myfile.write(str(4))

myfile.close()

14

open for Writing�(give no argument, or

"r", for Reading)

“\n” means end of line (Newline)

Incorrect; results in:

TypeError: expected a character buffer object

Similar to if you tried to do print(4)

Correct. Argument must be a string

close when done with all writing

Next thing written will be on this same line.

15 of 17

# Count the number of words in a text file and

# make a list of all the words in the file

num_words = 0

word_list = []

silly_file = open("silly.txt", "r")

for line in silly_file:

print(line, end="")

# what should come next? (Hint: use split())

silly_file.close()

print("Total words in file: ", num_words)

15

16 of 17

16

num_words = 0

word_list = []

silly_file = open("silly.txt", "r")

for line in silly_file:

new_words = line.split()

word_list.extend(new_words)

num_words = num_words + len(new_words)

silly_file.close()

print("Total word count:", num_words)

print(word_list)

17 of 17

This is a silly file.

Here is some more silly text.

And even another silly line.

The fourth silly line.

17