Fall 2016
INFO I427:
Pythonic Thinking
To be Explicit
To Choose Simple over Complex
To Maximize Readability
PYTHONIC STYLE
>>import this
The Zen of Python,
by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
WhiteSpace
Naming
Expressions and statements
PEP 8
Naming
Expressions and statements
PEP 8
Expressions and statements
PEP 8
Pylint tool:
Pylint provides automated enforcement of the PEP 8 style guide and detects many other types of common errors in Python programs.
PEP 8
from urllib.parse import parse_qs
my_values = parse_qs(‘red=5&blue=0&green=’,
keep_blank_values=True)
print(repr(my_values))
>>>
{‘red’: [‘5’], ‘green’: [”], ‘blue’: [‘0’]}
WRITE A HELPER FUNCTION INSTEAD OF COMPLEX EXPERESSION
from urllib.parse import parse_qs
my_values = parse_qs(‘red=5&blue=0&green=’,
keep_blank_values=True)
print(repr(my_values))
>>>
{‘red’: [‘5’], ‘green’: [”], ‘blue’: [‘0’]}
WRITE A HELPER FUNCTION INSTEAD OF COMPLEX EXPERESSION
print(‘Red: ’, my_values.get(‘red’))
print(‘Green: ’, my_values.get(‘green’))
Red: [‘5’]
Green: [”]
from urllib.parse import parse_qs
my_values = parse_qs(‘red=5&blue=0&green=’,
keep_blank_values=True)
print(repr(my_values))
>>>
{‘red’: [‘5’], ‘green’: [”], ‘blue’: [‘0’]}
WRITE A HELPER FUNCTION INSTEAD OF COMPLEX EXPERESSION
print(‘Red: ’, my_values.get(‘red’))
print(‘Green: ’, my_values.get(‘green’))
Red: [‘5’]
Green: [”]
red = int(my_values.get(‘red’, [”])[0] or 0)
WRITE A HELPER FUNCTION INSTEAD OF COMPLEX EXPERESSION
print(‘Red: ’, my_values.get(‘red’))
print(‘Green: ’, my_values.get(‘green’))
Red: [‘5’]
Green: [”]
red = int(my_values.get(‘red’, [”])[0] or 0)
green = int(my_values.get(‘green’, [”])[0] or 0)
The trick here is that the empty string, the
empty list, and zero all evaluate to False implicitly.
red = int(my_values.get(‘red’, [”])[0] or 0)
green = int(my_values.get(‘green’, [”])[0] or 0)
WRITE A HELPER FUNCTION INSTEAD OF COMPLEX EXPERESSION
def get_first_int(values, key, default=0):
found = values.get(key, [”])
if found[0]:
found = int(found[0])
else:
found = default
return found
Slicing lets you access a subset of a sequence’s items with minimal effort.
a = [‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’, ‘g’, ‘h’]
Somelist[-0:] or somelist[:] will results in a copy of the original list.
HOW TO SLICE A SEQUENCE
Python provides compact syntax for deriving one list from another
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
squares = [x**2 for x in a]
squares = map(lambda x: x ** 2, a)
LIST COMPREHENSION instead of MAP and FILTER
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
squares = [x**2 for x in a]
squares = map(lambda x: x ** 2, a)
You only want to compute the squares of the numbers that are divisible by 2
even_squares = [x**2 for x in a if x % 2 == 0]
alt = map(lambda x: x**2, filter(lambda x: x % 2 == 0, a))
LIST COMPREHENSION instead of MAP and FILTER
The problem with list comprehension:
value = [len(x) for x in open(‘/tmp/my_file.txt’)]
print(value)
>>>
[100, 57, 15, 1, 12, 75, 5, 86, 89, 11]
CONSIDER GENERATOR EXPRESSION FOR LARGE COMPREHENSIONS
The solution : Generator Expression
it = (len(x) for x in open(‘/tmp/my_file.txt’))
print(it)>>>
<generator object <genexpr> at 0x101b81480>
print(next(it))
print(next(it))
>>>
100
57
CONSIDER GENERATOR EXPRESSION FOR LARGE COMPREHENSIONS
Generator Expressions can be composed together
it = (len(x) for x in open(‘/tmp/my_file.txt’))
roots = ((x, x**0.5) for x in it)
print(next(roots))
>>>
(15, 3.872983346207417)
CONSIDER GENERATOR EXPRESSION FOR LARGE COMPREHENSIONS
flavor_list = [‘vanilla’, ‘chocolate’, ‘pecan’, ‘strawberry’]
for i in range(len(flavor_list)):
flavor = flavor_list[i]
print(‘%d: %s’ % (i + 1, flavor))
PREFER ENUMERATE OVER RANGE
flavor_list = [‘vanilla’, ‘chocolate’, ‘pecan’, ‘strawberry’]
for i in range(len(flavor_list)):
flavor = flavor_list[i]
print(‘%d: %s’ % (i + 1, flavor))
enumerate wraps any iterator with a lazy generator. This generator yields pairs of the loop index and the next value from the iterator.
for i, flavor in enumerate(flavor_list):
print(‘%d: %s’ % (i + 1, flavor))
PREFER ENUMERATE OVER RANGE
for i, flavor in enumerate(flavor_list):
print(‘%d: %s’ % (i + 1, flavor))
for i, flavor in enumerate(flavor_list, 1):
print(‘%d: %s’ % (i, flavor))
PREFER ENUMERATE OVER RANGE
enumerate provides concise syntax for looping over an iterator and getting the
index of each item from the iterator as you go.
names = [‘Cecilia’, ‘Lise’, ‘Marie’]
letters = [len(n) for n in names]
longest_name = None
max_letters = 0
for i in range(len(names)):
count = letters[i]
if count > max_letters:
longest_name = names[i]
max_letters = count
print(longest_name)
>>>
Cecilia
USE ZIP TO PROCESS ITERATORS IN PARALLEL
for i, name in enumerate(names):
count = letters[i]
if count > max_letters:
longest_name = name
max_letters = count
USE ZIP TO PROCESS ITERATORS IN PARALLEL
In Python 3, zip wraps two or more iterators with a lazy generator. The zip generator yields tuples containing the next value from each iterator.
for name, count in zip(names, letters):
if count > max_letters:
longest_name = name
max_letters = count
The zip_longest function from the itertools built-in module lets you iterate
over multiple iterators in parallel regardless of their lengths
USE ZIP TO PROCESS ITERATORS IN PARALLEL
FUNCTIONS
def divide(a, b):
Try:
return a / b
except ZeroDivisionError:
return None
x, y = 0, 5
result = divide(x, y)
if not result:
print(‘Invalid inputs’) # This is wrong!
PREFER EXCEPTION to RETURN NONE
Solution: def divide(a, b):
try:
return a / b
except ZeroDivisionError as e:
raise ValueError(‘Invalid inputs’) from e
x, y = 5, 2
try:
result = divide(x, y)
except ValueError:
print(‘Invalid inputs’)
else:
print(‘Result is %.1f’ % result)
>>>
Result is 2.5
PREFER EXCEPTION to RETURN NONE
numbers = [8, 3, 1, 2, 5, 4, 7, 6]
group = {2, 3, 5, 7}
sort_priority(numbers, group)
print(numbers)
>>>[2, 3, 5, 7, 1, 4, 6, 8]
KNOW HOW CLOSURES INTERACT WITH VARIABLE SCOPE
def sort_priority(values, group):
def helper(x):
if x in group:
return (0, x)
return (1, x)
values.sort(key=helper)
numbers = [8, 3, 1, 2, 5, 4, 7, 6]
group = {2, 3, 5, 7}
sort_priority(numbers, group)
print(numbers)
>>>[2, 3, 5, 7, 1, 4, 6, 8]
KNOW HOW CLOSURES INTERACT WITH VARIABLE SCOPE
def sort_priority(values, group):
def helper(x):
if x in group:
return (0, x)
return (1, x)
values.sort(key=helper)
numbers = [8, 3, 1, 2, 5, 4, 7, 6]
group = {2, 3, 5, 7}
sort_priority(numbers, group)
print(numbers)
>>>[2, 3, 5, 7, 1, 4, 6, 8]
KNOW HOW CLOSURES INTERACT WITH VARIABLE SCOPE
Python supports Closures:
functions that refer to variables from the scope in which they were defined.
This is why the helper function is able to access the group argument to sort_priority.
def sort_priority(values, group):
def helper(x):
if x in group:
return (0, x)
return (1, x)
values.sort(key=helper)
numbers = [8, 3, 1, 2, 5, 4, 7, 6]
group = {2, 3, 5, 7}
sort_priority(numbers, group)
print(numbers)
>>>[2, 3, 5, 7, 1, 4, 6, 8]
KNOW HOW CLOSURES INTERACT WITH VARIABLE SCOPE
Functions are first-class objects in Python:
You can refer to them directly,
assign them to variables, pass them as arguments to other functions, compare them
in expressions and if statements, etc.
This is how the sort method can accept a
closure function as the key argument.
def sort_priority(values, group):
def helper(x):
if x in group:
return (0, x)
return (1, x)
values.sort(key=helper)
numbers = [8, 3, 1, 2, 5, 4, 7, 6]
group = {2, 3, 5, 7}
sort_priority(numbers, group)
print(numbers)
>>>[2, 3, 5, 7, 1, 4, 6, 8]
KNOW HOW CLOSURES INTERACT WITH VARIABLE SCOPE
Python has specific rules for comparing tuples:
It first compares items in index zero,
then index one, then index two, and so on.
This is why the return value from the
helper closure causes the sort order to have two distinct groups.
you want to find the index of every word in a string
def index_words(text):
result = []
if text:
result.append(0)
for index, letter in enumerate(text):
if letter == ‘ ‘:
result.append(index + 1)
return result
address = ‘Four score and seven years ago...’
result = index_words(address)
print(result[:3])
>>>
[0, 5, 11]
CONSIDER GENERATORS INSTEAD OF RETURNING LIST
you want to find the index of every word in a string
def index_words(text):
result = []
if text:
result.append(0)
for index, letter in enumerate(text):
if letter == ‘ ‘:
result.append(index + 1)
return result
address = ‘Four score and seven years ago...’
result = index_words(address)
print(result[:3])
>>>
[0, 5, 11]
CONSIDER GENERATORS INSTEAD OF RETURNING LIST
Generators are functions that use yield expressions.
When called, generator functions do not actually run but instead immediately return an iterator.
With each call to the next built-in function, the iterator will advance the generator to its next yield expression.
Each value passed to yield by
the generator will be returned by the iterator to the caller.
CONSIDER GENERATORS INSTEAD OF RETURNING LIST
def index_words_iter(text):
if text:
yield 0
for index, letter in enumerate(text):
if letter == ‘ ‘:
yield index + 1
result = list(index_words_iter(address))
CONSIDER GENERATORS INSTEAD OF RETURNING LIST
def index_file(handle):
offset = 0
for line in handle:
if line:
yield offset
for letter in line:
offset += 1
if letter == ‘ ‘:
yield offset
CONSIDER GENERATORS INSTEAD OF RETURNING LIST
with open(‘/tmp/address.txt’, ‘r’) as f:
it = index_file(f)
results = islice(it, 0, 3)
print(list(results))
>>>
[0, 5, 11]
def normalize(numbers):
total = sum(numbers)
result = []
for value in numbers:
percent = 100 * value / total
result.append(percent)
return result
visits = [15, 35, 80]
percentages = normalize(visits)
print(percentages)
>>>
[11.538461538461538, 26.923076923076923, 61.53846153846154]
BE DEFENCIVE WHEN ITERATING OVER ARGUMENTS
def read_visits(data_path):
with open(data_path) as f:
for line in f:
yield int(line)
it = read_visits(‘/tmp/my_numbers.txt’)
percentages = normalize(it)
print(percentages)
>>>
[]
BE DEFENCIVE WHEN ITERATING OVER ARGUMENTS
def normalize(numbers):
total = sum(numbers)
result = []
for value in numbers:
percent = 100 * value / total
result.append(percent)
return result
The cause of this behavior is that an iterator only produces its results a single time. If you iterate over an iterator or generator that has already raised a StopIteration exception, you won’t get any results the second time around.
it = read_visits(‘/tmp/my_numbers.txt’)
print(list(it))
print(list(it)) # Already exhausted
>>>
[15, 35, 80]
[]
BE DEFENCIVE WHEN ITERATING OVER ARGUMENTS
What’s confusing is that you also won’t get any errors when you iterate over an already exhausted iterator. for loops, the list constructor, and many other functions throughout the Python standard library expect the StopIteration exception to be raised during normal operation. These functions can’t tell the difference between an iterator that has no output and an iterator that had output and is now exhausted.
BE DEFENCIVE WHEN ITERATING OVER ARGUMENTS
def normalize_copy(numbers):
numbers = list(numbers) # Copy the iterator
total = sum(numbers)
result = []
for value in numbers:
percent = 100 * value / total
result.append(percent)
return result
it = read_visits(‘/tmp/my_numbers.txt’)
percentages = normalize_copy(it)
print(percentages)
>>>[11.538461538461538, 26.923076923076923, 61.53846153846154]
BE DEFENCIVE WHEN ITERATING OVER ARGUMENTS
CAN WE DO BETTER?
BE DEFENCIVE WHEN ITERATING OVER ARGUMENTS