1 of 49

The Adventures of a

Python Script!

Dema Abu Adas | @human_dema

2 of 49

Who am i

  • 4th year software engineering student at the University of Guelph
  • BangBangCon West Organizer

3 of 49

bit.ly/demastalk

4 of 49

Python Interpreters

5 of 49

Python Interpreters:

Stackless Python

  • Doesn’t use the C stack
  • Hundreds of thousands of tiny tasks, called “tasklets” in a thread
    • Tasklets can run completely decoupled!

6 of 49

Python Interpreters:

Jython

  • Java implementation of python
  • Jython programs can interact with Java packages/applications

7 of 49

Python Interpreters:

IronPython

  • Open source implementation that integrates with .NET frameworks
  • IronPython can use .NET languages and vise-versa

8 of 49

Python Interpreters:

PyPy

  • Faaast !
  • Can utilize memory usage better
  • Also Stackless!

9 of 49

Python Interpreters:

CPython

  • Official implementation of Python!
  • Written in C

10 of 49

How does CPython work?

def hello_PyGotham():

print(“Hello PyGotham”)

hello_PyGotham()

11 of 49

Tokenizing

0,0-0,0: ENCODING 'utf-8'

1,0-1,3: NAME 'def'

1,4-1,18: NAME 'hello_PyGotham'

1,18-1,19: LPAR '('

1,19-1,20: RPAR ')'

1,20-1,21: COLON ':'

1,21-1,22: NEWLINE '\n'

2,0-2,4: INDENT ' '

2,4-2,9: NAME 'print'

2,9-2,10: LPAR '('

2,10-2,26: STRING '"Hello PyGotham"'

2,26-2,27: RPAR ')'

2,27-2,28: NEWLINE '\n'

3,1-3,2: NL '\n'

4,0-4,0: DEDENT ''

4,0-4,14: NAME 'hello_PyGotham'

4,14-4,15: LPAR '('

4,15-4,16: RPAR ')'

4,16-4,17: NEWLINE '\n'

5,0-5,0: ENDMARKER ''

def hello_PyGotham():

print(“Hello PyGotham”)

hello_PyGotham()

12 of 49

Tokens -> Parser Tree

[257, [269, [295, [263, [1, 'def'], [1, 'hello_PyGotham'], [264, [7, '('], [8, ')']], [11, ':'], [304, [4, ''], [5, ''], [269, [270, [271, [272, [274, [305, [309, [310, [311, [312, [315, [316, [317, [318, [319, [320, [321, [322, [323, [324, [1, 'print']], [326, [7, '('], [334, [335, [305, [309, [310, [311, [312, [315, [316, [317, [318, [319, [320, [321, [322, [323, [324, [3, '"Hello PyGotham!"']]]]]]]]]]]]]]]]]], [8, ')']]]]]]]]]]]]]]]]]]], [4, '']]], [6, '']]]]],[269, [270, [271, [272, [274, [305, [309, [310, [311, [312, [315, [316, [317, [318, [319, [320, [321, [322, [323, [324, [1, 'hello_PyGotham']], [326, [7, '('], [8, ')']]]]]]]]]]]]]]]]]]], [4, '']]], [4, ''], [0, '']]

def hello_PyGotham():

print(“Hello PyGotham”)

hello_PyGotham()

13 of 49

Tokens -> Parser Tree

[257, [269, [295, [263, [1, 'def'], [1, 'hello_PyGotham'], [264, [7, '('], [8, ')']], [11, ':'], [304, [4, ''], [5, ''], [269, [270, [271, [272, [274, [305, [309, [310, [311, [312, [315, [316, [317, [318, [319, [320, [321, [322, [323, [324, [1, 'print']], [326, [7, '('], [334, [335, [305, [309, [310, [311, [312, [315, [316, [317, [318, [319, [320, [321, [322, [323, [324, [3, '"Hello PyGotham!"']]]]]]]]]]]]]]]]]], [8, ')']]]]]]]]]]]]]]]]]]], [4, '']]], [6, '']]]]],[269, [270, [271, [272, [274, [305, [309, [310, [311, [312, [315, [316, [317, [318, [319, [320, [321, [322, [323, [324, [1, 'hello_PyGotham']], [326, [7, '('], [8, ')']]]]]]]]]]]]]]]]]]], [4, '']]], [4, ''], [0, '']]

def hello_PyGotham():

print(“Hello PyGotham”)

hello_PyGotham()

14 of 49

Tokens -> Parser Tree

[257, [269, [295, [263, [1, 'def'], [1, 'hello_PyGotham'], [264, [7, '('], [8, ')']], [11, ':'], [304, [4, ''], [5, ''], [269, [270, [271, [272, [274, [305, [309, [310, [311, [312, [315, [316, [317, [318, [319, [320, [321, [322, [323, [324, [1, 'print']], [326, [7, '('], [334, [335, [305, [309, [310, [311, [312, [315, [316, [317, [318, [319, [320, [321, [322, [323, [324, [3, '"Hello PyGotham!"']]]]]]]]]]]]]]]]]], [8, ')']]]]]]]]]]]]]]]]]]], [4, '']]], [6, '']]]]],[269, [270, [271, [272, [274, [305, [309, [310, [311, [312, [315, [316, [317, [318, [319, [320, [321, [322, [323, [324, [1, 'hello_PyGotham']], [326, [7, '('], [8, ')']]]]]]]]]]]]]]]]]]], [4, '']]], [4, ''], [0, '']]

def hello_PyGotham():

print(“Hello PyGotham”)

hello_PyGotham()

15 of 49

Parser Tree Generation

def hello_PyGotham():

print(“Hello PyGotham”)

hello_PyGotham()

16 of 49

Parser Tree Generation

def hello_PyGotham():

print(“Hello PyGotham”)

hello_PyGotham()

17 of 49

Parser Tree’s Responsibilities

  • All syntax analysis happens at this stage
  • Such as indentation errors, spelling errors, etc..

SyntaxError: invalid syntax

def hello_PyGotham():

print(“Hello PyGotham”)

hello_PyGotham()

18 of 49

Parser Tree -> AST

def hello_PyGotham():

print(“Hello PyGotham”)

hello_PyGotham()

19 of 49

AST Responsibilities

def hello_PyGotham():

print(“Hello PyGotham”)

hello_PyGotham()

  • Things that are syntactically correct but don’t make sense logically

TypeError: hello_PyGotham() takes no arguments (1 given)

20 of 49

AST -> Control FLow Graphs

def hello_PyGotham():

print(“Hello PyGotham”)

hello_PyGotham()

def hello():....

hello_PyGothaam()

hello_PyGotham

print(“Hello PyGotham)

print

hello_PyGotham

calls

calls

21 of 49

Another Example

def fib():

a, b = 0, 1

while True:

yield a

a, b = b, a + b

fib_gen = fib()

for _ in range(10):

next(fib_gen)

22 of 49

Another Example

next

next(fib_gen)

a, b = 0, 1

While True:

yield a

a, b = b, a + b

fib

fib

Def fib():...

Fib_gen = fib()

For _ in range(10):

23 of 49

Control FLow Graphs

  • One point of entry and exit
  • Can have many routes in between
  • Usually the last step before bytecode

24 of 49

CFG -> Bytecode generation!

def hello_PyGotham():

print(“Hello PyGotham”)

hello_PyGotham()

4 0 LOAD_GLOBAL 0 (print)

2 LOAD_CONST 1 ('Hello PyGotham')

4 CALL_FUNCTION 1

6 POP_TOP

8 LOAD_CONST 0 (None)

10 RETURN_VALUE

25 of 49

Final steps!

  • Stack oriented virtual machine
  • Push & pop commands!

26 of 49

Back to the Abstract Syntax Tree (AST)

27 of 49

What it is

  • Not a concrete representation of the original source code
  • Abstraction of code that discards details and focuses on the syntactic structure

28 of 49

Why is it useful

  • Contributing to CPython
  • Linting
  • Debugging

29 of 49

Linting Example

  • Extend Pylint with plugins!
  • Write 3 different types of checkers
    • Raw checkers, analyze the raw file stream
    • Token checkers, analyze the tokens
    • AST checkers, analyze the AST

30 of 49

Linting Example

  • PyLint uses the Astroid module
  • Astroid = AST + more functionality

31 of 49

Linting Example

  • Ordering of import statements
    • Alphabetically
    • Import statements go before from… import

32 of 49

Correct Import Example

import os

from os import path

from django.conf import settings

from django.db.models import (

CharField,

Field,

)

33 of 49

Incorrect Import Example

from os import path #should be after import os

import os

from django.conf import settings

from django.db.models import (

Field, #should be after CharField

CharField,

)

34 of 49

Astroid Tree Representation

import astroid

source_code = '''

def hello_PyGotham():

print("Hello PyGotham")

hello_PyGotham()

'''

ast = astroid.parse(source_code)

print(ast)

35 of 49

Astroid Tree Representation

Module(name='',

doc=None,

file='<?>',

path=['<?>'],

package=False,

pure_python=True,

future_imports=set(),

body=[ <FunctionDef.hello_PyGotham l.3 at 0x107234400>,

<Expr l.6 at 0x107234470>])

36 of 49

Pylint boilerplate

class AlphabeticallySortedImports(BaseChecker):

__implements__ = IAstroidChecker

name = 'alphabetically-sorted-imports-checker'

UNSORTED_IMPORT_FROM = 'unsorted-import-from'

DIR_HIGHER = 'higher'

DIR_LOWER = 'lower'

msgs = {

'C5001': ('"%s" in "%s" is in the wrong position. Move it %s.',

UNSORTED_IMPORT_FROM,

'Refer to project rules on wiki'),

}

options = ()

priority = -1

37 of 49

Pylint boilerplate

class AlphabeticallySortedImports(BaseChecker):

__implements__ = IAstroidChecker

name = 'alphabetically-sorted-imports-checker'

UNSORTED_IMPORT_FROM = 'unsorted-import-from'

DIR_HIGHER = 'higher'

DIR_LOWER = 'lower'

msgs = {

'C5001': ('"%s" in "%s" is in the wrong position. Move it %s.',

UNSORTED_IMPORT_FROM,

'Refer to project rules on wiki'),

}

options = ()

priority = -1

38 of 49

Pylint boilerplate

class AlphabeticallySortedImports(BaseChecker):

__implements__ = IAstroidChecker

name = 'alphabetically-sorted-imports-checker'

UNSORTED_IMPORT_FROM = 'unsorted-import-from'

DIR_HIGHER = 'higher'

DIR_LOWER = 'lower'

msgs = {

'C5001': ('"%s" in "%s" is in the wrong position. Move it %s.',

UNSORTED_IMPORT_FROM,

'Refer to project rules on wiki'),

}

options = ()

priority = -1

39 of 49

Pylint boilerplate

class AlphabeticallySortedImports(BaseChecker):

__implements__ = IAstroidChecker

name = 'alphabetically-sorted-imports-checker'

UNSORTED_IMPORT_FROM = 'unsorted-import-from'

DIR_HIGHER = 'higher'

DIR_LOWER = 'lower'

msgs = {

'C5001': ('"%s" in "%s" is in the wrong position. Move it %s.',

UNSORTED_IMPORT_FROM,

'Refer to project rules on wiki'),

}

options = ()

priority = -1

40 of 49

Pylint boilerplate

class AlphabeticallySortedImports(BaseChecker):

__implements__ = IAstroidChecker

name = 'alphabetically-sorted-imports-checker'

UNSORTED_IMPORT_FROM = 'unsorted-import-from'

DIR_HIGHER = 'higher'

DIR_LOWER = 'lower'

msgs = {

'C5001': ('"%s" in "%s" is in the wrong position. Move it %s.',

UNSORTED_IMPORT_FROM,

'Refer to project rules on wiki'),

}

options = ()

priority = -1

41 of 49

Actual implementation!

def visit_importfrom(self, node):

names = [name for name, _alias in node.names]

sorted_names = sorted(names)

for actual_index, name in enumerate(names):

correct_index = sorted_names.index(name)

if correct_index != actual_index:

direction = self.DIR_LOWER if correct_index > actual_index else self.DIR_HIGHER

args = name, node.as_string(), direction

self.add_message(

self.UNSORTED_IMPORT_FROM, node=node, args=args

)

42 of 49

Actual implementation!

def visit_importfrom(self, node):

names = [name for name, _alias in node.names]

sorted_names = sorted(names)

for actual_index, name in enumerate(names):

correct_index = sorted_names.index(name)

if correct_index != actual_index:

direction = self.DIR_LOWER if correct_index > actual_index else self.DIR_HIGHER

args = name, node.as_string(), direction

self.add_message(

self.UNSORTED_IMPORT_FROM, node=node, args=args

)

43 of 49

Actual implementation!

def visit_importfrom(self, node):

names = [name for name, _alias in node.names]

sorted_names = sorted(names)

for actual_index, name in enumerate(names):

correct_index = sorted_names.index(name)

if correct_index != actual_index:

direction = self.DIR_LOWER if correct_index > actual_index else self.DIR_HIGHER

args = name, node.as_string(), direction

self.add_message(

self.UNSORTED_IMPORT_FROM, node=node, args=args

)

44 of 49

Debugging Example

Birdseye!

  • Open source project
  • Graphical debugger
  • Uses the AST

45 of 49

How birdseye works!

@eye

def foo():

x = 1

y = 2

z = 3 * (x + y)

if (x > y):

y += 30

46 of 49

How birdseye works!

47 of 49

How birdseye works!

48 of 49

Cool things about birdseye

  • Integrations
    • VS Code
    • Jupyter notebooks
    • PyCharm plugins

49 of 49

Thank you!