1 of 47

Python in the real world

John J. Camilleri

2 of 47

Me

  • CTO at Textual.ai
  • Ph.D. in Computer Science
  • Out in the “real world” since 2017
  • Developing web apps for ~20 years
  • Language top 5 (time spent)
    1. JavaScript/TypeScript
    2. Haskell
    3. Python
    4. Shell scripting
    5. PHP

3 of 47

Scalable content creation for E-commerce

4 of 47

5 of 47

├─ subpart/subpart

│ ├─ quality/property='flared_13049'

│ ├─ identifier/kind='leg_567'

│ │ ├─ identifier/hypernym='leg_567'

│ │ ├─ kind/number='pl'

├─ subpart/subpart

│ ├─ identifier/kind='quality_7753'

│ │ ├─ identifier/hypernym='quality_7753'

│ ├─ quality/property='ribbed_1678'

│ ├─ material/cloth='jersey_635'

├─ subpart/subpart

│ ├─ identifier/kind='elastic_43945'

│ │ ├─ identifier/hypernym='elastic_43945'

│ ├─ phrase/position='at_the_waist_6720'

├─ subpart/subpart

│ ├─ identifier/kind='fit_5352'

│ │ ├─ identifier/hypernym='fit_5352'

│ ├─ quality/property='fitted_5842'

├─ template/template='gina-tricot_produktionse…

│ ├─ template/data={'text': 'Pick & Choose AB'}

├─ template/template='gina-tricot_produktionse…

│ ├─ template/data={'text': 'Melek Tekstil -…

├─ identifier/name='Aurora trousers'

├─ identifier/kind='pants_585'

Aurora trousers

Byxa med utsvängda ben. Byxan har en ribbad kvalitet i trikå. Den har en resår i midjan och en figurnära passform.

Material & Hållbarhet

  • leverantör: Pick & Choose AB
  • produktionsenhet: Melek Tekstil - Yahya Körsu

Aurora trousers

Housut levenevillä jaloilla. Housuissa on ribattu laatu trikoosta. Niissä on kuminauha vyötäröllä ja vartalonmyötäinen istuvuus.

Materiaalit ja kestävyys

  • tavarantoimittaja: Pick & Choose AB
  • tuotantoyksikkö: Melek Tekstil - Yahya Körsu

Aurora trousers

Hose mit ausgestellten Beinen. Die Hose hat eine gerippte Qualität aus Jersey. Sie hat ein Gummiband in der Taille und eine taillierte Passform.

Materialien & Nachhaltigkeit

  • lieferant: Pick & Choose AB
  • produzent: Melek Tekstil - Yahya Körsu

6 of 47

7 of 47

  • Gothenburg startup
  • Small team, big customers
  • Combine rule-based NLG with AI
  • “Agile”: build things as they are needed

8 of 47

9 of 47

10 of 47

11 of 47

Tech stack

  • ReactJS, jQuery
  • HTML, JavaScript, TypeScript, SCSS
  • Python, Django
  • Grammatical Framework
  • PostgreSQL
  • Pytest, Jest, Selenium
  • Shell, CI/CD scripts
  • Docker, Kubernetes, Serverless functions

12 of 47

Code repository in numbers

  • 18,590 commits by 26 people since 2017
  • 4 236 files
    • 1 660 Python (~40%)
    • 625 Java/TypeScript
    • 1 951 other
  • 1 636 384 lines of code
    • 256 446 Python (~15%)
    • 84 852 Java/TypeScript
    • 687 682 JSON
    • 56 988 CSS
    • 15 238 HTML
    • 535 178 other

13 of 47

Tickets and boards

14 of 47

Git “feature branch” workflow

  • dev branch is protected
  • New feature branches off dev
  • Merge requests back into dev
  • New releases branch off dev
  • Hotfix to current release: cherry-pick commits from dev into existing release branch

15 of 47

Code review

16 of 47

Continuous integration/delivery (CI/CD)

  • On every push and merge request
    • Vulnerability checks
    • Linting
    • Tests
    • Code coverage
  • On every commit to release branch
    • Build Docker images
    • Store in container registry

17 of 47

Feature flags

if getFeatureFlag(user, 'new-ui'):

# new code

else:

# old code

  • Turn features on/off quickly
  • Target to different users
  • Control rollout
  • Avoid maintaining multiple versions

Application code

Control panel

18 of 47

  • Run everything inside OS-level containers
  • Easily reproduce working environment
  • Self-document system requirements
  • Essential when hosting in cloud
  • What about pyenv/virtualenv?
    • Yes, but you’ll still need Docker

19 of 47

Application monitoring (APM)

20 of 47

Error tracking

21 of 47

Unexpected errors�happen.

A lot.

Why?

Traceback (most recent call last):

File "/Users/john/repositories/Textual/textual-app/app/products/views/details/tabs/templates_tab.py", line 285, in make_inline_form_html

text = TemplateGenerator.render_template(

File "/Users/john/repositories/Textual/textual-app/app/planner/data.py", line 646, in render_template

text = template.render(context_data)

File "/usr/local/lib/python3.9/site-packages/jinja2/asyncsupport.py", line 76, in render

return original_render(self, *args, **kwargs)

File "/usr/local/lib/python3.9/site-packages/jinja2/environment.py", line 1008, in render

return self.environment.handle_exception(exc_info, True)

File "/usr/local/lib/python3.9/site-packages/jinja2/environment.py", line 780, in handle_exception

reraise(exc_type, exc_value, tb)

File "/usr/local/lib/python3.9/site-packages/jinja2/_compat.py", line 37, in reraise

raise value.with_traceback(tb)

File "<template>", line 1, in <module>

File "/usr/local/lib/python3.9/site-packages/jinja2/environment.py", line 411, in getitem

return obj[argument]

jinja2.exceptions.UndefinedError: 'phrase' is undefined

22 of 47

Nullability

Any thing at any point could be None (or undefined)

23 of 47

Dynamicity

Any thing at any point could have an unexpected Type

24 of 47

Python is a dynamic language

Everything happens at runtime

Dynamic typing

Infer types at runtime

Dynamic binding

Add/change object attributes at runtime

“Duck” typing

Consider object attributes rather than type

Introspection

Examine object attributes at runtime

Monkey patching

Modify code, replace functions at runtime

Dynamic loading

Import modules at runtime

Dynamic code execution

Execute arbitrary code at runtime

25 of 47

The Python interpreter

$ python3

Python 3.10.12 (main, Jun 20 2023, 17:00:24) [Clang 14.0.3 ...

Type "help", "copyright", "credits" or "license" for more information.

>>> import math

>>> math.ceil(67/4)

17

>>> "hello".replace

<built-in method replace of str object at 0x106343fb0>

>>> "hello".replace("e", "a").replace("o", "å")

'Hallå'�>>> ^d�$

How is this implemented?

26 of 47

Implementing the Python interpreter

while True: exec(str(input('>>>')))

27 of 47

Dynamic code

  • DevOps workflow & release cycle → updating production code is slow
  • We want to make customer-specific customisations on-the-fly
  • Store dynamic code snippets in database
  • Easy to implement with eval()
  • Dangerous!
    • Side-steps code review, testing
    • Developer can introduce bugs
    • Even worse if code is malicious

28 of 47

Dynamic typing

$ cat test.py

foo = "hello"

print(foo[2])

foo_l = [foo]

print(foo_l[0])

foo.append("!")

$ python3 test.py

l

hello

Traceback (most recent call last):

File "…/test.py", line 30, …

foo.append("!")

AttributeError: 'str' object has no attribute 'append'

>>> foo = "hello"

>>> type(foo)

<class 'str'>

>>> foo[2]

"l"

>>> foo_l = [foo]

>>> type(foo_l)

<class list>

>>> foo_l[0]

"hello"

>>> foo.append("!")

AttributeError: 'str' object has no attribute 'append'

29 of 47

Dynamic errors

  • Everything is at runtime: errors too
  • No type checker to catch things early
  • Compensate with testing (boring)
  • Debugging very important (fun)

30 of 47

Dynamicity encourages quick hacks

👹

Convenient when you�know what you’re doing

👼🏻

No one in a team�knows all parts of a system

31 of 47

Static typing

def fib(n):

a, b = 0, 1

while a < n:

yield a

a, b = b, a+b

print(list(fib(4)))

print(list(fib("hello")))

from typing import Iterator

def fib(n: int) -> Iterator[int]:

a, b = 0, 1

while a < n:

yield a

a, b = b, a+b

print(list(fib(4)))

print(list(fib("hello")))

$ python3 fib.py

[0, 1, 1, 2, 3]

Traceback (most recent call last):

File "/tmp/fib.py", line 10, in <module>

print(list(fib("hello")))

File "/tmp/fib.py", line 5, in fib

while a < n:

TypeError: '<' not supported between instances of 'int' and 'str'

$ mypy fib.py

fib.py:10: error: Argument 1 to "fib" has incompatible type "str"; expected "int"

Found 1 error in 1 file (checked 1 source file)

$ python3 fib.py

[0, 1, 1, 2, 3]

Traceback (most recent call last): …

Without static type checking

With static type checking

32 of 47

mypy�http://mypy-lang.org/

  • Type safety is great!
  • Uses existing Python syntax (not transpilation, e.g. TypeScript → JavaScript)
  • Typing is opt-in
    • Extra work required to enjoy benefits
    • Very easy (a feature!) to have holes in your definitions
  • External libraries are often untyped
  • Somewhat immature

“an optional static type checker for Python�that aims to combine the benefits of�dynamic (or "duck") typing and static typing”

Other typing solutions

33 of 47

Object Orientation (OO) in Python

class Dog:

def __init__(self, name, age):

self.name = name

self.age = age

def woof(self):

print(f"{self.name} says woof!")

dog1 = Dog("Lassie", 8)

>>> dog1.woof()

Lassie says woof!

>>> type(dog1)

<class '__main__.Dog'>

def woof(dog):

print(f"{dog['name']} says woof!")

dog2 = {

"name": "Rex",

"age": 4

}

>>> woof(dog2)

Rex says woof!

>>> type(dog2)

<class 'dict'>

Function + POPO (Plain Old Python Object)

Class with fields and method

34 of 47

Web Apps: Frontend/Backend

  • Code: limited on frontend, anything on backend
  • Data: does not include functions

Storage

Frontend

Backend

Data�JSON/XML

Data

SQL/JSON

Code

Code

numbers

strings

relations

numbers

strings

lists�dictionaries

35 of 47

Data conversion / “Serialization”

Frontend

  • JSON to POPO
  • Functions in frontend code?
  • Python classes?

Storage

  • Normalisation of data across tables
  • Object-relational mapping (ORM) layer
  • “NoSQL” database can reduce normalisation
  • Efficient retrieval, query writing

class Dog:

def __init__(self, name, age):

self.name = name

self.age = age

def woof(self):

print(f"{self.name} says woof!")

dog = Dog("Rex", 4)

dog = {

"name": "Rex",

"age": 4

}

Python class & instance

JSON object

36 of 47

“Objects bind functions and data structures together in indivisible units.

This is a fundamental error since functions and data structures belong in totally different worlds.”

37 of 47

Every computer system ever

Processing

Input

Output

data

functions

38 of 47

Object-Relational Mapping (ORM)

  • Layer between business logic and storage
    • Defines models and relationships
    • Avoids writing SQL
  • Swap out database engine
  • Versioning and migrations

You can have ORM without OO!

from django.db import models

class Manufacturer(models.Model):

name = models.CharField(max_length=100)

class Car(models.Model):

name = models.CharField(max_length=50)

year = models.DateField()

manufacturer = models.ForeignKey(

Manufacturer,on_delete=models.CASCADE)

volvo = Manufacturer.objects.create(

name="Volvo")

xc40 = Car.objects.create(

name="XC 40",

year=2021,

manufacturer=volvo)

all_volvos = Car.objects.filter(� manufacturer=volvo)

39 of 47

Concurrency

  • Python “supports” concurrency — threading module
  • Only one thread at a time (no parallelism)
  • Threads compete for Global Interpreter Lock (GIL)
  • Not so bad for lots of I/O (which releases the GIL)
  • Very bad for parallel computation

In practice

  • Build concurrency at application level
  • Manager-Worker model
  • Parallelism between processes (not threads)
  • You probably need this anyway

40 of 47

Performance: in theory

  • Slow compared to other languages
  • Memory leaks
  • Bad benchmark performance

41 of 47

Performance: in reality

  • Benchmarks are not the “real world”
  • Side-by-side comparison of entire app unlikely
  • Bottlenecks come from many places
    1. Algorithms
    2. Data structures
    3. Database queries
    4. Network and file I/O
    5. External APIs
    6. Hosting environment
  • Speed isn’t everything
  • You have other problems
  • Memory leaks: just�restart periodically

42 of 47

Conclusions

  1. Real-world software is a jungle of tech and tools.
  2. Python is easy, but not safe.
  3. Language features can be both great and terrible.
  4. Keeping code manageable takes effort.
  5. Python is well-supported everywhere: API tutorials, hosting platforms, IDEs, AIs…
  6. Community is huge and active: easy to find libraries and developers�(but quality varies).
  7. Be sceptical of the object-oriented paradigm.

43 of 47

Sources

44 of 47

🐍 Thanks!

45 of 47

Unused slides…

46 of 47

“Duck-typing”

class Duck:

def swim(self):

print("Duck swimming")

def fly(self):

print("Duck flying")

class Whale:

def swim(self):

print("Whale swimming")

for animal in [Duck(), Whale()]:

animal.swim()

animal.fly()

Duck swimming

Duck flying

Whale swimming

AttributeError: 'Whale' object has no attribute 'fly'

"If it walks like a duck and it quacks like a duck, then it must be a duck"

47 of 47

Encapsulation (or lack thereof)

  • Class abstractions are leaky
  • Access modifiers do not exist
  • Anyone can access code from almost any place
  • Mangling protects namespaces when using inheritance, but still allows access to mangled members
  • Potentially unsafe

class Foo:

# public

def qux(self):

print "qux"

# protected

def _qib(self):

print "qib"

# private

def __qak(self):

print "qak"

>>> f = Foo()

>>> f.qux()

qux

>>> f._qib()

qib

>>> f._Foo__qak()

qak