1 of 60

Let’s play: Code Review

how to write better python code first time

Dominik 'Disconnect3d' Czarnota

ThaiPy, 19.04.2018

1

2 of 60

# whoami

  • Interested in security, low level stuff and reverse engineering <3
  • Playing Capture The Flag contests in Just Hit the Core team
  • Working for
  • Contributing to some open source projects here and there ;)

https://github.com/disconnect3d/

https://disconnect3d.pl/

disconnect3d # irc.freenode.net

2

3 of 60

Ways to approach this talk...

3

4 of 60

  1. C-like languages vs Python

for i in range(len(collection)):� item = collection[i]# …

4

5 of 60

  • C-like languages vs Python

for i in range(len(collection)):� item = collection[i]# …

for item in collection:# …

5

6 of 60

  • C-like languages vs Python

for i in range(len(collection)):� item = collection[i]# …

for item in collection:# …

for idx, item in enumerate(collection):� # …

6

If you need both item and index...

7 of 60

2) Python batteries

  • Collections:
    • list, tuple, dict, set, frozenset
    • collections.defaultdict, collections.OrderedDict, collections.deque

7

8 of 60

2) Python batteries

  • Collections:
    • list, tuple, dict, set, frozenset
    • collections.defaultdict, collections.OrderedDict, collections.deque
  • All the fancy builtins:
    • map, zip, reduce, filter,
    • itertools.*, functools.*

8

9 of 60

2) Python batteries

  • Collections:
    • list, tuple, dict, set, frozenset
    • collections.defaultdict, collections.OrderedDict, collections.deque
  • All the fancy builtins:
    • map, zip, reduce, filter,
    • itertools.*, functools.*
  • All the cool modules or frameworks out there
    • requests, beautifulsoup, marshmallow, SqlAlchemy
    • Django, django-rest-framework, flask
    • And many many more...

9

10 of 60

3) Design patterns/principles

  • SOLID — single responsibility, dependency inversion, etc.
  • DRY — Don’t repeat yourself
  • KISS — Keep it simple stupid
  • YAGNI — You aren't gonna need it
  • >>> Stop writing classes <<< — its Python not Java :P

10

11 of 60

5) Tooling

  • Use Continuous Integration
    • Linting:
      • Typing — mypy (Python >= 3.5)
      • Style — e.g. PyLint
      • Security analysis — Bandit
  • Cool IDE setup — don’t waste time on style etc.
    • Setup consistent config across editors: editorconfig
  • Use your shell — e.g. IPython :>

11

12 of 60

6) Git commit

  • Learn git rebase
  • Set proper git commit messages

12

13 of 60

6) Git commit

13

14 of 60

6) Git commit

14

Lol wut

This

S u c k s

15 of 60

6) Git commit

15

16 of 60

7) Be a human

  • ‘How to do code review like a human’
    • Let computers do the boring stuff
    • Be generous with code examples
    • Never say ‘you’
    • Frame feedback as requests, not commands
    • Tie notes to principles, not opinions

16

17 of 60

Okay enough

17

18 of 60

Let’s do some reviews?

18

19 of 60

Example #1

19

20 of 60

charref = re.compile("&#(0[0-7]+""|[0-9]+""|x[0-9a-fA-F]+);")

—————————————————————————————————————————————————————————————————————————————————

20

What does it do?

21 of 60

charref = re.compile("&#(0[0-7]+""|[0-9]+""|x[0-9a-fA-F]+);")

—————————————————————————————————————————————————————————————————————————————————

charref = re.compile(r""" &[#] # Start of a numeric entity reference� (� 0[0-7]+ # Octal form� | [0-9]+ # Decimal form� | x[0-9a-fA-F]+ # Hexadecimal form� )� ; # Trailing semicolon�""", re.VERBOSE)

source: https://docs.python.org/3/howto/regex.html#compilation-flags

21

What does it do?

22 of 60

Example #2

22

23 of 60

class AbstractCpu:

def execute(self):"""Decode, and execute one instruction pointed by register PC""" # (...)�� try:

try: getattr(self, name)(*insn.operands)

except AttributeError: text_bytes = ' '.join('%02x' % x for x in insn.bytes) logger.info("Unimplemented instruction: 0x%016x:\t%s\t%s\t%s", insn.address, text_bytes, insn.mnemonic, insn.op_str) self.emulate(insn)

except (Interruption, Syscall) as e: e.on_handled = lambda: self._publish_instruction_as_executed(insn) raise e

else: self._publish_instruction_as_executed(insn)

23

class SomeCpu(AbstractCpu):def CMP(self, op1, op2): # ...def MOV(self, op1, op2): # ...

24 of 60

class AbstractCpu:

def execute(self):"""Decode, and execute one instruction pointed by register PC""" # (...)�� try:

try: getattr(self, name)(*insn.operands)

except AttributeError: text_bytes = ' '.join('%02x' % x for x in insn.bytes) logger.info("Unimplemented instruction: 0x%016x:\t%s\t%s\t%s", insn.address, text_bytes, insn.mnemonic, insn.op_str) self.emulate(insn)

except (Interruption, Syscall) as e: e.on_handled = lambda: self._publish_instruction_as_executed(insn) raise e

else: self._publish_instruction_as_executed(insn)

24

class SomeCpu(AbstractCpu):def CMP(self, op1, op2): # ...def MOV(self, op1, op2): # ...

25 of 60

class AbstractCpu:

def execute(self):"""Decode, and execute one instruction pointed by register PC""" # (...)�� try:

try: getattr(self, name)(*insn.operands)

except AttributeError: text_bytes = ' '.join('%02x' % x for x in insn.bytes) logger.info("Unimplemented instruction: 0x%016x:\t%s\t%s\t%s", insn.address, text_bytes, insn.mnemonic, insn.op_str) self.emulate(insn)

except (Interruption, Syscall) as e: e.on_handled = lambda: self._publish_instruction_as_executed(insn) raise e

else: self._publish_instruction_as_executed(insn)

25

class SomeCpu(AbstractCpu):def CMP(self, op1, op2): # ...def MOV(self, op1, op2): # ...

26 of 60

class AbstractCpu:

def execute(self):"""Decode, and execute one instruction pointed by register PC""" # (...)�� try:

try: getattr(self, name)(*insn.operands)

except AttributeError: text_bytes = ' '.join('%02x' % x for x in insn.bytes) logger.info("Unimplemented instruction: 0x%016x:\t%s\t%s\t%s", insn.address, text_bytes, insn.mnemonic, insn.op_str) self.emulate(insn)

except (Interruption, Syscall) as e: e.on_handled = lambda: self._publish_instruction_as_executed(insn) raise e

else: self._publish_instruction_as_executed(insn)

26

class SomeCpu(AbstractCpu):def CMP(self, op1, op2): # ...def MOV(self, op1, op2): # ...

27 of 60

class AbstractCpu:

def execute(self):"""Decode, and execute one instruction pointed by register PC""" # (...)�� try:

try: getattr(self, name)(*insn.operands)

except AttributeError: text_bytes = ' '.join('%02x' % x for x in insn.bytes) logger.info("Unimplemented instruction: 0x%016x:\t%s\t%s\t%s", insn.address, text_bytes, insn.mnemonic, insn.op_str) self.emulate(insn)

except (Interruption, Syscall) as e: e.on_handled = lambda: self._publish_instruction_as_executed(insn) raise e

else: self._publish_instruction_as_executed(insn)

27

class SomeCpu(AbstractCpu):def CMP(self, op1, op2): # ...def MOV(self, op1, op2): # ...

28 of 60

class AbstractCpu:

def execute(self):"""Decode, and execute one instruction pointed by register PC""" # (...)�� try:

try: getattr(self, name)(*insn.operands)

except AttributeError: text_bytes = ' '.join('%02x' % x for x in insn.bytes) logger.info("Unimplemented instruction: 0x%016x:\t%s\t%s\t%s", insn.address, text_bytes, insn.mnemonic, insn.op_str) self.emulate(insn)

except (Interruption, Syscall) as e: e.on_handled = lambda: self._publish_instruction_as_executed(insn) raise e

else: self._publish_instruction_as_executed(insn)

28

class SomeCpu(AbstractCpu):def CMP(self, op1, op2): # ...def MOV(self, op1, op2): # ...

29 of 60

class AbstractCpu:

def execute(self):"""Decode, and execute one instruction pointed by register PC""" # (...)�� try:

try: getattr(self, name)(*insn.operands)

except AttributeError: text_bytes = ' '.join('%02x' % x for x in insn.bytes) logger.info("Unimplemented instruction: 0x%016x:\t%s\t%s\t%s", insn.address, text_bytes, insn.mnemonic, insn.op_str) self.emulate(insn)

except (Interruption, Syscall) as e: e.on_handled = lambda: self._publish_instruction_as_executed(insn) raise e

else: self._publish_instruction_as_executed(insn)

29

class SomeCpu(AbstractCpu):def CMP(self, op1, op2): # ...def MOV(self, op1, op2): # ...

30 of 60

class AbstractCpu:

def execute(self):"""Decode, and execute one instruction pointed by register PC""" # (...)�� try:

try: getattr(self, name)(*insn.operands)

except AttributeError: text_bytes = ' '.join('%02x' % x for x in insn.bytes) logger.info("Unimplemented instruction: 0x%016x:\t%s\t%s\t%s", insn.address, text_bytes, insn.mnemonic, insn.op_str) self.emulate(insn)

except (Interruption, Syscall) as e: e.on_handled = lambda: self._publish_instruction_as_executed(insn) raise e

else: self._publish_instruction_as_executed(insn)

30

Is this code okay?

31 of 60

class AbstractCpu:

def execute(self):"""Decode, and execute one instruction pointed by register PC""" # (...)�� try:

try: getattr(self, name)(*insn.operands)

except AttributeError: text_bytes = ' '.join('%02x' % x for x in insn.bytes) logger.info("Unimplemented instruction: 0x%016x:\t%s\t%s\t%s", insn.address, text_bytes, insn.mnemonic, insn.op_str) self.emulate(insn)

except (Interruption, Syscall) as e: e.on_handled = lambda: self._publish_instruction_as_executed(insn) raise e

else: self._publish_instruction_as_executed(insn)

31

What if…

CMP/MOV/other instruction raises AttributeError?

32 of 60

class AbstractCpu:

def execute(self):"""Decode, and execute one instruction pointed by register PC""" # (...)�� try:

try: getattr(self, name)(*insn.operands)

except AttributeError: text_bytes = ' '.join('%02x' % x for x in insn.bytes) logger.info("Unimplemented instruction: 0x%016x:\t%s\t%s\t%s", insn.address, text_bytes, insn.mnemonic, insn.op_str) self.emulate(insn)

except (Interruption, Syscall) as e: e.on_handled = lambda: self._publish_instruction_as_executed(insn) raise e

else: self._publish_instruction_as_executed(insn)

32

Then we will emulate it instead...

33 of 60

class AbstractCpu:

def execute(self):"""Decode, and execute one instruction pointed by register PC""" # (...)�� try:

try: getattr(self, name)(*insn.operands)

except AttributeError: text_bytes = ' '.join('%02x' % x for x in insn.bytes) logger.info("Unimplemented instruction: 0x%016x:\t%s\t%s\t%s", insn.address, text_bytes, insn.mnemonic, insn.op_str) self.emulate(insn)

except (Interruption, Syscall) as e: e.on_handled = lambda: self._publish_instruction_as_executed(insn) raise e

else: self._publish_instruction_as_executed(insn)

33

34 of 60

class AbstractCpu:

def execute(self):"""Decode, and execute one instruction pointed by register PC""" # (...)�� try:

implementation = getattr(self, name, None)�� if implementation is not None:� implementation(*insn.operands)else: text_bytes = ' '.join('%02x' % x for x in insn.bytes) logger.info("Unimplemented instruction: 0x%016x:\t%s\t%s\t%s", insn.address, text_bytes, insn.mnemonic, insn.op_str) self.emulate(insn)

except (Interruption, Syscall) as e: e.on_handled = lambda: self._publish_instruction_as_executed(insn) raise e

else: self._publish_instruction_as_executed(insn)

34

35 of 60

Example #3

35

36 of 60

import struct��FIELDS = ['id', 'time_stamp', 'value']�STRUCT_FMT = 'QII'

class Feature:def __init__(self, values):� self._tuple = values�� @staticmethoddef create(**kwargs):return Feature(tuple(kwargs.get(field) for field in FIELDS))�� @propertydef time_stamp(self):return self._tuple[1]�� def serialize(self):return struct.pack(self.STRUCT_FMT, *self._tuple)�� @staticmethoddef deserialize(buffer):� values = struct.unpack(Feature.STRUCT_FMT, buffer)return Feature(values)

36

37 of 60

import struct��FIELDS = ['id', 'time_stamp', 'value']�STRUCT_FMT = 'QII'

class Feature:def __init__(self, values):� self._tuple = values�� @staticmethoddef create(**kwargs):return Feature(tuple(kwargs.get(field) for field in FIELDS))�� @propertydef time_stamp(self):return self._tuple[1]�� def serialize(self):return struct.pack(self.STRUCT_FMT, *self._tuple)�� @staticmethoddef deserialize(buffer):� values = struct.unpack(Feature.STRUCT_FMT, buffer)return Feature(values)

37

A property for each field

38 of 60

import struct��FIELDS = ['id', 'time_stamp', 'value']�STRUCT_FMT = 'QII'

class Feature:def __init__(self, values):� self._tuple = values�� @staticmethoddef create(**kwargs):return Feature(tuple(kwargs.get(field) for field in FIELDS))�� @propertydef time_stamp(self):return self._tuple[1]�� def serialize(self):return struct.pack(self.STRUCT_FMT, *self._tuple)�� @staticmethoddef deserialize(buffer):� values = struct.unpack(Feature.STRUCT_FMT, buffer)return Feature(values)

38

39 of 60

import struct��FIELDS = ['id', 'time_stamp', 'value']�STRUCT_FMT = 'QII'

class Feature:def __init__(self, values):� self._tuple = values�� @staticmethoddef create(**kwargs):return Feature(tuple(kwargs.get(field) for field in FIELDS))�� @propertydef time_stamp(self):return self._tuple[1]�� def serialize(self):return struct.pack(self.STRUCT_FMT, *self._tuple)�� @staticmethoddef deserialize(buffer):� values = struct.unpack(Feature.STRUCT_FMT, buffer)return Feature(values)

39

In [2]: struct.pack('I', 0xDEADBABE)�Out[2]: b'\xbe\xba\xad\xde'��In [3]: struct.unpack('I', b'\xbe\xba\xad\xde')�Out[3]: (3735927486,)

40 of 60

import struct��FIELDS = ['id', 'time_stamp', 'value']�STRUCT_FMT = 'QII'

class Feature:def __init__(self, values):� self._tuple = values�� @staticmethoddef create(**kwargs):return Feature(tuple(kwargs.get(field) for field in FIELDS))�� @propertydef time_stamp(self):return self._tuple[1]�� def serialize(self):return struct.pack(self.STRUCT_FMT, *self._tuple)�� @staticmethoddef deserialize(buffer):� values = struct.unpack(Feature.STRUCT_FMT, buffer)return Feature(values)

40

41 of 60

import struct��FIELDS = ['id', 'time_stamp', 'value']�STRUCT_FMT = 'QII'

class Feature:def __init__(self, id=None, time_stamp=None, value=None):� self._tuple = tuple(id, time_stamp, value)�� @staticmethoddef create(**kwargs):return Feature(tuple(kwargs.get(field) for field in FIELDS))@propertydef time_stamp(self):return self._tuple[1]�� def serialize(self):return struct.pack(self.STRUCT_FMT, *self._tuple)�� @staticmethoddef deserialize(buffer):� values = struct.unpack(Feature.STRUCT_FMT, buffer)return Feature(values)

41

42 of 60

import struct��FIELDS = ['id', 'time_stamp', 'value']�STRUCT_FMT = 'QII'

class Feature:def __init__(self, id=None, time_stamp=None, value=None):� self._tuple = tuple(id, time_stamp, value)@propertydef time_stamp(self):return self._tuple[1]�� def serialize(self):return struct.pack(self.STRUCT_FMT, *self._tuple)�� @staticmethoddef deserialize(buffer):� values = struct.unpack(Feature.STRUCT_FMT, buffer)return Feature(values)

42

43 of 60

import struct��FIELDS = ['id', 'time_stamp', 'value']�STRUCT_FMT = 'QII'

class Feature:def __init__(self, id=None, time_stamp=None, value=None):� self._tuple = tuple(id, time_stamp, value)@propertydef time_stamp(self):return self._tuple[1]�� def serialize(self):return struct.pack(self.STRUCT_FMT, *self._tuple)�� @staticmethoddef deserialize(buffer):� values = struct.unpack(Feature.STRUCT_FMT, buffer)return Feature(values)

43

Is this okay?

44 of 60

import ctypes���class Feature(ctypes.LittleEndianStructure):� _fields_ = (('id', ctypes.c_uint64),('time_stamp', ctypes.c_uint32),('value', ctypes.c_uint32))�� def serialize(self):return bytes(self)� � @classmethoddef deserialize(cls, buf):return cls.from_buffer_copy(buf)

44

45 of 60

import ctypes���class Feature(ctypes.LittleEndianStructure):� _fields_ = (('id', ctypes.c_uint64),('time_stamp', ctypes.c_uint32),('value', ctypes.c_uint32))�� def serialize(self):return bytes(self)� � @classmethoddef deserialize(cls, buf):return cls.from_buffer_copy(buf)

45

46 of 60

import ctypes���class Feature(ctypes.LittleEndianStructure):"""

# ... Serialization: bytes(feature) Deserialization: Feature.from_buffer_copy(buf) """�� _fields_ = (('id', ctypes.c_uint64),('time_stamp', ctypes.c_uint32),('value', ctypes.c_uint32))

46

47 of 60

import ctypes���class Feature(ctypes.LittleEndianStructure):"""

# ... Serialization: bytes(feature) Deserialization: Feature.from_buffer_copy(buf) """�� _fields_ = (('id', ctypes.c_uint64),('time_stamp', ctypes.c_uint32),('value', ctypes.c_uint32))

Usage:

In [2]: f = Feature(id=1, time_stamp=0xFACE, value=3)��In [3]: bytes(f)�Out[3]: b'\x01\x00\x00\x00\x00\x00\x00\x00\xce\xfa\x00\x00\x03\x00\x00\x00'��In [4]: Feature.from_buffer_copy(_3)�Out[4]: <__main__.Feature at 0x7f08200e8268>

47

48 of 60

import ctypes���class Feature(ctypes.LittleEndianStructure):"""

# ... Serialization: bytes(feature) Deserialization: Feature.from_buffer_copy(buf) """�� _fields_ = (('id', ctypes.c_uint64),('time_stamp', ctypes.c_uint32),('value', ctypes.c_uint32))

Usage:

In [2]: f = Feature(id=1, time_stamp=0xFACE, value=3)��In [3]: bytes(f)�Out[3]: b'\x01\x00\x00\x00\x00\x00\x00\x00\xce\xfa\x00\x00\x03\x00\x00\x00'��In [4]: Feature.from_buffer_copy(_3)�Out[4]: <__main__.Feature at 0x7f08200e8268>

48

49 of 60

Example #4

49

50 of 60

class SomeTable:� TABLE_NAME = 'some_table'� COL_NAMES = ('timestamp', 'some_col', ...)

def __init__(self, client):� self.client = client� self.table = self.client.open(self.TABLE_NAME)

def scan(self, ts_min, ts_max):� scanner = self.table.scanner()� scanner.add_predicates(� self.table['timestamp'] >= ts_min,� self.table['timestamp'] <= ts_max,)� tuples = scanner.get_tuples()

mapping = lambda t: (t[0], t[1], t[2], ...)

values = list(map(mapping, tuples))

return pandas.df(values, columns=self.COL_NAMES)

50

51 of 60

class SomeTable:� TABLE_NAME = 'some_table'� COL_NAMES = ('timestamp', 'some_col', ...)

def __init__(self, client):� self.client = client� self.table = self.client.open(self.TABLE_NAME)

def scan(self, ts_min, ts_max):� scanner = self.table.scanner()� scanner.add_predicates(� self.table['timestamp'] >= ts_min,� self.table['timestamp'] <= ts_max,)� tuples = scanner.get_tuples()

mapping = lambda t: (t[0], t[1], t[2], ...)

values = list(map(mapping, tuples))

return pandas.df(values, columns=self.COL_NAMES)

51

52 of 60

class SomeTable:� TABLE_NAME = 'some_table'� COL_NAMES = ('timestamp', 'some_col', ...)

def __init__(self, client):� self.client = clientself.table = self.client.open(self.TABLE_NAME)

def scan(self, ts_min, ts_max):� scanner = self.table.scanner()� scanner.add_predicates(� self.table['timestamp'] >= ts_min,� self.table['timestamp'] <= ts_max,)� tuples = scanner.get_tuples()

mapping = lambda t: (t[0], t[1], t[2], ...)

values = list(map(mapping, tuples))

return pandas.df(values, columns=self.COL_NAMES)

52

53 of 60

class SomeTable:� TABLE_NAME = 'some_table'� COL_NAMES = ('timestamp', 'some_col', ...)

def __init__(self, client):� self.client = client� self.table = self.client.open(self.TABLE_NAME)

def scan(self, ts_min, ts_max):� scanner = self.table.scanner()� scanner.add_predicates(� self.table['timestamp'] >= ts_min,� self.table['timestamp'] <= ts_max,)� tuples = scanner.get_tuples()

mapping = lambda t: (t[0], t[1], t[2], ...)

values = list(map(mapping, tuples))

return pandas.df(values, columns=self.COL_NAMES)

53

54 of 60

class SomeTable:� TABLE_NAME = 'some_table'� COL_NAMES = ('timestamp', 'some_col', ...)

def __init__(self, client):� self.client = client� self.table = self.client.open(self.TABLE_NAME)

def scan(self, ts_min, ts_max):� scanner = self.table.scanner()� scanner.add_predicates(� self.table['timestamp'] >= ts_min,� self.table['timestamp'] <= ts_max,)� tuples = scanner.get_tuples()

mapping = lambda t: (t[0], t[1], t[2], ...)

values = list(map(mapping, tuples))

return pandas.df(values, columns=self.COL_NAMES)

54

So what do you think here?

55 of 60

class SomeTable:� TABLE_NAME = 'some_table'� COL_NAMES = ('timestamp', 'some_col', ...)

def __init__(self, client):� self.client = client� self.table = self.client.open(self.TABLE_NAME)

def scan(self, ts_min, ts_max):� scanner = self.table.scanner()� scanner.add_predicates(� self.table['timestamp'] >= ts_min,� self.table['timestamp'] <= ts_max,)� tuples = scanner.get_tuples()

mapping = lambda t: (t[0], t[1], t[2], ...)

values = list(map(mapping, tuples))

return pandas.df(values, columns=self.COL_NAMES)

55

For me:

  • Hardcoded column names
  • Adding new column is a pain

56 of 60

class SomeTable:� TABLE_NAME = 'some_table'� COL_NAMES = ('timestamp', 'some_col', ...)

def __init__(self, client):� self.client = client� self.table = self.client.open(self.TABLE_NAME)

def scan(self, ts_min, ts_max):� scanner = self.table.scanner()� scanner.add_predicates(� self.table['timestamp'] >= ts_min,� self.table['timestamp'] <= ts_max,)� tuples = scanner.get_tuples()

mapping = lambda t: (t[0], t[1], t[2], ...)

values = list(map(mapping, tuples))

return pandas.df(values, columns=self.COL_NAMES)

56

For me:

  • Hardcoded column names
  • Adding new column is a pain
  • ...if the table is created by another component it might be good to check table schema during __init__

57 of 60

class SomeTable:� TABLE_NAME = 'some_table'� COL_NAMES = ('timestamp', 'some_col', ...)

def __init__(self, client):� self.client = client� self.table = self.client.open(self.TABLE_NAME)

def scan(self, ts_min, ts_max):� scanner = self.table.scanner()� scanner.add_predicates(� self.table['timestamp'] >= ts_min,� self.table['timestamp'] <= ts_max,)� tuples = scanner.get_tuples()

mapping = lambda t: (t[0], t[1], t[2], ...)

values = list(map(mapping, tuples))

return pandas.df(values, columns=self.COL_NAMES)

57

For me:

  • Hardcoded column names
  • Adding new column is a pain
  • ...if the table is created by another component it might be good to check table schema during __init__
  • and make an abstract/base class out of it

58 of 60

class SomeTable:� TABLE_NAME = 'some_table'� COL_NAMES = ('timestamp', 'some_col', ...)

def __init__(self, client):� self.client = client� self.table = self.client.open(self.TABLE_NAME)

def scan(self, ts_min, ts_max):� scanner = self.table.scanner()� scanner.add_predicates(� self.table['timestamp'] >= ts_min,� self.table['timestamp'] <= ts_max,)� tuples = scanner.get_tuples()

mapping = lambda t: (t[0], t[1], t[2], ...)

values = list(map(mapping, tuples))

return pandas.df(values, columns=self.COL_NAMES)

58

For me:

  • Hardcoded column names
  • Adding new column is a pain
  • ...if the table is created by another component it might be good to check table schema during __init__
  • and make an abstract/base class out of it

59 of 60

class SomeTable:� TABLE_NAME = 'some_table'� COL_NAMES = ('timestamp', 'some_col', ...)

def __init__(self, client):� self.client = client� self.table = self.client.open(self.TABLE_NAME)

def scan(self, ts_min, ts_max):� scanner = self.table.scanner()� scanner.add_predicates(� self.table['timestamp'] >= ts_min,� self.table['timestamp'] <= ts_max,)� tuples = scanner.get_tuples()

mapping = lambda t: (self._column_getter(col)(t) for col in self.COL_NAMES)

values = list(map(mapping, tuples))

return pandas.df(values, columns=self.COL_NAMES)

def _column_getter(col):

idx = self.COL_NAMES.index(col)

return operator.itemgetter(idx)

59

60 of 60

The end

Questions?

https://disconnect3d.pl/

disconnect3d # irc.freenode.net