Object oriented design in Python

Object oriented design

Benefits of object oriented design:

  • Encapsulation
  • Code reuse (composition, inheritance)
  • Abstraction

Encapsulation

class Location:
    def __init__(self, name, longitude, latitude):
        self.name = name.upper() # Names are always uppercase
        self.longitude = longitude
        self.latitude = latitude

>>> loc = Location("Antwerp", 4.42, 51.22)
>>> loc.name
'ANTWERP'
>>> loc.name = "Antwerpen"
>>> loc.name
"Antwerpen" 😟

Encapsulation - Attributes

Variables prefixed with an underscore (self._name) is a convention to indicate that the instance variable is private.

class Location:
    def __init__(self, name, longitude, latitude):
        self._name = name.upper() # Names are always uppercase
        ...

    @property
    def name(self):
        return self._name

    @name.setter
    def name(self, value):
        self._name = value.upper()

>>> loc = Location("Antwerp", 4.42, 51.22)
>>> loc.name = "Antwerpen"
>>> loc.name
"ANTWERPEN" 😊

Composition

Composition in object oriented design is a way to combine objects or data types into more complex objects.

classDiagram

    class Grid{
        + nx
        + dx
        + ny
        + dy
        + find_index()
    }

    class ItemInfo{
        + name
        + type
        + unit
    }

    class DataArray{
        + data
        + time
        + item
        + geometry
        + plot()
    }

    DataArray --* Grid
    DataArray --* ItemInfo

Composition - Example

class Grid:
    def __init__(self, nx, dx, ny, dy):
        self.nx = nx
        self.dx = dx
        self.ny = ny
        self.dy = dy
    
    def find_index(self, x,y):
        ...

class DataArray:
    def __init__(self, data, time, item, geometry):
        self.data = data
        self.time = time
        self.item = item
        self.geometry = geometry

    def plot(self):
        ...

DataArray has a geometry (e.g. Grid) and an item (ItemInfo).

Inheritance

  • Inheritance is a way to reuse code and specialize behavior.
  • A child class inherits the attributes and methods from the parent class.
  • A child class can override the methods of the parent class.
  • A child class can add new methods.

Inheritance - Example

classDiagram

class _GeometryFM{
+ node_coordinates
+ element_table
}

class GeometryFM2D{
+ interp2d()
+ get_element_area()
+ plot()
}

class _GeometryFMLayered{
- _n_layers
- _n_sigma
+ to_2d_geometry()
}

class GeometryFM3D{
+ plot()
}

class GeometryFMVerticalProfile{
+ plot()
}
  _GeometryFM <|-- GeometryFM2D
  _GeometryFM <|-- _GeometryFMLayered
  _GeometryFMLayered <|-- GeometryFM3D
  _GeometryFMLayered <|-- GeometryFMVerticalProfile

GeometryFM3D inherits from _GeometryFMLayered, it is a _GeometryFMLayered.

Inheritance - Example (2)

class _GeometryFMLayered(_GeometryFM):
    def __init__(self, nodes, elements, n_layers, n_sigma):
        # call the parent class init method
        super().__init__(
            nodes=nodes,
            elements=elements,
        )
        self._n_layers = n_layers
        self._n_sigma = n_sigma

Composition vs inheritance

  • Inheritance is often used to reuse code, but this is not the main purpose of inheritance.
  • Inheritance is used to specialize behavior.
  • In most cases, composition is a better choice than inheritance.
  • Some recent programming languages (e.g. Go & Rust) do not support this style of inheritance.
  • Use inheritance only when it makes sense.

Types

C#

int n = 2;
String s = "Hello";

public String RepeatedString(String s, int n) {
    return Enumerable.Repeat(s, n).Aggregate((a, b) => a + b);
}

Python

n = 2
s = "Hello"

def repeated_string(s, n):
    return s * n

Types

  • Python is a dynamically typed language
  • Types are not checked at compile time by the interpreter
  • Types can be checked before runtime using a linter (e.g. mypy)
  • Type hints can be used by VS Code to provide auto-completion
n: int = 2
s: str = "Hello"

def repeated_string(s:str, n:int) -> str:
    return s * n

Abstraction

Version A

total = 0.0
for x in values:
    total = total +x

Version B

total = sum(values)
  • Using functions, e.g. sum() allows us to operate on a higher level of abstraction.
  • Too little abstraction will force you to write many lines of boiler-plate code
  • Too much abstraction limits the flexibility
  • ✨Find the right level of abstraction!✨

Collections Abstract Base Classes

classDiagram
    Container <|-- Collection
    Sized <|-- Collection
    Iterable <|-- Collection
   
    class Container{
        __contains__(self, x)
    }

    class Sized{
        __len__(self)
    }

    class Iterable{
        __iter__(self)
    }

  • If a class implements __len__ it is a Sized object.
  • If a class implements __contains__ it is a Container object.
  • If a class implements __iter__ it is a Iterable object.

Collections Abstract Base Classes

>>> a = [1, 2, 3]
>>> 1 in a
True
>>> a.__contains__(1)
True
>>> len(a)
3
>>> a.__len__()
3
>>> for x in a:
...     v.append(x)
>>> it = a.__iter__()
>>> next(it)
1
>>> next(it)
2
>>> next(it)
3
>>> next(it)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Collections Abstract Base Classes

classDiagram
    Container <|-- Collection
    Sized <|-- Collection
    Iterable <|-- Collection
    Collection <|-- Sequence
    Collection <|-- Set
    Sequence <|-- MutableSequence
    Mapping <|-- MutableMapping
    Collection <|-- Mapping

    MutableSequence <|-- List
    Sequence <|-- Tuple
    MutableMapping <|-- Dict

Pythonic

If you want your code to be Pythonic, you have to be familiar with these types and their methods.

Dundermethods:

  • __getitem__
  • __setitem__
  • __len__
  • __contains__
class JavaLikeToolbox:

    def __init__(self, tools: Collection[Tool]):
        self.tools = tools
    
    def getToolByName(self, name: str) -> Tool:
        for tool in self.tools:
            if tool.name == name:
                return tool

    def numberOfTools(self) -> int:
        return len(self.tools)

>>> tb = JavaLikeToolbox([Hammer(), Screwdriver()])
>>> tb.getToolByName("hammer")
Hammer()
>>> tb.numberOfTools()
2
class Toolbox:

    def __init__(self, tools: Collection[Tool]):
        self._tools = {tool.name: tool for tool in tools}

    def __getitem__(self, name: str) -> Tool:
        return self._tools[name]
    
    def __len__(self) -> int:
        return len(self.tools)

>>> tb = Toolbox([Hammer(), Screwdriver()])
>>> tb["hammer"]
Hammer()
>>> len(tb)
2
class SparseMatrix:
    def __init__(self, shape, fill_value=0.0, data=None):
        self.shape = shape
        self._data = data if data is not None else {}
        self.fill_value = fill_value
        
    def __setitem__(self, key, value):
        i,j = key
        self._data[i,j] = float(value) 

    def __getitem__(self, key) -> float:
        i,j = key
        return self._data.get((i,j), self.fill_value)
    
    def transpose(self) -> "SparseMatrix":
        data = {(j,i) : v for (i,j),v in self._data.items()}
        return SparseMatrix(data=data,
                            shape=self.shape,
                            fill_value=self.fill_value)
    
    def __repr__(self):
        matrix_str = ""
        for j in range(self.shape[1]):
            for i in range(self.shape[0]):
                value = self[i, j]
                matrix_str += f"{value:<4}"
            matrix_str += "\n"
        return matrix_str
>>> m = SparseMatrix(shape=(2,2), fill_value=0.0)
>>> m
0.0 0.0
0.0 0.0
>>> m[0,1]
0.0
>>> m[0,1] = 1.0
>>> m[1,0] = 2.0
>>> m
0.0 2.0 
1.0 0.0 
>>> m.transpose()
0.0 1.0 
2.0 0.0

Duck typing

  • If it walks like a duck and quacks like a duck, it’s a duck
  • From the perspective of the caller, it doesn’t matter if it is a rubber duck or a real duck.
  • The type of the object is not important, as long as it has the right methods.
  • Python is different than C# or Java, where you would have to create an interface IToolbox and implement it for Toolbox.

Duck typing - Example

An example is a Scikit learn transformers

  • fit
  • transform
  • fit_transform

If you want to make a transformer compatible with sklearn, you have to implement these methods.

Duck typing - Example

class PositiveNumberTransformer:

    def fit(self, X, y=None):
        # no need to fit (still need to have the method!)
        return self

    def transform(self, X):
        return np.abs(X)

    def fit_transform(self, X, y=None):
        return self.fit(X, y).transform(X)

Duck typing - Mixins

We can inherit some behavior from sklearn.base.TransformerMixin

from sklearn.base import TransformerMixin

class RemoveOutliersTransformer(TransformerMixin):

    def __init__(self, lower_bound, upper_bound):
        self.lower_bound = lower_bound
        self.upper_bound = upper_bound
        self.lower_ = None
        self.upper_ = None

    def fit(self, X, y=None):
        self.lower_ = np.quantile(X, self.lower_bound)
        self.upper_ = np.quantile(X, self.upper_bound)

    def transform(self, X):
        return np.clip(X, self.lower_, self.upper_)

    # def fit_transform(self, X, y=None):
    # we get this for free, from TransformerMixin

Let’s revisit the (date) Interval

The Interval class represent an interval in time.

class Interval:
    def __init__(self, start, end):
        self.start = start
        self.end = end

    def __contains__(self, x):
        return self.start < x < self.end

>>> dr = Interval(date(2020, 1, 1), date(2020, 1, 31))

>>> date(2020,1,15) in dr
True
>>> date(1970,1,1) in dr
False

What if we want to make another type of interval, e.g. a interval of numbers \([1.0, 2.0]\)?

A number interval

class Interval:
    def __init__(self, start, end):
        self.start = start
        self.end = end

    def __contains__(self, x):
        return self.start < x < self.end
    
>>> interval = Interval(5, 10)

>>> 8 in interval
True
>>> 12 in interval
False

As long as the start, end and x are comparable, the Interval class is a generic class able to handle integers, floats, dates, datetimes, strings …

Postel’s law

a.k.a. the Robustness principle of software design

  1. Be liberal in what you accept
  2. Be conservative in what you send
def process(number: Union[int,str,float]) -> int:
    # make sure number is an int from now on
    number = int(number)

    result = number * 2
    return result   

The consumers of your package (future self), will be grateful if you are not overly restricitive in what types you accept as input.

Example - Pydantic

from pydantic import BaseModel
from datetime import date

class Sensor(BaseModel):
    name: str
    voltage: float
    install_date: date
    location: tuple[float, float]

s1 = Sensor(name="Sensor 1",
            voltage=3.3,
            install_date=date(2020, 1, 1),
            location=(4.42, 51.22))

data = {
    "name": "Sensor 1",
    "voltage": "3.3",
    "install_date": "2020-01-01",
    "location": ("4.42", "51.22")
}

s2 = Sensor(**data)

Refactoring

  • Refactoring is a way to improve the design of existing code
  • Changing a software system in such a way that it does not alter the external behavior of the code, yet improves its internal structure
  • Refactoring is a way to make code more readable and maintainable
  • Housekeeping

Common refactoring techniques:

  • Extract method
  • Extract variable
  • Rename method
  • Rename variable
  • Rename class
  • Inline method
  • Inline variable
  • Inline class

Rename variable

Before

n = 0
for v in y:
    if v < 0:
        n = n + 1

After

FREEZING_POINT = 0.0
n_freezing_days = 0
for temp in daily_max_temperatures:
    if temp < FREEZING_POINT:
        n_freezing_days = n_freezing_days + 1 

Extract variable

Before

def predict(x):
    return min(0.0, 0.5 + 2.0 * min(0,x) + (random.random() - 0.5) / 10.0)

After

def predict(x):
    scale = 10.0
    error = (random.random() - 0.5) / scale)
    a = 0.5
    b = 2.0 
    draft = a + b * x + error
    return  min(0.0, draft)

Extract method

def error(scale):
    return (random.random() - 0.5) / scale)

def linear_model(x, *, a=0.0, b=1.0):
    return a + b * x

def clip(x, *, min_value=0.0):
    return min(min_value, x)

def predict(x): 
    draft = linear_model(x, a=0.5, b=2.0) + error(scale=10.0)
    return clip(draft, min_value=0.)

Inline method

Opposite of extract mehtod.

def predict(x): 
    draft = linear_model(x, a=0.5, b=2.0) + error(scale=10.0)
    return min(0.0, x)

Composed method

Break up a long method into smaller methods.

# get data
os.shutil.copyfile(thisfile, localfile)
df = read_csv(localfile)

# clean data
df.dropna()
df.drop_duplicates()
df[somevar<0.0] = 0.0

# transform data
df.date = pd.to_datetime(df.date) - 86400

# predict
predictions = df.height + df.weight * df.age

def get_data(filename,...):
    ...

def clean_data(df):
    ...

def transform_data(df):
    ...

def predict(df):
    ...

def main():
    df = get_data("raw_data.csv")
    cleaned_data = clean_data(df)
    final_data = transform_data(cleaned_data)
    predictions = predict(final_data)

Composed method

  • Divide your program into methods that perform one identifiable task
  • Keep all of the operations in a method at the same level of abstraction.
  • This will naturally result in programs with many small methods, each a few lines long.
  • When you use Extract method a bunch of times on a method the original method becomes a Composed method.

If you want to learn more about refactoring, I recommend the book “Refactoring: Improving the Design of Existing Code” by Martin Fowler.

Summary

  • OOP is a way to organize your code
  • Encapsulation, composition, inheritance, abstraction
  • Duck Typing
  • Postel’s law
  • Refactoring