Object oriented design in Python
Object oriented design
Benefits of object oriented design:
- Encapsulation
- Code reuse (composition, inheritance)
- Abstraction
Encapsulation
Encapsulation - Attributes
Variables prefixed with an underscore (self._name
) is a convention to indicate that the instance variable is private.
class Location:
def __init__(self, name, longitude, latitude):
self._name = name.upper() # Names are always uppercase
...
@property
def name(self):
return self._name
@name.setter
def name(self, value):
self._name = value.upper()
>>> loc = Location("Antwerp", 4.42, 51.22)
>>> loc.name = "Antwerpen"
>>> loc.name
"ANTWERPEN" 😊
Composition
Composition in object oriented design is a way to combine objects or data types into more complex objects.
Composition - Example
class Grid:
def __init__(self, nx, dx, ny, dy):
self.nx = nx
self.dx = dx
self.ny = ny
self.dy = dy
def find_index(self, x,y):
...
class DataArray:
def __init__(self, data, time, item, geometry):
self.data = data
self.time = time
self.item = item
self.geometry = geometry
def plot(self):
...
. . .
DataArray
has a geometry
(e.g. Grid
) and an item
(ItemInfo
).
Inheritance
- Inheritance is a way to reuse code and specialize behavior.
- A child class inherits the attributes and methods from the parent class.
- A child class can override the methods of the parent class.
- A child class can add new methods.
Inheritance - Example
. . .
GeometryFM3D
inherits from _GeometryFMLayered
, it is a _GeometryFMLayered
.
Inheritance - Example (2)
class _GeometryFMLayered(_GeometryFM):
def __init__(self, nodes, elements, n_layers, n_sigma):
# call the parent class init method
super().__init__(
=nodes,
nodes=elements,
elements
)self._n_layers = n_layers
self._n_sigma = n_sigma
Composition vs inheritance
- Inheritance is often used to reuse code, but this is not the main purpose of inheritance.
- Inheritance is used to specialize behavior.
- In most cases, composition is a better choice than inheritance.
- Some recent programming languages (e.g. Go & Rust) do not support this style of inheritance.
- Use inheritance only when it makes sense.
Hillard, 2020, Ch. 8 “The rules (and exceptions) of inheritance”
Types
C#
int n = 2;
String s = "Hello";
public String RepeatedString(String s, int n) {
return Enumerable.Repeat(s, n).Aggregate((a, b) => a + b);
}
. . .
Python
Types
- Python is a dynamically typed language
- Types are not checked at compile time by the interpreter
- Types can be checked before runtime using a linter (e.g.
mypy
) - Type hints can be used by VS Code to provide auto-completion
. . .
int = 2
n: str = "Hello"
s:
def repeated_string(s:str, n:int) -> str:
return s * n
Abstraction
Version A
= 0.0
total for x in values:
= total +x total
Version B
= sum(values) total
. . .
- Using functions, e.g.
sum()
allows us to operate on a higher level of abstraction. - Too little abstraction will force you to write many lines of boiler-plate code
- Too much abstraction limits the flexibility
- ✨Find the right level of abstraction!✨
- Which version is easiest to understand?
- Which version is easiest to change?
Collections Abstract Base Classes
. . .
- If a class implements
__len__
it is aSized
object. - If a class implements
__contains__
it is aContainer
object. - If a class implements
__iter__
it is aIterable
object.
Collections Abstract Base Classes
Collections Abstract Base Classes
Pythonic
If you want your code to be Pythonic, you have to be familiar with these types and their methods.
Dundermethods:
__getitem__
__setitem__
__len__
__contains__
- …
class JavaLikeToolbox:
def __init__(self, tools: Collection[Tool]):
self.tools = tools
def getToolByName(self, name: str) -> Tool:
for tool in self.tools:
if tool.name == name:
return tool
def numberOfTools(self) -> int:
return len(self.tools)
>>> tb = JavaLikeToolbox([Hammer(), Screwdriver()])
>>> tb.getToolByName("hammer")
Hammer()>>> tb.numberOfTools()
2
class Toolbox:
def __init__(self, tools: Collection[Tool]):
self._tools = {tool.name: tool for tool in tools}
def __getitem__(self, name: str) -> Tool:
return self._tools[name]
def __len__(self) -> int:
return len(self.tools)
>>> tb = Toolbox([Hammer(), Screwdriver()])
>>> tb["hammer"]
Hammer()>>> len(tb)
2
You want your code to be feel like the built-in types.
class SparseMatrix:
def __init__(self, shape, fill_value=0.0, data=None):
self.shape = shape
self._data = data if data is not None else {}
self.fill_value = fill_value
def __setitem__(self, key, value):
i,j = key
self._data[i,j] = float(value)
def __getitem__(self, key) -> float:
i,j = key
return self._data.get((i,j), self.fill_value)
def transpose(self) -> "SparseMatrix":
data = {(j,i) : v for (i,j),v in self._data.items()}
return SparseMatrix(data=data,
shape=self.shape,
fill_value=self.fill_value)
def __repr__(self):
matrix_str = ""
for j in range(self.shape[1]):
for i in range(self.shape[0]):
value = self[i, j]
matrix_str += f"{value:<4}"
matrix_str += "\n"
return matrix_str
>>> m = SparseMatrix(shape=(2,2), fill_value=0.0)
>>> m
0.0 0.0
0.0 0.0
>>> m[0,1]
0.0
>>> m[0,1] = 1.0
>>> m[1,0] = 2.0
>>> m
0.0 2.0
1.0 0.0
>>> m.transpose()
0.0 1.0
2.0 0.0
Duck typing
- “If it walks like a duck and quacks like a duck, it’s a duck”
- From the perspective of the caller, it doesn’t matter if it is a rubber duck or a real duck.
- The type of the object is not important, as long as it has the right methods.
- Python is different than C# or Java, where you would have to create an interface
IToolbox
and implement it forToolbox
.
Duck typing - Example
An example is a Scikit learn transformers
fit
transform
fit_transform
If you want to make a transformer compatible with sklearn, you have to implement these methods.
Duck typing - Example
class PositiveNumberTransformer:
def fit(self, X, y=None):
# no need to fit (still need to have the method!)
return self
def transform(self, X):
return np.abs(X)
def fit_transform(self, X, y=None):
return self.fit(X, y).transform(X)
Duck typing - Mixins
We can inherit some behavior from sklearn.base.TransformerMixin
from sklearn.base import TransformerMixin
class RemoveOutliersTransformer(TransformerMixin):
def __init__(self, lower_bound, upper_bound):
self.lower_bound = lower_bound
self.upper_bound = upper_bound
self.lower_ = None
self.upper_ = None
def fit(self, X, y=None):
self.lower_ = np.quantile(X, self.lower_bound)
self.upper_ = np.quantile(X, self.upper_bound)
def transform(self, X):
return np.clip(X, self.lower_, self.upper_)
# def fit_transform(self, X, y=None):
# we get this for free, from TransformerMixin
Let’s revisit the (date) Interval
The Interval
class represent an interval in time.
class Interval:
def __init__(self, start, end):
self.start = start
self.end = end
def __contains__(self, x):
return self.start < x < self.end
>>> dr = Interval(date(2020, 1, 1), date(2020, 1, 31))
>>> date(2020,1,15) in dr
True
>>> date(1970,1,1) in dr
False
. . .
What if we want to make another type of interval, e.g. a interval of numbers \([1.0, 2.0]\)?
A number interval
class Interval:
def __init__(self, start, end):
self.start = start
self.end = end
def __contains__(self, x):
return self.start < x < self.end
>>> interval = Interval(5, 10)
>>> 8 in interval
True
>>> 12 in interval
False
. . .
As long as the start
, end
and x
are comparable, the Interval
class is a generic class able to handle integers, floats, dates, datetimes, strings …
Postel’s law
a.k.a. the Robustness principle of software design
- Be liberal in what you accept
- Be conservative in what you send
. . .
def process(number: Union[int,str,float]) -> int:
# make sure number is an int from now on
= int(number)
number
= number * 2
result return result
. . .
The consumers of your package (future self), will be grateful if you are not overly restricitive in what types you accept as input.
Example - Pydantic
from pydantic import BaseModel
from datetime import date
class Sensor(BaseModel):
str
name: float
voltage:
install_date: datetuple[float, float]
location:
= Sensor(name="Sensor 1",
s1 =3.3,
voltage=date(2020, 1, 1),
install_date=(4.42, 51.22))
location
= {
data "name": "Sensor 1",
"voltage": "3.3",
"install_date": "2020-01-01",
"location": ("4.42", "51.22")
}
= Sensor(**data) s2
Refactoring
- Refactoring is a way to improve the design of existing code
- Changing a software system in such a way that it does not alter the external behavior of the code, yet improves its internal structure
- Refactoring is a way to make code more readable and maintainable
- Housekeeping
Common refactoring techniques:
- Extract method
- Extract variable
- Rename method
- Rename variable
- Rename class
- Inline method
- Inline variable
- Inline class
Rename variable
Before
= 0
n for v in y:
if v < 0:
= n + 1 n
. . .
After
= 0.0
FREEZING_POINT = 0
n_freezing_days for temp in daily_max_temperatures:
if temp < FREEZING_POINT:
= n_freezing_days + 1 n_freezing_days
Extract variable
Before
def predict(x):
return min(0.0, 0.5 + 2.0 * min(0,x) + (random.random() - 0.5) / 10.0)
. . .
After
def predict(x):
= 10.0
scale = (random.random() - 0.5) / scale)
error = 0.5
a = 2.0
b = a + b * x + error
draft return min(0.0, draft)
Extract method
def error(scale):
return (random.random() - 0.5) / scale)
def linear_model(x, *, a=0.0, b=1.0):
return a + b * x
def clip(x, *, min_value=0.0):
return min(min_value, x)
def predict(x):
= linear_model(x, a=0.5, b=2.0) + error(scale=10.0)
draft return clip(draft, min_value=0.)
Inline method
Opposite of extract mehtod.
Composed method
Break up a long method into smaller methods.
# get data
os.shutil.copyfile(thisfile, localfile)= read_csv(localfile)
df
# clean data
df.dropna()
df.drop_duplicates()<0.0] = 0.0
df[somevar
# transform data
= pd.to_datetime(df.date) - 86400
df.date
# predict
= df.height + df.weight * df.age predictions
def get_data(filename,...):
...
def clean_data(df):
...
def transform_data(df):
...
def predict(df):
...
def main():
= get_data("raw_data.csv")
df = clean_data(df)
cleaned_data = transform_data(cleaned_data)
final_data = predict(final_data) predictions
Composed method
- Divide your program into methods that perform one identifiable task
- Keep all of the operations in a method at the same level of abstraction.
- This will naturally result in programs with many small methods, each a few lines long.
- When you use Extract method a bunch of times on a method the original method becomes a Composed method.
If you want to learn more about refactoring, I recommend the book “Refactoring: Improving the Design of Existing Code” by Martin Fowler.
Summary
- OOP is a way to organize your code
- Encapsulation, composition, inheritance, abstraction
- Duck Typing
- Postel’s law
- Refactoring