Python Data classes using collections.namedtuple, typing.NamedTuple and latest @dataclass decorator.



Builders

The data class builders provide the following methods automatically:

  • __init__
  • __repr__
  • __eq__

Both collections.namedtuple and typing.NamedTuple build classes that are tuple subclasses. The @dataclass is a class decorator that does not affect the class hierarchy.

collections.namedtuple

The collections.namedtuple is the earliest data class. Example data class:

import collections
from dataclasses import dataclass

Cloth = collections.namedtuple('Cloth', ['color','size'])

class Shirts:
    sizes = 'S M L'.split()
    colors = 'Black White Blue Green Yellow'.split()

    def __init__(self) -> None:
        self._cloths = [Cloth(color, size) for color in self.colors for size in self.sizes]

    def __len__(self):
        return len(self._cloths)     

    def __getitem__(self, position):
        return self._cloths[position]

You can run the following, because in the above code implemented __len__ method.

shirts = Shirts()
len(shirts)

Now you can print all the available shirts:

for s in shirts:
    print(s)

The output is

Cloth(color='Black', size='S')
Cloth(color='Black', size='M')
Cloth(color='Black', size='L')
Cloth(color='White', size='S')
Cloth(color='White', size='M')
Cloth(color='White', size='L')
Cloth(color='Blue', size='S')
Cloth(color='Blue', size='M')
Cloth(color='Blue', size='L')
Cloth(color='Green', size='S')
Cloth(color='Green', size='M')
Cloth(color='Green', size='L')
Cloth(color='Yellow', size='S')
Cloth(color='Yellow', size='M')
Cloth(color='Yellow', size='L')

To sort the shirt from L,M and S order:

size_values = dict(S=3,M=2,L=1)
def high2low(cloth):
    color_value = Shirts.colors.index(cloth.color)
    return color_value * len(size_values) + size_values[cloth.size]

Print and compare with previous update:

for shirt in sorted(shirts, key=high2low):
    print(shirt)

The output is

Cloth(color='Black', size='L')
Cloth(color='Black', size='M')
Cloth(color='Black', size='S')
Cloth(color='White', size='L')
Cloth(color='White', size='M')
Cloth(color='White', size='S')
Cloth(color='Blue', size='L')
Cloth(color='Blue', size='M')
Cloth(color='Blue', size='S')
Cloth(color='Green', size='L')
Cloth(color='Green', size='M')
Cloth(color='Green', size='S')
Cloth(color='Yellow', size='L')
Cloth(color='Yellow', size='M')
Cloth(color='Yellow', size='S')

Another example found from Example 7-131 in the Fluent Python.

from collections import namedtuple

LatLon = namedtuple('LatLon', 'lat lon')  
Metropolis = namedtuple('Metropolis', 'name cc pop coord')  
metro_areas = [Metropolis(name, cc, pop, LatLon(lat, lon))  
    for name, cc, pop, (lat, lon) in metro_data]
metro_areas

the output is

[Metropolis(name='Tokyo', cc='JP', pop=36.933, coord=LatLon(lat=35.689722, lon=139.691667)),
 Metropolis(name='Delhi NCR', cc='IN', pop=21.935, coord=LatLon(lat=28.613889, lon=77.208889)),
...
...

to access

metro_areas[0].coord.lat
# output: 35.689722

more efficient to access using attrgetter in the operator package

from operator import attrgetter
name_lat = attrgetter('name', 'coord.lat')  

for city in metro_areas:  
    print(name_lat(city))

the ouput is

('Tokyo', 35.689722)
('Delhi NCR', 28.613889)
...

Convert to the json

import json
json.dumps(shirts[0]._asdict())

output is

'{"color": "Black", "size": "S"}'

typing.NamedTuple

Since Python 3.6, it made easy to add new methods and override methods by creating data classe extended from tuple.

from typing import NamedTuple
from dataclasses import field

class NTShirts(NamedTuple):
    sizes: list[str] = field(default_factory=list)
    colors: list[str] = field(default_factory=list)
    
nt = NTShirts('S M L'.split(),'Black White Blue Green Yellow'.split())
nt    

output is

NTShirts(sizes=['S', 'M', 'L'], colors=['Black', 'White', 'Blue', 'Green', 'Yellow'])

Data Class

Since Python 3.9, you can find the decorator which can be used to create data class:

from dataclasses import dataclass, field
from typing import ClassVar

@dataclass
class DCShirts:
    sizes: list[str] = field(default_factory=list)
    colors: list[str] = field(default_factory=list)
    inventory: ClassVar[int] = 99

dc = DCShirts('S M L'.split(),'Black White Blue Green Yellow'.split())
dc

This will generate the same output, but important point is ClassVar which hold the class variable, while sizes and colors are instance variables.

Both NamedTuple and docorator support PEP 526 – Syntax for Variable Annotations

  1. Fluent Python, 2nd Edition, Luciano Ramalho, Chapter 7