What are "named tuples" in Python? - python

What are named tuples and how do I use them?
When should I use named tuples instead of normal tuples, or vice versa?
Are there "named lists" too? (i.e. mutable named tuples)
For the last question specifically, see also Existence of mutable named tuple in Python?.

Named tuples are basically easy-to-create, lightweight object types. Named tuple instances can be referenced using object-like variable dereferencing or the standard tuple syntax. They can be used similarly to struct or other common record types, except that they are immutable. They were added in Python 2.6 and Python 3.0, although there is a recipe for implementation in Python 2.4.
For example, it is common to represent a point as a tuple (x, y). This leads to code like the following:
pt1 = (1.0, 5.0)
pt2 = (2.5, 1.5)
from math import sqrt
line_length = sqrt((pt1[0]-pt2[0])**2 + (pt1[1]-pt2[1])**2)
Using a named tuple it becomes more readable:
from collections import namedtuple
Point = namedtuple('Point', 'x y')
pt1 = Point(1.0, 5.0)
pt2 = Point(2.5, 1.5)
from math import sqrt
line_length = sqrt((pt1.x-pt2.x)**2 + (pt1.y-pt2.y)**2)
However, named tuples are still backwards compatible with normal tuples, so the following will still work:
Point = namedtuple('Point', 'x y')
pt1 = Point(1.0, 5.0)
pt2 = Point(2.5, 1.5)
from math import sqrt
# use index referencing
line_length = sqrt((pt1[0]-pt2[0])**2 + (pt1[1]-pt2[1])**2)
# use tuple unpacking
x1, y1 = pt1
Thus, you should use named tuples instead of tuples anywhere you think object notation will make your code more pythonic and more easily readable. I personally have started using them to represent very simple value types, particularly when passing them as parameters to functions. It makes the functions more readable, without seeing the context of the tuple packing.
Furthermore, you can also replace ordinary immutable classes that have no functions, only fields with them. You can even use your named tuple types as base classes:
class Point(namedtuple('Point', 'x y')):
[...]
However, as with tuples, attributes in named tuples are immutable:
>>> Point = namedtuple('Point', 'x y')
>>> pt1 = Point(1.0, 5.0)
>>> pt1.x = 2.0
AttributeError: can't set attribute
If you want to be able change the values, you need another type. There is a handy recipe for mutable recordtypes which allow you to set new values to attributes.
>>> from rcdtype import *
>>> Point = recordtype('Point', 'x y')
>>> pt1 = Point(1.0, 5.0)
>>> pt1 = Point(1.0, 5.0)
>>> pt1.x = 2.0
>>> print(pt1[0])
2.0
I am not aware of any form of "named list" that lets you add new fields, however. You may just want to use a dictionary in this situation. Named tuples can be converted to dictionaries using pt1._asdict() which returns {'x': 1.0, 'y': 5.0} and can be operated upon with all the usual dictionary functions.
As already noted, you should check the documentation for more information from which these examples were constructed.

What are named tuples?
A named tuple is a tuple.
It does everything a tuple can.
But it's more than just a tuple.
It's a specific subclass of a tuple that is programmatically created to your specification, with named fields and a fixed length.
This, for example, creates a subclass of tuple, and aside from being of fixed length (in this case, three), it can be used everywhere a tuple is used without breaking. This is known as Liskov substitutability.
New in Python 3.6, we can use a class definition with typing.NamedTuple to create a namedtuple:
from typing import NamedTuple
class ANamedTuple(NamedTuple):
"""a docstring"""
foo: int
bar: str
baz: list
The above is the same as collections.namedtuple, except the above additionally has type annotations and a docstring. The below is available in Python 2+:
>>> from collections import namedtuple
>>> class_name = 'ANamedTuple'
>>> fields = 'foo bar baz'
>>> ANamedTuple = namedtuple(class_name, fields)
This instantiates it:
>>> ant = ANamedTuple(1, 'bar', [])
We can inspect it and use its attributes:
>>> ant
ANamedTuple(foo=1, bar='bar', baz=[])
>>> ant.foo
1
>>> ant.bar
'bar'
>>> ant.baz.append('anything')
>>> ant.baz
['anything']
Deeper explanation
To understand named tuples, you first need to know what a tuple is. A tuple is essentially an immutable (can't be changed in-place in memory) list.
Here's how you might use a regular tuple:
>>> student_tuple = 'Lisa', 'Simpson', 'A'
>>> student_tuple
('Lisa', 'Simpson', 'A')
>>> student_tuple[0]
'Lisa'
>>> student_tuple[1]
'Simpson'
>>> student_tuple[2]
'A'
You can expand a tuple with iterable unpacking:
>>> first, last, grade = student_tuple
>>> first
'Lisa'
>>> last
'Simpson'
>>> grade
'A'
Named tuples are tuples that allow their elements to be accessed by name instead of just index!
You make a namedtuple like this:
>>> from collections import namedtuple
>>> Student = namedtuple('Student', ['first', 'last', 'grade'])
You can also use a single string with the names separated by spaces, a slightly more readable use of the API:
>>> Student = namedtuple('Student', 'first last grade')
How to use them?
You can do everything tuples can do (see above) as well as do the following:
>>> named_student_tuple = Student('Lisa', 'Simpson', 'A')
>>> named_student_tuple.first
'Lisa'
>>> named_student_tuple.last
'Simpson'
>>> named_student_tuple.grade
'A'
>>> named_student_tuple._asdict()
OrderedDict([('first', 'Lisa'), ('last', 'Simpson'), ('grade', 'A')])
>>> vars(named_student_tuple)
OrderedDict([('first', 'Lisa'), ('last', 'Simpson'), ('grade', 'A')])
>>> new_named_student_tuple = named_student_tuple._replace(first='Bart', grade='C')
>>> new_named_student_tuple
Student(first='Bart', last='Simpson', grade='C')
A commenter asked:
In a large script or programme, where does one usually define a named tuple?
The types you create with namedtuple are basically classes you can create with easy shorthand. Treat them like classes. Define them on the module level, so that pickle and other users can find them.
The working example, on the global module level:
>>> from collections import namedtuple
>>> NT = namedtuple('NT', 'foo bar')
>>> nt = NT('foo', 'bar')
>>> import pickle
>>> pickle.loads(pickle.dumps(nt))
NT(foo='foo', bar='bar')
And this demonstrates the failure to lookup the definition:
>>> def foo():
... LocalNT = namedtuple('LocalNT', 'foo bar')
... return LocalNT('foo', 'bar')
...
>>> pickle.loads(pickle.dumps(foo()))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
_pickle.PicklingError: Can't pickle <class '__main__.LocalNT'>: attribute lookup LocalNT on __main__ failed
Why/when should I use named tuples instead of normal tuples?
Use them when it improves your code to have the semantics of tuple elements expressed in your code.
You can use them instead of an object if you would otherwise use an object with unchanging data attributes and no functionality.
You can also subclass them to add functionality, for example:
class Point(namedtuple('Point', 'x y')):
"""adding functionality to a named tuple"""
__slots__ = ()
#property
def hypot(self):
return (self.x ** 2 + self.y ** 2) ** 0.5
def __str__(self):
return 'Point: x=%6.3f y=%6.3f hypot=%6.3f' % (self.x, self.y, self.hypot)
Why/when should I use normal tuples instead of named tuples?
It would probably be a regression to switch from using named tuples to tuples. The upfront design decision centers around whether the cost from the extra code involved is worth the improved readability when the tuple is used.
There is no extra memory used by named tuples versus tuples.
Is there any kind of "named list" (a mutable version of the named tuple)?
You're looking for either a slotted object that implements all of the functionality of a statically sized list or a subclassed list that works like a named tuple (and that somehow blocks the list from changing in size.)
A now expanded, and perhaps even Liskov substitutable, example of the first:
from collections import Sequence
class MutableTuple(Sequence):
"""Abstract Base Class for objects that work like mutable
namedtuples. Subclass and define your named fields with
__slots__ and away you go.
"""
__slots__ = ()
def __init__(self, *args):
for slot, arg in zip(self.__slots__, args):
setattr(self, slot, arg)
def __repr__(self):
return type(self).__name__ + repr(tuple(self))
# more direct __iter__ than Sequence's
def __iter__(self):
for name in self.__slots__:
yield getattr(self, name)
# Sequence requires __getitem__ & __len__:
def __getitem__(self, index):
return getattr(self, self.__slots__[index])
def __len__(self):
return len(self.__slots__)
And to use, just subclass and define __slots__:
class Student(MutableTuple):
__slots__ = 'first', 'last', 'grade' # customize
>>> student = Student('Lisa', 'Simpson', 'A')
>>> student
Student('Lisa', 'Simpson', 'A')
>>> first, last, grade = student
>>> first
'Lisa'
>>> last
'Simpson'
>>> grade
'A'
>>> student[0]
'Lisa'
>>> student[2]
'A'
>>> len(student)
3
>>> 'Lisa' in student
True
>>> 'Bart' in student
False
>>> student.first = 'Bart'
>>> for i in student: print(i)
...
Bart
Simpson
A

namedtuple is a factory function for making a tuple class. With that class we can create tuples that are callable by name also.
import collections
#Create a namedtuple class with names "a" "b" "c"
Row = collections.namedtuple("Row", ["a", "b", "c"])
row = Row(a=1,b=2,c=3) #Make a namedtuple from the Row class we created
print row #Prints: Row(a=1, b=2, c=3)
print row.a #Prints: 1
print row[0] #Prints: 1
row = Row._make([2, 3, 4]) #Make a namedtuple from a list of values
print row #Prints: Row(a=2, b=3, c=4)

namedtuples are a great feature, they are perfect container for data. When you have to "store" data you would use tuples or dictionaries, like:
user = dict(name="John", age=20)
or:
user = ("John", 20)
The dictionary approach is overwhelming, since dict are mutable and slower than tuples. On the other hand, the tuples are immutable and lightweight but lack readability for a great number of entries in the data fields.
namedtuples are the perfect compromise for the two approaches, the have great readability, lightweightness and immutability (plus they are polymorphic!).

named tuples allow backward compatibility with code that checks for the version like this
>>> sys.version_info[0:2]
(3, 1)
while allowing future code to be more explicit by using this syntax
>>> sys.version_info.major
3
>>> sys.version_info.minor
1

namedtuple
is one of the easiest ways to clean up your code and make it more readable. It self-documents what is happening in the tuple. Namedtuples instances are just as memory efficient as regular tuples as they do not have per-instance dictionaries, making them faster than dictionaries.
from collections import namedtuple
Color = namedtuple('Color', ['hue', 'saturation', 'luminosity'])
p = Color(170, 0.1, 0.6)
if p.saturation >= 0.5:
print "Whew, that is bright!"
if p.luminosity >= 0.5:
print "Wow, that is light"
Without naming each element in the tuple, it would read like this:
p = (170, 0.1, 0.6)
if p[1] >= 0.5:
print "Whew, that is bright!"
if p[2]>= 0.5:
print "Wow, that is light"
It is so much harder to understand what is going on in the first example. With a namedtuple, each field has a name. And you access it by name rather than position or index. Instead of p[1], we can call it p.saturation. It's easier to understand. And it looks cleaner.
Creating an instance of the namedtuple is easier than creating a dictionary.
# dictionary
>>>p = dict(hue = 170, saturation = 0.1, luminosity = 0.6)
>>>p['hue']
170
#nametuple
>>>from collections import namedtuple
>>>Color = namedtuple('Color', ['hue', 'saturation', 'luminosity'])
>>>p = Color(170, 0.1, 0.6)
>>>p.hue
170
When might you use namedtuple
As just stated, the namedtuple makes understanding tuples much
easier. So if you need to reference the items in the tuple, then
creating them as namedtuples just makes sense.
Besides being more lightweight than a dictionary, namedtuple also
keeps the order unlike the dictionary.
As in the example above, it is simpler to create an instance of
namedtuple than dictionary. And referencing the item in the named
tuple looks cleaner than a dictionary. p.hue rather than
p['hue'].
The syntax
collections.namedtuple(typename, field_names[, verbose=False][, rename=False])
namedtuple is in the collections library.
typename: This is the name of the new tuple subclass.
field_names: A sequence of names for each field. It can be a sequence
as in a list ['x', 'y', 'z'] or string x y z (without commas, just
whitespace) or x, y, z.
rename: If rename is True, invalid fieldnames are automatically
replaced with positional names. For example, ['abc', 'def', 'ghi','abc'] is converted to ['abc', '_1', 'ghi', '_3'], eliminating the
keyword 'def' (since that is a reserved word for defining functions)
and the duplicate fieldname 'abc'.
verbose: If verbose is True, the class definition is printed just
before being built.
You can still access namedtuples by their position, if you so choose. p[1] == p.saturation. It still unpacks like a regular tuple.
Methods
All the regular tuple methods are supported. Ex: min(), max(), len(), in, not in, concatenation (+), index, slice, etc. And there are a few additional ones for namedtuple. Note: these all start with an underscore. _replace, _make, _asdict.
_replace
Returns a new instance of the named tuple replacing specified fields with new values.
The syntax
somenamedtuple._replace(kwargs)
Example
>>>from collections import namedtuple
>>>Color = namedtuple('Color', ['hue', 'saturation', 'luminosity'])
>>>p = Color(170, 0.1, 0.6)
>>>p._replace(hue=87)
Color(87, 0.1, 0.6)
>>>p._replace(hue=87, saturation=0.2)
Color(87, 0.2, 0.6)
Notice: The field names are not in quotes; they are keywords here.
Remember: Tuples are immutable - even if they are namedtuples and have the _replace method. The _replace produces a new instance; it does not modify the original or replace the old value. You can of course save the new result to the variable. p = p._replace(hue=169)
_make
Makes a new instance from an existing sequence or iterable.
The syntax
somenamedtuple._make(iterable)
Example
>>>data = (170, 0.1, 0.6)
>>>Color._make(data)
Color(hue=170, saturation=0.1, luminosity=0.6)
>>>Color._make([170, 0.1, 0.6]) #the list is an iterable
Color(hue=170, saturation=0.1, luminosity=0.6)
>>>Color._make((170, 0.1, 0.6)) #the tuple is an iterable
Color(hue=170, saturation=0.1, luminosity=0.6)
>>>Color._make(170, 0.1, 0.6)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 15, in _make
TypeError: 'float' object is not callable
What happened with the last one? The item inside the parenthesis should be the iterable. So a list or tuple inside the parenthesis works, but the sequence of values without enclosing as an iterable returns an error.
_asdict
Returns a new OrderedDict which maps field names to their corresponding values.
The syntax
somenamedtuple._asdict()
Example
>>>p._asdict()
OrderedDict([('hue', 169), ('saturation', 0.1), ('luminosity', 0.6)])
Reference: https://www.reddit.com/r/Python/comments/38ee9d/intro_to_namedtuple/
There is also named list which is similar to named tuple but mutable
https://pypi.python.org/pypi/namedlist

from collections import namedtuple
They subclass tuple, and add a layer to assign property names to the positional elements
'namedtuple' is a function that generates a new class that inherits from "tuple" but also provides "named properties" to access elements of the tuple.
Generating Named Tuple Classes
"namedtuple" is a class factory. It needs a few things to generate the class
the class name
A sequence of field names we want to assign, in the order of elements in the tuple. Field names can be any valid variable names except that they cannot start with an "underscore".
The return value of the call to "namedtuple" will be a class. We need to assign that class to a variable name in our code so we can use it to construct instances. In general, we use the same name as the name of the class that was generated.
# Coords is a class
Coords = namedtuple('Coords', ['x', 'y'])
Now we can create instances of Coords class:
pt=Coords(10,20)
There are many ways we can provide the list of field names to the namedtuple function.
a list of strings
namedtuple('Coords',['x','y'])
a tuple of strings
namedtuple('Coords',('x','y'))
a single string with the field names separated by whitespace or commas
namedtuple('Coords','x, y'])
Instantiating Named Tuples
After we have created a named tuple class, we can instantiate them just like an ordinary class. In fact, the __new__ method of the generated class uses the field names we provided as param names.
Coords = namedtuple('Coords', ['x', 'y'])
coord=Coords(10,20)
Accessing Data in named tuple:
Since named tuples inherit from tuples, we can still handle them just like any other tuple: by index, slicing, iterating
Coords = namedtuple('Coords', ['x', 'y'])
coord=Coords(10,20) isinstance(coord,tuple) --> True # namedtuple is subclass of tuple
x,y=coord # Unpacking
x=coord[0] # by index
for e in coord:
print(e)
Now we can also access the data using the field names just as we do with the classes.
coord.x --> 10
coord.y --> 20
Since namedtuple is generated classes inherit from tuple, we can write like this:
class Coord(tuple):
....
"coord" is a tuple, therefore immutable
"rename" keyword arg for namedtuple
Field names cannot start with an underscore
Coords = namedtuple('Coords', ['x', '_y']) # does not work
namedtuple has a keyword-only argument, rename (defaults to False) that will automatically rename any invalid field name.
Coords = namedtuple('Coords', ['x', '_y'], rename=True)
field name "x" wont change, but "_y" will change to _1. 1 is the index of the field name.
Imagine the scenario where you need to update your application so you want to use namedTuple to store the users of your application. You need to extract the column names but they are invalid for named tuples and it will throw an exception. In this case, you use rename=True.
Extracting Named Tuple values into a dictionary
Coords = namedtuple('Coords', ['x', 'y'])
coord=Coords(10,20)
coord._asdict()
{'x': 10, 'y': 20}
Why do we use namedtuple
If you have this class:
class Stock:
def __init__(self, symbol, year, month, day, open, high, low, close):
self.symbol = symbol
self.year = year
self.month = month
self.day = day
self.open = open
self.high = high
self.low = low
self.close = close
Class Approach - vs - Tuple Approach
stock.symbol stock[0]
stock.open stock[4]
stock.close stock[7]
stock.high – stock.low stock[5] – stock[6]
As you see, the tuple approach is not readable. The namedtuple function in collections allows us to create a tuple that also has names attached to each field or property. This can be handy to reference data in the tuple structure by "name" instead of just relying on position. But keep in mind, tuples are immutable so if you want mutability, stick to class
Since namedtuple is iterable you can use the iterable methods. For example, if you have "coords" as a class instance, you cannot look for what is the max coord. But with named-tuple, you can.

What is namedtuple ?
As the name suggests, namedtuple is a tuple with name. In standard tuple, we access the elements using the index, whereas namedtuple allows user to define name for elements. This is very handy especially processing csv (comma separated value) files and working with complex and large dataset, where the code becomes messy with the use of indices (not so pythonic).
How to use them ?
>>>from collections import namedtuple
>>>saleRecord = namedtuple('saleRecord','shopId saleDate salesAmout totalCustomers')
>>>
>>>
>>>#Assign values to a named tuple
>>>shop11=saleRecord(11,'2015-01-01',2300,150)
>>>shop12=saleRecord(shopId=22,saleDate="2015-01-01",saleAmout=1512,totalCustomers=125)
Reading
>>>#Reading as a namedtuple
>>>print("Shop Id =",shop12.shopId)
12
>>>print("Sale Date=",shop12.saleDate)
2015-01-01
>>>print("Sales Amount =",shop12.salesAmount)
1512
>>>print("Total Customers =",shop12.totalCustomers)
125
Interesting Scenario in CSV Processing :
from csv import reader
from collections import namedtuple
saleRecord = namedtuple('saleRecord','shopId saleDate totalSales totalCustomers')
fileHandle = open("salesRecord.csv","r")
csvFieldsList=csv.reader(fileHandle)
for fieldsList in csvFieldsList:
shopRec = saleRecord._make(fieldsList)
overAllSales += shopRec.totalSales;
print("Total Sales of The Retail Chain =",overAllSales)

In Python inside there is a good use of container called a named tuple, it can be used to create a definition of class and has all the features of the original tuple.
Using named tuple will be directly applied to the default class template to generate a simple class, this method allows a lot of code to improve readability and it is also very convenient when defining a class.

I think it's worth adding information about NamedTuples using type hinting:
# dependencies
from typing import NamedTuple, Optional
# definition
class MyNamedTuple(NamedTuple):
an_attribute: str
my_attribute: Optional[str] = None
next_attribute: int = 1
# instantiation
my_named_tuple = MyNamedTuple("abc", "def")
# or more explicitly:
other_tuple = MyNamedTuple(an_attribute="abc", my_attribute="def")
# access
assert "abc" == my_named_tuple.an_attribute
assert 1 == other_tuple.next_attribute

Another way (a new way) to use named tuple is using NamedTuple from typing package: Type hints in namedtuple
Let's use the example of the top answer in this post to see how to use it.
(1) Before using the named tuple, the code is like this:
pt1 = (1.0, 5.0)
pt2 = (2.5, 1.5)
from math import sqrt
line_length = sqrt((pt1[0] - pt2[0])**2 + (pt1[1] - pt2[1])**2)
print(line_length)
(2) Now we use the named tuple
from typing import NamedTuple
inherit the NamedTuple class and define the variable name in the new class. test is the name of the class.
class test(NamedTuple):
x: float
y: float
create instances from the class and assign values to them
pt1 = test(1.0, 5.0) # x is 1.0, and y is 5.0. The order matters
pt2 = test(2.5, 1.5)
use the variables from the instances to calculate
line_length = sqrt((pt1.x - pt2.x)**2 + (pt1.y - pt2.y)**2)
print(line_length)

Try this:
collections.namedtuple()
Basically, namedtuples are easy to create, lightweight object types.
They turn tuples into convenient containers for simple tasks.
With namedtuples, you don’t have to use integer indices for accessing members of a tuple.
Examples:
Code 1:
>>> from collections import namedtuple
>>> Point = namedtuple('Point','x,y')
>>> pt1 = Point(1,2)
>>> pt2 = Point(3,4)
>>> dot_product = ( pt1.x * pt2.x ) +( pt1.y * pt2.y )
>>> print dot_product
11
Code 2:
>>> from collections import namedtuple
>>> Car = namedtuple('Car','Price Mileage Colour Class')
>>> xyz = Car(Price = 100000, Mileage = 30, Colour = 'Cyan', Class = 'Y')
>>> print xyz
Car(Price=100000, Mileage=30, Colour='Cyan', Class='Y')
>>> print xyz.Class
Y

Everyone else has already answered it, but I think I still have something else to add.
Namedtuple could be intuitively deemed as a shortcut to define a class.
See a cumbersome and conventional way to define a class .
class Duck:
def __init__(self, color, weight):
self.color = color
self.weight = weight
red_duck = Duck('red', '10')
In [50]: red_duck
Out[50]: <__main__.Duck at 0x1068e4e10>
In [51]: red_duck.color
Out[51]: 'red'
As for namedtuple
from collections import namedtuple
Duck = namedtuple('Duck', ['color', 'weight'])
red_duck = Duck('red', '10')
In [54]: red_duck
Out[54]: Duck(color='red', weight='10')
In [55]: red_duck.color
Out[55]: 'red'

Related

converting dict to typing.NamedTuple in python

For most, I am not sure if it's the right question to be asked but I couldn't yet found out why there are two different types of named tuple...
I have read " What's the difference between namedtuple and NamedTuple?" page.
However, I still don't understand how to convert a dictionary to a NamedTuple.
I have tried this code :
from collections import namedtuple
def convert(dictionary):
return namedtuple('GenericDict', dictionary.keys())(**dictionary)
however, this piece of code only converts the dict to a namedtuple from the collection module.
I was wondering if anyone can help me out on this.
How should I make a function to transform any random dict into a typing.NamedTuple.
Assume we have a class of NamedTuple like this :
class settingdefault(NamedTuple):
epoch : int = 8
train_size : float = 0.8
b: str = "doe"
and we just want to get an input of dict from the user and transform it to NamedTuple. So if there was an element missing it can get replaced by the settingdefault class.
and lets assume that the example dict is :
config = dict(e=10, b="test")
BTW, I want it to be like a function. other than that I know how to do it like :
setting = settingdefault(config['a'], config['b'])
I want to be able to have it for cases that I don't know the keys of the coming config dict as it can be anything.
Once again for the clarification ! My question is about typing.NamedTuple not the collections.namedtuple .
In this case, it's probably easiest to use typing's own metaclass for NamedTuple to create the class. The tricky part is that typing.NamedTuples need to have types associated for the fields.
Your conversion functions will end up looking something like this:
def convert(class_specs):
field_types = {field: type(value) for field, value in class_specs.items()}
return typing.NamedTupleMeta(
'GenericDict', [], dict(class_specs, __annotations__=field_types))
and like your example you can use it as follows:
>>> Coordinates = convert({'x': 1, 'y': 23})
>>> Coordinates()
GenericDict(x=1, y=23)
>>> Coordinates(0, 0)
GenericDict(x=0, y=0)
Getting rid of the defaults
If you don't want to use the values from the dictionary as defaults, you can simply directly use the NamedTuple constructor like so:
def convert(class_specs):
return typing.NamedTuple('GenericDict',
[(field, type(value)) for field, value in class_specs.items()])
>>> Coordinates = convert({'x': 42, 'y': 21})
>>> Coordinates()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: __new__() missing 2 required positional arguments: 'x' and 'y'
>>> Coordinates(x=1, y=2)
GenericDict(x=1, y=2)
Getting rid of defaults and type information
If you don't want the defaults or typing information, you can simply switch it out for the generic typing.Any using:
def convert(class_specs):
return typing.NamedTuple('GenericDict',
[(field, typing.Any) for field in class_specs.keys()])

Is there a reason to prefer list or tuple for __slots__?

You can define __slots__ in new-style python classes using either list or tuple (or perhaps any iterable?). The type persists after instances are created.
Given that tuples are always a little more efficient than lists and are immutable, is there any reason why you would not want to use a tuple for __slots__?
>>> class foo(object):
... __slots__ = ('a',)
...
>>> class foo2(object):
... __slots__ = ['a']
...
>>> foo().__slots__
('a',)
>>> foo2().__slots__
['a']
First, tuples aren't any more efficient than lists; they both support the exact same fast iteration mechanism from C API code, and use the same code for both indexing and iterating from Python.
More importantly, the __slots__ mechanism doesn't actually use the __slots__ member except during construction. This may not be that clearly explained by the documentation, but if you read all of the bullet points carefully enough the information is there.
And really, it has to be true. Otherwise, this wouldn't work:
class Foo(object):
__slots__ = (x for x in ['a', 'b', 'c'] if x != 'b')
… and, worse, this would:
slots = ['a', 'b', 'c']
class Foo(object):
__slots__ = slots
foo = Foo()
slots.append('d')
foo.d = 4
For further proof:
>>> a = ['a', 'b']
>>> class Foo(object):
... __slots__ = a
>>> del Foo.__slots__
>>> foo = Foo()
>>> foo.d = 3
AttributeError: 'Foo' object has no attribute 'd'
>>> foo.__dict__
AttributeError: 'Foo' object has no attribute '__dict__'
>>> foo.__slots__
AttributeError: 'Foo' object has no attribute '__slots__'
So, that __slots__ member in Foo is really only there for documentation and introspection purposes. Which means there is no performance issue, or behavior issue, just a stylistic one.
According to the Python docs..
This class variable can be assigned a string, iterable, or sequence of
strings with variable names used by instances.
So, you can define it using any iterable. Which one you use is up to you, but in terms of which to "prefer", I would use a list.
First, let's look at what would be the preferred choice if performance were not an issue, which would mean it would be the same decision you would make between list and tuples in all Python code. I would say a list, and the reason is because a tuple is design to have semantic structure: it should semantically mean something that you stored an element as the first item rather than the second. For example, if you stored the first value of an (X,Y) coordinate tuple (the X) as the second item, you just completely changed the semantic value of the structure. If you rearrange the names of the attributes in the __slots__ list, you haven't semantically changed anything. Therefore, in this case, you should use a list.
Now, about performance. First, this is probably premature optimization. I don't know about the performance difference between lists and tuples, but I would guess there isn't anyway. But even assuming there is, it would really only come into play if the __slots__ variable is accessed many times.
I haven't actually looked at the code for when __slots__ is accessed, but I ran the following test..
print('Defining slotter..')
class Slotter(object):
def __iter__(self):
print('Looking for slots')
yield 'A'
yield 'B'
yield 'C'
print('Defining Mine..')
class Mine(object):
__slots__ = Slotter()
print('Creating first mine...')
m1 = Mine()
m1.A = 1
m1.B = 2
print('Creating second mine...')
m2 = Mine()
m2.A = 1
m2.C = 2
Basically, I use a custom class so that I can see exactly when the slots variable is actually iterated. You'll see that it is done exactly once, when the class is defined.
Defining slotter..
Defining Mine..
Looking for slots
Creating first mine...
Creating second mine...
Unless there is a case that I'm missing where the __slots__ variable is iterated again, I think that the performance difference can be declared negligible at worst.

Python: Counter not taking class name of namedtuple into account ... how do I fix?

>>> Employee = namedtuple("Employee", "name")
>>> Patient = namedtuple("Patient", "name")
>>> e = Employee("Mike")
>>> p = Patient("Mike")
>>> Counter([e, p])
Counter({Employee(name='Mike'): 2})
Why doesn't the Counter differentiate between the two classes of namedtuple?
Namedtuples are, as the name implies, tuples. They are compared elementwise. Since both of your tuples have "Mike" as the first (and only) element, they are equal. It doesn't matter that they're different classes; only the contents are compared.
If you want to take account of the class itself in comparison, you'd have to write your own wrapper class. (One simple possibility would be to make a wrapper that includes the class name as an element of the tuple, so employee-Mike would become ("Employee", "Mike") and patient-Mike would be ("Patient", "Mike").)

How to make a dictionary that returns key for keys missing from the dictionary instead of raising KeyError?

I want to create a python dictionary that returns me the key value for the keys are missing from the dictionary.
Usage example:
dic = smart_dict()
dic['a'] = 'one a'
print(dic['a'])
# >>> one a
print(dic['b'])
# >>> b
dicts have a __missing__ hook for this:
class smart_dict(dict):
def __missing__(self, key):
return key
Could simplify it as (since self is never used):
class smart_dict(dict):
#staticmethod
def __missing__(key):
return key
Why don't you just use
dic.get('b', 'b')
Sure, you can subclass dict as others point out, but I find it handy to remind myself every once in a while that get can have a default value!
If you want to have a go at the defaultdict, try this:
dic = defaultdict()
dic.__missing__ = lambda key: key
dic['b'] # should set dic['b'] to 'b' and return 'b'
except... well: AttributeError: ^collections.defaultdict^object attribute '__missing__' is read-only, so you will have to subclass:
from collections import defaultdict
class KeyDict(defaultdict):
def __missing__(self, key):
return key
d = KeyDict()
print d['b'] #prints 'b'
print d.keys() #prints []
Congratulations. You too have discovered the uselessness of the
standard collections.defaultdict type. If that execrable midden heap of code smell
offends your delicate sensibilities as much as it did mine, this is your lucky
StackOverflow day.
Thanks to the forbidden wonder of the 3-parameter
variant of the type()
builtin, crafting a non-useless default dictionary type is both fun and profitable.
What's Wrong with dict.__missing__()?
Absolutely nothing, assuming you like excess boilerplate and the shocking silliness of collections.defaultdict – which should behave as expected but really doesn't. To be fair, Jochen
Ritzel's accepted
solution of subclassing dict and
implementing the optional __missing__()
method is a fantastic
workaround for small-scale use cases only requiring a single default dictionary.
But boilerplate of this sort scales poorly. If you find yourself instantiating
multiple default dictionaries, each with their own slightly different logic for
generating missing key-value pairs, an industrial-strength alternative
automating boilerplate is warranted.
Or at least nice. Because why not fix what's broken?
Introducing DefaultDict
In less than ten lines of pure Python (excluding docstrings, comments, and
whitespace), we now define a DefaultDict type initialized with a user-defined
callable generating default values for missing keys. Whereas the callable passed
to the standard collections.defaultdict type uselessly accepts no
parameters, the callable passed to our DefaultDict type usefully accepts the
following two parameters:
The current instance of this dictionary.
The current missing key to generate a default value for.
Given this type, solving sorin's
question reduces to a single line of Python:
>>> dic = DefaultDict(lambda self, missing_key: missing_key)
>>> dic['a'] = 'one a'
>>> print(dic['a'])
one a
>>> print(dic['b'])
b
Sanity. At last.
Code or It Didn't Happen
def DefaultDict(keygen):
'''
Sane **default dictionary** (i.e., dictionary implicitly mapping a missing
key to the value returned by a caller-defined callable passed both this
dictionary and that key).
The standard :class:`collections.defaultdict` class is sadly insane,
requiring the caller-defined callable accept *no* arguments. This
non-standard alternative requires this callable accept two arguments:
#. The current instance of this dictionary.
#. The current missing key to generate a default value for.
Parameters
----------
keygen : CallableTypes
Callable (e.g., function, lambda, method) called to generate the default
value for a "missing" (i.e., undefined) key on the first attempt to
access that key, passed first this dictionary and then this key and
returning this value. This callable should have a signature resembling:
``def keygen(self: DefaultDict, missing_key: object) -> object``.
Equivalently, this callable should have the exact same signature as that
of the optional :meth:`dict.__missing__` method.
Returns
----------
MappingType
Empty default dictionary creating missing keys via this callable.
'''
# Global variable modified below.
global _DEFAULT_DICT_ID
# Unique classname suffixed by this identifier.
default_dict_class_name = 'DefaultDict' + str(_DEFAULT_DICT_ID)
# Increment this identifier to preserve uniqueness.
_DEFAULT_DICT_ID += 1
# Dynamically generated default dictionary class specific to this callable.
default_dict_class = type(
default_dict_class_name, (dict,), {'__missing__': keygen,})
# Instantiate and return the first and only instance of this class.
return default_dict_class()
_DEFAULT_DICT_ID = 0
'''
Unique arbitrary identifier with which to uniquify the classname of the next
:func:`DefaultDict`-derived type.
'''
The key ...get it, key? to this arcane wizardry is the call to
the 3-parameter variant
of the type() builtin:
type(default_dict_class_name, (dict,), {'__missing__': keygen,})
This single line dynamically generates a new dict subclass aliasing the
optional __missing__ method to the caller-defined callable. Note the distinct
lack of boilerplate, reducing DefaultDict usage to a single line of Python.
Automation for the egregious win.
The first respondent mentioned defaultdict,
but you can define __missing__ for any subclass of dict:
>>> class Dict(dict):
def __missing__(self, key):
return key
>>> d = Dict(a=1, b=2)
>>> d['a']
1
>>> d['z']
'z'
Also, I like the second respondent's approach:
>>> d = dict(a=1, b=2)
>>> d.get('z', 'z')
'z'
I agree this should be easy to do, and also easy to set up with different defaults or functions that transform a missing value somehow.
Inspired by Cecil Curry's answer, I asked myself: why not have the default-generator (either a constant or a callable) as a member of the class, instead of generating different classes all the time? Let me demonstrate:
# default behaviour: return missing keys unchanged
dic = FlexDict()
dic['a'] = 'one a'
print(dic['a'])
# 'one a'
print(dic['b'])
# 'b'
# regardless of default: easy initialisation with existing dictionary
existing_dic = {'a' : 'one a'}
dic = FlexDict(existing_dic)
print(dic['a'])
# 'one a'
print(dic['b'])
# 'b'
# using constant as default for missing values
dic = FlexDict(existing_dic, default = 10)
print(dic['a'])
# 'one a'
print(dic['b'])
# 10
# use callable as default for missing values
dic = FlexDict(existing_dic, default = lambda missing_key: missing_key * 2)
print(dic['a'])
# 'one a'
print(dic['b'])
# 'bb'
print(dic[2])
# 4
How does it work? Not so difficult:
class FlexDict(dict):
'''Subclass of dictionary which returns a default for missing keys.
This default can either be a constant, or a callable accepting the missing key.
If "default" is not given (or None), each missing key will be returned unchanged.'''
def __init__(self, content = None, default = None):
if content is None:
super().__init__()
else:
super().__init__(content)
if default is None:
default = lambda missing_key: missing_key
self.default = default # sets self._default
#property
def default(self):
return self._default
#default.setter
def default(self, val):
if callable(val):
self._default = val
else: # constant value
self._default = lambda missing_key: val
def __missing__(self, x):
return self.default(x)
Of course, one can debate whether one wants to allow changing the default-function after initialisation, but that just means removing #default.setter and absorbing its logic into __init__.
Enabling introspection into the current (constant) default value could be added with two extra lines.
Subclass dict's __getitem__ method. For example, How to properly subclass dict and override __getitem__ & __setitem__

Structure accessible by attribute name or index options

I am very new to Python, and trying to figure out how to create an object that has values that are accessible either by attribute name, or by index. For example, the way os.stat() returns a stat_result or pwd.getpwnam() returns a struct_passwd.
In trying to figure it out, I've only come across C implementations of the above types. Nothing specifically in Python. What is the Python native way to create this kind of object?
I apologize if this has been widely covered already. In searching for an answer, I must be missing some fundamental concept that is excluding me from finding an answer.
Python 2.6 introduced collections.namedtuple to make this easy. With older Python versions you can use the named tuple recipe.
Quoting directly from the docs:
>>> Point = namedtuple('Point', 'x y')
>>> p = Point(11, y=22) # instantiate with positional or keyword arguments
>>> p[0] + p[1] # indexable like the plain tuple (11, 22)
33
>>> x, y = p # unpack like a regular tuple
>>> x, y
(11, 22)
>>> p.x + p.y # fields also accessible by name
33
>>> p # readable __repr__ with a name=value style
Point(x=11, y=22)
You can't use the same implementation as the result object of os.stat() and others. However Python 2.6 has a new factory function that creates a similar datatype called named tuple. A named tuple is a tuple whose slots can also be addressed by name. The named tuple should not require any more memory, according to the documentation, than a regular tuple, since they don't have a per instance dictionary. The factory function signature is:
collections.namedtuple(typename, field_names[, verbose])
The first argument specifies the name of the new type, the second argument is a string (space or comma separated) containing the field names and, finally, if verbose is true, the factory function will also print the class generated.
Example
Suppose you have a tuple containing a username and password. To access the username you get the item at position zero and the password is accessed at position one:
credential = ('joeuser', 'secret123')
print 'Username:', credential[0]
print 'Password:', credential[1]
There's nothing wrong with this code but the tuple isn't self-documenting. You have to find and read the documentation about the positioning of the fields in the tuple. This is where named tuple can come to the rescue. We can recode the previous example as follows:
import collections
# Create a new sub-tuple named Credential
Credential = collections.namedtuple('Credential', 'username, password')
credential = Credential(username='joeuser', password='secret123')
print 'Username:', credential.username
print 'Password:', credential.password
If you are interested of what the code looks like for the newly created Credential-type you can add verbose=True to the argument list when creating the type, in this particular case we get the following output:
import collections
Credential = collections.namedtuple('Credential', 'username, password', verbose=True)
class Credential(tuple):
'Credential(username, password)'
__slots__ = ()
_fields = ('username', 'password')
def __new__(_cls, username, password):
return _tuple.__new__(_cls, (username, password))
#classmethod
def _make(cls, iterable, new=tuple.__new__, len=len):
'Make a new Credential object from a sequence or iterable'
result = new(cls, iterable)
if len(result) != 2:
raise TypeError('Expected 2 arguments, got %d' % len(result))
return result
def __repr__(self):
return 'Credential(username=%r, password=%r)' % self
def _asdict(t):
'Return a new dict which maps field names to their values'
return {'username': t[0], 'password': t[1]}
def _replace(_self, **kwds):
'Return a new Credential object replacing specified fields with new values'
result = _self._make(map(kwds.pop, ('username', 'password'), _self))
if kwds:
raise ValueError('Got unexpected field names: %r' % kwds.keys())
return result
def __getnewargs__(self):
return tuple(self)
username = _property(_itemgetter(0))
password = _property(_itemgetter(1))
The named tuple doesn't only provide access to fields by name but also contains helper functions such as the _make() function which helps creating an Credential instance from a sequence or iterable. For example:
cred_tuple = ('joeuser', 'secret123')
credential = Credential._make(cred_tuple)
The python library documentation for namedtuple has more information and code examples, so I suggest that you take a peek.
an object that has values that are accessible either by attribute name, or by index
I'm not sure what you're finding hard about this.
A collection accessible by index implements __getitem__.
A collection accessible by names implements __getattr__ (or __getattribute__).
You can implement both without any trouble at all. Or, you can use namedtuple.
To make life simpler, you could extend the tuple class so you don't have to implement your own __getitem__. Or you can define an ordinary class that also has __getitem__ so you didn't have to mess with __getattr__.
For example
>>> class Foo( object ):
... def __init__( self, x, y, z ):
... self.x= x
... self.y= y
... self.z= z
... def __getitem__( self, index ):
... return { 0: self.x, 1: self.y, 2: self.z }[index]
...
>>> f= Foo(1,2,3)
>>> f.x
1
>>> f[0]
1
>>> f[1]
2
>>> f[2]
3
>>> f.y
2

Categories

Resources