Is it bad practice to use collections.OrderedDict? - python

I like to use collections.OrderedDict sometimes when I need an associative array where the order of the keys should be retained. Best example I have of this is in parsing or creating csv files, where it's useful to have the order of columns retained implicitly in the object.
But I'm worried that this is bad practice, since it seems to me that the whole concept of an associative array is that the order of the keys should never matter, and that any operations which rely on ordering should just use lists because that's why lists exist (this can be done for the csv example above). I don't have data on this, but I'm willing to bet that the performance for lists is universally better than OrderedDict.
So my question is: Are there any really compelling use cases for OrderedDict? Is the csv use case a good example of where it should be used or a bad one?

But I'm worried that this is bad practice, since it seems to me that the whole concept of an associative array is that the order of the keys should never matter,
Nonsense. That's not the "whole concept of an associative array". It's just that the order rarely matters and so we default to surrendering the order to get a conceptually simpler (and more efficient) data structure.
and that any operations which rely on ordering should just use lists because that's why lists exist
Stop it right there! Think a second. How would you use lists? As a list of (key, value) pairs, with unique keys, right? Well congratulations, my friend, you just re-invented OrderedDict, just with an awful API and really slow. Any conceptual objections to an ordered mapping would apply to this ad hoc data structure as well. Luckily, those objections are nonsense. Ordered mappings are perfectly fine, they're just different from unordered mappings. Giving it an aptly-named dedicated implementation with a good API and good performance improves people's code.
Aside from that: Lists are only one kind of ordered data structure. And while they are somewhat universal in that you can virtually all data structures out of some combination of lists (if you bend over backwards), that doesn't mean you should always use lists.
I don't have data on this, but I'm willing to bet that the performance for lists is universally better than OrderedDict.
Data (structures) doesn't (don't) have performance. Operations on data (structures) have. And thus it depends on what operations you're interested in. If you just need a list of pairs, a list is obviously correct, and iterating over it or indexing it is quite efficient. However, if you want a mapping that's also ordered, or even a tiny subset of mapping functionality (such as handling duplicate keys), then a list alone is pretty awful, as I already explained above.

For your specific use case (writing csv files) an ordered dict is not necessary. Instead, use a DictWriter.
Personally I use OrderedDict when I need some LIFO/FIFO access, for which is even has a the popitem method. I honestly couldn't think of a good use case, but the one mentioned at PEP-0327 for attribute order is a good one:
XML/HTML processing libraries currently drop the ordering of
attributes, use a list instead of a dict which makes filtering
cumbersome, or implement their own ordered dictionary. This affects
ElementTree, html5lib, Genshi and many more libraries.
If you are ever questioning why there is some feature in Python, the PEP is a good place to start because that's where the justification that leads to the inclusion of the feature is detailed.

Probably a comment would suffice...
I think it would be questionable if you use it on places where you don't need it (where order is irrelevant and ordinary a dict would suffice). Otherwise the code will probably be simpler than using lists.
This is valid for any language construct/library - if it makes your code simpler, use the higher level abstraction/implementation.

As long as you feel comfortable with this data structure, and that it fits your needs, why caring? Perhaps it is not the more efficient one (in term of speed, etc.), but, if it's there, it's obviously because it's useful in certain cases (or nobody would have thought of writing it).
You can basically use three types of associative arrays in Python:
the classic hash table (no order at all)
the OrderedDict (order which mirrors the way the object was created)
and the binary trees - this is not in the standard lib -, which order their keys exactly as you want, in a custom order (not necessarily the alphabetical one).
So, in fact, the order of the keys can matter. Just choose the structure that you think is the more appropriate to do the job.

For CSV and similar constructs of repeated keys use a namedtuple. It is best of both worlds.

Related

When you don't specify an order- why does result order vary?

I am in the very early stages of learning Python. This question has more to do with basic understanding than coding- hopefully I tag it correctly. I am reading my coursework and it says
"Run the program below that displays the ... The indentation and
spacing of the... key-value pairs simply provides more readability.
Note that order is not maintained in the dict when printed."
I know I can specify so that the order is the same each time. I can do that. I want to know when you write a program and run it why do the results get returned in a different order when not specified? Is it because of the way it gets handled in the processor?
Thanks.
The answer has nothing to do with Python, and everything to do with data structures - this behavior is universal and expected across all languages that implement a similar data structure. In Python it's called a dictionary, in other languages it's called a Map or a Hash Map or a Hash Table. There are a few other similar names for the same underlying data structure.
The Python dictionary is an Associative collection, as opposed to a Python List (which is just an Array), where its elements are contiguous in memory.
The big advantage that dictionaries (associative collections) offer is fast and constant look up times (O(1)) - arrays also offer fast look up since calculating an index is trivial - however a dictionary consists of key-value pairs where the key can be anything as long as it is hashable.
Essentially, to determine the "index" where an associated value should go in an associative container, you take the key, hash it, devise some way of mapping the hash to a number and treat that number like an index. As unlikely as it is for two different objects to yield the same hash, it could theoretically happen - what's more likely to happen is that your hash-to-number procedure maps two unique hashes to the same number - in any case, collisions like this can happen, and there are strategies for handling these collisions.
The point is, the hash of a key determines the order in which the associated value appears in the collection - therefore, there is no inherent order.

List of lists vs dictionary

In Python, are there any advantages / disadvantages of working with a list of lists versus working with a dictionary, more specifically when doing numerical operations with them? I'm writing a class of functions to solve simple matrix operations for my linear algebra class. I was using dictionaries, but then I saw that numpy uses list of lists instead, so I guess there must be some advantages in it.
Example: [[1,2,3],[4,5,6],[7,8,9]] as opposed to {0:[1,2,3],1:[4,5,6],2:[7,8,9]}
I think this largely depends on how you plan on using this structure.
Python's dictionaries are (like most) unordered by default. If you plan on iterating over your data like such, you shouldn't use a dictionary:
for list in dict.keys():
for elem in list:
# Logic
Likewise, it doesn't make a lot of sense to use a dictionary with the keys 1, 2, 3 ... when they have little value other than an index. Dictionaries also take up more space in memory due to the hashing process.
If you plan on accessing items by element (which it sounds like you want to), you'll still want to use a List. Index lookup in a list of O(1), same as in a Dictionary. The only difference is when you're looking up some key value, instead of an index (which will be faster than in a dictionary).
You should really only consider using a dictionary when you have some sort of key-value relationship mapping, in which a meaningful key needs to be searched for to retrieve a related value. This doesn't sound like one of those cases. Stick with a list-of-lists.
That's not to say Dictionaries are a bad data structure. Ruby and Python introduced me to them, and they're extremely useful for any of those afore-mentioned mapping problems (which I find I run into quite a lot). They're just useful for a specific class of problems, and this isn't one of them.
When the keys of the dictionary are 0, 1, ..., n, a list will be faster, since no hashing is involved. As soon as the keys are not such a sequence, you need to use a dict.

Why does Python treat tuples, lists, sets and dictionaries as fundamentally different things?

One of the reasons I love Python is the expressive power / reduced programming effort provided by tuples, lists, sets and dictionaries. Once you understand list comprehensions and a few of the basic patterns using in and for, life gets so much better! Python rocks.
However I do wonder why these constructs are treated as differently as they are, and how this is changing (getting stranger) over time. Back in Python 2.x, I could've made an argument they were all just variations of a basic collection type, and that it was kind of irritating that some non-exotic use cases require you to convert a dictionary to a list and back again. (Isn't a dictionary just a list of tuples with a particular uniqueness constraint? Isn't a list just a set with a different kind of uniqueness constraint?).
Now in the 3.x world, it's gotten more complicated. There are now named tuples -- starting to feel more like a special-case dictionary. There are now ordered dictionaries -- starting to feel more like a list. And I just saw a recipe for ordered sets. I can picture this going on and on ... what about unique lists, etc.
The Zen of Python says "There should be one-- and preferably only one --obvious way to do it". It seems to me this profusion of specialized collections types is in conflict with this Python precept.
What do the hardcore Pythonistas think?
These data types all serve different purposes, and in an ideal world you might be able to unify them more. However, in the real world we need to have efficient implementations of the basic collections, and e.g. ordering adds a runtime penalty.
The named tuples mainly serve to make the interface of stat() and the like more usable, and also can be nice when dealing with SQL row sets.
The big unification you're looking for is actually there, in the form of the different access protocols (getitem, getattr, iter, ...), which these types mix and match for their intended purposes.
tl;dr (duck-typing)
You're correct to see some similarities in all these data structures. Remember that python uses duck-typing (if it looks like a duck and quacks like a duck then it is a duck). If you can use two objects in the same situation then, for your current intents and purposes, they might as well be the same data type. But you always have to keep in mind that if you try to use them in other situations, they may no longer behave the same way.
With this in mind we should take a look at what's actually different and the same about the four data types you mentioned, to get a general idea of the situations where they are interchangeable.
Mutability (can you change it?)
You can make changes to dictionaries, lists, and sets. Tuples cannot be "changed" without making a copy.
Mutable: dict, list, set
Immutable: tuple
Python string is also an immutable type. Why do we want some immutable objects? I would paraphrase from this answer:
Immutable objects can be optimized a lot
In Python, only immutables are hashable (and only hashable objects can be members of sets, or keys in dictionaries).
Comparing across this property, lists and tuples seem like the "closest" two data types. At a high-level a tuple is an immutable "freeze-frame" version of a list. This makes lists useful for data sets that will be changing over time (since you don't have to copy a list to modify it) but tuples useful for things like dictionary keys (which must be immutable types).
Ordering (and a note on abstract data types)
A dictionary, like a set, has no inherent conceptual order to it. This is in contrast to lists and tuples, which do have an order. The order for the items in a dict or a set is abstracted away from the programmer, meaning that if element A comes before B in a for k in mydata loop, you shouldn't (and can't generally) rely on A being before B once you start making changes to mydata.
Order-preserving: list, tuple
Non-order-preserving: dict, set
Technically if you iterate over mydata twice in a row it'll be in the same order, but this is more a convenient feature of the mechanics of python, and not really a part of the set abstract data type (the mathematical definition of the data type). Lists and tuples do guarantee order though, especially tuples which are immutable.
What you see when you iterate (if it walks like a duck...)
One "item" per "element": set, list, tuple
Two "items" per "element": dict
I suppose here you could see a named tuple, which has both a name and a value for each element, as an immutable analogue of a dictionary. But this is a tenuous comparison- keep in mind that duck-typing will cause problems if you're trying to use a dictionary-only method on a named tuple, or vice-versa.
Direct responses to your questions
Isn't a dictionary just a list of tuples with a particular uniqueness
constraint?
No, there are several differences. Dictionaries have no inherent order, which is different from a list, which does.
Also, a dictionary has a key and a value for each "element". A tuple, on the other hand, can have an arbitrary number of elements, but each with only a value.
Because of the mechanics of a dictionary, where keys act like a set, you can look up values in constant time if you have the key. In a list of tuples (pairs here), you would need to iterate through the list until you found the key, meaning search would be linear in the number of elements in your list.
Most importantly, though, dictionary items can be changed, while tuples cannot.
Isn't a list just a set with a different kind of uniqueness
constraint?
Again, I'd stress that sets have no inherent ordering, while lists do. This makes lists much more useful for representing things like stacks and queues, where you want to be able to remember the order in which you appended items. Sets offer no such guarantee. However they do offer the advantage of being able to do membership lookups in constant time, while again lists take linear time.
There are now named tuples -- starting to feel more like a special-case dictionary. There are now ordered dictionaries -- starting to feel more like a list. And I just saw a recipe for ordered sets. I can picture this going on and on ... what about unique lists, etc.
To some degree I agree with you. However data structure libraries can be useful to support common use-cases for already well-established data structures. This keep the programmer from wasting time trying to come up with custom extensions to the standard structures. As long as it doesn't get out of hand, and we can still see the unique usefulness in each solution, it's good to have a wheel on the shelf so we don't need to reinvent it.
A great example is the Counter() class. This specialized dictionary has been of use to me more times than I can count (badoom-tshhhhh!) and it has saved me the effort of coding up a custom solution. I'd much rather have a solution that the community is helping me to develop and keep with proper python best-practices than something that sits around in my custom data structures folder and only gets used once or twice a year.
First of all, Ordered Dictionaries and Named Tuples were introduced in Python 2, but that's beside the point.
I won't point you at the docs since if you were really interested you would have read them already.
The first difference between collection types is mutability. tuple and frozenset are immutable types. This means they can be more efficient than list or set.
If you want something you can access randomly or in order, but will mainly change at the end, you want a list. If you want something you can also change at the beginning, you want a deque.
You simply can't have your cake and eat it too -- every feature you add causes you to lose some speed.
dict and set are fundamentally different from lists and tuples`. They store the hash of their keys, allowing you to see if an item is in them very quickly, but requires the key be hashable. You don't get the same membership testing speed with linked lists or arrays.
When you get to OrderedDict and NamedTuple, you're talking about subclasses of the builtin types implemented in Python, rather than in C. They are for special cases, just like any other code in the standard library you have to import. They don't clutter up the namespace but are nice to have when you need them.
One of these days, you'll be coding, and you'll say, "Man, now I know exactly what they meant by 'There should be one-- and preferably only one --obvious way to do it', a set is just what I needed for this, I'm so glad it's part of the Python language! If I had to use a list, it would take forever." That's when you'll understand why these different types exist.
A dictionary is indexed by key (in fact, it's a hash map); a generic list of tuples won't be. You might argue that both should be implemented as relations, with the ability to add indices at will, but in practice having optimized types for the common use cases is both more convenient and more efficient.
New specialized collections get added because they are common enough that lots of people would end up implementing them using more basic data types, and then you'd have the usual problems with wheel reinvention (wasted effort, lack of interoperability...). And if Python just offered an entirely generic construct, then we'd get lots of people asking "how do I implement a set using a relation", etc.
(btw, I'm using relation in the mathematical or DB sense)
All of these specialized collection types provide specific functionalities that are not adequately or efficiently provided by the "standard" data types of list, tuple, dict, and set.
For example, sometimes you need a collection of unique items, and you also need to retain the order in which you encountered them. You can do this using a set to keep track of membership and a list to keep track of order, but your solution will probably be slower and more memory-hungry than a specialized data structure designed for exactly this purpose, such as an ordered set.
These additional data types, which you see as combinations or variations on the basic ones, actually fill gaps in functionality left by the basic data types. From a practical perspective, if Python's core or standard library did not provide these data types, then anyone who needed them would invent their own inefficient versions. They are used less often than the basic types, but often enough to make it worth while to provide standard implementations.
One of the things I like in Python the most is agility. And a lot of functional, effective and usable collections types gives it to me.
And there is still one way to do this - each type does its own job.
The world of data structures (language agnostic) can generally be boiled down to a few small basic structures - lists, trees, hash-tables and graphs, etc. and variants and combinations thereof. Each has its own specific purpose in terms of use and implementation.
I don't think that you can do things like reduce a dictionary to a list of tuples with a particular uniqueness constraint without actually specifying a dictionary. A dictionary has a specific purpose - key/value look-ups - and the implementation of the data structure is generally tailored to those needs. Sets are like dictionaries in many ways, but certain operations on sets don't make sense on a dictionary (union, disjunction, etc).
I don't see this violating the 'Zen of Python' of doing things one way. While you can use a sorted dictionary to do what a dictionary does without using the sorted part, you're more violating Occam's razor and likely causing a performance penalty. I see this as different than being able to syntactically do thing different ways a la Perl.
The Zen of Python says "There should be one-- and preferably only one --obvious way to do it". It seems to me this profusion of specialized collections types is in conflict with this Python precept.
Not remotely. There are several different things being done here. We choose the right tool for the job. All of these containers are modeled on decades-old tried, tested and true CS concepts.
Dictionaries are not like tuples: they are optimized for key-value lookup. The tuple is also immutable, which distinguishes it from a list (you could think of it as sort of like a frozenlist). If you find yourself converting dictionaries to lists and back, you are almost certainly doing something wrong; an example would help.
Named tuples exist for convenience and are intended to replace simple classes rather than dictionaries, really. Ordered dictionaries are just a bit of wrapping to remember the order in which things were added to the dictionary. And neither is new in 3.x (although there may be better language support for them; I haven't looked).

Use of add(), append(), update() and extend() in Python

Is there an article or forum discussion or something somewhere that explains why lists use append/extend, but sets and dicts use add/update?
I frequently find myself converting lists into sets and this difference makes that quite tedious, so for my personal sanity I'd like to know what the rationalization is.
The need to convert between these occurs regularly as we iterate on development. Over time as the structure of the program morphs, various structures gain and lose requirements like ordering and duplicates.
For example, something that starts out as an unordered bunch of stuff in a list might pick up the the requirement that there be no duplicates and so need to be converted to a set.
All such changes require finding and changing all places where the relevant structure is added/appended and extended/updated.
So I'm curious to see the original discussion that led to this language choice, but unfortunately I didn't have any luck googling for it.
append has a popular definition of "add to the very end", and extend can be read similarly (in the nuance where it means "...beyond a certain point"); sets have no "end", nor any way to specify some "point" within them or "at their boundaries" (because there are no "boundaries"!), so it would be highly misleading to suggest that these operations could be performed.
x.append(y) always increases len(x) by exactly one (whether y was already in list x or not); no such assertion holds for s.add(z) (s's length may increase or stay the same). Moreover, in these snippets, y can have any value (i.e., the append operation never fails [except for the anomalous case in which you've run out of memory]) -- again no such assertion holds about z (which must be hashable, otherwise the add operation fails and raises an exception). Similar differences apply to extend vs update. Using the same name for operations with such drastically different semantics would be very misleading indeed.
it seems pythonic to just use a list
on the first pass and deal with the
performance on a later iteration
Performance is the least of it! lists support duplicate items, ordering, and any item type -- sets guarantee item uniqueness, have no concept of order, and demand item hashability. There is nothing Pythonic in using a list (plus goofy checks against duplicates, etc) to stand for a set -- performance or not, "say what you mean!" is the Pythonic Way;-). (In languages such as Fortran or C, where all you get as a built-in container type are arrays, you might have to perform such "mental mapping" if you need to avoid using add-on libraries; in Python, there is no such need).
Edit: the OP asserts in a comment that they don't know from the start (e.g.) that duplicates are disallowed in a certain algorithm (strange, but, whatever) -- they're looking for a painless way to make a list into a set once they do discover duplicates are bad there (and, I'll add: order doesn't matter, items are hashable, indexing/slicing unneeded, etc). To get exactly the same effect one would have if Python's sets had "synonyms" for the two methods in question:
class somewhatlistlikeset(set):
def append(self, x): self.add(x)
def extend(self, x): self.update(x)
Of course, if the only change is at the set creation (which used to be list creation), the code may be much more challenging to follow, having lost the useful clarity whereby using add vs append allows anybody reading the code to know "locally" whether the object is a set vs a list... but this, too, is part of the "exactly the same effect" above-mentioned!-)
set and dict are unordered. "Append" and "extend" conceptually only apply to ordered types.
It's written that way to annoy you.
Seriously. It's designed so that one can't simply convert one into the other easily. Historically, sets are based off dicts, so the two share naming conventions. While you could easily write a set wrapper to add these methods ...
class ListlikeSet(set):
def append(self, x):
self.add(x)
def extend(self, xs):
self.update(xs)
... the greater question is why you find yourself converting lists to sets with such regularity. They represent substantially different models of a collection of objects; if you have to convert between the two a lot, it suggests you may not have a very good handle on the conceptual architecture of your program.

Memory efficiency: One large dictionary or a dictionary of smaller dictionaries?

I'm writing an application in Python (2.6) that requires me to use a dictionary as a data store.
I am curious as to whether or not it is more memory efficient to have one large dictionary, or to break that down into many (much) smaller dictionaries, then have an "index" dictionary that contains a reference to all the smaller dictionaries.
I know there is a lot of overhead in general with lists and dictionaries. I read somewhere that python internally allocates enough space that the dictionary/list # of items to the power of 2.
I'm new enough to python that I'm not sure if there are other unexpected internal complexities/suprises like that, that is not apparent to the average user that I should take into consideration.
One of the difficulties is knowing how the power of 2 system counts "items"? Is each key:pair counted as 1 item? That's seems important to know because if you have a 100 item monolithic dictionary then space 100^2 items would be allocated. If you have 100 single item dictionaries (1 key:pair) then each dictionary would only be allocation 1^2 (aka no extra allocation)?
Any clearly laid out information would be very helpful!
Three suggestions:
Use one dictionary.
It's easier, it's more straightforward, and someone else has already optimized this problem for you. Until you've actually measured your code and traced a performance problem to this part of it, you have no reason not to do the simple, straightforward thing.
Optimize later.
If you are really worried about performance, then abstract the problem make a class to wrap whatever lookup mechanism you end up using and write your code to use this class. You can change the implementation later if you find you need some other data structure for greater performance.
Read up on hash tables.
Dictionaries are hash tables, and if you are worried about their time or space overhead, you should read up on how they're implemented. This is basic computer science. The short of it is that hash tables are:
average case O(1) lookup time
O(n) space (Expect about 2n, depending on various parameters)
I do not know where you read that they were O(n^2) space, but if they were, then they would not be in widespread, practical use as they are in most languages today. There are two advantages to these nice properties of hash tables:
O(1) lookup time implies that you will not pay a cost in lookup time for having a larger dictionary, as lookup time doesn't depend on size.
O(n) space implies that you don't gain much of anything from breaking your dictionary up into smaller pieces. Space scales linearly with number of elements, so lots of small dictionaries will not take up significantly less space than one large one or vice versa. This would not be true if they were O(n^2) space, but lucky for you, they're not.
Here are some more resources that might help:
The Wikipedia article on Hash Tables gives a great listing of the various lookup and allocation schemes used in hashtables.
The GNU Scheme documentation has a nice discussion of how much space you can expect hashtables to take up, including a formal discussion of why "the amount of space used by the hash table is proportional to the number of associations in the table". This might interest you.
Here are some things you might consider if you find you actually need to optimize your dictionary implementation:
Here is the C source code for Python's dictionaries, in case you want ALL the details. There's copious documentation in here:
dictobject.h
dictobject.c
Here is a python implementation of that, in case you don't like reading C.
(Thanks to Ben Peterson)
The Java Hashtable class docs talk a bit about how load factors work, and how they affect the space your hash takes up. Note there's a tradeoff between your load factor and how frequently you need to rehash. Rehashes can be costly.
If you're using Python, you really shouldn't be worrying about this sort of thing in the first place. Just build your data structure the way it best suits your needs, not the computer's.
This smacks of premature optimization, not performance improvement. Profile your code if something is actually bottlenecking, but until then, just let Python do what it does and focus on the actual programming task, and not the underlying mechanics.
"Simple" is generally better than "clever", especially if you have no tested reason to go beyond "simple". And anyway "Memory efficient" is an ambiguous term, and there are tradeoffs, when you consider persisting, serializing, cacheing, swapping, and a whole bunch of other stuff that someone else has already thought through so that in most cases you don't need to.
Think "Simplest way to handle it properly" optimize much later.
Premature optimization bla bla, don't do it bla bla.
I think you're mistaken about the power of two extra allocation does. I think its just a multiplier of two. x*2, not x^2.
I've seen this question a few times on various python mailing lists.
With regards to memory, here's a paraphrased version of one such discussion (the post in question wanted to store hundreds of millions integers):
A set() is more space efficient than a dict(), if you just want to test for membership
gmpy has a bitvector type class for storing dense sets of integers
Dicts are kept between 50% and 30% empty, and an entry is about ~12 bytes (though the true amount will vary by platform a bit).
So, the fewer objects you have, the less memory you're going to be using, and the fewer lookups you're going to do (since you'll have to lookup in the index, then a second lookup in the actual value).
Like others, said, profile to see your bottlenecks. Keeping an membership set() and value dict() might be faster, but you'll be using more memory.
I'd also suggest reposting this to a python specific list, such as comp.lang.python, which is full of much more knowledgeable people than myself who would give you all sorts of useful information.
If your dictionary is so big that it does not fit into memory, you might want to have a look at ZODB, a very mature object database for Python.
The 'root' of the db has the same interface as a dictionary, and you don't need to load the whole data structure into memory at once e.g. you can iterate over only a portion of the structure by providing start and end keys.
It also provides transactions and versioning.
Honestly, you won't be able to tell the difference either way, in terms of either performance or memory usage. Unless you're dealing with tens of millions of items or more, the performance or memory impact is just noise.
From the way you worded your second sentence, it sounds like the one big dictionary is your first inclination, and matches more closely with the problem you're trying to solve. If that's true, go with that. What you'll find about Python is that the solutions that everyone considers 'right' nearly always turn out to be those that are as clear and simple as possible.
Often times, dictionaries of dictionaries are useful for other than performance reasons. ie, they allow you to store context information about the data without having extra fields on the objects themselves, and make querying subsets of the data faster.
In terms of memory usage, it would stand to reason that one large dictionary will use less ram than multiple smaller ones. Remember, if you're nesting dictionaries, each additional layer of nesting will roughly double the number of dictionaries you need to allocate.
In terms of query speed, multiple dicts will take longer due to the increased number of lookups required.
So I think the only way to answer this question is for you to profile your own code. However, my suggestion is to use the method that makes your code the cleanest and easiest to maintain. Of all the features of Python, dictionaries are probably the most heavily tweaked for optimal performance.

Categories

Resources