Instead of this:
a = {"foo": None, "bar": None}
Is there a way to write this?
b = {"foo", "bar"}
And still let b have constant time access (i.e. not a Python set, which cannot be keyed into)?
Actually, in Python 2.7 and 3.2+, this really does work:
>>> b = {"foo", "bar"}
>>> b
set(['foo', 'bar'])
You can't use [] access on a set ("key into"), but you can test for inclusion:
>>> 'x' in b
False
>>> 'foo' in b
True
Sets are as close to value-less dictionaries as it gets. They have average-case constant-time access, require hashable objects (i.e. no storing lists or dicts in sets), and even support their own comprehension syntax:
{x**2 for x in xrange(100)}
Yes, sets:
set() -> new empty set object
set(iterable) -> new set object
Build an unordered collection of unique elements.
Related: How is set() implemented?
Time complexity : https://wiki.python.org/moin/TimeComplexity#set
In order to "key" into a set in constant time use in:
>>> s = set(['foo', 'bar', 'baz'])
>>> 'foo' in s
True
>>> 'fork' in s
False
Related
In PyCharm, when I write:
return set([(sy + ady, sx + adx)])
it says "Function call can be replaced with set literal" so it replaces it with:
return {(sy + ady, sx + adx)}
Why is that? A set() in Python is not the same as a dictionary {}?
And if it wants to optimize this, why is this more effective?
Python sets and dictionaries can both be constructed using curly braces:
my_dict = {'a': 1, 'b': 2}
my_set = {1, 2, 3}
The interpreter (and human readers) can distinguish between them based on their contents. However it isn't possible to distinguish between an empty set and an empty dict, so this case you need to use set() for empty sets to disambiguate.
A very simple test suggests that the literal construction is faster (python3.5):
>>> timeit.timeit('a = set([1, 2, 3])')
0.5449375328607857
>>> timeit.timeit('a = {1, 2, 3}')
0.20525191631168127
This question covers some issues of performance of literal constructions over builtin functions, albeit for lists and dicts. The summary seems to be that literal constructions require less work from the interpreter.
It is an alternative syntax for set()
>>> a = {1, 2}
>>> b = set()
>>> b.add(1)
>>> b.add(2)
>>> b
set([1, 2])
>>> a
set([1, 2])
>>> a == b
True
>>> type(a) == type(b)
True
dict syntax is different. It consists of key-value pairs. For example:
my_obj = {1:None, 2:None}
Another example how set and {} are not interchangeable (as jonrsharpe mentioned):
In: f = 'FH'
In: set(f)
Out: {'F', 'H'}
In: {f}
Out: {'FH'}
set([iterable]) is the constructor to create a set from the optional iterable iterable. And {} is to create set / dict object literals. So what is created depends on how you use it.
In [414]: x = {}
In [415]: type(x)
Out[415]: dict
In [416]: x = {1}
In [417]: type(x)
Out[417]: set
In [418]: x = {1: "hello"}
In [419]: type(x)
Out[419]: dict
Given a list iterator, you can find the original list via the pickle protocol:
>>> L = [1, 2, 3]
>>> Li = iter(L)
>>> Li.__reduce__()[1][0] is L
True
Given a dict iterator, how can you find the original dict? I could only find a hacky way using CPython implementation details (via garbage collector):
>>> def get_dict(dict_iterator):
... [d] = gc.get_referents(dict_iterator)
... return d
...
>>> d = {}
>>> get_dict(iter(d)) is d
True
There is no API to find the source iterable object from an iterator. This is intentional, iterators are seen as single-use objects; iterate and discard. A such, they often drop their iterable reference once they have reached the end; what's the point of keeping it if you can't get more elements, anyway?
You see this in both the list and dict iterators, the hacks you found either produce empty objects or None once you are done iterating. List iterators use an empty list when pickled:
>>> l = [1]
>>> it = iter(l)
>>> it.__reduce__()[1][0] is l
True
>>> list(it) # exhaust the iterator
[1]
>>> it.__reduce__()[1][0] is l
False
>>> it.__reduce__()[1][0]
[]
and the dictionary iterator just sets the pointer to the original dictionary to null, so there are no referents left after that:
>>> import gc
>>> it = iter({'foo': 42})
>>> gc.get_referents(it)
[{'foo': 42}]
>>> list(it)
['foo']
>>> gc.get_referents(it)
[]
Both your hacks are just that: hacks. They are implementation dependent and can and probably will change between Python releases. Currently, using iter(dictionary).__reduce__() gets you the equivalent of iter, list(copy(self)) and rather than access to the dictionary because that's deemed a better implementation, but future versions might use something different altogether, etc.
For dictionaries, the only other option currently available is to access the di_dict pointer in the dictiter struct, with ctypes:
import ctypes
class PyObject_HEAD(ctypes.Structure):
_fields_ = [
("ob_refcnt", ctypes.c_ssize_t),
("ob_type", ctypes.c_void_p),
]
class dictiterobject(ctypes.Structure):
_fields_ = [
("ob_base", PyObject_HEAD),
("di_dict", ctypes.py_object),
("di_used", ctypes.c_ssize_t),
("di_pos", ctypes.c_ssize_t),
("di_result", ctypes.py_object), # always NULL for dictkeys_iter
("len", ctypes.c_ssize_t),
]
def dict_from_dictiter(it):
di = dictiterobject.from_address(id(it))
try:
return di.di_dict
except ValueError: # null pointer
return None
This is just as much of a hack as relying on gc.get_referents():
>>> d = {'foo': 42}
>>> it = iter(d)
>>> dict_from_dictiter(it)
{'foo': 42}
>>> dict_from_dictiter(it) is d
True
>>> list(it)
['foo']
>>> dict_from_dictiter(it) is None
True
For now, at least in CPython versions up to and including Python 3.8, there are no other options available.
I have a nested dict like this, but much larger:
d = {'a': {'b': 'c'}, 'd': {'e': {'f':2}}}
I've written a function which takes a dictionary and a path of keys as input and returns the value associated with that path.
>>> p = 'd/e'
>>> get_from_path(d, p)
>>> {'f':2}
Once I get the nested dictionary, I will need to modify it, however, d can not be modified. Do I need to use deepcopy, or is there a more efficient solution that doesn't require constantly making copies of the dictionary?
Depending on your use case, one approach to avoid making changes to an existing dictionary is to wrap it in a collections.ChainMap:
>>> import collections
>>> # here's a dictionary we want to avoid dirty'ing
>>> d = {i: i for in in range(10)}
>>> # wrap into a chain map and make changes there
>>> c = collections.ChainMap({}, d)
Now we can add new keys and values to c without corresponding changes happening in d
>>> c[0] = -100
>>> print(c[0], d[0])
-100 0
Whether this solution is appropriate depends on your use case ... in particular the ChainMap will:
not behave like a regular map when it comes to some things, like deleting keys:
>>> del c[0]
>>> print(c[0])
0
still allow you to modify values in place
>>> d = dict(a=[])
>>> collections.ChainMap({}, d)["a"].append(1)
will alter the list in d
However, if you are merely wishing to take your embedded dictionary and pop some new keys and values on it, then ChainMap may be appropriate.
In trying to use a list comprehension to make a list given a conditional, I see the following:
In [1]: mydicts = [{'foo':'val1'},{'foo':''}]
In [2]: mylist = [d for d in mydicts if d['foo']]
In [3]: mylist
Out[3]: [{'foo': 'val1'}]
In [4]: mydicts[1]['foo'] = 'val2'
In [5]: mydicts
Out[5]: [{'foo': 'val1'}, {'foo': 'val2'}]
In [6]: mylist
Out[6]: [{'foo': 'val1'}]
I've been reading the docs to try and understand this but have come up with nothing so far, so I'll ask my question here: why is it that mylist never includes {'foo': 'val2'} even though the reference in the list comprehension points to mydict, which by In [6] contains {'foo': 'val2'}? Is this because Python eagerly evaluates list comprehensions? Or is the lazy/eager dichotomy totally irrelevant to this?
There's no lazy evaluation of lists in Python. List comprehensions simply create a new list. If you want "lazy" evaluation, use a generator expression instead.
my_generator_expression = (d for d in mydicts if d['foo']) # note parentheses
mydicts[1]['foo'] = 'val2'
print(my_generator_expression) # >>> <generator object <genexpr> at 0x00000000>
for d in my_generator_expression:
print(d) # >>> {'foo': 'val1'}
# >>> {'foo': 'val2'}
Note that generators differ from lists in several important ways. Perhaps the most notable is that once you iterate over them, they are exhausted, so they're best to use if you only need the data they contain once.
I think you're a bit confused about what list comprehensions do.
When you do this:
[d for d in mydicts if d['foo']]
That evaluates to a new list. So, when you do this:
mylist = [d for d in mydicts if d['foo']]
You're assigning that list as the value of mylist. You can see this very easily:
assert type(mylist) == list
You're not assigning "a list comprehension" that gets reevaluated every time to mylist. There are no magic values in Python that get reevaluated every time. (You can fake them by, e.g., creating a class with a #property, but that's not really an exception; it's the expression myobj.myprop that's being reevaluated, not myprop itself.)
In fact, mylist = [d for d in mydicts if d['foo']] is basically the same mylist = [1, 2, 3].* In both cases, you're creating a new list, and assigning it to mylist. You wouldn't expect the second one to re-evaluate [1, 2, 3] each time (otherwise, doing mylist[0] = 0 wouldn't do much good, because as soon as you try to view mylist you'd be getting a new, pristine list!). The same is true here.
* In Python 3.x, they aren't just basically the same; they're both just different types of list displays. In 2.x, it's a bit more murky, and they just happen to both evaluate to new list objects.
mylist contains the result of a previous list comprehension evaluation, it won't magically updated just because you update a variable that was used for its computation.
To define a singleton in python use singleton = ('singleton'),
A Python dictionary can use a tuple as a key, as in
[('one', 'two'): 5]
But is it possible to do
[('singleton'),: 5]
Somehow?
Yes, you can do this — but not with ('Singleton'). You've got to use ('Singleton',).
The reason for this is that Python will interpret single parentheses around a single item as merely the item itself. Adding a comma enforces the tuple interpretation.
>>> d = {}
>>> d[('Thing')] = "one"
>>> d.keys()
['Thing']
>>> d[('Thing',)] = "another"
>>> d
{'Thing': 'one', ('Thing',): 'another'}
Signify to python that 'singleton' is a tuple to make it work:
>>> a = {}
>>> a[('singleton',)] = 5
>>> a
{('singleton',): 5}