How to get Python to tell equal integers apart - python

Have a bit of a problem distinguishing between identical integers.
In the following (which is obviously a trivial case) a, b, c are integers. I wish to create a dicionary, diction, which will contain {a: 'foo', b: 'bar', c: 'baz'}
diction = {}
for i in (a, b, c):
j = ('foo', 'bar', 'baz')[(a, b, c).index(i)]
diction[i] = j
All runs very nicely until, for example, a and b are the same: the third line will give index 0 for both a and b, resulting in j = 'foo' for each case.
I know lists can be copied by
list_a = [1, 2, 3]
list_b = list(list_a)
or
list_b = list_a[:]
So, is there any way of maybe doing this with my identical integers?
(I tried making one a float, but the value remains the same , so that doesn't work.)

To create a dictionary from two different iterables, you can use the following code:
d = dict(zip((a, b, c), ('foo', 'bar', 'baz')))
where zip is used to combine both iterables in a list of tuples that can be passed to the dictionary constructor.
Note that if a==b, then the 'foo' will be overwritten with 'bar', since the values are added to the dictionary in the same order they are in the iterable as if you were using this code:
d[a] = 'foo'
d[b] = 'bar'
d[c] = 'baz'
This is just the standard behaviour of a dictionary, when a new value is assigned to a key that is already known, the value is overwritten.
If you prefer to keep all values in a list, then you can use a collections.defaultdict as follows:
from collections import defaultdict
d = defaultdict(list)
for key, value in zip((a, b, c), ('foo', 'bar', 'baz')):
d[key].append(value)

You can't distinguish between identical objects.

You can tell them apart if they do not fall between -5 and 256
See also "is" operator behaves unexpectedly with integers
http://docs.python.org/c-api/int.html
The current implementation keeps an array of integer objects for all
integers between -5 and 256, when you create an int in that range you
actually just get back a reference to the existing object. So it
should be possible to change the value of 1. I suspect the behaviour
of Python in this case is undefined. :-)
In [30]: a = 257
In [31]: a is 257
Out[31]: False
In [32]: a = 256
In [33]: a is 256
Out[33]: True
You may have to roll your own dictionary like object that implements this sort of behavior though... and it still wouldn't be able to do anything between -5 and 256. I'd need to do more digging to be sure though.

If a and b have the same value then you can't expect them to point to different positions in dictionary if used as keys. Key values in dictionaries must be unique.
Also if you have two sequences the simplest way to make a dictionary out of them is to zip them together:
tup = (a,b,c)
val = ('foo', 'bar', 'baz')
diction = dict(zip(tup, val))

All of the answers so far are correct - identical keys can't be re-used in a dictionary. If you absolutely have to try to do something like this, but can't ensure that a, b, and c have distinct values you could try something like this:
d = dict(zip((id(k) for k in (a,b,c)), ('foo', 'bar', 'baz')))
When you go to look up your values though, you'll have to remember to do so like this:
d[id(a)]
That might help, but I am not certain what you're actually after here.

Related

Python defaultdict

I found something strange that I couldn't understand.
This is the case:
from collections import defaultdict
a = defaultdict(lambda: len(a))
This is just the part of the code, and the code has never defined 'a' above.
The questions are:
Is it possible to use defaultdict as is, not specifying the variable previously?
If possible, what is the meaning of that code?
Maybe it is best explained in an example:
>>> a = defaultdict(lambda: len(b))
>>> b = 'abcd'
>>> a[0]
4
As you can see, it is possible to use b in the lambda even though the b does not yet exist at that point. What is important is that b exists at the time when the lambda is executed. At that point, Python will look for a variable named b and use it.
Note also that the original code does not necessarily use length of the defaultdict itself. It simply evaluates whatever a is at that point. See this example:
>>> a = defaultdict(lambda: len(a))
>>> a['a']
0
>>> a['b']
1
So far, so good. But then rename some things:
>>> x = a
>>> a = []
>>> x['c']
0
>>> x['d']
0
Now the deaultdict is named x, but it does not use len(x). It still uses len(a). This caveat may become important if you sent the defaultdict to a function where a does not mean anything.
you are saying to default dict, when i try to do something with a key and it doesnt exist, use this lambda as the inital value for the key. since your lambda is using a (i.E the dict its self) and you say the length of it. It means when you perform operations using a key thats not in the dict then the dict will use the lambda instead or in this case the length of the dict as the value
from collections import defaultdict
a = defaultdict(lambda: len(a))
a['one'] += 5 #here dict length is 0 so value is 0 + 5 = 5
a['two'] += 2 #jere dict length is 1 so value is 1 + 2 = 3
a['three'] += 1 #here dict length is 2 so value is 2 + 1 = 3
print(a.items())
print(a['newval']) #here newval doesnt exist so will use default value which is length of dict I.E 3
OUTPUT
dict_items([('one', 5), ('two', 3), ('three', 3)])
3
Here's how defaultdict works. Say you have a dict of lists and you're setting values for keys that might not exist. In that case you'd do something like this:
d = dict()
if some_key not in d:
d[some_key] = list()
d[some_key].append(some_value)
defaultdict does this automatically for you by passing it a callable, e.g., int, list, set, which will call int() (default value 0), list() (default value empty list), and set() (default value empty set) respectively. Your lambda is also a callable, which returns integers, so you'll have a dict with int values. But the value you get from the expression will depend on the size of the dict.
Can you do a = defaultdict(lambda: len(a))?
Yes, you can. The lambda will not be executed until called which is when it'll look up the name a. Compare these two cases.
f = lambda: len(a)
a = defaultdict(f)
a[0] # this is when the lambda is called for the first time
But,
g = lambda: len(b)
g() # this will raise a NameError
b = defauldict(g)

how to ignore the order of elements in a tuple

I am using tuples as the key for a dictionary I created. For example:
example_dict = {}
example_dict[("A", "B")] = "1"
Later when I wish to modify the value of an entry in the dictionary I don't currently have control over the order of the tuple. For example:
("B", "A") may be the case, instead of ("A", "B")
I'm aware that these tuples are not equal from a simple == comparison that I tried in the python shell.
What I am wondering is how I could work around this? How could I make the following not produce a KeyError:
print (example_dict["B", "A"])
Is there a way to consistently order the elements of a tuple? Is there a way to ignore order completely? Any other work arounds? I'm aware I could just include all arrangements of the tuples as keys in the dictionary, and then collate the values of the different permutations later. I strongly want to avoid doing this as that only adds difficulty and complexity to the problem.
The usual ways are to either sort the keys:
example_dict[tuple(sorted(key_tuple))] = "1"
use frozensets as keys (if there won't be duplicate elements in the tuples):
example_dict[frozenset(key_tuple)] = "1"
or use frozensets of (item, count) tuples as keys (if there can be duplicate elements in the tuples):
example_dict[frozenset(Counter(key_tuple).viewitems())] = "1"
Whichever option you choose, you'll have to apply the same transformation when you look up values.
You want your dictionary keys to be "sets" (a set is a collection for which an item is either in or not in the set, but that has no concept of order). Luckily python has what you need. Specifically because you need something hashable you want to use frozenset.
>>> example_dict = {}
>>> example_dict[frozenset(("A", "B"))] = "1"
>>> example_dict[frozenset(("B", "A"))]
'1'
>>> example_dict[frozenset(("A", "B"))]
'1'
Instead of using a tuple, use a frozenset. A frozenset is just a constant set, just as a tuple can be thought of as a constant list.
Here's an example (from Python 3, but it will work in Python 2 as well):
>>> d = {}
>>> k1 = frozenset((1, 2))
>>> k2 = frozenset((2, 1))
>>> k1
frozenset({1, 2})
>>> k2
frozenset({1, 2})
>>> k1 == k2
True
>>> d[k1] = 123
>>> d[k2]
123
>>>

evaluation of list comprehensions in python

In trying to use a list comprehension to make a list given a conditional, I see the following:
In [1]: mydicts = [{'foo':'val1'},{'foo':''}]
In [2]: mylist = [d for d in mydicts if d['foo']]
In [3]: mylist
Out[3]: [{'foo': 'val1'}]
In [4]: mydicts[1]['foo'] = 'val2'
In [5]: mydicts
Out[5]: [{'foo': 'val1'}, {'foo': 'val2'}]
In [6]: mylist
Out[6]: [{'foo': 'val1'}]
I've been reading the docs to try and understand this but have come up with nothing so far, so I'll ask my question here: why is it that mylist never includes {'foo': 'val2'} even though the reference in the list comprehension points to mydict, which by In [6] contains {'foo': 'val2'}? Is this because Python eagerly evaluates list comprehensions? Or is the lazy/eager dichotomy totally irrelevant to this?
There's no lazy evaluation of lists in Python. List comprehensions simply create a new list. If you want "lazy" evaluation, use a generator expression instead.
my_generator_expression = (d for d in mydicts if d['foo']) # note parentheses
mydicts[1]['foo'] = 'val2'
print(my_generator_expression) # >>> <generator object <genexpr> at 0x00000000>
for d in my_generator_expression:
print(d) # >>> {'foo': 'val1'}
# >>> {'foo': 'val2'}
Note that generators differ from lists in several important ways. Perhaps the most notable is that once you iterate over them, they are exhausted, so they're best to use if you only need the data they contain once.
I think you're a bit confused about what list comprehensions do.
When you do this:
[d for d in mydicts if d['foo']]
That evaluates to a new list. So, when you do this:
mylist = [d for d in mydicts if d['foo']]
You're assigning that list as the value of mylist. You can see this very easily:
assert type(mylist) == list
You're not assigning "a list comprehension" that gets reevaluated every time to mylist. There are no magic values in Python that get reevaluated every time. (You can fake them by, e.g., creating a class with a #property, but that's not really an exception; it's the expression myobj.myprop that's being reevaluated, not myprop itself.)
In fact, mylist = [d for d in mydicts if d['foo']] is basically the same mylist = [1, 2, 3].* In both cases, you're creating a new list, and assigning it to mylist. You wouldn't expect the second one to re-evaluate [1, 2, 3] each time (otherwise, doing mylist[0] = 0 wouldn't do much good, because as soon as you try to view mylist you'd be getting a new, pristine list!). The same is true here.
* In Python 3.x, they aren't just basically the same; they're both just different types of list displays. In 2.x, it's a bit more murky, and they just happen to both evaluate to new list objects.
mylist contains the result of a previous list comprehension evaluation, it won't magically updated just because you update a variable that was used for its computation.

Iterate over a pair of iterables, sorted by an attribute

One way (the fastest way?) to iterate over a pair of iterables a and b in sorted order is to chain them and sort the chained iterable:
for i in sorted(chain(a, b)):
print i
For instance, if the elements of each iterable are:
a: 4, 6, 1
b: 8, 3
then this construct would produce elements in the order
1, 3, 4, 6, 8
However, if the iterables iterate over objects, this sorts the objects by their memory address. Assuming each iterable iterates over the same type of object,
What is the fastest way to iterate over a particular
attribute of the objects, sorted by this attribute?
What if the attribute to be chosen differs between iterables? If iterables a and b both iterate over objects of type foo, which has attributes foo.x and foo.y of the same type, how could one iterate over elements of a sorted by x and b sorted by y?
For an example of #2, if
a: (x=4,y=3), (x=6,y=2), (x=1,y=7)
b: (x=2,y=8), (x=2,y=3)
then the elements should be produced in the order
1, 3, 4, 6, 8
as before. Note that only the x attributes from a and the y attributes from b enter into the sort and the result.
Tim Pietzcker has already answered for the case where you're using the same attribute for each iterable. If you're using different attributes of the same type, you can do it like this (using complex numbers as a ready-made class that has two attributes of the same type):
In Python 2:
>>> a = [1+4j, 7+0j, 3+6j, 9+2j, 5+8j]
>>> b = [2+5j, 8+1j, 4+7j, 0+3j, 6+9j]
>>> keyed_a = ((n.real, n) for n in a)
>>> keyed_b = ((n.imag, n) for n in b)
>>> from itertools import chain
>>> sorted_ab = zip(*sorted(chain(keyed_a, keyed_b), key=lambda t: t[0]))[1]
>>> sorted_ab
((1+4j), (8+1j), (3+6j), 3j, (5+8j), (2+5j), (7+0j), (4+7j), (9+2j), (6+9j))
Since in Python 3 zip() returns an iterator, we need to coerce it to a list before attempting to subscript it:
>>> # ... as before up to 'from itertools import chain'
>>> sorted_ab = list(zip(*sorted(chain(keyed_a, keyed_b), key=lambda t: t[0])))[1]
>>> sorted_ab
((1+4j), (8+1j), (3+6j), 3j, (5+8j), (2+5j), (7+0j), (4+7j), (9+2j), (6+9j))
Answer to question 1: You can provide a key attribute to sorted(). For example if you want to sort by the object's .name, then use
sorted(chain(a, b), key=lambda x: x.name)
As for question 2: I guess you'd need another attribute for each object (like foo.z, as suggested by Zero Piraeus) that can be accessed by sorted(), since that function has no way of telling where the object it's currently sorting used to come from. After all, it is receiving a new iterator from chain() which doesn't contain any information about whether the current element is from a or b.

python sort a list of objects based on attributes in the order of the other list

I am working with Python list sort.
I have two lists: one is a list of integers, the other is a list of objects, and the second object list has the attribute id which is also an integer, I want to sort the object list based on the id attribute, in the order of the same id appears in the first list, well, this is an example:
I got a = [1,2,3,4,5]
and b = [o,p,q,r,s], where o.id = 2, p.id = 1, q.id = 3, r.id = 5, s.id = 4
and I want my list b to be sorted in the order of its id appears in list a, which is like this:
sorted_b = [p, o, q, s, r]
Of course, I can achieve this by using nested loops:
sorted_b = []
for i in a:
for j in b:
if j.id == i:
sorted_b.append(j)
break
but this is a classic ugly and non-Python way to solve a problem, I wonder if there is a way to solve this in a rather neat way, like using the sort method, but I don't know how.
>>> from collections import namedtuple
>>> Foo = namedtuple('Foo', 'name id') # this represents your class with id attribute
>>> a = [1,2,3,4,5]
>>> b = [Foo(name='o', id=2), Foo(name='p', id=1), Foo(name='q', id=3), Foo(name='r', id=5), Foo(name='s', id=4)]
>>> sorted(b, key=lambda x: a.index(x.id))
[Foo(name='p', id=1), Foo(name='o', id=2), Foo(name='q', id=3), Foo(name='s', id=4), Foo(name='r', id=5)]
This is a simple way to do it:
# Create a dictionary that maps from an ID to the corresponding object
object_by_id = dict((x.id, x) for x in b)
sorted_b = [object_by_id[i] for i in a]
If the list gets big, it's probably the fastest way, too.
You can do it with a list comprehension, but in general is it the same.
sorted_b = [ y for x in a for y in b if y.id == x ]
There is a sorted function in Python. It takes optional keyword argument cmp. You can pass there your customized function for sorting.
cmp definition from the docs:
custom comparison should return a negative, zero or positive number depending on whether the first argument is considered smaller than, equal to, or larger than the second argument
a = [1,2,3,4,5]
def compare(el1, el2):
if a.index(el1.id) < a.index(el2.id): return -1
if a.index(el1.id) > a.index(el2.id): return 1
return 0
sorted(b, cmp=compare)
This is more straightforward however I would encourage you to use the key argument as jamylak described in his answer, because it's more pythonic and in Python 3 the cmp is not longer supported.

Categories

Resources