I have a nested list of tuples of 97510 values like this:
a = [ (1,2,3), (3,4,5), (5,4,2)]
every first value (index=0) is unique and I need to find other index=0 items that have the same index=1 items
In the example , I need to find the second and third tuples where the the second item '4' is common .
How do I do it ?
If you want to find all matches:
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> for inner in a:
... d[inner[1]].append(inner)
...
>>> d
defaultdict(<type 'list'>, {2: [(1, 2, 3)], 4: [(3, 4, 5), (5, 4, 2)]})
>>> d[4]
[(3, 4, 5), (5, 4, 2)]
If you want to pick out all matches for a particular second value:
>>> filter(lambda inner: inner[1] == 4, a)
[(3, 4, 5), (5, 4, 2)]
Edit: As pointed out in the comments, a list comprehension is preferable as it is more efficient for such work:
>>> [inner for inner in a if inner[1] == 4]
[(3, 4, 5), (5, 4, 2)]
Using timeit shows the list comprehension is about 2.5 times faster (on my machine anyway):
>>> timeit.timeit('[inner for inner in a if inner[1] == 4]', 'a=[(1,2,3), (3,4,5), (5, 4, 2)]')
2.5041549205780029
>>> timeit.timeit('filter(lambda inner: inner[1] == 4, a)', 'a=[(1,2,3), (3,4,5), (5, 4, 2)]')
6.328679084777832
Here is one way to do it:
>>> result = defaultdict(list)
>>> for item in a:
>>> result[item[1]].append(item)
>>> result
defaultdict(<type 'list'>, {2: [(1, 2, 3)], 4: [(3, 4, 5), (5, 4, 2)]})
This will result in a dictionary of lists where all items with the same second value are in one list, with that value as the key.
Another alternative:
from operator import itemgetter
from itertools import groupby
a = [ (1,2,3), (3,4,5), (5,4,2)]
b = groupby(sorted(a), itemgetter(1))
for val, group in b:
print val, list(group)
# 2 [(1, 2, 3)]
# 4 [(3, 4, 5), (5, 4, 2)]
Note that you can also use groupby:
from itertools import groupby
data = [ (1,2,3), (3,4,5), (5,4,2)]
res = groupby(sorted(data), key=lambda x: x[1])
Edited as per comment
Played around with the problem and found one more solution - however not the best one, but:
inputVals = [(1,2,3), (3,4,5), (5,4,2), (2,2,3), (7,3,1)]
for val in set(x[1] for x in inputVals):
print val, list(set(sval for sval in inputVals if sval[1] == val))
Related
I have a list of list.
e.g. list_a = [[1,2,3], [2,3], [4,3,2], [2,3]]
I want to count them like
[1,2,3]: 1
[2,3]: 2
[4,3,2]: 1
There is a library Counter in collections but not for unhashable elements like list. Currently, I just try to use other indirect ways for example transfer the list [1,2,3] into a string "1_2_3" to do that. Is there any other way can enable the count on the list directly?
Not the prettiest way to do it, but this works:
list_a = [[1,2,3], [2,3], [4,3,2], [2,3]]
counts = {}
for x in list_a:
counts.setdefault(tuple(x), list()).append(1)
for a, b in counts.items():
counts[a] = sum(b)
print(counts)
{(2, 3): 2, (4, 3, 2): 1, (1, 2, 3): 1}
A possible approach to do this job is using a dict.
Create a empty dict
Iterate over the list using a for loop.
For each element (iteration), check if the dict contains it.
If it doesn't, save it in the dict as a key. The value will be the occurrence counter.
If it does, just increment its value.
Possible implementation:
occurrence_dict = {}
for list in list_a:
if (occurrence_dict.get(str(list), False)):
occurence_dict[str(list)] += 1
else:
ocorrence_dict[str(list)] = 1
print(occurence_dict)
You can achieve it easily, by using tuple instead of list
c = Counter(tuple(item) for item in list_a)
# or
c = Counter(map(tuple, list_a))
# Counter({(2, 3): 2, (1, 2, 3): 1, (4, 3, 2): 1})
# exactly what you expected
(1, 2, 3) 1
(2, 3) 2
(4, 3, 2) 1
Way 1
Through the indexes of repeatable lists
list_a = [[1,2,3], [2,3], [4,3,2], [2,3], [1,2,3]] # just add some data
# step 1
dd = {i:v for i, v in enumerate(list_a)}
print(dd)
Out[1]:
{0: [1, 2, 3], 1: [2, 3], 2: [4, 3, 2], 3: [2, 3], 4: [1, 2, 3]}
# step 2
tpl = [tuple(x for x,y in dd.items() if y == b) for a,b in dd.items()]
print(tpl)
Out[2]:
[(0, 4), (1, 3), (2,), (1, 3), (0, 4)] # here is the tuple of indexes of matching lists
# step 3
result = {tuple(list_a[a[0]]):len(a) for a in set(tpl)}
print(result)
Out[3]:
{(4, 3, 2): 1, (2, 3): 2, (1, 2, 3): 2}
Way 2
Through converting nested lists to tuples
{i:[tuple(a) for a in list_a].count(i) for i in [tuple(a) for a in list_a]}
Out[1]:
{(1, 2, 3): 2, (2, 3): 2, (4, 3, 2): 1}
what is the best way to find the product of any number of elements from a list?
e.g if I have [a,b,c] as the input, i should get [a,b,c,a*b,a*c,b*c,a*b*c] as the output (order of elements for output doesn't matter.)
PS: Can we do it recursively? (e.g you just need the product of a*b and c to get the product a*b*c.
Any idea or suggestion is welcome. Thanks in advance!
Here you go:
from itertools import combinations
l = [2, 3, 5]
result = []
for i in range(1, len(l) + 1):
result += list(combinations(l, i))
multiplied_result = [reduce(lambda x, y: x*y, lst) for lst in result]
Now if we print the result, we get
>>> print listmap
[2, 3, 5, 6, 10, 15, 30]
You can use itertools.combinations within a list comprehension :
>>> def find_mul(li):
... return [[reduce(lambda x,y:x*y,j) for j in combinations(li,i)] for i in xrange(2,len(li)+1)]
...
DEMO:
>>> [list(combinations([2,3,4],i)) for i in xrange(2,len([2,3,4])+1)]
[[(2, 3), (2, 4), (3, 4)], [(2, 3, 4)]]
>>> l=[2,3,4]
>>> find_mul(l)
[[6, 8, 12], [24]]
I am newbie to Python and need to convert a list to dictionary. I know that we can convert a list of tuples to a dictionary.
This is the input list:
L = [1,term1, 3, term2, x, term3,... z, termN]
and I want to convert this list to a list of tuples (or directly to a dictionary) like this:
[(1, term1), (3, term2), (x, term3), ...(z, termN)]
How can we do that easily in Python?
>>> L = [1, "term1", 3, "term2", 4, "term3", 5, "termN"]
# Create an iterator
>>> it = iter(L)
# zip the iterator with itself
>>> zip(it, it)
[(1, 'term1'), (3, 'term2'), (4, 'term3'), (5, 'termN')]
You want to group three items at a time?
>>> zip(it, it, it)
You want to group N items at a time?
# Create N copies of the same iterator
it = [iter(L)] * N
# Unpack the copies of the iterator, and pass them as parameters to zip
>>> zip(*it)
Try with the group clustering idiom:
zip(*[iter(L)]*2)
From https://docs.python.org/2/library/functions.html:
The left-to-right evaluation order of the iterables is guaranteed.
This makes possible an idiom for clustering a data series into
n-length groups using zip(*[iter(s)]*n).
List directly into a dictionary using zip to pair consecutive even and odd elements:
m = [ 1, 2, 3, 4, 5, 6, 7, 8 ]
d = { x : y for x, y in zip(m[::2], m[1::2]) }
or, since you are familiar with the tuple -> dict direction:
d = dict(t for t in zip(m[::2], m[1::2]))
even:
d = dict(zip(m[::2], m[1::2]))
Using slicing?
L = [1, "term1", 2, "term2", 3, "term3"]
L = zip(L[::2], L[1::2])
print L
Try this ,
>>> L = [1, "term1", 3, "term2", 4, "term3", 5, "termN"]
>>> it = iter(L)
>>> [(x, next(it)) for x in it ]
[(1, 'term1'), (3, 'term2'), (4, 'term3'), (5, 'termN')]
>>>
OR
>>> L = [1, "term1", 3, "term2", 4, "term3", 5, "termN"]
>>> [i for i in zip(*[iter(L)]*2)]
[(1, 'term1'), (3, 'term2'), (4, 'term3'), (5, 'termN')]
OR
>>> L = [1, "term1", 3, "term2", 4, "term3", 5, "termN"]
>>> map(None,*[iter(L)]*2)
[(1, 'term1'), (3, 'term2'), (4, 'term3'), (5, 'termN')]
>>>
[(L[i], L[i+1]) for i in xrange(0, len(L), 2)]
The below code will take care of both even and odd sized list :
[set(L[i:i+2]) for i in range(0, len(L),2)]
it seems a simple task:
I am trying to merge 2 dictionaries without overwriting the values but APPENDING.
a = {1: [(1,1)],2: [(2,2),(3,3)],3: [(4,4)]}
b = {3: [(5,5)], 4: [(6,6)]}
number of tuples a = 4, number of tuples b = 2
This is why I have singled out these options since they are overwriting:
all = dict(a.items() + b.items())
all = dict(a, **b)
all = a.update([b])
The following solution works just fine, BUT it also appends values to my original dictionary a:
all = {}
for k in a.keys():
if k in all:
all[k].append(a[k])
else:
all[k] = a[k]
for k in b.keys():
if k in all:
all[k].append(b[k])
else:
all[k] = b[k]
Output =
a = {1: [(1, 1)], 2: [(2, 2), (3, 3)], 3: [(4, 4), **[(5, 5)]**]}
b = {3: [(5, 5)], 4: [(6, 6)]}
c = {1: [(1, 1)], 2: [(2, 2), (3, 3)], 3: [(4, 4), [(5, 5)]], 4: [(6, 6)]}
number of tuples a = 5 !!!!!, number of tuples b = 2 (correct), number of tuples all = 6 (correct)
It appended tuple [(5,5)] from b to a. I have no idea as to why this happens because all I am coding is to write everything into the complete dictionary "all".
Can anyone tell me where the heck it is changing dict(a) ???????
Any help is greatly welcome.
Use .extend instead of .append for merging lists together.
>>> example = [1, 2, 3]
>>> example.append([4, 5])
>>> example
[1, 2, 3, [4, 5]]
>>> example.extend([6, 7])
>>> example
[1, 2, 3, [4, 5], 6, 7]
Moreover, you can loop over the keys and values of both a and b together using itertools.chain:
from itertools import chain
all = {}
for k, v in chain(a.iteritems(), b.iteritems()):
all.setdefault(k, []).extend(v)
.setdefault() looks up a key, and sets it to a default if it is not yet there. Alternatively you could use collections.defaultdict to do the same implicitly.
outputs:
>>> a
{1: [(1, 1)], 2: [(2, 2), (3, 3)], 3: [(4, 4)]}
>>> b
{3: [(5,5)], 4: [(6,6)]}
>>> all
{1: [(1, 1)], 2: [(2, 2), (3, 3)], 3: [(4, 4), (5, 5)], 4: [(6, 6)]}
Note that because we now create a clean new list for each key first, then extend, your original lists in a are unaffected. In your code you do not create a copy of the list; instead you copied the reference to the list. In the end both the all and the a dict values point to the same lists, and using append on those lists results in the changes being visible in both places.
It's easy to demonstrate that with simple variables instead of a dict:
>>> foo = [1, 2, 3]
>>> bar = foo
>>> bar
[1, 2, 3]
>>> bar.append(4)
>>> foo, bar
([1, 2, 3, 4], [1, 2, 3, 4])
>>> id(foo), id(bar)
(4477098392, 4477098392)
Both foo and bar refer to the same list, the list was not copied. To create a copy instead, use the list() constructor or use the [:] slice operator:
>>> bar = foo[:]
>>> bar.append(5)
>>> foo, bar
([1, 2, 3, 4], [1, 2, 3, 4, 5])
>>> id(foo), id(bar)
(4477098392, 4477098536)
Now bar is a new copy of the list and changes no longer are visible in foo. The memory addresses (the result of the id() call) differ for the two lists.
If you want a third dictionary that is the combined one I would use the collection.defaultdict
from collections import defaultdict
from itertools import chain
all = defaultdict(list)
for k,v in chain(a.iteritems(), b.iteritems()):
all[k].extend(v)
outputs
defaultdict(<type 'list'>, {1: [(1, 1)], 2: [(2, 2), (3, 3)], 3: [(4, 4), (5, 5)], 4: [(6, 6)]})
As an explanation of why your a changes, consider your loop:
for k in a.keys():
if k in all:
all[k].append(a[k])
else:
all[k] = a[k]
So, if k is not yet in all, you enter the else part and now, all[k] points to the a[k] list. It's not a copy, it's a reference to a[k]: they're basically the same object. At the next iteration, all[k] is defined, and you append to it: but as all[k] points to a[k], you end up also appending to a[k].
You want to avoid a all[k] = a[k]. You could try that:
for k in a.keys():
if k not in all:
all[k] = []
all[k].extend(a[k])
(Note the extend instead of the append, as pointed out by #Martijn Pieters). Here, you never have all[k] pointing to a[k], so you're safe. #Martijn Pieters' answer is far more concise and elegant, though, so you should go with it.
I have two lists of tuples, for example:
a = [(1,2,3),(4,5,6),(7,8,9)]
b = [(1,'a'),(4,'b'),(7,'c')]
The first element of each tuple in a and b are matched, I want to get a list like this:
merged = [(1,2,3,'a'),(4,5,6,'b'),(7,8,9,'c')]
Perhaps I will have another list like:
c = [(1,'xx'),(4,'yy'),(7,'zz')]
and merge it to "merged" list later, I tried "zip" and "map" which are not right for this case.
>>> a = [(1,2,3),(4,5,6),(7,8,9)]
>>> b = [(1,'a'),(4,'b'),(7,'c')]
>>>
>>> [x + (z,) for x, (y, z) in zip(a, b)]
[(1, 2, 3, 'a'), (4, 5, 6, 'b'), (7, 8, 9, 'c')]
to check if first elements actually match,
>>> [x + y[1:] for x, y in zip(a, b) if x[0] == y[0]]
def merge(a,b):
for ax, (first, bx) in zip(a,b):
if ax[0] != first:
raise ValueError("Items don't match")
yield ax + (bx,)
print list(merge(a,b))
print list(merge(merge(a,b),c))
>>> [a[i]+(k,) for i,(j, k) in enumerate(b)]
[(1, 2, 3, 'a'), (4, 5, 6, 'b'), (7, 8, 9, 'c')]
Using timeit this is the fastest of the posted solutions to return a merged list.
[ (x,y,z,b[i][1]) for i,(x,y,z) in enumerate(a) if x == b[i][0] ]
This makes sure that the values are matched and then merged.