Related
Given a list of iterables:
li = [(1,2), (3,4,8), (3,4,7), (9,)]
I want to sort by the third element if present, otherwise leave the order unchanged. So here the desired output would be:
[(1,2), (3,4,7), (3,4,8), (9,)]
Using li.sort(key=lambda x:x[2]) returns an IndexError. I tried a custom function:
def safefetch(li, idx):
try:
return li[idx]
except IndexError:
return # (ie return None)
li.sort(key=lambda x: safefetch(x, 2))
But None in sorting yields a TypeError.
Broader context: I first want to sort by the first element, then the second, then the third, etc. until the length of the longest element, ie I want to run several sorts of decreasing privilege (as in SQL's ORDER BY COL1 , COL2), while preserving order among those elements that aren't relevant. So: first sort everything by first element; then among the ties on el_1 sort on el_2, etc.. until el_n. My feeling is that calling a sort function on the whole list is probably the wrong approach.
(Note that this was an "XY question": for my actual question, just using sorted on tuples is simplest, as Patrick Artner pointed out in the comments. But the question is posed is trickier.)
We can first get the indices for distinct lengths of elements in the list via a defaultdict and then sort each sublist with numpy's fancy indexing:
from collections import defaultdict
# {length -> inds} mapping
d = defaultdict(list)
# collect indices per length
for j, tup in enumerate(li):
d[len(tup)].append(j)
# sort
li = np.array(li, dtype=object)
for inds in d.values():
li[inds] = sorted(li[inds])
# convert back to list if desired
li = li.tolist()
to get li at the end as
[(1, 2), (3, 4, 7), (3, 4, 8), (9,)]
For some other samples:
In [134]: the_sorter([(12,), (3,4,8), (3,4,7), (9,)])
Out[134]: [(9,), (3, 4, 7), (3, 4, 8), (12,)]
In [135]: the_sorter([(12,), (3,4,8,9), (3,4,7), (11, 9), (9, 11), (2, 4, 4, 4)])
Out[135]: [(12,), (2, 4, 4, 4), (3, 4, 7), (9, 11), (11, 9), (3, 4, 8, 9)]
where the_sorter is above procedure wrapped in a function (name lacks imagination...)
def the_sorter(li):
# {length -> inds} mapping
d = defaultdict(list)
# collect indices per length
for j, tup in enumerate(li):
d[len(tup)].append(j)
# sort
li = np.array(li)
for inds in d.values():
li[inds] = sorted(li[inds])
return li.tolist()
Whatever you return as fallback value must be comparable to the other key values that might be returned. In your example that would require a numerical value.
import sys
def safefetch(li, idx):
try:
return li[idx]
except IndexError:
return sys.maxsize # largest int possible
This would put all the short ones at the back of the sort order, but maintain a stable order among them.
Inspired by #Mustafa Aydın here is a solution in Pandas. Would prefer one without the memory overhead of a dataframe, but this might be good enough.
import pandas as pd
li = [(1,2), (3,4,8), (3,4,7), (9,)]
tmp = pd.DataFrame(li)
[tuple(int(el) for el in t if not pd.isna(el)) for t in tmp.sort_values(by=tmp.columns.tolist()).values]
> [(1, 2), (3, 4, 7), (3, 4, 8), (9,)]
Alright. So I've been through some SO answers such as Find an element in a list of tuples in python and they don't seem that specific to my case. And I am getting no idea on how to use them in my issue.
Let us say I have a list of a tuple of tuples; i.e. the list stores several data points each referring to a Cartesian point. Each outer tuple represents the entire data of the point. There is an inner tuple in this tuple which is the point exactly. That is, let us take the point (1,2) and have 5 denoting some meaning to this point. The outer tuple will be ((1,2),5)
Well, it is easy to figure out how to generate this. However, I want to search for an outer tuple based on the value of the inner tuple. That is I wanna do:
for y in range(0, 10):
for x in range(0, 10):
if (x, y) in ###:
print("Found")
or something of this sense. How can this be done?
Based on the suggestion posted as a comment by #timgen, here is some pseudo-sample data.
The list is gonna be
selectPointSet = [((9, 2), 1), ((4, 7), 2), ((7, 3), 0), ((5, 0), 0), ((8, 1), 2)]
So I may wanna iterate through the whole domain of points which ranges from (0,0) to (9,9) and do something if the point is one among those in selectPointSet; i.e. if it is (9, 2), (4, 7), (7, 3), (5, 0) or (8, 1)
Using the data structures that you currently are, you can do it like this:
listTuple = [((1,1),5),((2,3),5)] #dummy list of tuples
for y in range(0, 10):
for x in range(0, 10):
for i in listTuple:#loop through list of tuples
if (x, y) in listTuple[listTuple.index(i)]:#test to see if (x,y) is in the tuple at this index
print(str((x,y)) , "Found")
You can make use of a dictionary.
temp = [((1,2),3),((2,3),4),((6,7),4)]
newDict = {}
# a dictionary with inner tuple as key
for t in temp:
newDict[t[0]] = t[1]
for y in range(0, 10):
for x in range(0, 10):
if newDict.__contains__((x,y)):
print("Found")
I hope this is what you are asking for.
Make a set from the two-element tuples for O(1) lookup.
>>> data = [((1,2),3),((2,3),4),((6,7),4)]
>>> tups = {x[0] for x in data}
Now you can query tups with any tuple you like.
>>> (6, 7) in tups
True
>>> (3, 2) in tups
False
Searching for values from 0 to 9:
>>> from itertools import product
>>> for x, y in product(range(10), range(10)):
... if (x, y) in tups:
... print('found ({}, {})'.format(x, y))
...
found (1, 2)
found (2, 3)
found (6, 7)
If you need to retain information about the third number (and the two-element inner tuples in data are unique) then you can also construct a dictionary instead of a set.
>>> d = dict(data)
>>> d
{(1, 2): 3, (2, 3): 4, (6, 7): 4}
>>> (2, 3) in d
True
>>> d[(2, 3)]
4
In the following code, I want to multprocess sum_ for three different values of z which are included in np.array([1,2,3]):
from multiprocessing import Pool
from functools import partial
import numpy as np
def sum_(x, y, z):
return x**1+y**2+z**3
sum_partial = partial(sum_, x = 1, y = 2) # freeze x and y
a = np.array([1,2,3]) # three different values for z
p = Pool(4)
p.map(sum_partial, a)
p.map(sum_partial, a) gives the following error: TypeError: sum_() got multiple values for keyword argument 'x', because for Python I reassign a to the kwarg x of my function. How can I make each variable of np.array([1,2,3]) to fill the argumentz of sum_ instead of x so that I can get the following result:
[6, 13, 32]
which are respectively:
sum_partial(z=1), sum_partial(z=2), sum_partial(z=3)
?
I would like to keep using pool.map.
Btw is that possible to use multiprocessing both with an array of y and an array of z to finally get a list of len(y)*len(z) values?
I find my answer here
In my case it'll be:
import multiprocessing as mp
def sum_(x, y, z):
return x**1+y**2+z**3
def mf_wrap(args):
return sum_(*args)
p = mp.Pool(4)
a = [1,2,3]
b = [0.1,0.2,0.3]
fl = [(1, i, j) for i in a for j in b]
#mf_wrap = lambda args: myfun(*args) -> this sucker, though more pythonic and compact, won't work
p.map(mf_wrap, fl)
According to this thread and PEP309, it appears that you cannot replace the first, leftmost argument of a function with partial.
Hence, you should slightly modify your code such that your iterable z is the first argument:
def sum_(z, x, y):
return x**1+y**2+z**3
This works for me and yields the desired result.
Edit:
Regarding your second questions, you could use itertools to generate the arguments:
import itertools
a = [1, 2, 3]
b = [7, 8, 9]
c = list(itertools.product(a, b))
print c
Out[74]: [(1, 7), (1, 8), (1, 9), (2, 7), (2, 8), (2, 9), (3, 7), (3, 8), (3, 9)]
In that case, your sum_ should expect a tuple as input:
def sum_(values, z):
x, y = values
return x**1+y**2+z**3
sum_partial = partial(sum_, z=2)
map(sum_partial, c)
Out[88]: [58, 73, 90, 59, 74, 91, 60, 75, 92]
I have an array of 3-tuples and I want to sort them in order of decreasing product of the elements of each tuple in Python. So, for example, given the array
[(3,2,3), (2,2,2), (6,4,1)]
since 3*2*3 = 18, 2*2*2 = 8, 6*4*1 = 24, the final result would be
[(6,4,1), (3,2,3), (2,2,2)]
I know how to sort by, for example, the first element of the tuple, but I'm not sure how to tackle this.
Any help would be greatly appreciated. Thanks!
Use the key argument of sorted/list.sort to specify a function for computing the product, and set the reverse argument to True to make the results descending rather than ascending, e.g.:
from operator import mul
print sorted([(3,2,3), (2,2,2), (6,4,1)], key=lambda tup: reduce(mul, tup), reverse=True)
In [176]: L = [(3,2,3), (2,2,2), (6,4,1)]
In [177]: L.sort(key=lambda (a,b,c):a*b*c, reverse=True)
In [178]: L
Out[178]: [(6, 4, 1), (3, 2, 3), (2, 2, 2)]
A simpler solution from my point of view:
a = [(3,2,3), (2,2,2), (6,4,1)]
def f(L):
return L[0]*L[1]*L[2]
print sorted(a, key = f, reverse = True)
key must be a function that returns a value that will be used in order to sort the list
reverse is True because you want it ordered in decreasing order
>>> from operator import mul
>>> input_list = [(3,2,3), (2,2,2), (6,4,1)]
>>> input_list.sort(key=lambda tup: reduce(mul,tup))
>>> print input_list
[(2, 2, 2), (3, 2, 3), (6, 4, 1)]
I have a list of lists of tuples
A= [ [(1,2,3),(4,5,6)], [(7,8,9),(8,7,6),(5,4,3)],[(2,1,0),(1,3,5)] ]
The outer list can have any number of inner lists, the inner lists can have any number of tuples, a tuple always has 3 integers.
I want to generate all combination of tuples, one from each list:
[(1,2,3),(7,8,9),(2,1,0)]
[(1,2,3),(7,8,9),(1,3,5)]
[(1,2,3),(8,7,6),(2,1,0)]
...
[(4,5,6),(5,4,3),(1,3,5)]
A simple way to do it is to use a function similar to itertools.poduct()
but it must be called like this
itertools.product([(1,2,3),(4,5,6)], [(7,8,9),(8,7,6),(5,4,3)],[(2,1,0),(1,3,5)])
i.e the outer list is removed. And I don't know how to do that. Is there a better way to generate all combinations of tuples?
itertools.product(*A)
For more details check the python tutorial
This works for your example, if there is only one level of nested lists (no lists of lists of lists):
itertools.product(*A)
you can probably call itertools.product like so:
itertools.product(*A) # where A is your list of lists of tuples
This way it expands your list's elements into arguments for the function you are calling.
Late to the party but ...
I'm new to python and come from a lisp background. This is what I came up with (check out the var names for lulz):
def flatten(lst):
if lst:
car,*cdr=lst
if isinstance(car,(list)):
if cdr: return flatten(car) + flatten(cdr)
return flatten(car)
if cdr: return [car] + flatten(cdr)
return [car]
Seems to work. Test:
A = [ [(1,2,3),(4,5,6)], [(7,8,9),(8,7,6),(5,4,3)],[(2,1,0),(1,3,5)] ]
flatten(A)
Result:
[(1, 2, 3), (4, 5, 6), (7, 8, 9), (8, 7, 6), (5, 4, 3), (2, 1, 0), (1, 3, 5)]
Note: the line car,*cdr=lst only works in Python 3.0
This is not exactly one step, but this would do what you want if for some reason you don't want to use the itertools solution:
def crossprod(listoflists):
if len(listoflists) == 1:
return listoflists
else:
result = []
remaining_product = prod(listoflists[1:])
for outertupe in listoflists[0]:
for innercombo in remaining_product[0]:
newcombo = [outertupe]
newcombo.append(innercombo)
result.append(newcombo)
return result
def flatten(A)
answer = []
for i in A:
if type(i) == list:
ans.extend(i)
else:
ans.append(i)
return ans
This may also be achieved using list comprehension.
In [62]: A = [ [(1,2,3),(4,5,6)], [(7,8,9),(8,7,6),(5,4,3)],[(2,1,0),(1,3,5)] ]
In [63]: improved_list = [num for elem in A for num in elem]
In [64]: improved_list
Out[64]: [(1, 2, 3), (4, 5, 6), (7, 8, 9), (8, 7, 6), (5, 4, 3), (2, 1, 0), (1, 3, 5)]