I have 2 lists:
edges = [[0,1],[0,2],[0,3],[1,2],[1,3]]
weight = [10,8,7,3,7]
edges represents the list of edges connecting 2 nodes together with the corresponding weight.
for the given starting nodes as in edges[i][0] I want to choose the shortest connecting point given the weight so in this case the result would look like:
connect = [[0,3],[1,2]]
weight = [7,3]
Because out of all the nodes connected to 0 3 is the closest one and for 1, 2 is the closest one.
I am not able to formulate the problem, any help is appreciated!
edges = [[0,1],[0,2],[0,3],[1,2],[1,3]]
weight = [10,8,7,3,7]
connect = []
wght = []
In [8]: for i in set(e[0] for e in edges):
...: temp = [(a, b) for a, b in zip(edges, weight) if a[0] == i]
...: temp = min(temp, key=lambda x: x[1])
...: connect += [temp[0]]
...: wght += [temp[1]]
In [9]: connect
Out[9]: [[0, 3], [1, 2]]
In [10]: wght
Out[10]: [7, 3]
In case you are into one liner:
In [20]: [min([(a, b) for a, b in zip(edges, weight) if a[0] == i], key=lambda x: x[1]) fo
...: r i in set([e[0] for e in edges])]
Out[20]: [([0, 3], 7), ([1, 2], 3)]
Another solution using Pandas:
df = pd.DataFrame(edges, columns=['start','end'])
df['weight'] = weight
df.loc[df.groupby('start')['weight'].idxmin()]
With the results being:
start end weight
0 3 7
1 2 3
Related
I want to maximize the following function:
f(i, j, k) = min(A(i, j), B(j, k))
Where A and B are matrices and i, j and k are indices that range up to the respective dimensions of the matrices. I would like to find (i, j, k) such that f(i, j, k) is maximized. I am currently doing that as follows:
import numpy as np
import itertools
shape_a = (100 , 150)
shape_b = (shape_a[1], 200)
A = np.random.rand(shape_a[0], shape_a[1])
B = np.random.rand(shape_b[0], shape_b[1])
# All the different i,j,k
combinations = itertools.product(np.arange(shape_a[0]), np.arange(shape_a[1]), np.arange(shape_b[1]))
combinations = np.asarray(list(combinations))
A_vals = A[combinations[:, 0], combinations[:, 1]]
B_vals = B[combinations[:, 1], combinations[:, 2]]
f = np.min([A_vals, B_vals], axis=0)
best_indices = combinations[np.argmax(f)]
print(best_indices)
[ 49 14 136]
This is faster than iterating over all (i, j, k), but a lot of (and most of the) time is spent constructing the A_vals and B_vals matrices. This is unfortunate, because they contain many many duplicate values as the same i, j and k appear multiple times. Is there a way to do this where (1) the speed of numpy's matrix computation can be preserved and (2) I don't have to construct the memory-intensive A_vals and B_vals arrays.
In other languages you could maybe construct the matrices so that they container pointers to A and B, but I do not see how to achieve this in Python.
Perhaps you could re-evaluate how you look at the problem in context of what min and max actually do. Say you have the following concrete example:
>>> np.random.seed(1)
>>> print(A := np.random.randint(10, size=(4, 5)))
[[5 8 9 5 0]
[0 1 7 6 9]
[2 4 5 2 4]
[2 4 7 7 9]]
>>> print(B := np.random.randint(10, size=(5, 3)))
[[1 7 0]
[6 9 9]
[7 6 9]
[1 0 1]
[8 8 3]]
You are looking for a pair of numbers in A and B such that the column in A is the same as the row of B, and the you get the maximum smaller number.
For any set of numbers, the largest pairwise minimum happens when you take the two largest numbers. You are therefore looking for the max in each column of A, row of B, the minimum of those pairs, and then the maximum of that. Here is a relatively simple formulation of the solution:
candidate_i = A.argmax(axis=0)
candidate_k = B.argmax(axis=1)
j = np.minimum(A[candidate_i, np.arange(A.shape[1])], B[np.arange(B.shape[0]), candidate_k]).argmax()
i = candidate_i[j]
k = candidate_k[j]
And indeed, you see that
>>> i, j, k
(0, 2, 2)
>>> A[i, j]
9
>>> B[j, k]
9
If there are collisions, argmax will always pick the first option.
Your values i,j,k are determined by the index of the maximum value from the set {A,B}. You can simply use np.argmax().
if np.max(A) < np.max(B):
ind = np.unravel_index(np.argmax(A),A.shape)
else:
ind = np.unravel_index(np.argmax(B),B.shape)
It will return only two values, either i,j if max({A,B}) = max({A}) or j,k if max({A,B}) = max({B}). But if for example you get i,j then k can be any value that fit the shape of the array B, so select randomly one of this value.
If you also need to maximize the other value then:
if np.max(A) < np.max(B):
ind = np.unravel_index(np.argmax(A),A.shape)
ind = ind + (np.argmax(B[ind[1],:]),)
else:
ind = np.unravel_index(np.argmax(B),B.shape)
ind = (np.argmax(A[:,ind[0]]),) + ind
I'm new with Python and have a quite simple problem on paper but difficult to me in Python.
I have two samples of values (which are lists) :
X = [2, 2, 4, 6]
Y = [1, 3, 4, 5]
I have a concatenated list which is sorted as
Z = [ 1 , 2 , 2 , 3 , 4 , 4 , 5 , 6]
#rank: 1 2.5 4 5.5 7 8
I would like to get the sum of ranks of X values in Z. For this example, the ranks of 2, 2, 4 and 6 in Z are 2.5 + 2.5 + 5.5 + 8 = 18.5
(ranks of Y values in Z are 1 + 4 + 5.5 + 7 = 17.5)
Here is what I've done but it doesn't work with these lists X and Y (it works if each value appears only one time)
def funct(X, Z):
rank = []
for i in range(len(Z)):
for j in range(len(X)):
if Z[i] == X[j]:
rank = rank + [(i+1)]
print(sum(rank))
return
I would like to solve my problem with not too much complicated functions (only loops and quite easy ways to get a solution).
You can use a dictionary to keep track of the rank sums and counts once you've sorted the combined list.
X = [2, 2, 4, 6]
Y = [1, 3, 4, 5]
Z = sorted(X + Y)
ranksum = {}
counts = {}
for i, v in enumerate(Z):
ranksum[v] = ranksum.get(v, 0) + (i + 1) # Add
counts[v] = counts.get(v, 0) + 1 # Increment count
Then, when you want to look up the rank of an element, you need ranksum[v] / count[v].
r = [ranksum[x] / counts[x] for x in X]
print(r)
# Out: [2.5, 2.5, 5.5, 8]
Here's a solution for how to build the list of ranks:
X = ...
Y = ...
Z = sorted(X + Y)
rank = [1]
z = Z[:1]
for i, e in enumerate(Z[1:], start=2):
if e == z[-1]:
rank[-1] += 0.5
else:
rank.append(i)
z.append(e)
Now you can convert that into a dictionary:
ranks = dict(zip(z, rank))
That will make lookup easier:
sum(ranks[e] for e in X)
Here's another option where you build a dictionary of the rank indexes and then create a rank dictionary from there:
from collections import defaultdict
X = [2, 2, 4, 6]
Y = [1, 3, 4, 5]
Z = sorted(X + Y)
rank_indexes = defaultdict(lambda: [])
for i,v in enumerate(Z):
rank_indexes[v].append(i+1)
ranks = {k:(sum(v)/len(v)) for (k,v) in rank_indexes.items()}
print("Sum of X ranks:", sum([ranks[v] for v in X]))
print("Sum of Y ranks:", sum([ranks[v] for v in Y]))
Output:
Sum of X ranks: 18.5
Sum of Y ranks: 17.5
You can do the same thing without defaultdict, but it's slightly slower and I'd argue less Pythonic:
rank_indexes = {}
for i,v in enumerate(Z):
rank_indexes.setdefault(v, []).append(i+1)
ranks = {k:(sum(v)/len(v)) for (k,v) in rank_indexes.items()}
I have two lists of element
a = [1,2,3,2,3,1,1,1,1,1]
b = [3,1,2,1,2,3,3,3,3,3]
and I am trying to uniquely match the element from a to b, my expected result is like this:
1: 3
2: 1
3: 2
So I tried to construct an assignment matrix and then use scipy.linear_sum_assignment
a = [1,2,3,2,3,1,1,1,1,1]
b = [3,1,2,1,2,3,3,3,3,3]
total_true = np.unique(a)
total_pred = np.unique(b)
matrix = np.zeros(shape=(len(total_pred),
len(total_true)
)
)
for n, i in enumerate(total_true):
for m, j in enumerate(total_pred):
matrix[n, m] = sum(1 for item in b if item==(i))
I expected the matrix to be:
1 2 3
1 0 2 0
2 0 0 2
3 6 0 0
But the output is:
[[2. 2. 2.]
[2. 2. 2.]
[6. 6. 6.]]
What mistake did I made in here? Thank you very much
You don't even need to process this by Pandas. try to use zip and dict:
In [42]: a = [1,2,3,2,3,1,1,1,1,1]
...: b = [3,1,2,1,2,3,3,3,3,3]
...:
In [43]: c =zip(a,b)
In [44]: dict(c)
Out[44]: {1: 3, 2: 1, 3: 2}
UPDATE as OP said, if we need to store all the value with the same key, we can use defaultdict:
In [58]: from collections import defaultdict
In [59]: d = defaultdict(list)
In [60]: for k,v in c:
...: d[k].append(v)
...:
In [61]: d
Out[61]: defaultdict(list, {1: [3, 3, 3, 3, 3, 3], 2: [1, 1], 3: [2, 2]})
This row:
matrix[n, m] = sum(1 for item in b if item==(i))
counts the occurrences of i in b and saves the result to matrix[n, m]. Each cell of the matrix will contain either the number of 1's in b (i.e. 2) or the number of 2's in b (i.e. 2) or the number of 3's in b (i.e. 6). Notice that this value is completely independent of j, which means that the values in one row will always be the same.
In order to take j into consideration, try to replace the row with:
matrix[n, m] = sum(1 for x, y in zip(a, b) if (x, y) == (j, i))
In case your expected output, since how we specify the matrix as a(i, j) with i is the index of the row, and j is the index of the col. Looking at a(3,1) in your matrix, the result is 6, which means (3,1) combination matches 6 times, with 3 is from b and 1 is from a. We can find all the matches from 2 list.
matches = [tuple([x, y]) for x,y in zip(b, a)]
Then we can find how many matches there are of a specific combination, for example a(3, 1).
result = matches.count((3,1))
In Python, is there a good way to iterate through lists of different lengths?
For example,
a = [1,2,3]
b=[4,5]
c = [a,b]
for val1, val2, val3 in c:
print val1
print val2
print val3
Assuming that the list will have at least 2 values, and in some list, 3rd value is optional. The above for loop didn't work for b, obviously, that val3 is not available for list 'b'. In that case, I want to print the val3 as 0. Can I give a default value in case of unavailability?
for val1, val2, val3=0 in c:
The above syntax didn't work either. Please help.
If you want to be fancy ("elegant"?), you can pad a given list with zeros:
def pad_list(t, size, default):
return t + [default] * (size - len(t))
for x in c:
v1, v2, v3 = pad_list(x, 3, 0)
print(v1)
print(v2)
print(v3)
Similarly, if you're working with tuples, here's another function:
def pad_list(t, size, default):
return t + (default,) * (size - len(t))
You could use zip_longest with fillvalue handling empty slot for this case:
from itertools import zip_longest
a = [1,2,3]
b = [4,5]
l = []
for x, y in zip_longest(a, b, fillvalue=0):
l.append((x, y))
print(list(zip(*l)))
# [(1, 2, 3), (4, 5, 0)]
If you need values out of list, just replace last print with:
for val1, val2, val3 in zip(*l):
print(val1)
print(val2)
print(val3)
# 1
# 2
# 3
# 4
# 5
# 0
this is very simple
c = [1,2,3]
val1 , val2 , *val3= [1,2 , 3]
val1 = 1 , val2 =3 , val3=[3]
c=[1,2]
val1 , val2 , *val3= [1,2 , 3]
val1 = 1 , val2 =3 , val3=[]
def foreach(l):
def deco(f):
for xs in l:
f(*xs)
return deco
#foreach([[1, 2, 3], [4, 5]])
def _(a, b, c=6):
print(a, b, c, sep='\n')
The simplest method to concatenate the lists is via chain function from the itertools module.
Example Snippet
import itertools
a = [1, 2, 3, 4, 5, 6]
b = [7, 8, 9, 10, 11, 12]
c = ['A', 'B', 'C', 'D', 'E', 'F', 'U']
combined = itertools.chain( a, b, c ) # combines in order
# enumerate lists to allow for iteration
for index, value in enumerate(combined):
print(value, end = ' ')
Output
1 2 3 4 5 6 7 8 9 10 11 12 A B C D E F U
I am trying to do the following:
import numpy as np
A = np.array([1,5,2,7,1])
B = np.sort(A)
print B
>>> [1,1,2,5,7]
I want to find the location of all elements in B as in original array A. i.e. I want to create an array C such that
print C
>>[0,4,2,1,3]
which refers to 1 in B being present in A at 0 and 4th location, 5 in B was present in A at 1st location, etc.
I tried using np.where( B == A) but it produces gibberish
import numpy as np
A = np.array([1,5,2,7,1])
print np.argsort(A) #prints [0 4 2 1 3]
If you don't want to imporr numpy for any reason you can also use this code:
a = [1,5,2,7,1]
b = zip(a, range(len(a)))
tmp = sorted(b, key=lambda x: x[0])
c = map( lambda x: x[1], tmp)
print c
[0, 4, 2, 1, 3]
https://repl.it/CVbI
A = [1,5,2,7,1]
for i,e in sorted(enumerate(A), key=lambda x: x[1]):
print(i, e)
B = [x for x,_ in sorted(enumerate(A), key=lambda x: x[1])]
A = sorted(A)
print(A)
print(B)