reduce lists given single value of 2d lists

reduce lists given single value of 2d lists - python

I have 2 lists:
edges = [[0,1],[0,2],[0,3],[1,2],[1,3]]
weight = [10,8,7,3,7]
edges represents the list of edges connecting 2 nodes together with the corresponding weight.
for the given starting nodes as in edges[i][0] I want to choose the shortest connecting point given the weight so in this case the result would look like:
connect = [[0,3],[1,2]]
weight = [7,3]
Because out of all the nodes connected to 0 3 is the closest one and for 1, 2 is the closest one.
I am not able to formulate the problem, any help is appreciated!

edges = [[0,1],[0,2],[0,3],[1,2],[1,3]]
weight = [10,8,7,3,7]
connect = []
wght = []
In [8]: for i in set(e[0] for e in edges):
...: temp = [(a, b) for a, b in zip(edges, weight) if a[0] == i]
...: temp = min(temp, key=lambda x: x[1])
...: connect += [temp[0]]
...: wght += [temp[1]]
In [9]: connect
Out[9]: [[0, 3], [1, 2]]
In [10]: wght
Out[10]: [7, 3]
In case you are into one liner:
In [20]: [min([(a, b) for a, b in zip(edges, weight) if a[0] == i], key=lambda x: x[1]) fo
...: r i in set([e[0] for e in edges])]
Out[20]: [([0, 3], 7), ([1, 2], 3)]

Another solution using Pandas:
df = pd.DataFrame(edges, columns=['start','end'])
df['weight'] = weight
df.loc[df.groupby('start')['weight'].idxmin()]
With the results being:
start end weight
0 3 7
1 2 3

Related

Fastest way to find the maximum minimum value of two 'connected' matrices

I want to maximize the following function:
f(i, j, k) = min(A(i, j), B(j, k))
Where A and B are matrices and i, j and k are indices that range up to the respective dimensions of the matrices. I would like to find (i, j, k) such that f(i, j, k) is maximized. I am currently doing that as follows:
import numpy as np
import itertools
shape_a = (100 , 150)
shape_b = (shape_a[1], 200)
A = np.random.rand(shape_a[0], shape_a[1])
B = np.random.rand(shape_b[0], shape_b[1])
# All the different i,j,k
combinations = itertools.product(np.arange(shape_a[0]), np.arange(shape_a[1]), np.arange(shape_b[1]))
combinations = np.asarray(list(combinations))
A_vals = A[combinations[:, 0], combinations[:, 1]]
B_vals = B[combinations[:, 1], combinations[:, 2]]
f = np.min([A_vals, B_vals], axis=0)
best_indices = combinations[np.argmax(f)]
print(best_indices)
[ 49 14 136]
This is faster than iterating over all (i, j, k), but a lot of (and most of the) time is spent constructing the A_vals and B_vals matrices. This is unfortunate, because they contain many many duplicate values as the same i, j and k appear multiple times. Is there a way to do this where (1) the speed of numpy's matrix computation can be preserved and (2) I don't have to construct the memory-intensive A_vals and B_vals arrays.
In other languages you could maybe construct the matrices so that they container pointers to A and B, but I do not see how to achieve this in Python.

Perhaps you could re-evaluate how you look at the problem in context of what min and max actually do. Say you have the following concrete example:
>>> np.random.seed(1)
>>> print(A := np.random.randint(10, size=(4, 5)))
[[5 8 9 5 0]
[0 1 7 6 9]
[2 4 5 2 4]
[2 4 7 7 9]]
>>> print(B := np.random.randint(10, size=(5, 3)))
[[1 7 0]
[6 9 9]
[7 6 9]
[1 0 1]
[8 8 3]]
You are looking for a pair of numbers in A and B such that the column in A is the same as the row of B, and the you get the maximum smaller number.
For any set of numbers, the largest pairwise minimum happens when you take the two largest numbers. You are therefore looking for the max in each column of A, row of B, the minimum of those pairs, and then the maximum of that. Here is a relatively simple formulation of the solution:
candidate_i = A.argmax(axis=0)
candidate_k = B.argmax(axis=1)
j = np.minimum(A[candidate_i, np.arange(A.shape[1])], B[np.arange(B.shape[0]), candidate_k]).argmax()
i = candidate_i[j]
k = candidate_k[j]
And indeed, you see that
>>> i, j, k
(0, 2, 2)
>>> A[i, j]
9
>>> B[j, k]
9
If there are collisions, argmax will always pick the first option.

Your values i,j,k are determined by the index of the maximum value from the set {A,B}. You can simply use np.argmax().
if np.max(A) < np.max(B):
ind = np.unravel_index(np.argmax(A),A.shape)
else:
ind = np.unravel_index(np.argmax(B),B.shape)
It will return only two values, either i,j if max({A,B}) = max({A}) or j,k if max({A,B}) = max({B}). But if for example you get i,j then k can be any value that fit the shape of the array B, so select randomly one of this value.
If you also need to maximize the other value then:
if np.max(A) < np.max(B):
ind = np.unravel_index(np.argmax(A),A.shape)
ind = ind + (np.argmax(B[ind[1],:]),)
else:
ind = np.unravel_index(np.argmax(B),B.shape)
ind = (np.argmax(A[:,ind[0]]),) + ind

How to get ranks from a sample in a list of values?

I'm new with Python and have a quite simple problem on paper but difficult to me in Python.
I have two samples of values (which are lists) :
X = [2, 2, 4, 6]
Y = [1, 3, 4, 5]
I have a concatenated list which is sorted as
Z = [ 1 , 2 , 2 , 3 , 4 , 4 , 5 , 6]
#rank: 1 2.5 4 5.5 7 8
I would like to get the sum of ranks of X values in Z. For this example, the ranks of 2, 2, 4 and 6 in Z are 2.5 + 2.5 + 5.5 + 8 = 18.5
(ranks of Y values in Z are 1 + 4 + 5.5 + 7 = 17.5)
Here is what I've done but it doesn't work with these lists X and Y (it works if each value appears only one time)
def funct(X, Z):
rank = []
for i in range(len(Z)):
for j in range(len(X)):
if Z[i] == X[j]:
rank = rank + [(i+1)]
print(sum(rank))
return
I would like to solve my problem with not too much complicated functions (only loops and quite easy ways to get a solution).

You can use a dictionary to keep track of the rank sums and counts once you've sorted the combined list.
X = [2, 2, 4, 6]
Y = [1, 3, 4, 5]
Z = sorted(X + Y)
ranksum = {}
counts = {}
for i, v in enumerate(Z):
ranksum[v] = ranksum.get(v, 0) + (i + 1) # Add
counts[v] = counts.get(v, 0) + 1 # Increment count
Then, when you want to look up the rank of an element, you need ranksum[v] / count[v].
r = [ranksum[x] / counts[x] for x in X]
print(r)
# Out: [2.5, 2.5, 5.5, 8]

Here's a solution for how to build the list of ranks:
X = ...
Y = ...
Z = sorted(X + Y)
rank = [1]
z = Z[:1]
for i, e in enumerate(Z[1:], start=2):
if e == z[-1]:
rank[-1] += 0.5
else:
rank.append(i)
z.append(e)
Now you can convert that into a dictionary:
ranks = dict(zip(z, rank))
That will make lookup easier:
sum(ranks[e] for e in X)

Here's another option where you build a dictionary of the rank indexes and then create a rank dictionary from there:
from collections import defaultdict
X = [2, 2, 4, 6]
Y = [1, 3, 4, 5]
Z = sorted(X + Y)
rank_indexes = defaultdict(lambda: [])
for i,v in enumerate(Z):
rank_indexes[v].append(i+1)
ranks = {k:(sum(v)/len(v)) for (k,v) in rank_indexes.items()}
print("Sum of X ranks:", sum([ranks[v] for v in X]))
print("Sum of Y ranks:", sum([ranks[v] for v in Y]))
Output:
Sum of X ranks: 18.5
Sum of Y ranks: 17.5
You can do the same thing without defaultdict, but it's slightly slower and I'd argue less Pythonic:
rank_indexes = {}
for i,v in enumerate(Z):
rank_indexes.setdefault(v, []).append(i+1)
ranks = {k:(sum(v)/len(v)) for (k,v) in rank_indexes.items()}

Construct an assignment matrix - Python

I have two lists of element
a = [1,2,3,2,3,1,1,1,1,1]
b = [3,1,2,1,2,3,3,3,3,3]
and I am trying to uniquely match the element from a to b, my expected result is like this:
1: 3
2: 1
3: 2
So I tried to construct an assignment matrix and then use scipy.linear_sum_assignment
a = [1,2,3,2,3,1,1,1,1,1]
b = [3,1,2,1,2,3,3,3,3,3]
total_true = np.unique(a)
total_pred = np.unique(b)
matrix = np.zeros(shape=(len(total_pred),
len(total_true)
)
)
for n, i in enumerate(total_true):
for m, j in enumerate(total_pred):
matrix[n, m] = sum(1 for item in b if item==(i))
I expected the matrix to be:
1 2 3
1 0 2 0
2 0 0 2
3 6 0 0
But the output is:
[[2. 2. 2.]
[2. 2. 2.]
[6. 6. 6.]]
What mistake did I made in here? Thank you very much

You don't even need to process this by Pandas. try to use zip and dict:
In [42]: a = [1,2,3,2,3,1,1,1,1,1]
...: b = [3,1,2,1,2,3,3,3,3,3]
...:
In [43]: c =zip(a,b)
In [44]: dict(c)
Out[44]: {1: 3, 2: 1, 3: 2}
UPDATE as OP said, if we need to store all the value with the same key, we can use defaultdict:
In [58]: from collections import defaultdict
In [59]: d = defaultdict(list)
In [60]: for k,v in c:
...: d[k].append(v)
...:
In [61]: d
Out[61]: defaultdict(list, {1: [3, 3, 3, 3, 3, 3], 2: [1, 1], 3: [2, 2]})

This row:
matrix[n, m] = sum(1 for item in b if item==(i))
counts the occurrences of i in b and saves the result to matrix[n, m]. Each cell of the matrix will contain either the number of 1's in b (i.e. 2) or the number of 2's in b (i.e. 2) or the number of 3's in b (i.e. 6). Notice that this value is completely independent of j, which means that the values in one row will always be the same.
In order to take j into consideration, try to replace the row with:
matrix[n, m] = sum(1 for x, y in zip(a, b) if (x, y) == (j, i))

In case your expected output, since how we specify the matrix as a(i, j) with i is the index of the row, and j is the index of the col. Looking at a(3,1) in your matrix, the result is 6, which means (3,1) combination matches 6 times, with 3 is from b and 1 is from a. We can find all the matches from 2 list.
matches = [tuple([x, y]) for x,y in zip(b, a)]
Then we can find how many matches there are of a specific combination, for example a(3, 1).
result = matches.count((3,1))

Good way to iterate lists of different lengths and set default value

In Python, is there a good way to iterate through lists of different lengths?
For example,
a = [1,2,3]
b=[4,5]
c = [a,b]
for val1, val2, val3 in c:
print val1
print val2
print val3
Assuming that the list will have at least 2 values, and in some list, 3rd value is optional. The above for loop didn't work for b, obviously, that val3 is not available for list 'b'. In that case, I want to print the val3 as 0. Can I give a default value in case of unavailability?
for val1, val2, val3=0 in c:
The above syntax didn't work either. Please help.

If you want to be fancy ("elegant"?), you can pad a given list with zeros:
def pad_list(t, size, default):
return t + [default] * (size - len(t))
for x in c:
v1, v2, v3 = pad_list(x, 3, 0)
print(v1)
print(v2)
print(v3)
Similarly, if you're working with tuples, here's another function:
def pad_list(t, size, default):
return t + (default,) * (size - len(t))

You could use zip_longest with fillvalue handling empty slot for this case:
from itertools import zip_longest
a = [1,2,3]
b = [4,5]
l = []
for x, y in zip_longest(a, b, fillvalue=0):
l.append((x, y))
print(list(zip(*l)))
# [(1, 2, 3), (4, 5, 0)]
If you need values out of list, just replace last print with:
for val1, val2, val3 in zip(*l):
print(val1)
print(val2)
print(val3)
# 1
# 2
# 3
# 4
# 5
# 0

this is very simple
c = [1,2,3]
val1 , val2 , *val3= [1,2 , 3]
val1 = 1 , val2 =3 , val3=[3]
c=[1,2]
val1 , val2 , *val3= [1,2 , 3]
val1 = 1 , val2 =3 , val3=[]

def foreach(l):
def deco(f):
for xs in l:
f(*xs)
return deco
#foreach([[1, 2, 3], [4, 5]])
def _(a, b, c=6):
print(a, b, c, sep='\n')

The simplest method to concatenate the lists is via chain function from the itertools module.
Example Snippet
import itertools
a = [1, 2, 3, 4, 5, 6]
b = [7, 8, 9, 10, 11, 12]
c = ['A', 'B', 'C', 'D', 'E', 'F', 'U']
combined = itertools.chain( a, b, c ) # combines in order
# enumerate lists to allow for iteration
for index, value in enumerate(combined):
print(value, end = ' ')
Output
1 2 3 4 5 6 7 8 9 10 11 12 A B C D E F U

Python find index of all array elements in another array

I am trying to do the following:
import numpy as np
A = np.array([1,5,2,7,1])
B = np.sort(A)
print B
>>> [1,1,2,5,7]
I want to find the location of all elements in B as in original array A. i.e. I want to create an array C such that
print C
>>[0,4,2,1,3]
which refers to 1 in B being present in A at 0 and 4th location, 5 in B was present in A at 1st location, etc.
I tried using np.where( B == A) but it produces gibberish

import numpy as np
A = np.array([1,5,2,7,1])
print np.argsort(A) #prints [0 4 2 1 3]

If you don't want to imporr numpy for any reason you can also use this code:
a = [1,5,2,7,1]
b = zip(a, range(len(a)))
tmp = sorted(b, key=lambda x: x[0])
c = map( lambda x: x[1], tmp)
print c
[0, 4, 2, 1, 3]

https://repl.it/CVbI
A = [1,5,2,7,1]
for i,e in sorted(enumerate(A), key=lambda x: x[1]):
print(i, e)
B = [x for x,_ in sorted(enumerate(A), key=lambda x: x[1])]
A = sorted(A)
print(A)
print(B)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

reduce lists given single value of 2d lists - python

Another solution using Pandas: df = pd.DataFrame(edges, columns=['start','end']) df['weight'] = weight df.loc[df.groupby('start')['weight'].idxmin()] With the results being: start end weight 0 3 7 1 2 3

Related

Fastest way to find the maximum minimum value of two 'connected' matrices

How to get ranks from a sample in a list of values?

Construct an assignment matrix - Python

Good way to iterate lists of different lengths and set default value

Python find index of all array elements in another array

Categories

Resources