Find minimum values of both "columns" of list of lists - python

Given a list like the next one:
foo_list = [[1,8],[2,7],[3,6]]
I've found in questions like Tuple pairs, finding minimum using python and
minimum of list of lists that the pair with the minimum value of a list of lists can be found using a generator like:
min(x for x in foo_list)
which returns
[1, 8]
But I was wondering if there is a similar way to return both minimum values of the "columns" of the list:
output = [1,6]
I know this can be achieved using numpy arrays:
output = np.min(np.array(foo_list), axis=0)
But I'm interested in finding such a way of doing so with generators (if possible).
Thanks in advance!

[min(l) for l in zip(*foo_list)]
returns [1, 6]
zip(*foo_list) gets the list transpose and then we find the minimum in both lists.
Thanks #mousetail for suggestion.

You can use two min() for this. Like -
min1 = min(a for a, _ in foo_list)
min2 = min(b for _, b in foo_list)
print([min1, min2])
Will this do? But I think if you don't want to use third party library, you can just use plain old loop which will be more efficient.

Related

Determining index each group duplicate values in an array in Python with the fastest way

I want to find an index of each group duplicate value like this:
s = [2,6,2,88,6,...]
The results must return the index from original s: [[0,2],[1,4],..] or the result can show another way.
I find many solutions so I find the fastest way to get duplicate group:
s = np.sort(a, axis=None)
s[:-1][s[1:] == s[:-1]]
But after sort I got wrong index from original s.
In my case, I have ~ 200mil value on the list and I want to find the fastest way to do that. I use an array to store value because I want to use GPU to make it faster.
Using hash structures like dict helps.
For example:
import numpy as np
from collections import defaultdict
a=np.array([2,4,2,88,15,4])
table=defaultdict(list)
for ind,num in enumerate(a):
table[num]+=[ind]
Outputs:
{2: [0, 2], 4: [1, 5], 88: [3], 15: [4]}
If you want to show duplicated elements in the order from small to large:
for k,v in sorted(table.items()):
if len(v)>1:
print(k,":",v)
Outputs:
2 : [0, 2]
4 : [1, 5]
The speed is determined by how many different values in the number list.
See if this meets your performance requirements (here, s is your input array):
counts = np.bincount(s)
cum_counts = np.add.accumulate(counts)
sorted_inds = np.argsort(s)
result = np.split(sorted_inds, cum_counts[:-1])
Notes:
The result would be a list of arrays.
Each of these arrays would contain indices of a repeated value in s. Eg, if the value 13 is repeated 7 times in s, there would be an array with 7 indices among the arrays of result
If you want to ignore singleton values of s (values that occur only once in s), you can pass minlength=2 to np.bincount()
(This is a variation of my other answer. Here, instead of splitting the large array sorted_inds, we take slices from it, so it's likely to have a different kind of performance characteristic)
If s is the input array:
counts = np.bincount(s)
cum_counts = np.add.accumulate(counts)
sorted_inds = np.argsort(s)
result = [sorted_inds[:cum_counts[0]]] + [sorted_inds[cum_counts[i]:cum_counts[i+1]] for i in range(cum_counts.size-1)]

index of a first occurrence (inequality match) in a list

A=[2,3,5,7,11,13]
print(A.index(5))
The answer is 2,
But what I need is the first one which is bigger than 4 (the answer will be the same - 2).
I can apply a while loop, but is there a more elegant or a builtin way to do it?
In my problem the list is sorted in an ascending order (no duplication),
and my target is to split it into two lists: lower or equal to 4, and bigger than 4; and given the list is sorted it would be redundant to scan it twice (or even once).
As #DanD.mentioned, you can use the bisect module for this, in you example you can use bisect_left
>>> import bisect
>>> bisect.bisect_left(A, 5)
2
This will use a binary search since your data is sorted, which will be faster than a linear search (O(logN) instead of O(N)).
If you want the index of the first value greater than 4, then you can switch to bisect_right
>>> bisect.bisect_right(A, 4)
2
You're totally correct about efficiency - if you have already sorted list, do not iterate linearly, its waste of time
There's built-in bisect module - exactly for binary search in sorted containers.
You're probably looking for bisect_right function.
Thanks everybody, the answer using your kind help is:
import bisect
A=[2,3,5,7,11,13]
N=bisect.bisect_right(A,4)
print(A[:N]) #[2,3]
print(A[N:]) #[5,7,11,13]
Use next with a default argument:
val = next((i for i, x in enumerate(A) if x > 4), len(A))
Given the above result, you can then do:
left, right = A[:val], A[val:]

Associating values from one list with values in another programmatically

(Asking again in a more concise way)
I have four lists of values and I need to link the first and last together like this:
so that I can plot the points (4, 8350.1416), (10, 13167.329), (15, 29200.063), etc.
The enumerate function can give me access to the indices of the rightmost list, but how can I associate the values in that one with the correct values in the leftmost list?
The lists change with each run of the code, so I need to do it programmatically, like in a for loop for example.
EDIT: My program reads the pixel values along a randomly selected row. List1 holds the minimum-valued pixels, and list2 holds their values. Then list3 holds the minimum values of those minimum values, and list4 holds their values. Describing it like that sounds a lot more confusing than it is!
I've tried using
ubermin_vals_x = []
for i in ubermin_values:
value = ubermin_pixels[i]
ubermin_vals_x.append(minimum_pixels[i])
but it tries to iterate over the values (8350.1416, 13167.329...) which of course can't be done.
I'm trying to plot the lists to look like this:
but have the black carets from list4 at the correct points along the x-axis, which are given in list1.
Naming lists from left to right as l1,l2,l3,l4 l2 seems useless to me, since it just replicates the value in l4, so if I understand the problem, the code could be:
for i,v in zip(l3,l4):
print (l1[i],v) #or plot
and you can replace v with l2[i].
Or even simpler:
for i in l3:
print (l1[i],l2[i])
As from comment below in your example elements of l3 seem to be sigle-element list, the code becomes:
for i in l3:
print (l1[i[0]],l2[i[0]])
It's not very clear to me what you are trying to do, but here is my guess
find the index of element in 4th array in 2nd array
use that index to extract the number in 1st array
and the implementation is as follows
a4 = [ 8350.1416, 13167.329, 29200.063 ]
a2 = [13846, 8350.1416, 0, 13167.329, 0, 29200.063]
a1 = [1, 4, 7, 10, 12, 15, 18]
idx = [a1[a2.index(x)] for x in a4]
result = zip(idx, a4)
I also suspect #Vincenzooo 's answer is already very close to what you want. Maybe
for i in l3:
print (l1[i[0]],l2[i[0]])
Thanks, everyone, especially #Vincenzoo
It works now with this:
uberminxlist = []
uberminylist = []
for i in ubermin_pixels:
uberminxlist.append(minimum_pixels[i[0]])
uberminylist.append(minimum_values[i[0]])
Lovely :)

How to get the second half of a list of lists as a list of lists?

So I know that to get a single column, I'd have to write
a = list(zip(*f)[0])
and the resulting a will be a list containing the first element in the lists in f.
How do I do this to get more than one element per list? I tried
a = list(zip(*f)[1:19])
But it just returned a list of lists where the inner list is the composed of the ith element in every list.
The easy way is not to use zip(). Instead, use a list comprehension:
a = [sub[1:19] for sub in f]
If it is actually the second half that you are looking for:
a = [sub[len(sub) // 2:] for sub in f]
That will include the 3 in [1, 2, 3, 4, 5]. If you don't want to include it:
a = [sub[(len(sub) + 1) // 2:] for sub in f]
You should definitely prefer #zondo's solution for both performance and readability. However, a zip based solution is possible and would look as follows (in Python 2):
zip(*zip(*f)[1:19])
You should not consider this cycle of unpacking, zipping, slicing, unpacking and re-zipping in any serious code though ;)
In Python 3, you would have to cast both zip results to list, making this even less sexy.

Basic python: how to increase value of item in list [duplicate]

This question already has answers here:
Why does this iterative list-growing code give IndexError: list assignment index out of range? How can I repeatedly add (append) elements to a list?
(9 answers)
Closed 4 months ago.
This is such a simple issue that I don't know what I'm doing wrong. Basically I want to iterate through the items in an empty list and increase each one according to some criteria. This is an example of what I'm trying to do:
list1 = []
for i in range(5):
list1[i] = list1[i] + 2*i
This fails with an list index out of range error and I'm stuck. The expected result (what I'm aiming at) would be a list with values:
[0, 2, 4, 6, 8]
Just to be more clear: I'm not after producing that particular list. The question is about how can I modify items of an empty list in a recursive way. As gnibbler showed below, initializing the list was the answer. Cheers.
Ruby (for example) lets you assign items beyond the end of the list. Python doesn't - you would have to initialise list1 like this
list1 = [0] * 5
So when doing this you are actually using i so you can just do your math to i and just set it to do that. there is no need to try and do the math to what is going to be in the list when you already have i. So just do list comprehension:
list1 = [2*i for i in range(5)]
Since you say that it is more complex, just don't use list comprehension, edit your for loop as such:
for i in range(5):
x = 2*i
list1[i] = x
This way you can keep doing things until you finally have the outcome you want, store it in a variable, and set it accordingly! You could also do list1.append(x), which I actually prefer because it will work with any list even if it's not in order like a list made with range
Edit: Since you want to be able to manipulate the array like you do, I would suggest using numpy! There is this great thing called vectorize so you can actually apply a function to a 1D array:
import numpy as np
list1 = range(5)
def my_func(x):
y = x * 2
vfunc = np.vectorize(my_func)
vfunc(list1)
>>> array([0, 2, 4, 6, 8])
I would advise only using this for more complex functions, because you can use numpy broadcasting for easy things like multiplying by two.
Your list is empty, so when you try to read an element of the list (right hand side of this line)
list1[i] = list1[i] + 2*i
it doesn't exist, so you get the error message.
You may also wish to consider using numpy. The multiplication operation is overloaded to be performed on each element of the array. Depending on the size of your list and the operations you plan to perform on it, using numpy very well may be the most efficient approach.
Example:
>>> import numpy
>>> 2 * numpy.arange(5)
array([0, 2, 4, 6, 8])
I would instead write
for i in range(5):
list1.append(2*i)
Yet another way to do this is to use the append method on your list. The reason you're getting an out of range error is because you're saying:
list1 = []
list1.__getitem__(0)
and then manipulate this item, BUT that item does not exist since your made an empty list.
Proof of concept:
list1 = []
list1[1]
IndexError: list index out of range
We can, however, append new stuff to this list like so:
list1 = []
for i in range(5):
list1.append(i * 2)

Categories

Resources