I have list, ex. [1, 3, 0, 2, 6, 0, 7]
And if there is “0” I have to put it at the end of the list, without changing the places of other numbers.
Should return - [1,3,2,6,7,0,0]
Thanks in advance! :)
Just sort by zeroness:
>>> a = [1, 3, 0, 2, 6, 0, 7]
>>> a.sort(key=bool, reverse=True)
>>> a
[1, 3, 2, 6, 7, 0, 0]
my_list = [1, 3, 0, 2, 6, 0, 7]
count = my_list.count(0)
my_list = [value for value in my_list if value != 0]
my_list.extend([0]*count)
print(my_list)
output:
[1, 3, 2, 6, 7, 0, 0]
Here is another way of doing it that is different than the answers so far. This will work whether it is 0 or another number that you need the same operation for (as long as you replace the 0 with the number you want in both the filter functions).
lst = [1, 3, 0, 2, 6, 0, 7]
[*filter((0).__ne__, lst)] + [*filter((0).__eq__, lst)]
Output:
[1, 3, 2, 6, 7, 0, 0]
Definitely an easier way, but here is what I came up with:
l = [1, 3, 0, 2, 6, 0, 7]
le = len(l)
l = [x for x in l if x !=0]
for i in range(0,le-len(l)):
l.append(0)
This is a basic programming exercice so lets approach it with the most basic Python constructs. A simple algorithm would be to track an index where the first zero is detected and swap non-zero numbers with that position which we then move forward. This will carry the zeroes across the list until all the zeroes are bunched up at then end.
numbers = [1, 3, 0, 2, 6, 0, 7]
firstZero = -1 # track position of first zero
for index,number in enumerate(numbers): # go through numbers in list
if firstZero < 0:
if number == 0: # do nothing until a zero is found
firstZero = index # start tracking first zero position
elif number != 0:
numbers[firstZero] = number # swap non-zero numbers
numbers[index] = 0 # with zero position
firstZero += 1 # and advance first zero position
print(numbers)
# [1, 3, 2, 6, 7, 0, 0]
Tracing the progress of the loop, you can see the movement of the firstZero position relative to the index and the carrying of the zeroes through the list:
# firstZero index number numbers
# -1 0 1 [1, 3, 0, 2, 6, 0, 7] # do nothing
# Z ^
# -1 1 3 [1, 3, 0, 2, 6, 0, 7] # do nothing
# Z ...^
# 2 2 0 [1, 3, 0 2, 6, 0, 7] # record position of 1st zero
# ..........Z^
# 3 3 2 [1, 3, 2<->0, 6, 0, 7] # swap non-zero (2),
# ...Z^ # advance position of 1st zero
# 4 4 6 [1, 3, 2, 6<->0, 0, 7] # swap non-zero (6),
# ...Z^ # advance position of 1st zero
# 4 5 0 [1, 3, 2, 6, 0, 0, 7] # ignore subsequent zeroes
# Z ..^ # do nothing
# 5 6 0 [1, 3, 2, 6, 7<- 0,->0] # swap non-zero (7),
# ...Z ..^ # advance position of 1st zero
On the other hand, if you're into more meta trickery, you can leverage the stability of the Python sort and its ability to sort on a calculated key to group the numbers between non-zeroes and zeroes:
numbers.sort(numbers,key=lambda n:not n)
The key parameter uses a function to obtain a key to sort on (as opposed to the numbers themselves). That function here is not n which, when applied to an integer, will return True if it is zero and False if it is not. This will match sorting keys with numbers in the following way:
False, False, True, False, False, True, False
[1, 3, 0, 2, 6, 0, 7]
Sorting boolean values will place False values before True and the stability of the Python sort will keep the relative order of items for identical key values:
False, False, False, False, False True, True
[1, 3, 2, 6, 7, 0, 0]
The calculated key values only exist during the sort process, the result is only the number list sorted accordingly.
Although using the sort function is a nice trick, in therms of complexity, it will perform in O(n log n) time. The basic algorithm being more specialized is able to do the same work in a single pass through the data which will perform in O(n) time.
Another way to group the zeroes at the end of the list in O(n) time is to build a new list by assembling a list of the non-zero items with a list of zeros of the appropriate length:
nonZero = [*filter(bool,numbers)]
numbers = nonZero + [0]*(len(numbers)-len(nonZero))
Then nonZero list is built using a list comprehension with unpacking of the result from the filter iterator. The second part is a repetition of a list with zero for the number of times needed to reach the original length.
Related
For example, let's consider the following numpy array:
[1, 5, 0, 5, 4, 6, 1, -1, 5, 10]
Also, let's suppose that the threshold is equal to 3.
That is to say that we are looking for sequences of at least two consecutive values that are all above the threshold.
The output would be the indices of those values, which in our case is:
[[3, 4, 5], [8, 9]]
If the output array was flattened that would work as well!
[3, 4, 5, 8, 9]
Output Explanation
In our initial array we can see that for index = 1 we have the value 5, which is greater than the threshold, but is not part of a sequence (of at least two values) where every value is greater than the threshold. That's why this index would not make it to our output.
On the other hand, for indices [3, 4, 5] we have a sequence of (at least two) neighboring values [5, 4, 6] where each and every of them are above the threshold and that's the reason that their indices are included in the final output!
My Code so far
I have approached the issue with something like this:
(arr > 3).nonzero()
The above command gathers the indices of all the items that are above the threshold. However, I cannot determine if they are consecutive or not. I have thought of trying a diff on the outcome of the above snippet and then may be locating ones (that is to say that indices are one after the other). Which would give us:
np.diff((arr > 3).nonzero())
But I'd still be missing something here.
If you convolve a boolean array with a window full of 1 of size win_size ([1] * win_size), then you will obtain an array where there is the value win_size where the condition held for win_size items:
import numpy as np
def groups(arr, *, threshold, win_size, merge_contiguous=False, flat=False):
conv = np.convolve((arr >= threshold).astype(int), [1] * win_size, mode="valid")
indexes_start = np.where(conv == win_size)[0]
indexes = [np.arange(index, index + win_size) for index in indexes_start]
if flat or merge_contiguous:
indexes = np.unique(indexes)
if merge_contiguous:
indexes = np.split(indexes, np.where(np.diff(indexes) != 1)[0] + 1)
return indexes
arr = np.array([1, 5, 0, 5, 4, 6, 1, -1, 5, 10])
threshold = 3
win_size = 2
print(groups(arr, threshold=threshold, win_size=win_size))
print(groups(arr, threshold=threshold, win_size=win_size, merge_contiguous=True))
print(groups(arr, threshold=threshold, win_size=win_size, flat=True))
[array([3, 4]), array([4, 5]), array([8, 9])]
[array([3, 4, 5]), array([8, 9])]
[3 4 5 8 9]
You can do what you want using simple numpy operations
import numpy as np
arr = np.array([1, 5, 0, 5, 4, 6, 1, -1, 5, 10])
arr_padded = np.concatenate(([0], arr, [0]))
a = np.where(arr_padded > 3, 1, 0)
da = np.diff(a)
idx_start = (da == 1).nonzero()[0]
idx_stop = (da == -1).nonzero()[0]
valid = (idx_stop - idx_start >= 2).nonzero()[0]
result = [list(range(idx_start[i], idx_stop[i])) for i in valid]
print(result)
Explanation
Array a is a padded binary version of the original array, with 1s where the original elements are greater than three. da contains 1s where "islands" of 1s begin in a, and -1 where the "islands" end in a. Due to the padding, there is guaranteed to be an equal number of 1s and -1s in da. Extracting their indices, we can calculate the length of the islands. Valid index pairs are those whose respective "islands" have length >= 2. Then, its just a matter of generating all numbers between the index bounds of the valid "islands".
I follow your original idea. You are almost done.
I use another diff2 to pick the index of the first value in a sequence. See comments in code for details.
import numpy as np
arr = np.array([ 1, 5, 0, 5, 4, 6, 1, -1, 5, 10])
threshold = 3
all_idx = (arr > threshold).nonzero()[0]
# array([1, 3, 4, 5, 8, 9])
result = np.empty(0)
if all_idx.size > 1:
diff1 = np.zeros_like(all_idx)
diff1[1:] = np.diff(all_idx)
# array([0, 2, 1, 1, 3, 1])
diff1[0] = diff1[1]
# array([2, 2, 1, 1, 3, 1])
# **Positions with a value 1 in diff1 should be reserved.**
# But we also want the position before each 1. Create another diff2
diff2 = np.zeros_like(all_idx)
diff2[:-1] = np.diff(diff1)
# array([ 2, -1, 0, 2, -2, 0])
# **Positions with a negative value in diff2 should be reserved.**
result = all_idx[(diff1==1) | (diff2<0)]
print(result)
# array([3, 4, 5, 8, 9])
I'll try something different using window views, I'm not sure this works all the time so counterexamples are welcome. It has the advantage of not requiring Python loops.
import numpy as np
from numpy.lib.stride_tricks import sliding_window_view as window
def consec_thresh(arr, thresh):
win = window(np.argwhere(arr > thresh), (2, 1))
return np.unique(win[np.diff(win, axis=2).ravel() == 1, :,:].ravel())
How does it work?
So we start with the array and gather the indices where the threshold is met:
In [180]: np.argwhere(arr > 3)
Out[180]:
array([[1],
[3],
[4],
[5],
[8],
[9]])
Then we build a sliding window that makes up pair of values along the column (which is the reason for the (2, 1) shape of the window).
In [181]: window(np.argwhere(arr > 3), (2, 1))
Out[181]:
array([[[[1],
[3]]],
[[[3],
[4]]],
[[[4],
[5]]],
[[[5],
[8]]],
[[[8],
[9]]]])
Now we want to take the difference inside each pair, if it's one then the indices are consecutive.
In [182]: np.diff(window(np.argwhere(arr > 3), (2, 1)), axis=2)
Out[182]:
array([[[[2]]],
[[[1]]],
[[[1]]],
[[[3]]],
[[[1]]]])
We can plug those values back in the windows we created above,
In [185]: window(np.argwhere(arr > 3), (2, 1))[np.diff(window(np.argwhere(arr > 3), (2, 1)), axis=2).ravel() == 1, :, :]
Out[185]:
array([[[[3],
[4]]],
[[[4],
[5]]],
[[[8],
[9]]]])
Then we can ravel (flatten without copy when possible), we have to get rid of the repeated indices created by windowing so I call np.unique. We ravel again and get:
array([3, 4, 5, 8, 9])
The below iteration code should help with O(n) complexity
arr = [1, 5, 0, 5, 4, 6, 1, -1, 5, 10]
threshold = 3
sequence = 2
output = []
temp_arr = []
for i in range(len(arr)):
if arr[i] > threshold:
temp_arr.append(i)
else:
if len(temp_arr) >= sequence:
output.append(temp_arr)
temp_arr = []
if len(temp_arr):
output.append(temp_arr)
temp_arr = []
print(output)
# Output
# [[3, 4, 5], [8, 9]]
I would suggest using a for loop with two indces. You will have one that starts at j=1 and the other at i=0, both stepping forward by 1.
You can then ask if the value at both is greater than the threshold, if so
add the indices to a list and keep moving forward with j until the threshold or .next() is not greater than threshhold.
values = [1, 5, 0, 5, 4, 6, 1, -1, 5, 10]
res=[]
threshold= 3
i=0
j=0
for _ in values:
j=i+1
lista=[]
try:
print(f"i: {i} j:{j}")
# check if condition is met
if(values[i] > threshold and values[j] > threshold):
lista.append(i)
# add sequence
while values[j] > threshold:
lista.append(j)
print(f"j while: {j}")
j+=1
if(j>=len(values)):
break
res.append(lista)
i=j
if(j>=len(values)):
break
except:
print("ex")
this works. but needs refactoring
Let's try the following code:
# Simple is better than complex
# Complex is better than complicated
arr = [1, 5, 0, 5, 4, 6, 1, -1, 5, 10]
arr_3=[i if arr[i]>3 else 'a' for i in range(len(arr))]
arr_4=''.join(str(x) for x in arr_3)
i=0
while i<len(arr_5):
if len(arr_5[i]) <=1:
del arr_5[i]
else:
i+=1
arr_6=[list(map(lambda x: int(x), list(x))) for x in arr_5]
print(arr_6)
Outputs:
[[3, 4, 5], [8, 9]]
Here is a solution that makes use of pandas Series:
thresh = 3
win_size = 2
s = pd.Series(arr)
# locating groups of values where there are at least (win_size) consecutive values above the threshold
groups = s.groupby(s.le(thresh).cumsum().loc[s.gt(thresh)]).transform('count').ge(win_size)
0 False
1 False
2 False
3 True
4 True
5 True
6 False
7 False
8 True
9 True
dtype: bool
We can now easily take their indices in a 1D array:
np.flatnonzero(groups)
# array([3, 4, 5, 8, 9], dtype=int64)
OR multiple lists:
[np.arange(index.start, index.stop) for index in np.ma.clump_unmasked(np.ma.masked_not_equal(groups.values, value=True))]
# [array([3, 4, 5], dtype=int64), array([8, 9], dtype=int64)]
Is there any efficient way to replace all zeros in a tensor with the last non-zero value in torch?
For example if I had the tensor:
tensor([[1, 0, 0, 4, 0, 5, 0, 0],
[0, 3, 0, 6, 0, 0, 8, 0]])
The output should be:
tensor([[1, 1, 1, 4, 4, 5, 5, 5],
[0, 3, 3, 6, 6, 6, 8, 8]])
I currently have the following code:
def replace_zeros_with_prev_nonzero(tensor):
output = tensor.clone()
for i in range(len(output)):
prev_value = 0
for j in range(len(tensor[i])):
if tensor[i,j] == 0:
output[i,j] = prev_value
else:
prev_value = tensor[i,j].item()
return output
But it feels though a bit clunky and I'm sure there has to be a better way to do this. So is it possible to write it in fewer lines, or better yet parallelise the operation without treating the tensors as arrays?
You can remove one of the loops by vectorising over 1st dimension.
def replace_zeros_with_prev_nonzero(tensor):
output = tensor.clone()
for i in range(1, tensor.shape[1]):
mask = tensor[:, i] == 0
output[mask, i] = output[mask, i-1]
return output
output[mask, i] = output[mask, i-1] replaces 0 with the previous value (which itself will be replaced if 0 originally except for 0th index).
I want to go through just a section of a 2D list, rather than the whole thing.
Here's essentially what it is I want to do:
Let's say the user inputs the coordinates [1,1] (so, row 1 column 1)
If I have the 2D list:
[[1,3,7],
[4,2,9],
[13,5,6]]
Then I want to iterate through all the elements adjacent to the element at [1,1]
Furthermore, if the element is at a corner or the edge of the 2D list, (so basically if the user enters [0,0], for example) then I want to just want to get back the elements at [0,0], [0,1], [1,0], and [1,1]
So essentially I just want elements adjacent to a specific to a certain point in the 2D array.
Here's what I've done so far:
I've made it so that it assigns 4 variables at the start of the code: starting_row, ending_row, starting_column, and ending_column. These variables are assigned values based off of which coordinates the user wants to input (if they the row is 0 or len(list) then the for loop runs accordingly. The same goes for the columns).
Then, I use a nested for loop to go through every element
for row in range(row_start, row_end+1):
for column in range(column_start, column_end+1):
print(lst[row,column])
Only thing is, it doesn't seem to work correctly and often outputs the whole entire 2D list when enter a list size of more than 3x3 elements (all the lists will be square lists)
You can slice the list of lists according to the given row and column. For the lower bounds, use max with 0 to avoid slicing with a negative index, but not so for the upper bounds since it is okay for the stopping index of a slice to be out of the range of a list:
def get_adjacent_items(matrix, row, col):
output = []
for r in matrix[max(row - 1, 0): row + 2]:
for i in r[max(col - 1, 0): col + 2]:
output.append(i)
return output
or, with a list comprehension:
def get_adjacent_items(matrix, row, col):
return [i for r in matrix[max(row - 1, 0): row + 2] for i in r[max(col - 1, 0): col + 2]]
so that given:
m = [[1, 3, 7],
[4, 2, 9],
[13, 5, 6]]
get_adjacent_items(m, 0, 0) returns: [1, 3, 4, 2]
get_adjacent_items(m, 1, 1) return: [1, 3, 7, 4, 2, 9, 13, 5, 6]
get_adjacent_items(m, 2, 1) returns: [4, 2, 9, 13, 5, 6]
get_adjacent_items(m, 2, 2) returns: [2, 9, 5, 6]
I have tried using scipy.stats mode to find the most common value. My matrix contains a lot of zeros, though, and so this is always the mode.
For example, if my matrix looks like the following:
array = np.array([[0, 0, 3, 2, 0, 0],
[5, 2, 1, 2, 6, 7],
[0, 0, 2, 4, 0, 0]])
I'd like to have the value of 2 returned.
Try collections.Counter:
import numpy as np
from collections import Counter
a = np.array(
[[0, 0, 3, 2, 0, 0],
[5, 2, 1, 2, 6, 7],
[0, 0, 2, 4, 0, 0]]
)
ctr = Counter(a.ravel())
second_most_common_value, its_frequency = ctr.most_common(2)[1]
As mentioned in some comments, you probably are speaking of numpy arrays.
In this case, it is rather simple to mask the value you want to avoid:
import numpy as np
from scipy.stats import mode
array = np.array([[0, 0, 3, 2, 0, 0],
[5, 2, 1, 2, 6, 7],
[0, 0, 2, 4, 0, 0]])
flat_non_zero = array[np.nonzero(array)]
mode(flat_non_zero)
Which returns (array([2]), array([ 4.])) meaning the value appearing the most is 2, and it appears 4 times (see the doc for more info). So if you want to only get 2, you just need to get the first index of the return value of the mode : mode(flat_non_zero)[0][0]
EDIT: if you want to filter another specific value x from array instead of zero, you can use array[array != x]
original_list = [1, 2, 3, 1, 2, 5, 6, 7, 8] #original list
noDuplicates = list(set(t)) #creates a list of all the unique numbers of the original list
most_common = [noDuplicates[0], original_list.count(noDuplicates[0])] #initializes most_most common to
#the first value and count so we have something to start with
for number in noDuplicates: #loops through the unique numbers
if number != 0: #makes sure that we do not check 0
count = original_list.count(number) #checks how many times that unique number appears in the original list
if count > most_common[1] #if the count is greater than the most_common count
most_common = [number, count] #resets most_common to the current number and count
print(str(most_common[0]) + " is listed " + str(most_common[1]) + "times!")
This loops through your list and finds the most used number and prints it with the number of occurrences in your original list.
this is probably really easy to do but I am looking to calculate the length of consecutive positive occurrences in a list in python. For example, I have a and I am looking to return b:
a=[0,0,1,1,1,1,0,0,1,0,1,1,1,0]
b=[0,0,4,4,4,4,0,0,1,0,3,3,3,0]
I note a similar question on Counting consecutive positive value in Python array but this only returns consecutive counts but not the length of the belonging group.
Thanks
This is similar to a run length encoding problem, so I've borrowed some ideas from that Rosetta code page:
import itertools
a=[0,0,1,1,1,1,0,0,1,0,1,1,1,0]
b = []
for item, group in itertools.groupby(a):
size = len(list(group))
for i in range(size):
if item == 0:
b.append(0)
else:
b.append(size)
b
Out[8]: [0, 0, 4, 4, 4, 4, 0, 0, 1, 0, 3, 3, 3, 0]
At last after so many tries came up with these two lines.
In [9]: from itertools import groupby
In [10]: lst=[list(g) for k,g in groupby(a)]
In [21]: [x*len(_lst) if x>=0 else x for _lst in lst for x in _lst]
Out[21]: [0, 0, 4, 4, 4, 4, 0, 0, 1, 0, 3, 3, 3, 0]
Here's one approach.
The basic premise is that when in a consecutive run of positive values, it will remember all the indices of these positive values. As soon as it hits a zero, it will backtrack and replace all the positive values with the length of their run.
a=[0,0,1,1,1,1,0,0,1,0,1,1,1,0]
glob = []
last = None
for idx, i in enumerate(a):
if i>0:
glob.append(idx)
if i==0 and last != i:
for j in glob:
a[j] = len(glob)
glob = []
# > [0, 0, 4, 4, 4, 4, 0, 0, 1, 0, 3, 3, 3, 0]