How can I find the second most common number in an array? - python

I have tried using scipy.stats mode to find the most common value. My matrix contains a lot of zeros, though, and so this is always the mode.
For example, if my matrix looks like the following:
array = np.array([[0, 0, 3, 2, 0, 0],
[5, 2, 1, 2, 6, 7],
[0, 0, 2, 4, 0, 0]])
I'd like to have the value of 2 returned.

Try collections.Counter:
import numpy as np
from collections import Counter
a = np.array(
[[0, 0, 3, 2, 0, 0],
[5, 2, 1, 2, 6, 7],
[0, 0, 2, 4, 0, 0]]
)
ctr = Counter(a.ravel())
second_most_common_value, its_frequency = ctr.most_common(2)[1]

As mentioned in some comments, you probably are speaking of numpy arrays.
In this case, it is rather simple to mask the value you want to avoid:
import numpy as np
from scipy.stats import mode
array = np.array([[0, 0, 3, 2, 0, 0],
[5, 2, 1, 2, 6, 7],
[0, 0, 2, 4, 0, 0]])
flat_non_zero = array[np.nonzero(array)]
mode(flat_non_zero)
Which returns (array([2]), array([ 4.])) meaning the value appearing the most is 2, and it appears 4 times (see the doc for more info). So if you want to only get 2, you just need to get the first index of the return value of the mode : mode(flat_non_zero)[0][0]
EDIT: if you want to filter another specific value x from array instead of zero, you can use array[array != x]

original_list = [1, 2, 3, 1, 2, 5, 6, 7, 8] #original list
noDuplicates = list(set(t)) #creates a list of all the unique numbers of the original list
most_common = [noDuplicates[0], original_list.count(noDuplicates[0])] #initializes most_most common to
#the first value and count so we have something to start with
for number in noDuplicates: #loops through the unique numbers
if number != 0: #makes sure that we do not check 0
count = original_list.count(number) #checks how many times that unique number appears in the original list
if count > most_common[1] #if the count is greater than the most_common count
most_common = [number, count] #resets most_common to the current number and count
print(str(most_common[0]) + " is listed " + str(most_common[1]) + "times!")
This loops through your list and finds the most used number and prints it with the number of occurrences in your original list.

Related

Find runs and lengths of consecutive values in an array

I'd like to find equal values in an array and their indices if they occur consecutively more then 2 times.
[0, 3, 0, 1, 0, 1, 2, 1, 2, 2, 2, 2, 1, 3, 4]
so in this example I would find value "2" occured "4" times, starting from position "8". Is there any build in function to do that?
I found a way with collections.Counter
collections.Counter(a)
# Counter({0: 3, 1: 4, 3: 2, 5: 1, 4: 1})
but this is not what I am looking for.
Of course I can write a loop and compare two values and then count them, but may be there is a more elegant solution?
Find consecutive runs and length of runs with condition
import numpy as np
arr = np.array([0, 3, 0, 1, 0, 1, 2, 1, 2, 2, 2, 2, 1, 3, 4])
res = np.ones_like(arr)
np.bitwise_xor(arr[:-1], arr[1:], out=res[1:]) # set equal, consecutive elements to 0
# use this for np.floats instead
# arr = np.array([0, 3, 0, 1, 0, 1, 2, 1, 2.4, 2.4, 2.4, 2, 1, 3, 4, 4, 4, 5])
# res = np.hstack([True, ~np.isclose(arr[:-1], arr[1:])])
idxs = np.flatnonzero(res) # get indices of non zero elements
values = arr[idxs]
counts = np.diff(idxs, append=len(arr)) # difference between consecutive indices are the length
cond = counts > 2
values[cond], counts[cond], idxs[cond]
Output
(array([2]), array([4]), array([8]))
# (array([2.4, 4. ]), array([3, 3]), array([ 8, 14]))
_, i, c = np.unique(np.r_[[0], ~np.isclose(arr[:-1], arr[1:])].cumsum(),
return_index = 1,
return_counts = 1)
for index, count in zip(i, c):
if count > 1:
print([arr[index], count, index])
Out[]: [2, 4, 8]
A little more compact way of doing it that works for all input types.

How to modify a list if a matrix contain a value?

I have a list with values [5, 5, 5, 5, 5] and I have a matrix too filled with with 1 and 0.
I want to have a new list that have to be like this:
if there's a 1 into the matrix then sum a '2' into the v's value if it's the first row and sum a '3' it's the second row.
example:
list:
v = [5,5,5,5,5]
matrix:
m = [[0, 1, 1, 0, 0], [0, 0, 1, 1, 0]]
final result:
v1 = [5,7,10,8,5]
Create a function that adds array lines, you can have the parameters be 1D numeric arrays. Loops through the arrays and returns a result array that is the addition of each element.
If your task requires it, add a check if the lines are of equal length and abort the function with an error if so.
Run this function on all of the matrix lines and then run it for the result of that and the input array.
Hope I managed to be comprehensive enough
You can use NumPy package for efficient code.
import numpy as np
v = [5,5,5,5,5]
matrix = [[0, 1, 1, 0, 0],
[0, 0, 1, 1, 0]]
weights = np.array([2,3])
w_matrix = np.multiply(matrix, weights[:, np.newaxis]).sum(axis=0)
v1 = v + w_matrix
classical python:
You can use a loop comprehension:
to_add = [sum((A*B) for A,B in zip(factors,x)) for x in zip(*m)]
[a+b for a,b in zip(v, to_add)]
output: [5, 7, 10, 8, 5]
numpy:
That said, this is a perfect use case for numpy that is more efficient and less verbose:
import numpy as np
v = [5,5,5,5,5]
m = [[0, 1, 1, 0, 0], [0, 0, 1, 1, 0]]
factors = [2,3]
V = np.array(v)
M = np.array(m)
F = np.array(factors)
V+(M*F[:,None]).sum(0)
output: array([ 5, 7, 10, 8, 5])

Replacing the values of a numpy array of zeros using a array of indexes

I'm working with numpy and I got a problem with index, I have a numpy array of zeros, and a 2D array of indexes, what I need is to use this indexes to change the values of the array of zeros by the value of 1, I tried something, but it's not working, here is what I tried.
import numpy as np
idx = np.array([0, 3, 4],
[1, 3, 5],
[0, 4, 5]]) #Array of index
zeros = np.zeros(6) #Array of zeros [0, 0, 0, 0, 0, 0]
repeat = np.tile(zeros, (idx.shape[0], 1)) #This repeats the array of zeros to match the number of rows of the index array
res = []
for i, j in zip(repeat, idx):
res.append(i[j] = 1) #Here I try to replace the matching index by the value of 1
output = np.array(res)
but I get the syntax error
expression cannot contain assignment, perhaps you meant "=="?
my desired output should be
output = [[1, 0, 0, 1, 1, 0],
[0, 1, 0, 1, 0, 1],
[1, 0, 0, 0, 1, 1]]
This is just an example, the idx array can be bigger, I think the problem is the indexing, and I believe there is a much simple way of doing this without repeating the array of zeros and using the zip function, but I can't figure it out, any help would be aprecciated, thank you!
EDIT: When I change the = by == I get a boolean array which I don't need, so I don't know what's happening there either.
You can use np.put_along_axis to assign values into the array repeat based on indices in idx. This is more efficient than a loop (and easier).
import numpy as np
idx = np.array([[0, 3, 4],
[1, 3, 5],
[0, 4, 5]]) #Array of index
zeros = np.zeros(6).astype(int) #Array of zeros [0, 0, 0, 0, 0, 0]
repeat = np.tile(zeros, (idx.shape[0], 1))
np.put_along_axis(repeat, idx, 1, 1)
repeat will then be:
array([[1, 0, 0, 1, 1, 0],
[0, 1, 0, 1, 0, 1],
[1, 0, 0, 0, 1, 1]])
FWIW, you can also make the array of zeros directly by passing in the shape:
np.zeros([idx.shape[0], 6])

Why is this index function showing a 2 instead of a 4?

When I am running the following code, I get 0,1,2,3,2 to be printed out. Why would a 2 be printed as the last number when it should be a 4? The index function should be printing the index of each of the lists inside of the big list, correct? Using Python btw
game_board = [["o", 1, 0, 0, 0],
[2, 2, 0, 0, 0],
[0, 0, 2, 2, 2],
[0, 0, 2, "o", 2],
[0, 0, 2, 2, 2]]
for i in game_board:
print(game_board.index(i))
using the index function of a list returns the first occurrence of an item. Here's a simplified example:
game_board = [0,2,5,2,3]
game_board.index(2)
>>> 1
This returns the first time the number two is seen in your list.
In your example, the row
[0, 0, 2, 2, 2]
is at index 2 and index 4. When iterating over each list and looking for it's index, rows 0, 1, 2, and 3 return correctly but because the row at index 4 is the same at index 2, it will return the number 2.

python consecutive counts of an occurence with length

this is probably really easy to do but I am looking to calculate the length of consecutive positive occurrences in a list in python. For example, I have a and I am looking to return b:
a=[0,0,1,1,1,1,0,0,1,0,1,1,1,0]
b=[0,0,4,4,4,4,0,0,1,0,3,3,3,0]
I note a similar question on Counting consecutive positive value in Python array but this only returns consecutive counts but not the length of the belonging group.
Thanks
This is similar to a run length encoding problem, so I've borrowed some ideas from that Rosetta code page:
import itertools
a=[0,0,1,1,1,1,0,0,1,0,1,1,1,0]
b = []
for item, group in itertools.groupby(a):
size = len(list(group))
for i in range(size):
if item == 0:
b.append(0)
else:
b.append(size)
b
Out[8]: [0, 0, 4, 4, 4, 4, 0, 0, 1, 0, 3, 3, 3, 0]
At last after so many tries came up with these two lines.
In [9]: from itertools import groupby
In [10]: lst=[list(g) for k,g in groupby(a)]
In [21]: [x*len(_lst) if x>=0 else x for _lst in lst for x in _lst]
Out[21]: [0, 0, 4, 4, 4, 4, 0, 0, 1, 0, 3, 3, 3, 0]
Here's one approach.
The basic premise is that when in a consecutive run of positive values, it will remember all the indices of these positive values. As soon as it hits a zero, it will backtrack and replace all the positive values with the length of their run.
a=[0,0,1,1,1,1,0,0,1,0,1,1,1,0]
glob = []
last = None
for idx, i in enumerate(a):
if i>0:
glob.append(idx)
if i==0 and last != i:
for j in glob:
a[j] = len(glob)
glob = []
# > [0, 0, 4, 4, 4, 4, 0, 0, 1, 0, 3, 3, 3, 0]

Categories

Resources