Related
Pretend I have a pandas Series that consists of 0s and 1s, but this can work with numpy arrays or any iterable. I would like to create a formula that would take an array and an input n and then return a new series that contains 1s at the nth indices leading up to every time that there is at least a single 1 in the original series. Here is an example:
array = np.array([0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1])
> preceding_indices_function(array, 2)
np.array([0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1])
For each time there is a 1 in the input array, the two indices preceding it are filled in with 1 regardless of whether there is a 0 or 1 in that index in the original array.
I would really appreciate some help on this. Thanks!
Use a convolution with np.convolve:
N = 2
# craft a custom kernel
kernel = np.ones(2*N+1)
kernel[-N:] = 0
# array([1, 1, 1, 0, 0])
out = (np.convolve(array, kernel, mode='same') != 0).astype(int)
Output:
array([0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1])
Unless you don't want to use numpy, mozway's transpose is the best solution.
But since several iterations have been given, I add my itertools based solution
[a or b or c for a,b,c in itertools.zip_longest(array, array[1:], array[2:], fillvalue=0)]
zip_longest is the same as classical zip, but if the iterators have different "lengths", the number of iteration is the one of the longest, and finished iterators will return None. Unless you add a fillvalue parameter to zip_longest.
So, here itertools.zip_longest(array, array[1:], array[2:], fillvalue=0) gives a sequence of triplets (a,b,c), of 3 subsequent elements (a being the current element, b the next, c the one after, b and c being 0 if there isn't any next element or element after the next).
So from there, a simple comprehension build a list of [a or b or c] that is 1 if a, or b or c is 1, 0 else.
import numpy as np
array = np.array([0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1])
array = np.array([a or array[idx+1] or array[idx+2] for idx, a in enumerate(array[:-2])] + [array[-2] or array[-1]] + [array[-1]])
this function works if a is a list, should work with other iterables as well:
def preceding_indices_function(array, n):
for i in range(len(a)):
if array[i] == 1:
for j in range(n):
if i-j-1 >= 0:
array[i-j-1] = 1
return array
I got a solution that is similar to the other one but slightly simpler in my opinion:
>>> [1 if (array[i+1] == 1 or array[i+2] == 1) else x for i,x in enumerate(array) if i < len(array) - 2]
[0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1]
I am new to this world and I am starting to take my first steps in python. I am trying to extract in a single list the indices of certain values of my list (those that are greater than 10). When using append I get the following error and I don't understand where the error is.
dbs = [0, 1, 0, 0, 0, 0, 1, 0, 1, 23, 1, 0, 1, 1, 0, 0, 0,
1, 1, 0, 20, 1, 1, 15, 1, 0, 0, 0, 40, 15, 0, 0]
exceed2 = []
for d, i in enumerate(dbs):
if i > 10:
exceed2.append= (d,i)
print(exceed2)
You probably mean to write
for i, d in enumerate(dbs):
if d > 10:
exceed2.append(i)
print(exceed2)
Few fixes here:
append=() is invalid syntax, you should just write append()
the i, d values from enumerate() are returning the values and indexes. You should be checking d > 10, since that's the value (per your description of the task). Then you should be putting only i into the exceed2 array. (I switch the i and d variables so that i is for index as that's more conventional)
append(d,i) wouldn't work anyway, as append takes one argument. If you want to append both the value and index, you should use .append((d, i)), which will append a tuple of both to the list.
you probably don't want to print exceed2 every time the condition is hit, when you could just print it once at the end.
Welcome to this world :D
the problem is that .append is actually a function that only takes one input, and appends this input to the very end of whatever list you provide.
Try this instead:
dbs = [0, 1, 0, 0, 0, 0, 1, 0, 1, 23, 1, 0, 1, 1, 0, 0, 0,
1, 1, 0, 20, 1, 1, 15, 1, 0, 0, 0, 40, 15, 0, 0]
exceed2 = []
for d, i in enumerate(dbs):
if i > 10:
exceed2.append(i)
print(exceed2)
Write a function that produces stream generator for given iterable object (list, generator, etc) whose elements contain position and value and sorted by order of apperance. Stream generator should be equal to initial stream (without position) but gaps filled with zeroes. For example:
gen = gen_stream(9,[(4,111),(7,12)])
list(gen) [0, 0, 0, 0, 111, 0, 0, 12, 0] # first element has zero index, so 111 located on fifth position, 12 located on 8th position
I.e. 2 significant elements has indexes 4 and 7, all other elements filled with zeroes.
To simplify things elements are sorted (i.e element with lower position should precede element with higher number) in initial stream.
First parameter can be None, in this case stream should be inifinite, e.g. infinite zeroes stream:
gen_stream(None, [])
following stream starts with 0, 0, 0, 0, 111, 0, 0, 12, ... then infinitely generates zeroes:
gen_stream(None, [(4,111),(7,12)])
Function should also support custom position-value extractor for more advanced cases, e.g.
def day_extractor(x):
months = [31,28,31,30,31,31,30,31,30,31,30,31]
acc = sum(months[:x[1]-1]) + x[0] - 1
return (acc, x[2])
precipitation_days = [(3,1,4),(5,2,6)]
list(gen_stream(59,precipitation_days,day_extractor)) #59: January and February to limit output
[0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
precipitation_days format is following: (d,m,mm), where d - day in month, m - month, mm - precipitation in millimeters
So, in example:
(3,1,4) - January,3 precipitation: 4 mm
(5,2,6) - February,5 precipitation: 6 mm
Extractor passed as optional third parameter with default value - lambda function that handles (position, value) pairs like in first example.
That's what i did:
import sys
a=[(4,111),(7,12)]
n = 9
def gen_stream(n1, a1):
if n1==None:
b = [0 for i in range(sys.maxsize)]
else:
b = [0 for i in range(n1)]
for i in range(len(a1)):
b[a[i][0]]=a[i][1]
for i in range(len(b)):
yield b[i]
for i in gen_stream(None, a):
print(i)
So far I have reached a stream with infinite zeros, but the function is not executed for some reason ... And how to do it next with months? My memory error crashes, and the program eats a lot of RAM (((help please
I want to convert this hexadecimal array :
[7,3,2,0,1,9,0,4]
into this one
[0,1,1,1,0,0,1,1,0,0,1,0,0,0,0,0,0,0,0,1,1,0,0,1,0,0,0,0,0,1,0,0]
where you can recognize the first 4 integers is egal to 7 in binary format (0111), and so on.
I tried to use format(x, '04b') but the result is in string format :
['0111','0011','0010','0000','0001','1001','0000','0100']
Consequently I can't use the result as binary array. How to do that ?
This one liner will return a list of integers as you want:
hex = [7,3,2,0,1,9,0,4]
list(map(int,"".join([format(x, '04b') for x in hex])))
You can use bitwise operations:
h = [7,3,2,0,1,9,0,4]
[i >> b & 1 for i in h for b in range(3, -1, -1)]
This returns:
[0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0]
arr = [7,3,2,0,1,9,0,4]
hexa = ''.join(str(e) for e in arr)
print(bin(int(hexa,16))[2:])
This takes a hexidecimal array and converts it into binary!
How can I find the amount of consecutive 1s (or any other value) in each row for of the following numpy array? I need a pure numpy solution.
array([[0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0],
[0, 0, 1, 0, 0, 1, 2, 0, 0, 1, 1, 1],
[0, 0, 0, 4, 1, 0, 0, 0, 0, 1, 1, 0]])
There are two parts to my question, first: what is the maximum number of 1s in a row? Should be
array([2,3,2])
in the example case.
And second, what is the index of the start of the first set of multiple consecutive 1s in a row? For the example case this would be
array([3,9,9])
In this example I put 2 consecutive 1s in a row. But it should be possible to change that to 5 consecutive 1s in a row, this is important.
A similar question was answered using np.unique, but it only works for one row and not an array with multiple rows as the result would have different lengths.
Here's a vectorized approach based on differentiation -
import numpy as np
import pandas as pd
# Append zeros columns at either sides of counts
append1 = np.zeros((counts.shape[0],1),dtype=int)
counts_ext = np.column_stack((append1,counts,append1))
# Get start and stop indices with 1s as triggers
diffs = np.diff((counts_ext==1).astype(int),axis=1)
starts = np.argwhere(diffs == 1)
stops = np.argwhere(diffs == -1)
# Get intervals using differences between start and stop indices
start_stop = np.column_stack((starts[:,0], stops[:,1] - starts[:,1]))
# Get indices corresponding to max. interval lens and thus lens themselves
SS_df = pd.DataFrame(start_stop)
out = start_stop[SS_df.groupby([0],sort=False)[1].idxmax(),1]
Sample input, output -
Original sample case :
In [574]: counts
Out[574]:
array([[0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0],
[0, 0, 1, 0, 0, 1, 2, 0, 0, 1, 1, 1],
[0, 0, 0, 4, 1, 0, 0, 0, 0, 1, 1, 0]])
In [575]: out
Out[575]: array([2, 3, 2], dtype=int64)
Modified case :
In [577]: counts
Out[577]:
array([[0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0],
[0, 0, 1, 0, 0, 1, 2, 0, 1, 1, 1, 1],
[0, 0, 0, 4, 1, 1, 1, 1, 1, 0, 1, 0]])
In [578]: out
Out[578]: array([2, 4, 5], dtype=int64)
Here's a Pure NumPy version that is identical to the previous until we have start, stop. Here's the full implementation -
# Append zeros columns at either sides of counts
append1 = np.zeros((counts.shape[0],1),dtype=int)
counts_ext = np.column_stack((append1,counts,append1))
# Get start and stop indices with 1s as triggers
diffs = np.diff((counts_ext==1).astype(int),axis=1)
starts = np.argwhere(diffs == 1)
stops = np.argwhere(diffs == -1)
# Get intervals using differences between start and stop indices
intvs = stops[:,1] - starts[:,1]
# Store intervals as a 2D array for further vectorized ops to make.
c = np.bincount(starts[:,0])
mask = np.arange(c.max()) < c[:,None]
intvs2D = mask.astype(float)
intvs2D[mask] = intvs
# Get max along each row as final output
out = intvs2D.max(1)
I think one problem that is very similar is to check if between the sorted rows the element wise difference is a certain amount. Here if there is a difference of 1 between 5 consecutive would be as follows. It can also be done for difference of 0 for two cards:
cardAmount=cards[0,:].size
has4=cards[:,np.arange(0,cardAmount-4)]-cards[:,np.arange(cardAmount-3,cardAmount)]
isStraight=np.any(has4 == 4, axis=1)