Count only first occurrence of each sequence python

Count only first occurrence of each sequence python - python

I have some acceleration data that I have set up a new column to give a 1 if the accel value in the accelpos column >=2.5 using the following code
frame["new3"] = np.where((frame.accelpos >=2.5), '1', '0')
I end up getting data in sequences like so
0,0,0,0,1,1,1,1,1,0,0,0,1,1,0,0,0,1,1,1,1,1,1,1,1,1,1,0,0,0,0
I want to add a second column to give a 1 just at the start of each sequence as follows
0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0
Any help would be much apreciated

You can compare shifted values by Series.shift and get values only for '1', so chain conditions by & for bitwise AND and last casting to integers for True/False to 1/0 mapping:
df = pd.DataFrame({'col':'0,0,0,0,1,1,1,1,1,0,0,0,1,1,0,0,0,1,1,1,1,1,1,1,1,1,1,0,0,0,0'.split(',')})
df['new'] = (df['col'].ne(df['col'].shift()) & df['col'].eq('1')).astype(int)
Or test difference, but because possible first 1 is necessary replace missing value by original with fillna:
s = df['col'].astype(int)
df['new'] = s.diff().fillna(s).eq(1).astype(int)
print (df)
col new
0 0 0
1 0 0
2 0 0
3 0 0
4 1 1
5 1 0
6 1 0
7 1 0
8 1 0
9 0 0
10 0 0
11 0 0
12 1 1
13 1 0
14 0 0
15 0 0
16 0 0
17 1 1
18 1 0
19 1 0
20 1 0
21 1 0
22 1 0
23 1 0
24 1 0
25 1 0
26 1 0
27 0 0
28 0 0
29 0 0
30 0 0

I am not familiar with the where function. I guess i might try and help from an algorithmic point of view.
Assume we have a list a = [0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, ..., 0]
From an algorithmic POV if you want to replace each sequence of 1 with a unique one at the begining of such sequence here is what you want to do :
parse the list
assess whether it is a one or a zero
if it is a one then, each following item must be a 0 until you actually have a zero
You might want to have something like this :
a = [0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1]
for i in range(len(a)-1):
if a[i] == 1 :
for j in range(1,len(a)-i):
if a[i+j] == 1:
a[i+j] = 0
else :
break

Related

Combining looping and conditional to make new columns on dataframe

I want to make a function with loop and conditional, that count only when Actual Result = 1.
So the numbers always increase by 1 if the Actual Result = 1.
This is my dataframe:
This is my code but it doesnt produce the result that i want :
def func_count(x):
for i in range(1,880):
if x['Actual Result']==1:
result = i
else:
result = '-'
return result
X_machine_learning['Count'] = X_machine_learning.apply(lambda x:func_count(x),axis=1)
When i check & filter with count != '-' The result will be like this :
The number always equal to 1 and not increase by 1 everytime the actual result = 1. Any solution?

Try something like this:
import pandas as pd
df = pd.DataFrame({
'age': [30,25,40,12,16,17,14,50,22,10],
'actual_result': [0,1,1,1,0,0,1,1,1,0]
})
count = 0
lst_count = []
for i in range(len(df)):
if df['actual_result'][i] == 1:
count+=1
lst_count.append(count)
else:
lst_count.append('-')
df['count'] = lst_count
print(df)
Result
age actual_result count
0 30 0 -
1 25 1 1
2 40 1 2
3 12 1 3
4 16 0 -
5 17 0 -
6 14 1 4
7 50 1 5
8 22 1 6
9 10 0 -

Actually, you don't need to loop over the dataframe, which is mostly a Pandas-antipattern that should be avoided. With df your dataframe you could try the following instead:
m = df["Actual Result"] == 1
df["Count"] = m.cumsum().where(m, "-")
Result for the following dataframe
df = pd.DataFrame({"Actual Result": [1, 1, 0, 1, 1, 1, 0, 0, 1, 0]})
is
Actual Result Count
0 1 1
1 1 2
2 0 -
3 1 3
4 1 4
5 1 5
6 0 -
7 0 -
8 1 6
9 0 -

Pandas replace all but first in consecutive group

The problem description is simple, but I cannot figure how to make this work in Pandas. Basically, I'm trying to replace consecutive values (except the first) with some replacement value. For example:
data = {
"A": [0, 1, 1, 1, 0, 0, 0, 0, 2, 2, 2, 2, 3]
}
df = pd.DataFrame.from_dict(data)
A
0 0
1 1
2 1
3 1
4 0
5 0
6 0
7 0
8 2
9 2
10 2
11 2
12 3
If I run this through some function foo(df, 2, 0) I would get the following:
A
0 0
1 1
2 1
3 1
4 0
5 0
6 0
7 0
8 2
9 0
10 0
11 0
12 3
Which replaces all values of 2 with 0, except for the first one. Is this possible?

You can find all the rows where A = 2 and A is also equal to the previous A value and set them to 0:
data = {
"A": [0, 1, 1, 1, 0, 0, 0, 0, 2, 2, 2, 2, 3]
}
df = pd.DataFrame.from_dict(data)
df[(df.A == 2) & (df.A == df.A.shift(1))] = 0
Output:
A
0 0
1 1
2 1
3 1
4 0
5 0
6 0
7 0
8 2
9 0
10 0
11 0
12 3
If you have more than one column in the dataframe, use df.loc to just set the A values:
df.loc[(df.A == 2) & (df.A == df.A.shift(1)), 'A'] = 0

Try, if 'A' is duplicated further down the datafame, an is monotonic increasing:
def foo(df, val=2, repl=0):
return df.mask((df.groupby('A').transform('cumcount') > 0) & (df['A'] == val), repl)
foo(df, 2, 0)
Output:
A
0 0
1 1
2 1
3 1
4 0
5 0
6 0
7 0
8 2
9 0
10 0
11 0
12 3

I'm not sure if this is the best way, but I came up with this solution, hope to be helpful:
import pandas as pd
data = {
"A": [0, 1, 1, 1, 0, 0, 0, 0, 2, 2, 2, 2, 3]
}
df = pd.DataFrame(data)
def replecate(df, number, replacement):
i = 1
for column in df.columns:
for index,value in enumerate(df[column]):
if i == 1 and value == number :
i = 0
elif value == number and i != 1:
df[column][index] = replacement
i = 1
return df
replecate(df, 2 , 0)
Output
A
0 0
1 1
2 1
3 1
4 0
5 0
6 0
7 0
8 2
9 0
10 0
11 0
12 3

I've managed a solution to this problem by shifting the row down by one and checking to see if the values align. Also included a function which can take multiple values to check for (not just 2).
import pandas as pd
data = {
"A": [0, 1, 1, 1, 0, 0, 0, 0, 2, 2, 2, 2, 3]
}
df = pd.DataFrame(data)
def replace_recurring(df,key,offset=1,values=[2]):
df['offset'] = df[key].shift(offset)
df.loc[(df[key]==df['offset']) & (df[key].isin(values)),key] = 0
df = df.drop(['offset'],axis=1)
return df
df = replace_recurring(df,'A',offset=1,values=[2])
Giving the output:
A
0 0
1 1
2 1
3 1
4 0
5 0
6 0
7 0
8 2
9 0
10 0
11 0
12 3

how to print ones and zeros in columns with their indexes in python?

I have a list of zeros and ones, I want to print them in two different columns with headings and index numbers. Something like this.
list = [1,0,1,1,1,0,1,0,1,0,0]
ones zeros
1 1 2 0
3 1 6 0
4 1 8 0
5 1 10 0
7 1 11 0
9 1
This is the desired output.
I tried this:
list = [1,0,1,1,1,0,1,0,1,0,0]
print('ones',end='\t')
print('zeros')
for index,ele in enumerate(list,start=1):
if ele==1:
print(index,ele,end=" ")
elif ele==0:
print(" ")
print(index,ele,end=" ")
else:
print()
But this gives output like this:
ones zeros
1 1
2 0 3 1 4 1 5 1
6 0 7 1
8 0 9 1
10 0
11 0
How do get the desired output?
Any help is appreciated.

You can use itertools.zip_longest, str.ljust, f-strings (for formatting), and some calculations for the printing part, and use two lists to hold the indices of both zeros and ones:
l = [1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0]
ones, zeros = [], []
max_len_zeros = max_len_ones = 0
for index, num in enumerate(l, 1):
if num == 0:
zeros.append(index)
max_len_zeros = max(max_len_zeros, len(str(index)))
else:
ones.append(index)
max_len_ones = max(max_len_ones, len(str(index)))
from itertools import zip_longest
print('ones' + ' ' * (max_len_ones + 2) + 'zeros')
for ones_index, zeros_index in zip_longest(ones, zeros, fillvalue = ''):
one = '1' if ones_index else ' '
this_one_index = str(ones_index).ljust(max_len_ones)
zero = '0' if zeros_index else ''
this_zero_index = str(zeros_index).ljust(max_len_zeros)
print(f'{this_one_index} {one} {this_zero_index} {zero}')
Output:
ones zeros
1 1 2 0
3 1 6 0
4 1 8 0
5 1 10 0
7 1 11 0
9 1
List with more zeros than ones:
In: l = [1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0]
Out:
ones zeros
1 1 2 0
4 1 3 0
7 1 5 0
9 1 6 0
10 1 8 0
14 1 11 0
12 0
13 0
15 0
List with equal number of zeros and ones:
In: l = [1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1]
Out:
ones zeros
1 1 2 0
3 1 4 0
5 1 6 0
8 1 7 0
9 1 10 0
11 1 13 0
12 1 14 0
15 1 16 0
18 1 17 0
20 1 19 0

It's hard to do what you need in an iterative way. I have kind of a "broken" solution that both shows how you could better do what you are trying to do and why an iterative approach is limited in this case.
I updated your code as following:
list = [1,0,1,1,1,0,1,0,1,0,0]
print('ones',end='\t')
print('zeros')
for index,ele in enumerate(list,start=1):
# First check if extra space OR new lines OR both are needed
if index > 1:
if ele==1:
print()
elif ele==0:
if list[index-2]==1:
print('', end=' \t')
else:
print('', end='\n\t\t')
# THEN, write your desired output without any end
if ele==1:
print(index,ele,end="")
elif ele==0:
print(index,ele,end="")
# Finally an empty line
print()
It gives the following ouput:
ones zeros
1 1 2 0
3 1
4 1
5 1 6 0
7 1 8 0
9 1 10 0
11 0
As you can see, its limitation is that you can't go "up" and rewrite in old lines.
However, if you need to display EXACTLY as you've shown, you need to construct an intermediate data structure (for example a dict) and then display it using zip

convert values in a series to either one of two values

I have a series y which has values between -3 and 3.
I want to convert numbers that are above 0 to 1 and numbers that are less than or equal to zero to 0.
What is the best way to do this?
I wrote the code below. However it doesn't give me the expected output. The first line works. However after running the second line the values that were 1 change to something random, which I don't understand
import numpy as np
y_final = np.where(y > 0, 1, y).tolist()
y_final = np.where(y <= 0, 0, y).tolist()

I think you need Series.clip if values are integers:
y = pd.Series(range(-3, 4))
print (y)
0 -3
1 -2
2 -1
3 0
4 1
5 2
6 3
dtype: int64
print (y.clip(lower=0, upper=1))
0 0
1 0
2 0
3 0
4 1
5 1
6 1
dtype: int64
In your solution is possible simplify it by set 1 and 0:
y_final = np.where(y > 0, 1, 0)
print (y_final)
[0 0 0 0 1 1 1]
Or convert mask greater like 0 to integers:
y_final = y.gt(0).astype(int)
#alternative
#y_final = (y > 0).astype(int)
print (y_final)
0 0
1 0
2 0
3 0
4 1
5 1
6 1
dtype: int32

You can also use simple map:
numbers = range(-3,4)
print(list(map(lambda n: 1 if n > 0 else 0, numbers)))

How to reorder a binary list but keep 1's roughly evenly spread apart from each other in the list?

I basically want to reorder(don't think this is a shuffling task) a list of 100 binary numbers. The following properties should hold after the reorder: the fixed frequency of 1's should remain, which is 10 and the 1's should be roughly spread apart from each other as shown below, so every 9th, 10th, or 11th digit is a 1. I want this reordering to be random. The trivial approach I had in mind is to track the index of the first 1 in the input list and generate a new start index. Any ideas on other solutions?
x = [1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0]

Codes as follows:
def main():
from random import shuffle
from random import randint
from itertools import chain
num_of_10th = randint(0, 5) * 2
num_of_11th = num_of_9th = int((10 - num_of_10th) / 2)
lsts = []
for i in range(num_of_10th):
lsts.append([1, 0, 0, 0, 0, 0, 0, 0, 0, 0])
for i in range(num_of_9th):
lsts.append([1, 0, 0, 0, 0, 0, 0, 0, 0])
for i in range(num_of_11th):
lsts.append([1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
shuffle(lsts)
lsts = list(chain.from_iterable(lsts))
print(lsts)

You can use python's list multiplication.
My solution will generate a random size between 1 and 10 using random.randint. from this size I create the repeated_part that starts with a 1 and fills in the rest with zero's. For example
when size is 5 repeated_part will be [1, 0, 0, 0, 0].
From the size we can calculate the number of times it fits in a list of 100 100//spread and we add one overflow. Now the list will be too large for example with a size of 3 the total size of the list is ((100/3)+1)*3 = 102 so we truncate the list to become 100 in length with [:100].
import random
size = random.randint(1, 10)
repeated_part = [1] + [0]*(size-1)
result = (repeated_part * (100 // size + 1)) [:100]
Note is you want the 1 to not start as first you could use random.shuffle(repeated_part) but still hold all your other requirements.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Count only first occurrence of each sequence python - python

Related

Combining looping and conditional to make new columns on dataframe

Pandas replace all but first in consecutive group

how to print ones and zeros in columns with their indexes in python?

convert values in a series to either one of two values

How to reorder a binary list but keep 1's roughly evenly spread apart from each other in the list?

Categories

Resources