I want to know how can I make the source code of the following problem based on Python.
I have a dataframe that contain this column:
Column X
1
0
0
0
1
1
0
0
1
I want to create a list b counting the sum of successive 0 value for getting something like that :
List X
1
3
3
3
1
1
2
2
1
If I understand your question correctly, you want to replace all the zeros with the number of consecutive zeros in the current streak, but leave non-zero numbers untouched. So
1 0 0 0 0 1 0 1 1 0 0 1 0 1 0 0 0 0 0
becomes
1 4 4 4 4 1 1 1 1 2 2 1 1 1 5 5 5 5 5
To do that, this should work, assuming your input column (a pandas Series) is called x.
result = []
i = 0
while i < len(x):
if x[i] != 0:
result.append(x[i])
i += 1
else:
# See how many times zero occurs in a row
j = i
n_zeros = 0
while j < len(x) and x[j] == 0:
n_zeros += 1
j += 1
result.extend([n_zeros] * n_zeros)
i += n_zeros
result
Adding screenshot below to make usage clearer
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed last year.
Improve this question
I did an Algorithm to print a triangle in the upper half of the matrix (without recursive function) but now i want to do it recursively can anyone Help?
Thanks
CLICK HERE TO SEE AN EXAMPLE
My python code:
def T(row,column):
print("Row: ",row)
print("Column", column)
if row != column and row%2 == 0:
print("Please enter valid number")
else:
matrix = []
start = 1
cteRow = row
cteColumn = column
for i in range(0,column):
matrix.append(list(range(start,row + 1)))
start = start + cteRow
row = row + cteColumn
print("The Matrix: \n")
for i in range(len(matrix)):
for j in range(len(matrix)):
print(matrix[i][j], end= " ")
print()
print()
length = len(matrix)
middle = int(length/2)
for i in range(length):
for j in range(length):
matrix[i][j] = 0
if (middle + i)<= length and (middle - i)>= 0:
matrix[i][middle] = 1
myRangeList=list(range(middle-i,middle+i+1))
for n in myRangeList:
matrix[i][n] = 1
print(matrix[i][j], end = " ")
print()
T(5,5)
To make it recursive, you have to come up with a method to convert the result of a smaller matrix into a larger one.
for example:
From T(3,5) --> T(4,7)
0 0 1 0 0 0 0 0 1 0 0 0
0 1 1 1 0 0 0 1 1 1 0 0
1 1 1 1 1 0 1 1 1 1 1 0
1 1 1 1 1 1 1
The transformation could be adding zeros on each side and a line of 1s at the bottom:
0 0 1 0 0 0 x x x x x 0 0 0 0 1 0 0 0
0 1 1 1 0 0 x x x x x 0 0 0 1 1 1 0 0
1 1 1 1 1 0 x x x x x 0 0 1 1 1 1 1 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1
This will be easy if the parameters provided fit the triangle size exactly: rows = (columns+1)/2. You can handle the disproportionate dimensions by padding the result from an appropriate size ratio with zeros so that the function only needs to handle proper proportions:
def R(rows,cols):
if not cols%2: # must have odd number of columns (pad trailing zero)
return [ row+[0] for row in R(rows,cols-1)]
height = (cols+1)//2
if rows>height: # too high, pad with lines of zeros
return R(height,cols)+[[0]*cols for _ in range(rows-height)]
if rows<height: # not high enough
return R(height,cols)[:rows] # truncate bottom
if cols == 1: # base case (1,1)
return [[1]]
result = R(rows-1,cols-2) # Smaller solution
result = [[0]+row+[0] for row in result] # pad with zeros
result += [[1]*cols] # add line of 1s
return result
The function only generates the matrix. User input and printing should always be separate from data manipulation (especially for recursion)
output:
for row in R(5,5): print(*row)
0 0 1 0 0
0 1 1 1 0
1 1 1 1 1
0 0 0 0 0
0 0 0 0 0
Note that this will work for any combination of row & column sizes but, if you always provide the function with a square matrix, only the "upper half" will contain the triangle because triangle height = (columns+1)/2. It would also be unnecessary to ask the user for both a number of rows and a number of columns if the two are required to be equal.
For only square matrices with an odd number of columns, the process can be separated in a recursive part and a padding function that uses it:
def R(cols):
if cols == 1: return [[1]] # base case (1,1)
result = R(cols-2) # Smaller solution
result = [[0]+row+[0] for row in result] # pad with zeros
result += [[1]*cols] # add line of 1s
return result
def T(n):
return R(n)+[[0]*n]*((n-1)//2) # square matrix with padding
for row in T(7): print(*row)
0 0 0 1 0 0 0
0 0 1 1 1 0 0
0 1 1 1 1 1 0
1 1 1 1 1 1 1
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
I have a daraframe as below:
Datetime Data Fn
0 18747.385417 11275.0 0
1 18747.388889 8872.0 1
2 18747.392361 7050.0 0
3 18747.395833 8240.0 1
4 18747.399306 5158.0 1
5 18747.402778 3926.0 0
6 18747.406250 4043.0 0
7 18747.409722 2752.0 1
8 18747.420139 3502.0 1
9 18747.423611 4026.0 1
I want to calculate the sum of continious non zero values of Column (Fn)
I want my result dataframe as below:
Datetime Data Fn Sum
0 18747.385417 11275.0 0 0
1 18747.388889 8872.0 1 1
2 18747.392361 7050.0 0 0
3 18747.395833 8240.0 1 1
4 18747.399306 5158.0 1 2 <<<
5 18747.402778 3926.0 0 0
6 18747.406250 4043.0 0 0
7 18747.409722 2752.0 1 1
8 18747.420139 3502.0 1 2
9 18747.423611 4026.0 1 3
You can use groupby() and cumsum():
groups = df.Fn.eq(0).cumsum()
df['Sum'] = df.Fn.ne(0).groupby(groups).cumsum()
Details
First use df.Fn.eq(0).cumsum() to create pseudo-groups of consecutive non-zeros. Each zero will get a new id while consecutive non-zeros will keep the same id:
groups = df.Fn.eq(0).cumsum()
# groups Fn (Fn added just for comparison)
# 0 1 0
# 1 1 1
# 2 2 0
# 3 2 1
# 4 2 1
# 5 3 0
# 6 4 0
# 7 4 1
# 8 4 1
# 9 4 1
Then group df.Fn.ne(0) on these pseudo-groups and cumsum() to generate the within-group sequences:
df['Sum'] = df.Fn.ne(0).groupby(groups).cumsum()
# Datetime Data Fn Sum
# 0 18747.385417 11275.0 0 0
# 1 18747.388889 8872.0 1 1
# 2 18747.392361 7050.0 0 0
# 3 18747.395833 8240.0 1 1
# 4 18747.399306 5158.0 1 2
# 5 18747.402778 3926.0 0 0
# 6 18747.406250 4043.0 0 0
# 7 18747.409722 2752.0 1 1
# 8 18747.420139 3502.0 1 2
# 9 18747.423611 4026.0 1 3
How about using cumsum and reset when value is 0
df['Fn2'] = df['Fn'].replace({0: False, 1: True})
df['Fn2'] = df['Fn2'].cumsum() - df['Fn2'].cumsum().where(df['Fn2'] == False).ffill().astype(int)
df
You can store the fn column in a list and then create a new list and iterate over the stored fn column and check the previous index value if it is greater than zero then add it to current index else do not update it and after this u can make a dataframe for the list and concat column wise to existing dataframe
fn=df[Fn]
sum_list[0]=fn first value
for i in range(1,lenghtofthe column):
if fn[i-1]>0:
sum_list.append(fn[i-1]+fn[i])
else:
sum_list.append(fn[i])
dfsum=pd.Dataframe(sum_list)
df=pd.concat([df,dfsum],axis=1)
Hope this will help you.there may me syntax errors that you can refer google.But the idea is this
try this:
sum_arr = [0]
for val in df['Fn']:
if val > 0:
sum_arr.append(sum_arr[-1] + 1)
else:
sum_arr.append(0)
df['sum'] = sum_arr[1:]
df
I have a pandas dataframe and I want to loop over the last column "n" times based on a condition.
import random as random
import pandas as pd
p = 0.5
df = pd.DataFrame()
start = []
for i in range(5)):
if random.random() < p:
start.append("0")
else:
start.append("1")
df['start'] = start
print(df['start'])
Essentially, I want to loop over the final column "n" times and if the value is 0, change it to 1 with probability p so the results become the new final column. (I am simulating on-off every time unit with probability p).
e.g. after one iteration, the dataframe would look something like:
0 0
0 1
1 1
0 0
0 1
after two:
0 0 1
0 1 1
1 1 1
0 0 0
0 1 1
What is the best way to do this?
Sorry if I am asking this wrong, I have been trying to google for a solution for hours and coming up empty.
Like this. Append col with name 1, 2, ...
# continue from question code ...
# colname is 1, 2, ...
for col in range(1, 5):
tmp = []
for i in range(5):
# check final col
if df.iloc[i,col-1:col][0] == "0":
if random.random() < p:
tmp.append("0")
else:
tmp.append("1")
else: # == 1
tmp.append("1")
# append new col
df[str(col)] = tmp
print(df)
# initial
s
0 0
1 1
2 0
3 0
4 0
# result
s 1 2 3 4
0 0 0 1 1 1
1 0 0 0 0 1
2 0 0 1 1 1
3 1 1 1 1 1
4 0 0 0 0 0
I am very new to python and coding. I have this homework that I have to do:
You will receive on the first line the rows of the matrix (n) and on the next n lines you will get each row of the matrix as a string (zeros and ones separated by a single space). You have to calculate how many blocks you have (connected ones horizontally or diagonally) Here are examples:
Input:
5
1 1 0 0 0
1 1 0 0 0
0 0 0 0 0
0 0 0 1 1
0 0 0 1 1
Output:
2
Input:
6
1 1 0 1 0 1
0 1 1 1 1 1
0 1 0 0 0 0
0 1 1 0 0 0
0 1 1 1 1 0
0 0 0 1 1 0
Output:
1
Input:
4
0 1 0 1 1 0
1 0 1 1 0 1
1 0 0 0 0 0
0 0 0 1 0 0
Output:
5
the code I came up with for now is :
n = int(input())
blocks = 0
matrix = [[int(i) for i in input().split()] for j in range(n)]
#loop or something to find the blocks in the matrix
print(blocks)
Any help will be greatly appreciated.
def valid(y,x):
if y>=0 and x>=0 and y<N and x<horizontal_len:
return True
def find_blocks(y,x):
Q.append(y)
Q.append(x)
#search around 4 directions (up, right, left, down)
dy = [0,1,0,-1]
dx = [1,0,-1,0]
# if nothing is in Q then terminate counting block
while Q:
y = Q.pop(0)
x = Q.pop(0)
for dir in range(len(dy)):
next_y = y + dy[dir]
next_x = x + dx[dir]
#if around component is valid range(inside the matrix) and it is 1(not 0) then include it as a part of block
if valid(next_y,next_x) and matrix[next_y][next_x] == 1:
Q.append(next_y)
Q.append(next_x)
matrix[next_y][next_x] = -1
N = int(input())
matrix = []
for rows in range(N):
row = list(map(int, input().split()))
matrix.append(row)
#row length
horizontal_len = len(matrix[0])
blocks = 0
#search from matrix[0][0] to matrix[N][horizontal_len]
for start_y in range(N):
for start_x in range(horizontal_len):
#if a number is 1 then start calculating
if matrix[start_y][start_x] == 1:
#make 1s to -1 for not to calculate again
matrix[start_y][start_x] = -1
Q=[]
#start function
find_blocks(start_y, start_x)
blocks +=1
print(blocks)
I used BFS algorithm to solve this question. The quotations are may not enough to understand the logic.
If you have questions about this solution, let me know!
I have 2 columns of data called level 1 event and level 2 event.
Both are columns of 1s and zeros.
lev_1 lev_2 lev_2_&_lev_1
0 1 0 0
1 0 0 0
2 1 0 0
3 1 1 1
4 1 0 0
col['lev2_&_lev_1] = 1 if lev_2 of current row and lev_1 of previous row are both 1.
I have achieved this by using for loop.
i = 1
while i < a.shape[0]:
if a['lev_1'].iloc[i - 1] == 1 & a['lev_2'].iloc[i] == 1:
a['lev_2_&_lev_1'].iloc[i] = 1
i += 1
I wanted to know a computationally efficient way to do this because my original df is very big.
Thank you!
Use np.where and .shift():
df['lev_2_&_lev_1'] = np.where(df['lev_2'].eq(1) & df['lev_1'].shift().eq(1), 1, 0)
lev_1 lev_2 lev_2_&_lev_1
0 1 0 0
1 0 0 0
2 1 0 0
3 1 1 1
4 1 0 0
Explanation
df['lev_2'].eq(1): checks if current row is equal to 1
df['lev_1'].shift().eq(1): checks if previous row is equal to 1
np.where(condition, 1, 0): if condition is True return 1 else 0
You want:
(df['lev_2'] & df['lev_1'].shift()).astype(int)