Getting a specific value inside dataframe and the position - python

I have a Dataframe similar to the tabel example here, and I want to find a specific value inside this Dataframe, for example the value 2. Herefor i use np.where, in the next step I want to check if there is a next value, and if so is the value smaller/bigger/similar.
My solution would be 'print out the np.where and hardcode the index with [-x] for each value after the 2.' So iam looking for a smarter Solution for cases with for example 100 Values
The output should be: 2 is bigger ,2 is smaller ,2 is the last number.
Value
1
2
1
2
3
2

If I understand your question correctly you could try this code
import pandas as pd
frame = pd.DataFrame([1, 2, 1, 2, 3, 2], columns=['num'])
def find_values(df, number):
for index, row in df.iterrows():
if row['num'] == number:
if len(df) == index+1:
print(number, 'is the last number')
else:
next_num = df.loc[index+1, 'num']
if next_num > number:
print(number, '<', next_num)
elif next_num < number:
print(number, '>', next_num)
else:
print(number, '==', number)
find_values(frame, 2)
output:
2 > 1
2 < 3
2 is the last number

Related

How to print list of numbers but counting in like a pyramid fashion in IDLE

I'm trying to make a loop that prints numbers exactly like this:
1
12
123
1234
12345
I already have this pattern down for different characters, and can even print this:
1
22
333
4444
55555
But I'm having a big headache trying to figure out how I can make it count. Any help would be greatly appreciated.
Here is the code I have to print the list above:
for row in range (number_of_rows + 1):
for column in range(row)
print (row, end='')
print()
Sorry, I was just going to make a comment, but this will be easier:
for row in range (number_of_rows + 1):
for column in range(row)
print (column+1, end='') #<-- put column here instead of row and add a "+1"
print()
Some more details of what is going on:
for row in range (number_of_rows + 1):
Iterate from zero to number_of_rows. e.g. if number of rows was 5 this would iterate through a row of 0,1,2,3,4,5
for column in range(row):
For each row iterate from 0 to the row number minus 1. e.g. row 3 would iterate through 0, 1, 2
print (row+1, end='')
Print the column, one digit at a time. To start the row at 1 we need to add 1
If I'm understanding correctly, you could do something like
for i in range(1, rows + 1):
print(''.join(str(j) for j in range(1, i + 1)))
outputs:
1
12
123
1234
12345

Python pandas DataFrame: Check if n elements has continous value?

I want to check if a column has at least one instant with 5 continous days with nonzero value. Such that in the following example, this would be false for column '1' result=0, and ture for column '2' result=1. This code will do the job:
import pandas as pd
days=pd.date_range('1900-1-1',periods=14,freq='D')
df = pd.DataFrame({'1': [0,1,0,1,1,0,1,0,1,1,0,1,1,0], '2':[0,1,0,1,1,0,1,0,1,1,1,1,1,0]},index=days)
col='2' #Select any column (i.e., the result for col1 should be 0 and for col2 should be 1)
nday=5 #Number of consecutive days with nonzero values
result=0 #If nonzero values lasted for 5 consecutive days, then result=1
for index, row in df.iterrows():
if row[col] ==0: #Restart counting if nonzero vaules are not continous for five days
nday=5
elif row[col] ==1: #Check for continous nonzero values
nday-=1
if nday==0:
result=1
break
print(result)
Is there an easier way than this long code?
The code seems good in terms of complexity and the number of lines. Just a few suggestions, see below.
def has_continuous(col, ndays=5) -> bool:
days_left = n_days
for index, row in enumerate(col):
if not row[col]: #Restart counting
days_left = n_days
else:
# I assume that all values are non-negative. If it is not zero, it is positive
days_left -= 1
if not days_left:
return True
return False
result = has_continuos(df['2'], 5)
If you are always checking for 0, you can use rolling with min:
col='2'
nday=5
print (df[col].rolling(nday).min().ge(1).any())
# True

Formatting unknown output in a table in Python

Help! I'm a Python beginner given the assignment of displaying the Collatz Sequence from a user-inputted integer, and displaying the contents in columns and rows. As you may know, the results could be 10 numbers, 30, or 100. I'm supposed to use '\t'. I've tried many variations, but at best, only get a single column. e.g.
def sequence(number):
if number % 2 == 0:
return number // 2
else:
result = number * 3 + 1
return result
n = int(input('Enter any positive integer to see Collatz Sequence:\n'))
while sequence != 1:
n = sequence(int(n))
print('%s\t' % n)
if n == 1:
print('\nThank you! The number 1 is the end of the Collatz Sequence')
break
Which yields a single vertical column, rather than the results being displayed horizontally. Ideally, I'd like to display 10 results left to right, and then go to another line. Thanks for any ideas!
Something like this maybe:
def get_collatz(n):
return [n // 2, n * 3 + 1][n % 2]
while True:
user_input = input("Enter a positive integer: ")
try:
n = int(user_input)
assert n > 1
except (ValueError, AssertionError):
continue
else:
break
sequence = [n]
while True:
last_item = sequence[-1]
if last_item == 1:
break
sequence.append(get_collatz(last_item))
print(*sequence, sep="\t")
Output:
Enter a positive integer: 12
12 6 3 10 5 16 8 4 2 1
>>>
EDIT Trying to keep it similar to your code:
I would change your sequence function to something like this:
def get_collatz(n):
if n % 2 == 0:
return n // 2
return n * 3 + 1
I called it get_collatz because I think that is more descriptive than sequence, it's still not a great name though - if you wanted to be super explicit maybe get_collatz_at_n or something.
Notice, I took the else branch out entirely, since it's not required. If n % 2 == 0, then we return from the function, so either you return in the body of the if or you return one line below - no else necessary.
For the rest, maybe:
last_number = int(input("Enter a positive integer: "))
while last_number != 1:
print(last_number, end="\t")
last_number = get_collatz(last_number)
In Python, print has an optional keyword parameter named end, which by default is \n. It signifies which character should be printed at the very end of a print-statement. By simply changing it to \t, you can print all elements of the sequence on one line, separated by tabs (since each number in the sequence invokes a separate print-statement).
With this approach, however, you'll have to make sure to print the trailing 1 after the while loop has ended, since the loop will terminate as soon as last_number becomes 1, which means the loop won't have a chance to print it.
Another way of printing the sequence (with separating tabs), would be to store the sequence in a list, and then use str.join to create a string out of the list, where each element is separated by some string or character. Of course this requires that all elements in the list are strings to begin with - in this case I'm using map to convert the integers to strings:
result = "\t".join(map(str, [12, 6, 3, 10, 5, 16, 8, 4, 2, 1]))
print(result)
Output:
12 6 3 10 5 16 8 4 2 1
>>>

Use pandas to count value greater than previous value

I am trying to count the number of times a value is greater than the previous value by 2.
I have tried
df['new'] = df.ms.gt(df.ms.shift())
and other similar lines but none give me what I need.
might be less than elegant but:
df['new_ms'] = df['ms'].shift(-1)
df['new'] = np.where((df['ms'] - df['new_ms']) >= 2, 1, 0)
df['new'].sum()
Are you looking for diff? Find the difference between consecutive values and check that their difference is greater than, or equal to 2, then count rows that are True:
(df.ms.diff() >= 2).sum()
If you need to check if the difference is exactly 2, then change >= to ==:
(df.ms.diff() == 2).sum()
Since you need a specific difference, gt won't work. You could simply subtract and see if the difference is bigger than 2:
(df.ms - df.ms.shift() > 2).sum()
edit: changed to get you your answer instead of creating a new column. sum works here because it converts booleans to 1 and 0.
your question was ambiguous but as you wanted to see a program where number of times a value is greater than the previous value by 2 in pandas.here it is :
import pandas as pd
lst2 = [11, 13, 15, 35, 55, 66, 68] #list of int
dataframe = pd.DataFrame(list(lst2)) #converting into dataframe
count = 0 #we will count how many time n+1 is greater than n by 2
d = dataframe[0][0] #storing first index value to d
for i in range(len(dataframe)):
#print(dataframe[0][i])
d = d+2 #incrementing d by 2 to check if it is equal to the next index value
if(d == dataframe[0][i]):
count = count+1 #if n is less than n+1 by 2 then keep counting
d = dataframe[0][i] #update index
print("total count ",count) #printing how many times n was less than n+1 by 2

Append values of a pandas data frame to a list by if condition

I've got a pandas df, which has one column with either positive or negative float values:
snapshot 0 (column name)
2018-06-21 00:00:00 -60.18
2018-06-21 00:00:15 43.78
2018-06-21 00:00:30 -22.08
Now I want to append the positive values to a list that's called:
excessSupply=[]
and the negative values to:
excessLoad=[]
by
for row in self.dfenergyBalance:
if self.dfenergyBalance['0'] < 0:
self.excessLoad.append(self.dfenergyBalance['0'])
else:
self.excessLoad.append(0)
(for excessSupply is the if condition self.dfenergyBalance > 0)
The outcome is a key error of the column name '0'
In my opinion no loops (slow) are necessesary, also it seems column name is number 0:
mask = dfenergyBalance[0] < 0
excessSupply = dfenergyBalance.loc[mask, 0].tolist()
excessLoad = dfenergyBalance.loc[~mask, 0].tolist()
print (excessSupply)
[-60.18, -22.08]
print (excessLoad)
[43.78]
EDIT:
For list with only 0 by length of positive values:
excessLoad = [0] * (~mask).sum()
print (excessLoad)
[0]
If need only one list with replaced positive to 0 values:
L = np.where(mask, dfenergyBalance[0], 0).tolist()
print (L)
[-60.18, 0.0, -22.08]

Categories

Resources