Python if statment's results into pandas dataframe - python

I am trying to create a DataFrame from a simple if statement result with no success. Could you show me the right method, please? This is what I have so far but the value of discrep is not added to the DataFrame.
discrepancy_value=round(system_availability.iloc[0,0]-data_av.iloc[0,0],2)
discrep=[]
if discrepancy_value>1:
discrep=discrepancy_value
else:
discrep=r'Discrepancy is not significant'
discrepancy=pd.DataFrame()
discrepancy['Discrepancy']=discrep

Your problem is, that you are trying to insert a single value in the dataframe. The dataframe needs lists, not values.
What you should be doing is:
discrep=[]
if discrepancy_value>1:
discrep.append(discrepancy_value)
else:
discrep.append(r'Discrepancy is not significant')
discrepancy=pd.DataFrame()
discrepancy['Discrepancy']=discrep

On one line:
discrepancy = pd.DataFrame({'Discrepancy': [discrepancy_value if discrepancy_value > 1 else r'Discrepancy is not significant']})

You are trying to set a column on an empty dataset with 0 rows. If there would be already rows in the dataframe the following would add the same value to all rows:
discrepancy['Discrepancy']=discrep
But because there are no rows in the dataframe, the column is not added to any row.
You could append a new row with the column value like this:
discrepancy.append([{'Discrepancy': discrep}])
Or add the row already when you create the dataframe
discrepancy=pd.DataFrame([{'Discrepancy': discrep}])

Related

How do I make a new dataframe with data from a previous dataframe

I have the data shown in the screenshot. I want to create a new panda with the column headers of the cells in the forces column in the screenshot and I want the respective values to be listed in each column. I have tried indexing each variable and creating a new panda but that hasn't seemed to work. Could I get some help?
I tried indexing and creating a new panda but when i index the variables i get a single value as opposed to a list of values.
one way to this is by filtring your dataframe each time on a specific value for the column Forces:
column = df['Forces'].unique()
dict_of_column_value = {}
for col in column:
dict_of_column_value[col] = list(df[df['Forces'] == col].Values)
pd.DataFrame(dict_of_column_value)

How to add a row to a Pandas dataframe with the same value all the way down

I am trying to add a new row/series to my dataframe with all values set to 'dummy'.
df['new_col'] = pd.Series(data='dummy')
This does add a new column, but none of the values are populated. I want to get the data to be dummy all the way down for however many rows are already in the dataframe.
Did you try
df['new_col'] = 'dummy'

How to delete rows in Panda DataFrame?

I have a list in my pandas data frame and i want to delete all the rows that have a specific value in a columns up to the 10th row.
Something like this,
del df[df.value == 5][:10]
but it's not working
anyone know the proper syntax
thanks

pandas max function results in inoperable DataFrame

I have a DataFrame with four columns and want to generate a new DataFrame with only one column containing the maximum value of each row.
Using df2 = df1.max(axis=1) gave me the correct results, but the column is titled 0 and is not operable. Meaning I can not check it's data type or change it's name, which is critical for further processing. Does anyone know what is going on here? Or better yet, has a better way to generate this new DataFrame?
It is Series, for one column DataFrame use Series.to_frame:
df2 = df1.max(axis=1).to_frame('maximum')

adding row from one dataframe to another

I am trying to insert or add from one dataframe to another dataframe. I am going through the original dataframe looking for certain words in one column. When I find one of these terms I want to add that row to a new dataframe.
I get the row by using.
entry = df.loc[df['A'] == item]
But when trying to add this row to another dataframe using .add, .insert, .update or other methods i just get an empty dataframe.
I have also tried adding the column to a dictionary and turning that into a dataframe but it writes data for the entire row rather than just the column value. So is there a way to add one specific row to a new dataframe from my existing variable ?
So the entry is a dataframe containing the rows you want to add?
you can simply concatenate two dataframe using concat function if both have the same columns' name
import pandas as pd
entry = df.loc[df['A'] == item]
concat_df = pd.concat([new_df,entry])
pandas.concat reference:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html
The append function expect a list of rows in this formation:
[row_1, row_2, ..., row_N]
While each row is a list, representing the value for each columns
So, assuming your trying to add one row, you shuld use:
entry = df.loc[df['A'] == item]
df2=df2.append( [entry] )
Notice that unlike python's list, the DataFrame.append function returning a new object and not changing the object called it.
See also enter link description here
Not sure how large your operations will be, but from an efficiency standpoint, you're better off adding all of the found rows to a list, and then concatenating them together at once using pandas.concat, and then using concat again to combine the found entries dataframe with the "insert into" dataframe. This will be much faster than using concat each time. If you're searching from a list of items search_keys, then something like:
entries = []
for i in search_keys:
entry = df.loc[df['A'] == item]
entries.append(entry)
found_df = pd.concat(entries)
result_df = pd.concat([old_df, found_df])

Categories

Resources