I am replacing values of an existing dataframe in Python during a for loop. My original dataframe is of this format:
After trying to update the items in the "slice_file_name" and "fsID" via these pandas replace commands:
df1['slice_file_name'] = df1['slice_file_name'].replace(to_replace=str(row["slice_file_name"]),value=f'{actual_filename}_{directory}.wav')
fsID_Name = str(row["fsID"])
df1['fsID'] = df1['fsID'].replace(to_replace=str(row["fsID"]), value=f'{fsID_Name}_{directory}')
Only the "slice_file_name" gets updated correctly:
The "fsID" does not get updated correctly. Can you tell me what I am doing wrong here?
I want to update the "fsID" column as follows: - for example for the first data row, "fsID" should be equal to 102305_TimeShift-10pct. I see from my ide that f'{fsID_Name}_{directory}' gives the correct string, but it is not updating the fsID cell. How to update the fsID cell accordingly?
Thanks!
If I understand correctly what you are looking for 'fsID', would this work (after your 'slice_file_name' is updated) ?
df['fsID']=df['fsID']+'_'+df['slice_file_name'].str.split('_',expand=True)[1].str.replace('.wav','')
Related
Trying to permanently delete all rows that contain a given string. Tried this code, it runs but if you df.head() afterwards it doesn't show that it dropped.
df[df["column"].str.contains('text')==False]
Try assigning it to the df. Like:
df = df[df["column"].str.contains('text')==False]
I am looking to delete a row in a dataframe that is imported into python by pandas.
if you see the sheet below, the first column has same name multiple times. So the condition is, if the first column value re-appears in a next row, delete that row. If not keep that frame in the dataframe.
My final output should look like the following:
Presently I am doing it by converting each column into a list and deleting them by index values. I am hoping there would be an easy way. Rather than this workaround/
df.drop_duplicates([df.columns[0])
should do the trick.
Try the following code;
df.drop_duplicates(subset='columnName', keep=’first’, inplace=true)
I imported a .csv file with a single column of data into a dataframe that I am trying to clean up by splitting the column based on various string occurrences within the cells. I've tried numerous means to split the column, but can't seem to get it to work. My latest attempt was using the following:
df.loc[:,'DataCol'] = df.DataCol.str.split(pat=':\n',expand=True)
df
The result is a dataframe that is still one column and completely unchanged. What am I doing wrong? This is my first time doing anything like this so please forgive the simple question.
Df.loc creates a copy of the column you've selected - try replacing the code below with df['DataCol'], which references the actual column in the original dataframe.
df.loc[:,'DataCol']
I'm trying to make python append all the data starting from count=1 to the next column but it prints it at the bottom of my result from count=0.
im using 'self' because of my class and function. the first time that 'count==0' it makes two columns. first column is my 'self.header' and the second one is 'self.oneVariableSum(self.times2)'. but once the count goes to 1, it adds 'self.oneVariableSum(self.times2)' to the bottom end of second column. but i need it to be in a new column instead.
i have the portion of that code below but i cant figure what i'm doing wrong.
if (count==0):
self.all.append([self.header,self.oneVariableSum(self.times2)])
else:
self.all.append([[None,self.oneVariableSum(self.times2)]])
As the other ones said, it's not really possible/easy to do it using python lists. i ended up converting it to a panda dataframe. and used the line below to append the new result in a new column.
self.result=pd.concat([all,all2],axis=1, sort=False)
This did the trick.
So I have an excel sheet with the following format:
Now what I'm looking to do is to loop trough each index cell in column A and assign all cells the same value until the next 0 is reached. so for example:
Now I have tried importing the excel file into a pandas dataframe and then using for loops to do this, but I can't seem to make it work. Any suggestions or directions to the appropriate method would be much appreciated!
Thank you for your time
Edit:
Using #wen-ben's method: s.index=pd.Series((s.index==0).cumsum()).map({1:'bananas',2:'cherries',3:'pineapples'})
just enters the first element (bananas) for all cells in Column A
Assuming you have dataframe s using cumsum
s.index=pd.Series((s.index==0).cumsum()).map({1:'bananas',2:'cherries',3:'pineapples'})