add column in dataframe - python

I need to add a column named Id in a data frame that contains the name of author id's like Author-Id-001, Author-Id-002... and so on till 150.
How can I do that?
Thanks in advance.
something like this(instead of Test-Document-00* I need Author-Id-00*

I think need:
df['Id'] = ['Author-Id-{:03d}'.format(x) for x in range(1, 151)]

Related

How to make Date a column in dataframe

I have the below dataframe and i am trying to display how many rides per day.
But i can see only 1 column "near_penn" is considered as a column but "Date" is not.
c = df[['start day','near_penn','Date']]
c=c.loc[c['near_penn']==1]
pre_pandemic_df_new=pd.DataFrame()
pre_pandemic_df_new=c.groupby('Date').agg({'near_penn':'sum'})
print(pre_pandemic_df_new)
print(pre_pandemic_df_new.columns)
Why doesn't it consider "Date" as a column?
How can i make Date as a column of "pre_pandemic_df_new"?
Feel you can use to to_datetime method.
import pandas as pd
pre_pandemic_df_new["Date"]= pd.to_datetime(pre_pandemic_df_new["Date"])
Hope this works
Why doesn't it consider "Date" as a column?
Because the date is an index for your Dataframe.
How can I make Date as a column of "pre_pandemic_df_new"?
you can try this:
pre_pandemic_df_new.reset_index(level=['Date'])
df[['Date','near_penn']] = df[['Date_new','near_penn_new']]
Once you created your dataframe you can try this to add new columns to the end of the dataframe to test if it works before you make adjustments
OR
You can check for a value for the first row corresponding to the first "date" row.
These are the first things that came to my mind hope it helps

How to delete a row from a dataframe if a value in the index is NaN or blank string

I am new to python and wondering if there is a simple way to fix my problem below:
I would like to remove an entire row from the dataframe if the index is 'NaN'. I have tried the following
df.dropna(inplace=True,subset=[df.index.values])
I know how to do this if I want to only look for NaN values in any other column, but I am not sure how to do it for the index values. Thanks much!
Try this:
df = df[df.index.notnull()]
maybe try this:
df = df.dropna(how='all')
print(df)

Filtering Pandas Dataframe by the ending of the string

I have a data frame called df and in one column 'Properties' I have listed properties of some product. These properties are a single sentence. Some of them have the same ending i.e. stock.
I was trying to do something like:
df.loc[df['Properties'][-6:] == 'stock']
to filter this values but it was not working.
I'd like to implement functionality where I can filter data frame by its last 5 characters.
Do you have any ideas how to do this task?
Try this:
df = df[df['Properties'].str.endswith('stock')]
If you want to try what you were trying, this would work:
df = df[df['Properties'].str[-5:]=='stock']

How would you extract or slice string like this?

As picture shown, how would you slice or extract 'id' from the 'user' column?
df['id'] = df['user'].apply(lambda x: x['id'])
This should work
The column id will contain the ids
The user column looks like JSON. Try json.loads(df['user'])['id'].

How to create a column in the existing dataframe with independent row values

I have a dataframe like this in pandas DataFrame where A is the column name:
results.head()
A
when you are away
when I was away
when they are away
I want to add a new column B which would seem like the following:
A B
when you are away you
when I was away I
when they are away they
I tried with this code but it did not work:
results.assign(B = you, I, they)
I am new to pandas dataframe and would very much appreciate the help.
Try this:
B_list = ['you','I','they']
results['B'] = B_list
OR
results = results.assign(B=['you', 'I', 'they'])
If you are interested in placing the column in a specific location (index), then use the insert method:
results.insert(index, "ColName" , new_set)
otherwise, the answer by Mayank is simpler.

Categories

Resources