How would you extract or slice string like this? - python

As picture shown, how would you slice or extract 'id' from the 'user' column?

df['id'] = df['user'].apply(lambda x: x['id'])
This should work
The column id will contain the ids

The user column looks like JSON. Try json.loads(df['user'])['id'].

Related

Using casefold() with dataframe Column Names and .contains method

How do I look for instances in the dataframe where the 'Campaign' column contains b0.
I would like to not alter the dataframe values but instead just view them as if they were lowercase.
df.loc.str.casefold()[df['Campaign'].str.casefold().contains('b0')]
I recently inquired about doing this in the instance of matching a specific string like below, but what I am asking above I am finding to be more difficult.
df['Record Type'].str.lower() == 'keyword'
Try with
df.loc[df['Campaign'].str.contains('b0',case=False)]
Alternatively, if you want to create a subset of the dataframe:
df_subset = df[(df[('Campaign')].str.casefold().str.contains('b0', na=False))]

How do I strip data from a row in Pandas?

I have a Pandas dataframe and I need to strip out components.schema.Person.properties and just call it id.
column
data_type
data_description
components.schemas.Person.properties.id
string
Unique Mongo Id generated for the person.
Like this?
df['column'] = df['column'].apply(lambda x: x.split('.')[-1])
or more compact solution by #Chris Adams:
df['column'].str.split('.').str[-1]

Get a pandas column name as a string

I am having a dataframe containing multiple columns and multiple rows. I am trying to find the column which contains the entry 'some_string'. I managed to this by
col = df.columns[df.isin(['some_string']).any()]
I would like to have col as a string, but instead it is of the following type
In [47]:
print(col)
Out[47]:
Index(['col_N'], dtype='object')
So how can I get just 'col_N' returned? I just can't find an answer to that! Tnx
You can treat your output as a list. If you have only one match you can as for
print(col[0])
If you have one or more and you want to print then all, you can convert it to a list:
print(list(col))
or you can only pass the values of col to the print:
print(*col)
I think typecasting will help
list_of_columns = list(df.columns)

how to set columns of pandas dataframe as list

I have a pandas dataframe and when I try to acess its columns (like df[["a"]) it is not possible because
the columns are defined as an "Index" object (pandas.core.indexes.base.Index). or Index(['col2','col2'], [![enter image description here][1]][1]dtype='object')
I tried convert it doing something like df.columns = df.columns.tolist() and also df.columns = [str(col) for col in df.columns]
but the columns remained as an Index object.
What I want is to make df.columns and it would return a list object.
What Can I do ?
columns is not callable. So, you need to remove the parenthesis ():
df.columns will give you the name of the columns as an object.
list(df.columns) will give you the name of the columns as a list.
In your example, list(ss.columns) will return a list of column names.
try this:
df.columns.values.tolist()
since you were trying to convert it using this approach, you missed the values attribute
You have to wrap it over list Constructor to function it like a list i.e list(ss.columns).
list(ss.columns)
Hope this works!

add column in dataframe

I need to add a column named Id in a data frame that contains the name of author id's like Author-Id-001, Author-Id-002... and so on till 150.
How can I do that?
Thanks in advance.
something like this(instead of Test-Document-00* I need Author-Id-00*
I think need:
df['Id'] = ['Author-Id-{:03d}'.format(x) for x in range(1, 151)]

Categories

Resources