This question already has answers here:
How to filter Pandas dataframe using 'in' and 'not in' like in SQL
(11 answers)
Closed 2 years ago.
I am filtering a dataframe where a column == one of these numbers (must be in string format):
numbers = [
'013237918',
'70261045',
'70261125',
'70275745',
'70325295',
'70346445',
'70377495',
'70428835',
'74908083',
'75316203'
]
# filter the DF
df[df['column'].str.contains('|'.join(numbers))]
Which im 99% sure works.
However what if one of the rows contained my number, but also an extra number at the end. Or an extra letter. For example '70377495abx'. That would appear in the above search, however I don't want it.
I want the rows where the column equals exactly one of the items in the list.
Like
df[df['column'].str.equals('|'.join(numbers))]
Does something like this exist?
Use
print(df[df['column'].isin(numbers)])
Related
This question already has answers here:
I'm getting an IndentationError. How do I fix it?
(6 answers)
Closed 12 months ago.
In a dataframe I would like to replace values in 2 columns, for certain IDs that are in another column. To be more specific:
globalprocudt_all dataframe:
I have 34 identifier in a list, and I would like to iterate through the list and do this change:
list = ['8012542891901',
'4001869429854',
'4001869429816',
'4001869429809',
'4001869429762',
'4001869429755',
'4001869429717',
'4001869429700',
'4001869429687',
'4001869429670']
for i in list:
globalproduct_all.loc[globalproduct_all.identifier == i, ['ETIM_class_cat', 'ETIM_class']] = "EC000042", "Miniature circuit breaker (MCB)"
I get this error message:
How can I make this work?
thank you
oh sorry, I needed to add a tab before the code... then it works
This question already has answers here:
How do I select rows from a DataFrame based on column values?
(16 answers)
Closed 1 year ago.
What I'm trying to do is create a function that returns the frequency or count of a name input into the function. For example, if I inputted 'Olivia' into my get_frequency(name) function I would want 19674 returned.
You don't actually need this if condition, you could just search for the 'Count' value when df['name'] is equal to the name you are searching for.
snippet of code:
def get_frequency(name):
return df['Count'][df['Name']==name]
if you want the scalar value you could use values like this:
def get_frequency(name):
return df['Count'][df['Name']==name].values[0]
as a result you'd get:
get_frequency('Emma')
>> 20799
In your sintax above:
df.loc[df.Name == 'name', 'Count]
or you can use at
df.at[row_index, col_index]
will return the scalar of the position, or you can use the index number:
df.iat[row_pos_number, col_pos_number]
This question already has answers here:
How to filter Pandas dataframe using 'in' and 'not in' like in SQL
(11 answers)
Closed 3 years ago.
I'm trying to get python to loop through all of the rows in my pandas dataframe.x column and return True if this value is anywhere in the dataframe.y column.
I've already tried to do it like this, but it isn't working.
#Create a column for storage
df["y_exists"] = ""
#Here is a loop that always returns FALSE
if df.x in df.y:
df.x_exists="TRUE"
else:
df.x_exists="FALSE"
This just results in a column full of FALSE when I know that many values should have returned TRUE.
Ideally, if row one of the dataframe had a value of "105" in column x it would search for "105" in column y and return TRUE if it was there and FALSE if it was not and recored this into the new column I created.
You are looking for pandas.DataFrame.isin.
However, you shouldn't loop but use vectorization instead, as it is more efficient, like this:
df['exists_in_y']=df['x'].isin(df['y'])
This question already has an answer here:
Replace values in a pandas series via dictionary efficiently
(1 answer)
Closed 4 years ago.
I have a df where each row has a code that indicates a department.
On the other hand, I have a dictionary in which each code corresponds to a region name (a region is constituted of multiple departments).
I thought of a loop to put the value in a new column that indicates the region.
Here is what I thought would work:
for r in df:
dep = r["dep"].astype(str)
r["region"] = dep_dict.get(dep)
But the only thing I get is "string indices must be integers".
Do you people know how could I make it work ? Or if I should take a totally different route (like joining) ?
Thanks ! 🙏
df.region = df.dep.apply(lambda x: dep_dict[x])
Would this help?
This question already has answers here:
SQLAlchemy and scalar values [duplicate]
(2 answers)
Closed 6 years ago.
I want to get the values of a column by SQLAlchemy, I know I can do like this:
For example:
result = db.session.query(User.id).all()
but the result is a set of list.
result = [(1),(2),(3).........]
result = [i[0] for i in result] // how to omit this step?
but I hope to get a list([1,2,3....]) or a set(1,2,3) directly. Anyone has good idea?
It seems there is no approach in SQLAlchemy