This question already has answers here:
Remove duplicates from dataframe, based on two columns A,B, keeping row with max value in another column C
(4 answers)
how do I remove rows with duplicate values of columns in pandas data frame?
(4 answers)
Drop all duplicate rows across multiple columns in Python Pandas
(8 answers)
Closed 2 years ago.
I have an Excel file that contains the following information
and I need to extract that data into a dataframe with unique values from column ID and value 1; so that I will end up with something like this:
I have done the following:
df=pd.read_excel(file)
dfUnique=pd.unique(df[["column ID","value1"]]).values.ravel('K')
but I get the following error:
could not broadcast input array from shape (5,2) into shape (5)
Related
This question already has answers here:
Use a list of values to select rows from a Pandas dataframe
(8 answers)
Closed 10 months ago.
I have a pandas dataframe containing data and a python list which contains ids. I want to extract data from the pandas dataframe which matches with the values of list.
ids = ['SW00003062', 'SW00003063', 'SW00003067', 'SW00003072']
Dataframe is this:
You can use isin
out = df[df['id'].isin(ids)]
This question already has answers here:
How do I select rows from a DataFrame based on column values?
(16 answers)
Closed 1 year ago.
I have one array (name df) and dataframe (name data).
Array consists of unique id, say df=array([10,11,12]).
And dataframe consists of 3 columns: data, id, value.
I want to filter dataframe in such a way that it should only contain id specified in array
IIUC:
data = data[data["id"].isin(df)]
This question already has answers here:
How can I pivot a dataframe?
(5 answers)
How to pivot a dataframe in Pandas? [duplicate]
(2 answers)
Closed 1 year ago.
I have a dataframe like this:
index,col1,value
1,A,1
1,B,2
2,A,3
2,D,4
2,C,5
2,B,6
And I would like to convert this dataframe to this:
index,col1_A,col1_B,col1_C,col1_D
1,1,2,np.Nan,np.nan
2,3,4,5,6
The conversion is based on the index column value and for each unique index column, the column values from col1 is converted to column name and its associated value is set to the corresponding value available in value columns.
Currently my solution contains looping by creating subset of df as temporary df based on each index and then starting looping there. I am wondering if there is already builtin solution available for it in pandas. please feel free to suggest.
This question already has answers here:
select columns based on columns names containing a specific string in pandas
(4 answers)
Closed 4 years ago.
Due to large number of columns I was trying to search for column names where the column name has 'el' in it. This is the code I tried
combined.columns.str.contains("el")
But the result I'm getting is a Boolean array how can I view the column names that returned true from it.
Use boolean indexing with loc with : for select all rows:
combined.loc[:, combined.columns.str.contains("el")]
Or use DataFrame.filter:
combined.filter(like='el')
This question already has answers here:
Filter dataframe rows if value in column is in a set list of values [duplicate]
(7 answers)
Closed 4 years ago.
I have a pandas dataframe that looks like this:
From this, I want to grab all the rows for particular Filters (1st column). So for example, I want to grab the rows for F218W, F336W, and F373N.
What is the easiest way to do this in pandas?
In addition if I wanted to grab the rows for those filters but also only for Chip 1, how could I do that easily?
Thanks!
This is a simple slicing:
df[df["# Filter"].isin(["F218W", "F336W","F373N"])]
If the rules across multiple columns, you can simply combine them using &:
df[df["# Filter"].isin(["F218W", "F336W","F373N"]) & (df["Chip"] == 1)]