How to set index of pandas dataframe in python? [duplicate] - python

This question already has answers here:
Dataframe set_index not setting
(2 answers)
Pandas set_index does not set the index
(1 answer)
Closed 3 years ago.
I have tried to choose a column to be an index of a data frame. The examples I've seen so far suggests to use the method set_index(), but it doesn't work in my case. I use python 3.7.0
import pandas as pd
df = pd.DataFrame({'Fruit' : ['Apples','Oranges'],
'Amount': [ 1, 17 ]})
df.set_index('Fruit')
print(df1)
The output that I get is
Fruit Amount
0 'Apples' 1
1 'Oranges' 17
The output I want would be something like
Amount
Fruit
'Apples' 1
'Oranges' 17

Related

How do I create one column in pandas (Python) based on indexes out of multiple columns? [duplicate]

This question already has answers here:
Pandas Melt Function
(2 answers)
Closed 2 years ago.
I have a data frame where there's multiple options for a certain index (1-M relationship) - e.g. States as Index and Counties as respective columns. I want to group it in a way that creates just one column but with all the values. This is a basic transformation but somehow I can't get it right.
Sorry I don't know how to insert code that actually is already run so here I present the code to create example DFs as to what I'd like to create.
pd.DataFrame({'INDEX': ['INDEX1','INDEX2','INDEX3'],
'col1': ['a','b','d'],
'col2': ['c','f',np.nan],
'col3': ['e',np.nan,np.nan]})
and I want it to transform it so that I end up with this data frame:
pd.DataFrame({'INDEX': ['INDEX1','INDEX1','INDEX1','INDEX2','INDEX2','INDEX3'],
'col1': ['a','c','e','b','f','d']})
You can use melt here:
df = pd.melt(df, id_vars=['INDEX']).drop(columns=['variable']).dropna()
print(df)
INDEX value
0 INDEX1 a
1 INDEX2 b
2 INDEX3 d
3 INDEX1 c
4 INDEX2 f
6 INDEX1 e

Change the value from another dataframe [duplicate]

This question already has answers here:
Pandas Merging 101
(8 answers)
Closed 2 years ago.
I am new to python. I'm want to change all the values in the column 'Starting' from df_2 with the 'Station' column from df_1. I did it by using for loop . But How can I perform this task in simplest way?
df_1:
ID Station
0 1 Satose
1 2 Forlango
2 3 poterio
.
.
df_2:
Rail_Number Starting Ending
AABDD 3 44433
DLRAKA 1 45232
MiGOMu 2 18756
.
.
I have answered a similar question here :
Replace a value in a dataframe with a value from another dataframe
Step 1: Convert both columns from df_1 into a dictionary by using the following code:
d = dict(zip(df_1.ID,df_1.Station))
Step 2: Now we just need to map this dictionary and df_2:
df_2.Starting = df_1.ID.map(d)

subset the dataframe into a new one using copy [duplicate]

This question already has answers here:
why should I make a copy of a data frame in pandas
(8 answers)
Closed 4 years ago.
I have a dataframe df
a b c
0 5 6 9
1 6 7 10
2 7 8 11
3 8 9 12
So if I want to select only col a and b and store it in another df I would use something like this
df1 = df[['a','b']]
But I have seen places where people write it this way
df1 = df[['a','b']].copy()
Can anyone let me know what is .copy() because the earlier code works just fine.
For example, if you want to rename a dataframe (example using replace):
df2=df
df2=df2.replace('blah','foo')
Here:
df==df2
Will be:
True
You want it to only do to, df2:
df2=df.copy()
df2=df2.replace('blah','foo')
Then now:
df==df2
Returns:
False

group by two columns count in pandas [duplicate]

This question already has answers here:
Counting duplicate values in Pandas DataFrame
(3 answers)
Count unique values using pandas groupby [duplicate]
(5 answers)
Closed 4 years ago.
I have a Pandas DataFrame like this :
df = pd.DataFrame({
'Date': ['2017-1-1', '2017-1-1', '2017-1-2', '2017-1-2', '2017-1-3'],
'Groups': ['one', 'one', 'one', 'two', 'two']})
Date Groups
0 2017-1-1 one
1 2017-1-1 one
2 2017-1-2 one
3 2017-1-2 two
4 2017-1-3 two
How can I generate a new DataFrame like this?
Date Groups_counts
0 2017-1-1 1
1 2017-1-2 2
2 2017-1-3 1
Thanks a lot!
To get count of unique records use:
df.groupby('Date')['Groups'].nunique()

removing rows in pandas based on a column's values [duplicate]

This question already has answers here:
Use a list of values to select rows from a Pandas dataframe
(8 answers)
Filter dataframe rows if value in column is in a set list of values [duplicate]
(7 answers)
Closed 5 years ago.
This is a subset of a dataframe:
index drug_id values
1 le.1 f
2 le.7 h
3 le.10 9
4 le.11 10
5 le.15 S
I am going to remove rows that values in the drug_id column are: le.7, le.10, le.11.
This is my code:
df.drop(df.drug_id[['le.7', 'le.10', 'le.11']], inplace = True )
I also tried this:
df.drop(df.drug_id == ['le.7', 'le.10', 'le.11'], inplace = True )
But none of them worked. Any suggestion ?

Categories

Resources