accessing a pandas DataFrame cell

accessing a pandas DataFrame cell - python

I'm having a pandas issue.
I have a dataframe that looks like the following:
A B C D
0 max kate marie John
1 kate marie max john
2 john max kate marie
3 marie john kate max
And I need to access, for instance, the cell in row 0, column D.
I tried using df.iloc[0, 3] but it returns the whole D column.
Any help would be appreciated.

You could use
df.iloc[0]['D']
or
df.loc[0,'D']
Documentation reference DataFrame.iloc
To get the value at a location.

df.iloc[0]["D"]
seems to do the trick

It works fine if your Dataframe name is df.
df.iloc[0,3]
Out[15]: 'John'

You can refer to this for your solution
# Import pandas package
import pandas as pd
# Define a dictionary containing employee data
data = {'Name': ['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Age': [27, 24, 22, 32],
'Address': ['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
'Qualification': ['Msc', 'MA', 'MCA', 'Phd']}
# Convert the dictionary into DataFrame
df = pd.DataFrame(data)
print(pd)
Then you got the table like this
if you want the name in the 0-th row and in 0-th column("Name")
synthax = dataframe.iloc[row-index]['ColumnName']
print(df.iloc[0]['Name'])

Related

How can I add/merge values from one existing column to another column - Python - Pandas - Jupyter Notebook

Good Morning,
This is my code
data = {'Names_Males_GroupA': ['Robert', 'Andrew', 'Gordon', 'Steve'], 'Names_Females_GroupA': ['Brenda', 'Sandra', 'Karen', 'Megan'], 'Name_Males_GroupA': ['David', 'Patricio', 'Noe', 'Daniel']}
df = pd.DataFrame(data)
df
Since Name_Males_GroupA has an error (missing and 's')
I need to move all the values to the correct column which is Names_Males_GroupA
In other words: I want to Add the names David, Patricio, Noe and Daniel below the names Robert, Andrew, Gordon and Steve.
After that I can delete the wrong column.
Thank you.

If I understand you correctly, you can try
df = pd.concat([df.iloc[:, :2], df.iloc[:, 2].to_frame('Names_Males_GroupA')], ignore_index=True)
print(df)
Names_Males_GroupA Names_Females_GroupA
0 Robert Brenda
1 Andrew Sandra
2 Gordon Karen
3 Steve Megan
4 David NaN
5 Patricio NaN
6 Noe NaN
7 Daniel NaN

I would break them apart and put them back together with a pd.concat
data = {'Names_Males_GroupA': ['Robert', 'Andrew', 'Gordon', 'Steve'], 'Names_Females_GroupA': ['Brenda', 'Sandra', 'Karen', 'Megan'], 'Name_Males_GroupA': ['David', 'Patricio', 'Noe', 'Daniel']}
df = pd.DataFrame(data)
df1 = df[['Name_Males_GroupA', 'Names_Females_GroupA']]
df1.columns = ['Names_Males_GroupA', 'Names_Females_GroupA']
df = df[['Names_Males_GroupA', 'Names_Females_GroupA']]
pd.concat([df, df1])

How can I create a column in an existing df from dictionary and using conditions?

I have the following df (just for example):
data={'Name': ['Tom', 'Joseph', 'Krish', 'John']}
df=pd.DataFrame(data)
print(df)
city={"New York": "123",
"LA":"456",
"Miami":"789"}
Output:
Name
0 Tom
1 Joseph
2 Krish
3 John
I would like to add another column to the df which will be based on the city dictionary.
I would like to do it by the following conditions:
If the Name is Tom or Krish then they should get 123 (New York).
If the Name is John then he should get 456 (LA).
If the Name is Joseph then he should get 789 (Miami).
Thanks in advance :)

try via loc and boolean masking:
df.loc[df['Name'].isin(['Tom','Krish']),'City']='New York'
df.loc[df['Name'].eq('Joseph'),'City']='LA'
df.loc[df['Name'].eq('John'),'City']='Miami'
Finally:
df['Value']=df['City'].map(city)
#you can also use replace() in place of map()
OR
#import numpy as np
cond=[
df['Name'].isin(['Tom','Krish']),
df['Name'].eq('Joseph'),
df['Name'].eq('John')
]
df['City']=np.select(cond,['New York','LA','Miami'])
df['Value']=df['City'].map(city)

Turning repeated row labels into column headers in pandas

I have a questionnaire in this format
import pandas as pd
df = pd.DataFrame({'Question': ['Name', 'Age', 'Income','Name', 'Age', 'Income'],
'Answer': ['Bob', 50, 42000, 'Michelle', 42, 62000]})
As you can see the same 'Question' appears repeatedly, and I need to reformat this so that the result is as follows
df2 = pd.DataFrame({'Name': ['Bob', 'Michelle'],
'Age': [ 50, 42],
'Income': [42000,62000]})

Use numpy.reshape:
print (pd.DataFrame(df["Answer"].to_numpy().reshape((2,-1)), columns=df["Question"][:3]))
Or transpose and pd.concat:
s = df.set_index("Question").T
print (pd.concat([s.iloc[:, n:n+3] for n in range(0, len(s.columns), 3)]).reset_index(drop=True))
Both yield the same result:
Question Name Age Income
0 Bob 50 42000
1 Michelle 42 62000

You can create new column group with .assign that utilizes .groupby and .cumcount (Bob would be the first group and Michelle would be in the second group, with the groups being determined based off repetition of Name, Age, and Income)
Then .pivot the datraframe with the index being the group.
code:
df3 = (df.assign(group=df.groupby('Question').cumcount())
.pivot(index='group', values='Answer', columns='Question')
.reset_index(drop=True)[['Name','Age','Income']]) #[['Name','Age','Income']] at the end reorders the columns.
df3
Out[76]:
Question Name Age Income
0 Bob 50 42000
1 Michelle 42 62000

Here is a solution! It assumes that there are an even number of potential names for each observation (3 columns for Bob and Michelle, respectively):
import pandas as pd
df = pd.DataFrame({'Question': ['Name', 'Age', 'Income','Name', 'Age', 'Income'],
'Answer': ['Bob', 50, 42000, 'Michelle', 42, 62000]})
df=df.set_index("Question")
pd.concat([df.iloc[i:i+3,:].transpose() for i in range(0,len(df),3)],axis=0).reset_index(drop=True)

Having trouble with pandas

import pandas as pd
stack = pd.DataFrame(['adam',25,28,'steve',25,28,'emily',18,21)
print(stack[0].to_list()[0::2])
print(stack[0].to_list()[1::2])
df = pd.DataFrame(
{'Name': stack[0].to_list()[0::3],
'Age': stack[0].to_list()[1::3],
'New Age': stack[0].to_list()[2::3] }
)
print(df)
It how do i separate adam and steve into a different row?
I want it to line up like the table below.
Table

You can get it as list and use slice [0::2] and [1::2]
import pandas as pd
data = pd.DataFrame(['adam',22,'steve',25,'emily',18])
print(data)
#print(data[0].to_list()[0::2])
#print(data[0].to_list()[1::2])
df = pd.DataFrame({
'Name': data[0].to_list()[0::2],
'Age': data[0].to_list()[1::2],
})
print(df)
Before (like on original image which was removed from question)
0
0 adam
1 22
2 steve
3 25
4 emily
5 18
After:
Name Age
0 adam 22
1 steve 25
2 emily 18
EDIT: image from original question
EDIT: BTW: the same with normal list
import pandas as pd
data = ['adam',22,'steve',25,'emily',18]
print(data)
df = pd.DataFrame({
'Name': data[0::2],
'Age': data[1::2],
})
print(df)

These two lines should do it. However, without knowing what code you have, what you're trying to accomplish, or what else you intend to do with it, the following code is only valid in this situation.
d = {'Name': ['adam', 'steve', 'emily'], 'Age': [22, 25, 18]}
df = pd.DataFrame(d)

How to replace column values based on a list?

I have a list like this:
x = ['Las Vegas', 'San Francisco, 'Dallas']
And a dataframe that looks a bit like this:
import pandas as pd
data = [['Las Vegas (Clark County), 25], ['New York', 23],
['Dallas', 27]]
df = pd.DataFrame(data, columns = ['City', 'Value'])
I want to replace my city values in the DF "Las Vegas (Clark County)" with "Las Vegas". In my dataframe are multiple cities with different names which needs to be changed. I know I could do a regex expression to just strip off the part after the parentheses, but I was wondering if there was a more clever, generic way.

Use Series.str.extract with joined values of list by | for regex OR and then replace non matched values to original by Series.fillna:
df['City'] = df['City'].str.extract(f'({"|".join(x)})', expand=False).fillna(df['City'])
print (df)
City Value
0 Las Vegas 25
1 New York 23
2 Dallas 27
Another idea is use Series.str.contains with loop, but it should be slow if large Dataframe and many values in list:
for val in x:
df.loc[df['City'].str.contains(val), 'City'] = val

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

accessing a pandas DataFrame cell - python

You could use df.iloc[0]['D'] or df.loc[0,'D'] Documentation reference DataFrame.iloc To get the value at a location.

df.iloc[0]["D"] seems to do the trick

It works fine if your Dataframe name is df. df.iloc[0,3] Out[15]: 'John'

Related

How can I add/merge values from one existing column to another column - Python - Pandas - Jupyter Notebook

How can I create a column in an existing df from dictionary and using conditions?

Turning repeated row labels into column headers in pandas

Having trouble with pandas

How to replace column values based on a list?

Categories

Resources