Convert Dictionary of List of Tuples into Pandas Dataframe [duplicate] - python

This question already has an answer here:
convert dict of lists of tuples to dataframe
(1 answer)
Closed 10 months ago.
I am a beginning in Python.
I know similar questions have been posed, and I have read through the answers for the past 2 hours, but I can’t seem to get my code to work. Appreciate your help to advise where I might have gone wrong.
I have a dictionary as such:
{Tom: [(“Math”, 98),
(“English”,75)],
Betty: [(“Science”, 42),
(“Humanities”, 15]}
What is the most efficient way to convert to the following Pandas Dataframe?
Tom Math 98
Tom English 75
Betty Science 42
Betty Humanities 15
I have tried the following method which is throwing up a TypeError: cannot unpack non-iterable int object:
df = pd.DataFrame(columns=[‘Name’,’Subject’,’Score’])
i=0
for name in enumerate(data):
for subject, score in name:
df.loc[i]= [name,subject,score]
i += 1
Thanks a million!

You can loop and construct a list of list that Pandas can consume.
d = {'Tom': [('Math', 98),
('English',75)],
'Betty': [('Science', 42),
('Humanities', 15)]}
data = [[k, *v] for k, lst in d.items() for v in lst]
df = pd.DataFrame(data, columns=['Name','Subject','Score'])
Name Subject Score
0 Tom Math 98
1 Tom English 75
2 Betty Science 42
3 Betty Humanities 15

Do this,
df = pd.DataFrame(data).melt(var_name = "Name", value_name = "Data")
new_df = pd.DataFrame(df["Data"].tolist(), columns = ["Subject", "Marks"])
new_df.insert(loc = 0, column = "Name", value = df["Name"])
Output -
Name
Subject
Marks
0
Tom
Math
98
1
Betty
Science
42
2
Tom
English
75
3
Betty
Humanities
15

Related

Check if string starts with a list of values & doesn't contain a certain value [duplicate]

This question already has answers here:
Filter dataframe rows if value in column is in a set list of values [duplicate]
(7 answers)
Closed 21 days ago.
I have a dataframe:
df = pd.DataFrame([['Jim', 'CF'], ['Toby', 'RW'], ['Joe', 'RWF'], ['Tom', 'RAMF'], ['John', 'RWB']], columns=['Name', 'Position'])
I want to obtain a subset of this dataframe such that we only have subjects who:
Has Position = 'RW', 'RWF', or 'RAMF'
I need to do this in one line of code
I can currently do this in two lines:
RW = df[df['Position'].str.startswith(('RW', 'RAMF', 'RWF'), na = False)]
RW = RW[RW['Position'].str.contains('RWB')==False]
The issue is that subjects with position 'RWB' show up when subsetting by str.startswith('RW'). Therefore, I have to specify in the second line to remove these 'RWB'.
Is it possible to do this in one line of code??
If need test starting of strings use:
RW = df[df['Position'].str.match('RW|RAMF|RWF', na = False) &
~df['Position'].str.contains('RWB', na = False)]
print (RW)
Name Position
1 Toby RW
2 Joe RWF
3 Tom RAMF
Or:
RW = df[df['Position'].str.startswith(('RW', 'RAMF', 'RWF'), na = False) &
~df['Position'].str.contains('RWB', na = False)]
print (RW)
Name Position
1 Toby RW
2 Joe RWF
3 Tom RAMF
If need test if exist values of tuple in column:
RW = df[df['Position'].isin(('RW', 'RAMF', 'RWF'))]
print (RW)
Name Position
1 Toby RW
2 Joe RWF
3 Tom RAMF

Conversion of nested dictionary into data frame in Python

My list/dictionary is nested with lists for different items in it like this:
scores = [{"Student":"Adam","Subjects":[{"Name":"Math","Score":85},{"Name":"Science","Score":90}]},
{"Student":"Bec","Subjects":[{"Name":"Math","Score":70},{"Name":"English","Score":100}]}]
If I use pd.DataFrame directly on the dictionary I get:
What should I do in order to get a data frame that looks like this:
Student Subject.Name Subject.Score
Adam Math 85
Adam Science 90
Bec Math 70
Bec English 100
?
Thanks very much
Use json_normalize with rename:
df = (pd.json_normalize(scores, 'Subjects','Student')
.rename(columns={'Name':'Subject.Name','Score':'Subject.Score'}))
print (df)
Subject.Name Subject.Score Student
0 Math 85 Adam
1 Science 90 Adam
2 Math 70 Bec
3 English 100 Bec
Or list with dict comprehension and DataFrame constructor:
df = (pd.DataFrame([{**x, **{f'Subject.{k}': v for k, v in y.items()}}
for x in scores for y in x.pop('Subjects')]))
print (df)
Student Subject.Name Subject.Score
0 Adam Math 85
1 Adam Science 90
2 Bec Math 70
3 Bec English 100

How do i increase an element value from column in Pandas?

Hello I have this Pandas code (look below) but turn out it give me this error: TypeError: can only concatenate str (not "int") to str
import pandas as pd
import numpy as np
import os
_data0 = pd.read_excel("C:\\Users\\HP\\Documents\\DataScience task\\Gender_Age.xlsx")
_data0['Age' + 1]
I wanted to change the element values from column 'Age', imagine if I wanted to increase the column elements from 'Age' by 1, how do i do that? (With Number of Children as well)
The output I wanted:
First Name Last Name Age Number of Children
0 Kimberly Watson 36 2
1 Victor Wilson 35 6
2 Adrian Elliott 35 2
3 Richard Bailey 36 5
4 Blake Roberts 35 6
Original output:
First Name Last Name Age Number of Children
0 Kimberly Watson 24 1
1 Victor Wilson 23 5
2 Adrian Elliott 23 1
3 Richard Bailey 24 4
4 Blake Roberts 23 5
Try:
df['Age'] = df['Age'] - 12
df['Number of Children'] = df['Number of Children'] - 1

How to split comma separated cell values in a list in pandas?

I hope you are doing well!
Code:
df = pd.read_excel('Grade.xlsx')
df = df.loc[df['Current year'] =="Final Year"]
Num = df['Available Number'].values
Sample data:
Sr no. Name Current Year City Available Number
1 joe First Year NY 125,869,589,852
2 mike Final Year MI 586
3 Ross Final Year NY 589,639,741
4 juli Second Year NY 869,253
Now my code copied value "586"(row2) and "589,639,741"(row3) in variable. But I want to covert those values in list(list of integers) and then later I want to iterate in for loop.
I want some thing like this:
list1 = [586]
list2 = [589,639,741]
I don't know how to separate those values and convert it in list.
If anyone here can help me? I started learning pandas recently. Thanks in advance.
Use a loop to convert the strings into list of int:
df['Available Number'] = df['Available Number'] \
.apply(lambda x: [int(i) for i in x.split(',')])
Output:
Sr no. Name Current Year City Available Number
0 1 joe First Year NY [125, 869, 589, 852]
1 2 mike Final Year MI [586]
2 3 Ross Final Year NY [589, 639, 741]
3 4 juli Second Year NY [869, 253]

Creating a new column in a specific place in Pandas [duplicate]

This question already has answers here:
how do I insert a column at a specific column index in pandas?
(6 answers)
Closed 2 years ago.
I would like to create a new column in Python and place in a specific position. For instance, let "example" be the following dataframe:
import pandas as pd
example = pd.DataFrame({
'name': ['alice','bob','charlie'],
'age': [25,26,27],
'job': ['manager','clerk','policeman'],
'gender': ['female','male','male'],
'nationality': ['french','german','american']
})
I would like to create a new column to contain the values of the column "age":
example['age_times_two']= example['age'] *2
Yet, this code creates a column at the end of the dataframe. I would like to place it as the third column, or, in other words, the column right next to the column "age". How could this be done:
a) By setting an absolute place to the new column (e.g. third position)?
b) By setting a relative place for the new column (e.g. right to the column "age")?
You can use df.insert here.
example.insert(2,'age_times_two',example['age']*2)
example
name age age_times_two job gender nationality
0 alice 25 50 manager female french
1 bob 26 52 clerk male german
2 charlie 27 54 policeman male american
This is a bit manual way of doing it:
example['age_times_two']= example['age'] *2
cols = list(example.columns.values)
cols
You get a list of all the columns, and you can rearrange them manually and place them in the code below
example = example[['name', 'age', 'age_times_two', 'job', 'gender', 'nationality']]
Another way to do it:
example['age_times_two']= example['age'] *2
cols = example.columns.tolist()
cols = cols[:2]+cols[-1:]+cols[2:-1]
example = example[cols]
print(example)
.
name age age_times_two job gender nationality
0 alice 25 50 manager female french
1 bob 26 52 clerk male german
2 charlie 27 54 policeman male american

Categories

Resources