I have a list of nested dictionaries in python.
I tried to convert it into a dataframe using:
data=pd.DataFrame(list_of_dicts)
This converts most of the dicts into columns. However there is still the first column which consists of another list of dicts. Data looks like this:
FIS mid LI DE PBT
4182 L234 L3133 2020-02-13T09:50:53Z
In the FIScolumn are still dictionaries, column first row of FIS looks like this:
[{'FI': [{'TMC': {'PC': 6671, 'DE': 'Pohlheim-Dorf-Güll', 'QD': '+', 'LE': 0.04984}, 'SHP': [], 'CF': [{'TY': 'TR', 'SP': 30.0, 'SU': 30.0, 'FF': 30.0, 'JF': 0.0, 'CN': 0.7}]}
I tried to applied the method described above again on the FIS column. But this doesn't write the dicts in new columns.
So my question is: How can I convert this list of dicts to a dataframe?
I extracted the data from the here api (https://developer.here.com/documentation/traffic/dev_guide/topics/examples.html)
Thank you in advance!
It works
list_of_dicts = [{"qwerty":[1,2,3]}, {"lol":["l","o","l"]}]
df = pd.concat([pd.DataFrame(e) for e in list_of_dicts], axis=1)
Related
{'labels': ['travel', 'dancing', 'cooking'],
'scores': [0.9938651323318481, 0.0032737774308770895, 0.002861034357920289],
'sequence': 'one day I will see the world'}
i have this a df['prediction'] column i want to split this result into three different column as df['travel'],df['dancing'],df['cooking'] and their respective scores i am sorry if the question is not appropriaterequired result
required result
you can edit your data as a list of dicts and each dict is row data
and at the end, you can you set_index you select the index
import pandas as pd
list_t = [{
"travel":0.9938651323318481,
"dancing": 0.0032737774308770895,
"cooking":0.002861034357920289,
"sequence":'one day I will see the world'
}]
df = pd.DataFrame(list_t)
df.set_index("sequence")
#output
travel dancing cooking
sequence
one day I will see the world 0.993865 0.003274 0.002861
What you can do is iterate over this dict and make another dictionary
say s is the source dictionary and x is the new dictionary that you want
x = {}
x['sequence']=s['sequence']
for i, l in enumerate(s['labels']):
x[l] = s['scores'][i]
This should solve your problem.
I am new to pandas, and I would appreciate any help. I have a pandas dataframe that comes from csv file. The data contains 2 columns : dates and cashflows. Is it possible to convert these list into list comprehension with tuples inside the list? Here how my dataset looks like:
2021/07/15 4862.306832
2021/08/15 3474.465543
2021/09/15 7121.260118
The desired output is :
[(2021/07/15, 4862.306832),
(2021/08/15, 3474.465543),
(2021/09/15, 7121.260118)]
use apply with lambda function
data = {
"date":["2021/07/15","2021/08/15","2021/09/15"],
"value":["4862.306832","3474.465543","7121.260118"]
}
df = pd.DataFrame(data)
listt = df.apply(lambda x:(x["date"],x["value"]),1).tolist()
Output:
[('2021/07/15', '4862.306832'),
('2021/08/15', '3474.465543'),
('2021/09/15', '7121.260118')]
I would like to construct a MultiIndex DataFrame from a deeply-nested dictionary of the form
md = {'50': {'100': {'col1': ('0.100',
'0.200',
'0.300',
'0.400'),
'col2': ('6.263E-03',
'6.746E-03',
'7.266E-03',
'7.825E-03')},
'101': {'col1': ('0.100',
'0.200',
'0.300',
'0.400'),
'col2': ('6.510E-03',
'7.011E-03',
'7.553E-03',
'8.134E-03')}
'102': ...
}
'51': ...
}
I've tried
df = pd.DataFrame.from_dict({(i,j): md[i][j][v] for i in md.keys() for j in md[i].keys() for v in md[i][j]}, orient='index')
following Construct pandas DataFrame from items in nested dictionary, but I get a DataFrame with 1 row and many columns.
Bonus:
I'd also like to label the MultiIndex keys and the columns 'col1' and 'col2', as well as convert the strings to int and float, respectively.
How can I reconstruct my original dictionary from the dataframe?
I tried df.to_dict('list').
Check out this answer: https://stackoverflow.com/a/24988227/9404057. This method unpacks the keys and values of the dictionary, and reforms the data into an easily processed format for multiindex dataframes. Note that if you are using python 3.5+, you will need to use .items() rather than .iteritems() as shown in the linked answer:
>>>>import pandas as pd
>>>>reform = {(firstKey, secondKey, thirdKey): values for firstKey, middleDict in md.items() for secondKey, innerdict in middleDict.items() for thirdKey, values in innerdict.items()}
>>>>df = pd.DataFrame(reform)
To change the data type of col1 and col to int and float, you can then use pandas.DataFrame.rename() and specify any values you want:
df.rename({'col1':1, 'col2':2.5}, axis=1, level=2, inplace=True)
Also, if you'd rather have the levels on the index rather than the columns, you can also use pandas.DataFrame.T
If you wanted to reconstruct your dictionary from this MultiIndex, you could do something like this:
>>>>md2={}
>>>>for i in df.columns:
if i[0] not in md2.keys():
md2[i[0]]={}
if i[1] not in md2[i[0]].keys():
md2[i[0]][i[1]]={}
md2[i[0]][i[1]][i[2]]=tuple(df[i[0]][i[1]][i[2]].values)
I have csv dataframe like -
print(test.loc[1])
outlook sunny
temperature mild
humidity normal
wind weak
playtennis yes
Name: 1, dtype: object
I want to convert this into something like -
outlook.sunny.temperature.mild.humidity.normal.wind.weak.playtennis.yes
How can I achieve this?
Let ser = test.loc[1].
You can convert this series to a dictionary with .to_dict(),
Then convert the dictionary into a list of key/value tuples with .items(),
Then merge the tuples into one list with itertools.chain, and finally
Join the list items with periods with .join().
Python code:
from itertools import chain
'.'.join(chain.from_iterable(ser.to_dict().items()))
#'outlook.sunny.temperature.mild.humidity.normal....yes'
This is one way using a list comprehension and str.join:
import pandas as pd
test = pd.DataFrame([['sunny', 'mild', 'normal', 'weak', 'yes']],
columns=['outlook', 'temperature', 'humidity', 'wind', 'playtennis'])
res = '.'.join([k+'.'+test[k].iloc[0] for k in test])
print(res)
'outlook.sunny.temperature.mild.humidity.normal.wind.weak.playtennis.yes'
Alternatively, you can zip column names and dataframe values:
res = '.'.join(i+'.'+j for i, j in zip(test, test.values[0]))
I am new to pandas/python. So i am reading a .xlsx file and in that i created bunch of dataframes, 16 to be precise and a master dataframe which is empty. Now I want to append all of these 16 dataframes to the master dataframe one by one, using for loops.
1 method I thought of iterating through a list. But can these df_1, df_2 etc be stored in a list, and then we can iterate over them.
Let's say suppose i had a csv file then,
df1 = pd.read_csv('---.csv')
df2 = pd.read_csv('---.csv')
then i create a list,
filenames = ['---.csv','---.csv']
create an empty master dataframe :
master_df= []
finally, loop through the list :
for f in filenames:
master_df.append(pd.read_csv(f))
but this wont apply, i need something similar, so how can i iterate over all the dataframes. Any solution would be appreciated.
FINALLY, this is my master_df :
master_df = pd.DataFrame({'Variable_Name': [], 'Value':[], 'Count': []})
and this is the 1st dataframe :
df_1 = pd.DataFrame({
'Variable_Name': ['Track', 'Track', 'Track', 'Track'],
'Value': ['Track 38','Track 39', 'Track 40', 'Track 37'],
'Count': [161, 160, 158, 152]})
Similarly 15 more are there.
This is because append() returns new dataframe and this object should be stored somewhere
Try:
for f in filenames:
master_df = master_df.append(pd.read_csv(f))
More info of append function: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.append.html