This question already has answers here:
Pandas DataFrame to List of Dictionaries
(5 answers)
Closed 1 year ago.
I am new in python, so every tip will be helpful :)
I have a pandas dataframe with multiple columns and I need it converted to a new list of objects. Among all of dataframes columns I have two (lat, lon) that I want in my new object as attributes.
index
city
lat
lon
0
London
42.33
55.44
1
Rome
92.44
88.11
My new list of object will need to look something like this:
[
{'lat': 42.33, 'lon': 55.44},
{'lat': 92.44, 'lon': 88.11}
]
More specifically I need this for Machine Learning with ML Studio.
Thanks!
Use Pandas.DataFrame.to_dict(orient) to convert a DataFrame into a dictionary. There are multiple dictionary orientations; for your case use orient='records'
You also want to only select the lat & lon columns, like this:
df[['lat','lon']].to_dict(orient='records')
This will give you your result:
[{'lat': 42.33, 'lon': 55.44}, {'lat': 92.44, 'lon': 88.11}]
Here are some other orientations you could try out:
‘dict’ (default) : dict like {column -> {index -> value}}
‘list’ : dict like {column -> [values]}
‘series’ : dict like {column -> Series(values)}
‘split’ : dict like {‘index’ -> [index], ‘columns’ -> [columns], ‘data’ -> [values]}
‘records’ : list like [{column -> value}, … , {column -> value}]
‘index’ : dict like {index -> {column -> value}}
You can choose the columns you want and then use to_dict with orient='records' to get the required result
df[["lat", "lon"]].to_dict(orient='records')
Related
{'labels': ['travel', 'dancing', 'cooking'],
'scores': [0.9938651323318481, 0.0032737774308770895, 0.002861034357920289],
'sequence': 'one day I will see the world'}
i have this a df['prediction'] column i want to split this result into three different column as df['travel'],df['dancing'],df['cooking'] and their respective scores i am sorry if the question is not appropriaterequired result
required result
you can edit your data as a list of dicts and each dict is row data
and at the end, you can you set_index you select the index
import pandas as pd
list_t = [{
"travel":0.9938651323318481,
"dancing": 0.0032737774308770895,
"cooking":0.002861034357920289,
"sequence":'one day I will see the world'
}]
df = pd.DataFrame(list_t)
df.set_index("sequence")
#output
travel dancing cooking
sequence
one day I will see the world 0.993865 0.003274 0.002861
What you can do is iterate over this dict and make another dictionary
say s is the source dictionary and x is the new dictionary that you want
x = {}
x['sequence']=s['sequence']
for i, l in enumerate(s['labels']):
x[l] = s['scores'][i]
This should solve your problem.
I am new to pandas, and I would appreciate any help. I have a pandas dataframe that comes from csv file. The data contains 2 columns : dates and cashflows. Is it possible to convert these list into list comprehension with tuples inside the list? Here how my dataset looks like:
2021/07/15 4862.306832
2021/08/15 3474.465543
2021/09/15 7121.260118
The desired output is :
[(2021/07/15, 4862.306832),
(2021/08/15, 3474.465543),
(2021/09/15, 7121.260118)]
use apply with lambda function
data = {
"date":["2021/07/15","2021/08/15","2021/09/15"],
"value":["4862.306832","3474.465543","7121.260118"]
}
df = pd.DataFrame(data)
listt = df.apply(lambda x:(x["date"],x["value"]),1).tolist()
Output:
[('2021/07/15', '4862.306832'),
('2021/08/15', '3474.465543'),
('2021/09/15', '7121.260118')]
In python, would the following be considered a list or a dict?
temp = [{'lat': 39.7612992, 'lon': -86.1519681},
{'lat': 39.762241, 'lon': -86.158436 },
{'lat': 39.7622292, 'lon': -86.1578917}]
I have a pandas dataframe that I am trying to convert to look like the above but I am not certain what I should be converting it to.
Yes, it is a list. More precisely, it is a list object, containing a sequence of dict objects. You can run type(temp) to know the type of that object.
I would like to construct a MultiIndex DataFrame from a deeply-nested dictionary of the form
md = {'50': {'100': {'col1': ('0.100',
'0.200',
'0.300',
'0.400'),
'col2': ('6.263E-03',
'6.746E-03',
'7.266E-03',
'7.825E-03')},
'101': {'col1': ('0.100',
'0.200',
'0.300',
'0.400'),
'col2': ('6.510E-03',
'7.011E-03',
'7.553E-03',
'8.134E-03')}
'102': ...
}
'51': ...
}
I've tried
df = pd.DataFrame.from_dict({(i,j): md[i][j][v] for i in md.keys() for j in md[i].keys() for v in md[i][j]}, orient='index')
following Construct pandas DataFrame from items in nested dictionary, but I get a DataFrame with 1 row and many columns.
Bonus:
I'd also like to label the MultiIndex keys and the columns 'col1' and 'col2', as well as convert the strings to int and float, respectively.
How can I reconstruct my original dictionary from the dataframe?
I tried df.to_dict('list').
Check out this answer: https://stackoverflow.com/a/24988227/9404057. This method unpacks the keys and values of the dictionary, and reforms the data into an easily processed format for multiindex dataframes. Note that if you are using python 3.5+, you will need to use .items() rather than .iteritems() as shown in the linked answer:
>>>>import pandas as pd
>>>>reform = {(firstKey, secondKey, thirdKey): values for firstKey, middleDict in md.items() for secondKey, innerdict in middleDict.items() for thirdKey, values in innerdict.items()}
>>>>df = pd.DataFrame(reform)
To change the data type of col1 and col to int and float, you can then use pandas.DataFrame.rename() and specify any values you want:
df.rename({'col1':1, 'col2':2.5}, axis=1, level=2, inplace=True)
Also, if you'd rather have the levels on the index rather than the columns, you can also use pandas.DataFrame.T
If you wanted to reconstruct your dictionary from this MultiIndex, you could do something like this:
>>>>md2={}
>>>>for i in df.columns:
if i[0] not in md2.keys():
md2[i[0]]={}
if i[1] not in md2[i[0]].keys():
md2[i[0]][i[1]]={}
md2[i[0]][i[1]][i[2]]=tuple(df[i[0]][i[1]][i[2]].values)
I am using Python's csv.DictReader to read in values from a CSV file to create a dictionary where keys are first row or headers in the CSV and other rows are values. It works perfectly as expected and I am able to get a dictionary, but I only want certain keys to be in the dictionary rather than all of the column values. What is the best way to do this? I tried using csv.reader but I don't think it has this functionality. Maybe this can be achieved using pandas?
Here is the code I was using with CSV module where Fieldnames was the keys that I wanted to retain in my dict. I realized it isn't used for what I described above.
import csv
with open(target_path+target_file) as csvfile:
reader = csv.DictReader(csvfile,fieldnames=Fieldnames)
for i in reader:
print i
You can do this very simply using pandas.
import pandas as pd
# get only the columns you want from the csv file
df = pd.read_csv(target_path + target_file, usecols=['Column Name1', 'Column Name2'])
result = df.to_dict(orient='records')
Sources:
pandas.read_csv
pandas.DataFrame.to_dict
You can use the to_dict method to get a list of dicts:
import pandas as pd
df = pd.read_csv(target_path+target_file, names=Fieldnames)
records = df.to_dict(orient='records')
for row in records:
print row
to_dict documentation:
In [67]: df.to_dict?
Signature: df.to_dict(orient='dict')
Docstring:
Convert DataFrame to dictionary.
Parameters
----------
orient : str {'dict', 'list', 'series', 'split', 'records', 'index'}
Determines the type of the values of the dictionary.
- dict (default) : dict like {column -> {index -> value}}
- list : dict like {column -> [values]}
- series : dict like {column -> Series(values)}
- split : dict like
{index -> [index], columns -> [columns], data -> [values]}
- records : list like
[{column -> value}, ... , {column -> value}]
- index : dict like {index -> {column -> value}}
.. versionadded:: 0.17.0
Abbreviations are allowed. `s` indicates `series` and `sp`
indicates `split`.
Returns
-------
result : dict like {column -> {index -> value}}
File: /usr/local/lib/python2.7/dist-packages/pandas/core/frame.py
Type: instancemethod