I have a dictionary like the below
d = {'a':'1,2,3','b':'3,4,5,6'}
I want to create dataframes from it in a loop, such as
a = 1,2,3
b = 3,4,5,6
Creating a single dataframe that can reference dictionary keys such as df['a'] does not work for what I am trying to achieve. Any suggestions?
Try this to get a list of dataframes:
>>> import pandas as pd
>>> import numpy as np
>>> dfs = [pd.DataFrame(np.array(b.split(',')), columns=list(a)) for a,b in d.items()]
gives the following output
>>> dfs[0]
a
0 1
1 2
2 3
>>> dfs[1]
b
0 3
1 4
2 5
3 6
To convert your dictionary into a list of DataFrames, run:
lst = [ pd.Series(v.split(','), name=k).to_frame()
for k, v in d.items() ]
Then, for your sample data, lst[0] contains:
a
0 1
1 2
2 3
and lst[1]:
b
0 3
1 4
2 5
3 6
Hope this helps:
dfs=[]
for key, value in d.items():
df = pd.DataFrame.from_dict((list(filter(None, value))))
dfs.append(df)
Related
I have a dict in python like this:
d = {"a": [1,2,3], "b": [4,5,6]}
I want to transform in a dataframe like this:
letter
number
a
1
a
2
a
3
b
4
b
5
b
6
i have tried this code:
df = pd.DataFrame.from_dict(vulnerabilidade, orient = 'index').T
but this gave me:
a
1
2
3
b
4
5
6
You can always read your data in as you already have and then .melt it:
When passed no id_vars or value_vars, melt turns each of your columns into their own rows.
import pandas as pd
d = {"a": [1,2,3], "b": [4,5,6]}
out = pd.DataFrame(d).melt(var_name='letter', value_name='value')
print(out)
letter value
0 a 1
1 a 2
2 a 3
3 b 4
4 b 5
5 b 6
To use 'letter' and 'number' as column labels you could use:
a2 = [[key, val] for key, x in d.items() for val in x]
dict2 = pd.DataFrame(a2, columns = ['letter', 'number'])
which gives
letter number
0 a 1
1 a 2
2 a 3
3 b 4
4 b 5
5 b 6
Yet another possible solution:
(pd.Series(d, index=d.keys(), name='numbers')
.rename_axis('letters').reset_index()
.explode('numbers', ignore_index=True))
Output:
letters numbers
0 a 1
1 a 2
2 a 3
3 b 4
4 b 5
5 b 6
This will yield what you want (there might be a simpler way though):
import pandas as pd
my_dict = {"a": [1,2,3], "b": [4,5,6]}
my_list = [[key, val] for key in my_dict for val in my_dict[key] ]
df = pd.DataFrame(my_list, columns=['letter','number'])
df
# Out[106]:
# letter number
# 0 a 1
# 1 a 2
# 2 a 3
# 3 b 4
# 4 b 5
# 5 b 6
I want to know how could I create a dataframe based on two list. I have the following lists:
List_time = [1,2,3]
List_item = [a,b,c]
For every item in list_item, I want another column that agregates every time in list_time:
df = [1 a
1 b
1 c
2 a
2 b
2 c
3 a
3 b
3 c]
Sorry if it's a very basic question, I'm exhausted right now. Thanks
Use itertools.product
from itertools import product
df = pd.DataFrame(product(List_time, List_item))
Try this;
List_time = [1,2,3]
List_item = ["a","b","c"]
n = 3 # times need to repeat
import pandas as pd
df = pd.DataFrame({"List_time":[i for i in List_time for _ in range(n)],
"List_item":List_item*n})
#output of df;
List_time List_item
0 1 a
1 1 b
2 1 c
3 2 a
4 2 b
5 2 c
6 3 a
7 3 b
8 3 c
I like to use itertools product function for just this purpose. It will combine lists as cross products and Pandas will ingest this nicely.
import itertools
import pandas as pd
a = [1, 2, 3]
b = ['a', 'b', 'c']
df = pd.DataFrame(data=itertools.product(a, b))
Output:
0 1
0 1 a
1 1 b
2 1 c
3 2 a
4 2 b
5 2 c
6 3 a
7 3 b
8 3 c
Edit: I misread the question, my mistake
How do i convert the below to a dataframe as like the expected output below? Please help. I tried other answers of SO but they were in different format of input.
ab = [{'q1':[7,2,6]},{'q2':[1,2,3]}]
import pandas as pd
pd.DataFrame(ab)
Current output:
q1 q2
0 [7, 2, 6] NaN
1 NaN [1, 2, 3]
Expected Output
q1 q2
0 7 1
1 2 2
2 6 3
A simple transformation:
ds = {k: v for d in ab for k, v in d.items()}
df = pd.DataFrame(ds)
Other options.
Using pandas.concat:
df = pd.concat(map(pd.DataFrame, ab), axis=1)
or using collections.ChainMap:
from collections import ChainMap
df = pd.DataFrame(dict(ChainMap(*ab)))
output:
q1 q2
0 7 1
1 2 2
2 6 3
Consider a dictionary like the following:
>>> dict_temp = {'a': np.array([[0,1,2], [3,4,5]]),
'b': np.array([[3,4,5], [2,5,1], [5,3,7]])}
How can I build a pandas DataFrame out of this, using a multi-index with level 0 and 1 as follows:
level_0 = ['a', 'b']
level_1 = [[0,1], [0,1,2]]
I expect the code to build the multi-index levels itself... I don't care about the column names for now.
Appreciate comments...
Try concat:
pd.concat({k:pd.DataFrame(d) for k, d in dict_temp.items()})
Output:
0 1 2
a 0 0 1 2
1 3 4 5
b 0 3 4 5
1 2 5 1
2 5 3 7
I need to add columns iteratively to a DataFrame object. This is a simplified version:
>>> x=DataFrame()
>>> for i in 'ps':
... x = x.append(DataFrame({i:[3,4]}))
...
>>> x
p s
0 3 NaN
1 4 NaN
0 NaN 3
1 NaN 4
What should I do to get:
p s
0 3 3
1 4 4
?
Your idea of creating the dict first is probably the best way:
>>> from pandas import *
>>> DataFrame({c: [1,2] for c in 'sp'})
p s
0 1 1
1 2 2
(here using dictionary comprehensions, available in Python 2.7). Just for completeness, though, you could -- inefficiently -- use join or concat to get a column-by-column approach to work:
>>> df = DataFrame()
>>> for c in 'sp':
... df = concat([df, DataFrame({c: [1,2]})], axis=1)
...
>>> print df
s p
0 1 1
1 2 2
>>>
>>> df = DataFrame()
>>> for c in 'sp':
... df = df.join(DataFrame({c: [1,2]}), how='outer')
...
>>> print df
s p
0 1 1
1 2 2
[You can see the difference in column order.] But your idea of building the dict and then constructing the DataFrame from the constructed dict is a much better approach.