Pandas Create Data Frame Column from List - python

Given the following list:
list=['a','b','c']
I'd like to create a data frame where the list is the column of values.
I'd like the header to be "header".
Like this:
header
a
b
c
Thanks in advance!

Wouldn't that be:
list=['a','b','c']
df= pd.DataFrame({'header': list})
header
0 a
1 b
2 c

Related

sorting out data from two rows in csv file python

I have two rows of data in row 4 and 5. Row 4 has the titles for the data and row 5 holds the actual data. I want to go ahead and sort them out in any sort of format. I am completely new to python so I don't even know where to start. Its a csv file and I want a output of a csv file as well. This is what the data looks like:
A
B
C
D
A
B
C
D
A
B
C
D
0
1
2
3
4
5
6
7
8
9
10
11
I would like data to look something like this if possible:
A
B
C
D
0
1
2
3
4
5
6
7
8
9
10
11
So I want to sort it out by the titles but since the row is not a header row I dont know what to do. Again the titles "A" "B" "C" "D" are in row 4 and the data 0,1,2,3.... are in row 5. Any help would be appreciated.
You can use pandas to read the csv file and then use pandas.DataFrame to sort the data. Here is a sample code:
import pandas as pd
df = pd.read_csv('file.csv', header=None)
df.columns = df.iloc[3]
df = df.sort_values(by=['A', 'B', 'C', 'D'])
df.to_csv('output.csv', index=False)
You can use a dictionary to store the original data, using the first row as the dictionary keys. Then you can use panda to create your final csv file. Something like this:
from collections import defaultdict
import pandas
# read the two rows
with open('data.txt') as ifile:
headers = [name.strip() for name in ifile.readline().split(",")]
values = [int(value.strip()) for value in ifile.readline().split(",")]
# use a dictionary to store the data, using the
# names in firt row as dictionary keys
dd = defaultdict(lambda: [])
for name, val in zip(headers, values):
dd[name].append(val)
# use pandas package to create the csv
data_frame = pandas.DataFrame.from_dict(dd)
data_frame.to_csv("final.csv", index=False)
I am assuming that your data.txt file contains:
A,B,C,D,A,B,C,D,A,B,C,D
0,1,2,3,4,5,6,7,8,9,10,11

How to convert first column of dataframe in to its headers

I have dataframe df:
0
0 a
1 b
2 c
3 d
4 e
O/P should be:
a b c d e
0
1
2
3
4
5
I want column containing(a, b,c,d,e) as header of my dataframe.
Could anyone help?
If your dataframe is pandas and its name is df. Try solving it with pandas:
Firstly convert initial df content to a list, afterwards create a new dataframe defining its columns with the list.
import pandas as pd
list = df[0].tolist() #df[0] is getting the content of first column
dfSolved = pd.DataFrame([], columns = list)
You may provide more details like the index and values of the expected output, the operation you wanna do, etc, so that we could give a specific solution to your case
Here is the solution:
import pandas as pd
import io
import numpy as np
data_string = """ columns_name
0 a
1 b
2 c
3 d
4 e
"""
df = pd.read_csv(io.StringIO(data_string), sep='\s+')
# Solution
df_result = pd.DataFrame(data=[[np.nan]*5],
columns=df['columns_name'].tolist())

Convert 2 columns of a pandas dataframe to a list

I know that you can pull out a single column from a datframe to a list by doing this:
newList = df['column1'].tolist()
and that you can convert all values to a list like this:
newList = df.values.tolist()
But is there a way to convert 2 columns from a dataframe to a list so that you get a list that looks like this
Column 1 Column 2
0 apple 9
1 peach 12
and the resulting list is:
[[apple,9],[peach,12]]
Thanks
As per your example, you can convert a pandas DataFrame to a list with df.values.tolist().
If you want just specific columns, you just need to change df in this code to df containing only those columns, as df[[column1, column2, ..., columnN]].values.tolist()
You can use zip:
[list(i) for i in zip(df['Column 1'], df['Column 2'])]
Output
[[apple,9],[peach,12]]
To convert the entire data frame to a list of lists:
lst = df.to_numpy().tolist()

Python Pandas data frame with column value as a dictionary

I have a dataframe DF in the following format, first row is the column name
name val1 val2
A 1 2
B 3 4
How to convert to the following data frame format
name map
A {val1:1,val2:2}
B {val1:3,val2:4}
You can achieve it with to_dict() method of dataframe.
x['map']=x[['val1','val2']].to_dict(orient='records')

Python: create new row based on column names in DataFrame

I would like to know how to make a new row based on the column names row in a python dataframe, and append it to the same dataframe.
example
df = pd.DataFrame(np.random.randn(10, 5),columns=['abx', 'bbx', 'cbx', 'acx', 'bcx'])
I want to create a new row based on the column names that gives:
b | b | b | c | c |by taking the middle char of the column name.
the idea is to use that new row, later, for multi-indexing the columns.
I'm assuming this is what you want as you've not responded, we can append a new row by creating a dict from zipping the df columns and a list comprehension of the middle character (assuming that column name lengths are 3):
In [126]:
df.append(dict(zip(df.columns, [col[1] for col in df])), ignore_index=True)
Out[126]:
abx bbx cbx acx bcx
0 -0.373421 -0.1005462 -0.8280985 -0.1593167 1.335307
1 1.324328 -0.6189612 -0.743703 0.9419248 1.282682
2 0.3730312 -0.06697892 1.113707 -0.9691056 1.779643
3 -0.6644958 1.379606 -0.3751724 -1.135034 0.3287292
4 0.4406139 -0.5767996 -0.2267589 -1.384412 -0.03038372
5 -1.242734 -0.838923 -0.6724592 1.405247 -0.3716862
6 -1.682637 -1.69309 -1.291833 1.781704 0.6321988
7 -0.5793783 -0.6809975 1.03502 -0.6498381 -1.124236
8 1.589016 1.272961 -1.968225 0.5515182 0.3058628
9 -2.275342 2.892237 2.076253 -0.1422845 -0.09776171
10 b b b c c
ix --- lets you read the entire row-- you just say which ever row you want.
then you get your columns and assign them to the raw you want.
See the example below.
virData = DataFrame(df)
virData.columns = virData.ix[1].values
virData.columns

Categories

Resources