Convert 2 columns of a pandas dataframe to a list

Convert 2 columns of a pandas dataframe to a list - python

I know that you can pull out a single column from a datframe to a list by doing this:
newList = df['column1'].tolist()
and that you can convert all values to a list like this:
newList = df.values.tolist()
But is there a way to convert 2 columns from a dataframe to a list so that you get a list that looks like this
Column 1 Column 2
0 apple 9
1 peach 12
and the resulting list is:
[[apple,9],[peach,12]]
Thanks

As per your example, you can convert a pandas DataFrame to a list with df.values.tolist().
If you want just specific columns, you just need to change df in this code to df containing only those columns, as df[[column1, column2, ..., columnN]].values.tolist()

You can use zip:
[list(i) for i in zip(df['Column 1'], df['Column 2'])]
Output
[[apple,9],[peach,12]]

To convert the entire data frame to a list of lists:
lst = df.to_numpy().tolist()

Related

Python pandas - multiple columns to one series

I have an excel spreadsheet with raw data in:
demo-data:
1
2
3
4
5
6
7
8
9
How do I combine all the numbers to one series, so I can start doing math on it. They are all just numbers of the same "kind"

Given your dataframe as df, this function may help df.values.flatten().

You can convert your dataframe to a list and iterate through it to extract and put values into a 1D list:
df = pd.read_excel("data.xls")
lst = df.to_numpy().tolist()
result = []
for row in lst:
for item in row:
result.append(item)

Store nth row elements in a list panda dataframe

I am new to python.Could you help on follow
I have a dataframe as follows.
a,d,f & g are column names. dataframe can be named as df1
a d f g
20 30 20 20
0 1 NaN NaN
I need to put second row of the df1 into a list without NaN's.
Ideally as follows.
x=[0,1]

Select the second row using df.iloc[1] then using .dropna remove the nan values, finally using .tolist method convert the series into python list.
Use:
x = df.iloc[1].dropna().astype(int).tolist()
# x = [0, 1]

Check itertuples()
So you would have something like taht:
for row in df1.itertuples():
row[0] #-> that's your index of row. You can do whatever you want with it, as well as with whole row which is a tuple now.
you can also use iloc and dropna() like that:
row_2 = df1.iloc[1].dropna().to_list()

Add array of new columns to Pandas dataframe

How do I append a list of integers as new columns to each row in a dataframe in Pandas?
I have a dataframe which I need to append a 20 column sequence of integers as new columns. The use case is that I'm translating natural text in a cell of the row into a sequence of vectors for some NLP with Tensorflow.
But to illustrate, I create a simple data frame to append:
df = pd.DataFrame([(1, 2, 3),(11, 12, 13)])
df.head()
Which generates the output:
And then, for each row, I need to pass a function that takes in a particular value in the column '2' and will return an array of integers that need to be appended as columns in the the data frame - not as an array in a single cell:
def foo(x):
return [x+1, x+2, x+3]
Ideally, to run a function like:
df[3, 4, 5] = df['2'].applyAsColumns(foo)
The only solution I can think of is to create the data frame with 3 blank columns [3,4,5] , and then use a for loop to iterate through the blank columns and then input them as values in the loop.
Is this the best way to do it, or is there any functions built into Pandas that would do this? I've tried checking the documentation, but haven't found anything.
Any help is appreciated!

IIUC,
def foo(x):
return pd.Series([x+1, x+2, x+3])
df = pd.DataFrame([(1, 2, 3),(11, 12, 13)])
df[[3,4,5]] = df[2].apply(foo)
df
Output:
0 1 2 3 4 5
0 1 2 3 4 5 6
1 11 12 13 14 15 16

Pandas Return a list of Rows whose Substrings are found in another column

I have the following dataset;
I would like to end up with a column like this one;
Ideally, I would like to convert the columns to the same case and split the strings by spaces and return rows that contain a substring that is found on the other column.

Check values of Series.str.splited first column by DataFrame.isin with flatten splitted values of second column and get at least one True value per row by DataFrame.any, pass to boolean indexing with filter first column and if necessary create one column Dataframe by Series.to_frame:
df = pd.DataFrame({'column_a':['ga lt','ka','ku','na ma',np.nan, np.nan],
'column_b':['se','ga','ma po','na','ka ch', 'wa wo']})
vals = [y for x in df['column_b'] for y in x.split()]
mask = df['column_a'].str.split(expand=True).isin(vals).any(axis=1)
df = df.loc[mask, 'column_a'].to_frame('column_a_in_column_b')
print (df)
column_a_in_column_b
0 ga lt
1 ka
3 na ma

Pandas Create Data Frame Column from List

Given the following list:
list=['a','b','c']
I'd like to create a data frame where the list is the column of values.
I'd like the header to be "header".
Like this:
header
a
b
c
Thanks in advance!

Wouldn't that be:
list=['a','b','c']
df= pd.DataFrame({'header': list})
header
0 a
1 b
2 c

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Convert 2 columns of a pandas dataframe to a list - python

As per your example, you can convert a pandas DataFrame to a list with df.values.tolist(). If you want just specific columns, you just need to change df in this code to df containing only those columns, as df[[column1, column2, ..., columnN]].values.tolist()

You can use zip: [list(i) for i in zip(df['Column 1'], df['Column 2'])] Output [[apple,9],[peach,12]]

To convert the entire data frame to a list of lists: lst = df.to_numpy().tolist()

Related

Python pandas - multiple columns to one series

Store nth row elements in a list panda dataframe

Add array of new columns to Pandas dataframe

Pandas Return a list of Rows whose Substrings are found in another column

Pandas Create Data Frame Column from List

Categories

Resources