I want to add a column to a data frame, and also set a list to each element of it, after the execution of below code, nothing changed,
df = pd.DataFrame({'A':[1,2,3],'B':[4,5,6]})
df['C'] = 0
for i in range(len(df)):
lst = [6,7,8]
data.iloc[i]['C'] = []
data.iloc[i]['C'] = lst
Also, based on Assigning a list value to pandas dataframe, I tried df.at[i,'C'] on the above code, and the following error appeared: 'setting an array element with a sequence.'
You can use np.tile with np.ndarray.tolist
l = len(df)
df['C'] = np.tile([6,7,8],(l,1)).tolist()
df
A B C
0 1 4 [6, 7, 8]
1 2 5 [6, 7, 8]
2 3 6 [6, 7, 8]
One idea is use list comprehension:
lst = [6,7,8]
df['C'] = [lst for _ in df.index]
print (df)
A B C
0 1 4 [6, 7, 8]
1 2 5 [6, 7, 8]
2 3 6 [6, 7, 8]
In your solution for me working:
df['C'] = ''
for i in range(len(df)):
lst = [6,7,8]
df.iloc[i, df.columns.get_loc('C')] = lst
Related
suppose i have
list1 = [3, 4, 6, 8, 13]
in a for loop I want to subtract the value i from the value that comes right after. In the above example: 4-3, 6-4, 8-6, 13-8. (and i want to skip the first value)
desired result
list2 = [3, 1, 2, 2, 5]
can i do this in a for loop / list comprehension?
more specifically do I want to do this in a dataframe
list1
0 3
1 4
2 6
3 8
4 13
and after the operation
list1 list2
0 3 3
1 4 1
2 6 2
3 8 2
4 13 5
I have tried for loops, lambda functions and list comprehensions and trying to access the positional index with enumerate() but I can't figure out how to access the value just before the value from which I want to subtract from
edit: answers below worked. thank you very much!
The dataframe solution has already been posted. This is an implementation for lists:
list1 = [3, 4, 6, 8, 13]
list2 = []
for i, v in enumerate(list1):
list2.append(list1[i] - list1[i-1])
list2[0] = list1[0]
print(list2) # [3, 1, 2, 2, 5]
And lastly, in list comprehension:
list2 = [list1[i] - list1[i-1] for i, v in enumerate(list1)]
list2[0] = list1[0]
You should use shift to access the next row:
df['list2'] = df['list1'].sub(df['list1'].shift(fill_value=0))
Or, using diff with fillna:
df['list2'] = df['list1'].diff().fillna(df['list1'])
Output:
list1 list2
0 3 3
1 4 1
2 6 2
3 8 2
4 13 5
For a pure python solution:
list1 = [3, 4, 6, 8, 13]
list2 = [a-b for a,b in zip(list1, [0]+list1)]
Output: [3, 1, 2, 2, 5]
You could loop backwards for x in range(len(list) - 1, 0, -1): and then the calculation can be done list[x] = list[x] - list[x - 1]
Try this code its working
import pandas as pd
list1 = [3, 4, 6, 8, 13]
list2 = [list1[i+1]-list1[i] for i in range(len(list1)-1)]
list2.insert(0, list1[0])
data = {
"list1":list1,
"list2":list2
}
df = pd.DataFrame(data)
print(df)
output:
$ python3 solution.py
list1 list2
0 3 3
1 4 1
2 6 2
3 8 2
4 13 5
I have a dataframe with categorical data in it.
I have come with a procedure to keep only desired categories, while moving up the remaining categories in the empty cells of deleted values.
But I want to do it without the list intermediaries if possible.
import pandas as pd
mydf = pd.DataFrame(data = {'a': [9,6,3,8,5],
'b': [4, 3,5,6,7],
'c': [5, 3,6,9,10]
}
)
selecList = [5,8,4,6] # only this categories shall remain
mydf
a b c
0 9 4 5
1 6 3 3
2 3 5 6
3 8 6 9
4 5 7 10
Desired Output
a b c
0 6 4 5
1 8 5 6
2 5 6 <NA>
My workaround:
myList = mydf.T.values.tolist()
myList
[[9, 6, 3, 8, 5], [4, 3, 5, 6, 7], [5, 3, 6, 9, 10]]
filtered_list = [[x for x in y if x in selecList ] for y in myList]
filtered_list
[[6, 8, 5], [4, 5, 6], [5, 6]]
filtered_df = pd.DataFrame(filtered_list).T
filtered_df.columns = list(mydf)
filtered_df = filtered_df.astype('Int64')
Unsuccessful try:
pd.DataFrame(mydf.apply(lambda y: [x for x in y if x in selecList ])).T
Here is an alternative solution:
df.where(df.isin(selecList)).dropna(how='all')
Here is a another solution:
df.where(df.isin(selecList)).stack().droplevel(0).to_frame().assign(i = lambda x: x.groupby(level=0).cumcount()).set_index('i',append=True)[0].unstack(level=0)
I have the folowing dataframe:
df = pd.DataFrame({'cols': ['a', 'b', 'c'], 'vals': [[1,2], [3,4], [5,6]]})
series = pd.Series([3,5])
df
OUT:
cols vals
0 a [1, 2]
1 b [3, 4]
2 c [5, 6]
series
OUT:
0 3
1 5
i would like to get the following result:
cols vals
0 a [1, 2, 3]
1 b [3, 4, 5]
2 c [5, 6]
How can i achieve this without using itterrows?
good old += with index alignment:
df.loc[series.index, 'vals'] += pd.Series([[i] for i in series], index=series.index)
Altenatively with explode
df['vals'] = df['vals'].explode().append(series).groupby(level=0).agg(list)
print(df)
cols vals
0 a [1, 2, 3]
1 b [3, 4, 5]
2 c [5, 6]
You could use a list comprehension and slice assign back to vals (this assumes the index is a normal range):
df.loc[:len(series)-1, 'vals'] = [i+[j] for i,j in zip(df.loc[:len(series)-1, 'vals'], series)]
print(df)
cols vals
0 a [1, 2, 3]
1 b [3, 4, 5]
2 c [5, 6]
I have a list of list,
lst = [[2, 0, 1, 6, 7, 8], [4, 3, 5]]
and I want to flatten the list and assign a unique id to each list in the list merged into a data.frame.
Desired output:
value group
0 2 0
1 0 0
2 1 0
3 6 0
4 7 0
5 8 0
6 4 1
7 3 1
8 5 1
You're going to need to do some fancy flattening:
flattened = [(item, index) for index, sublist in enumerate(lst) for item in sublist]
df = pd.DataFrame(flattened, columns=['value','group'])
If you want a Pandas DataFrame:
import pandas as pd
lst = [[2, 0, 1, 6, 7, 8], [4, 3, 5]]
final_list = []
for i, l in enumerate(lst):
for num in l:
final_list.append({'value': num, 'group': i})
df = pd.DataFrame(final_list)
you can use this code:
new_lst = []
for group in lst:
for n in group:
new_lst.append({"group":lst.index(group),"value": n})
You should try something before asking for desired output.
Looping through a list of list, whilst having a unique identifier, you may want to use the function enumerate that "gives the indexer" of the list.
for i,sub_list in enumerate(lst):
identifier = i
[(value,identifier) for value in sublist]
....
Hoping this will help
How to remove elements from list based on index range in pandas Dataframe.
suppose DataFrame is like
df:
values size
0 [1,2,3,4,5,6,7] 2 #delete first 2 elements from list
1 [1,2,3,4] 3 #delete first 3 elements from list
2 [9,8,7,6,5,4,3] 5 #delete first 5 elements from list
Expected Output is
df:
values size
0 [3,4,5,6,7] 2
1 [4] 3
2 [4,3] 5
Use list comprehension with indexing:
df['values'] = [i[j:] for i, j in zip(df['values'], df['size'])]
print (df)
values size
0 [3, 4, 5, 6, 7] 2
1 [4] 3
2 [4, 3] 5
Using df.apply
import pandas as pd
df = pd.DataFrame({"values": [[1,2,3,4,5,6,7], [1,2,3,4], [9,8,7,6,5,4,3]], "size": [2, 3, 5]})
df["values"] = df.apply(lambda x: x["values"][x['size']:], axis=1)
print(df)
Output:
size values
0 2 [3, 4, 5, 6, 7]
1 3 [4]
2 5 [4, 3]
Using map in base Python, you could do
dat['values'] = pd.Series(map(lambda x, y : x[y:], dat['values'], dat['size']))
which returns
dat
Out[34]:
values size
0 [3, 4, 5, 6, 7] 2
1 [4] 3
2 [4, 3] 5