This question already has answers here:
How do I melt a pandas dataframe?
(3 answers)
Closed 7 days ago.
I would like to change the arrangement of this table:
import pandas as pd
original_dict = {
"group A" : [10,9,11],
"group B" :[23,42,56]
}
original_df = pd.DataFrame(original_dict)
original_df
Here is the desired output:
Value
Group Type
10
group A
9
group A
11
group A
23
group B
42
group B
56
group B
Thank you!
You can use Pandas Melt function.
https://pandas.pydata.org/docs/reference/api/pandas.melt.html
df = pd.melt(original_df)
df.columns=['Group Type', 'Value']
df
Group Type Value
group A 10
group A 9
group A 11
group B 23
group B 42
group B 56
This question already has answers here:
How to test if a string contains one of the substrings in a list, in pandas?
(4 answers)
Closed 5 months ago.
I am trying to extract specific rows from a dataframe where values in a column contain a designated string. For example, the current dataframe looks like:
df1=
Location Value Name Type
Up 10 Test A X
Up 12 Test B Y
Down 11 Prod 1 Y
Left 8 Test C Y
Down 15 Prod 2 Y
Right 30 Prod 3 X
And I am trying to build a new dataframe will all rows that have "Test" in the 'Name' column.
df2=
Location Value Name Type
Up 10 Test A X
Up 12 Test B Y
Left 8 Test C Y
Is there a way to do this with regex or match?
Try:
df_out = df[df["Name"].str.contains("Test")]
print(df_out)
Prints:
Location Value Name Type
0 Up 10 Test A X
1 Up 12 Test B Y
3 Left 8 Test C Y
How about: df2 = df1.loc[['Test' in name for name in df1.Name ]]
In column J would like to get the value as per excel function ie IF(H3>I3,C2,0) and based on that occurance value ie from bottom to up 1st occurance as the latest one and next to that is 2nd occurance.
enter image description here
Here is the solution:
import pandas as pd
import numpy as np
# suppose we have this DataFrame:
df = pd.DataFrame({'A':[55,23,11,100,9] , 'B':[12,72,35,4,100]})
# suppose we want to reflect values of 'A' column if its values are equal or more than values in 'B' column, otherwise return 0
# so i'll make another column named 'Result' to put the results in it
df['Result'] = np.where(df['A'] >= df['B'] , df['A'] , 0)
then if you try to print DataFrame:
df
result:
A B Result
0 55 12 55
1 11 72 0
2 23 35 0
3 100 4 100
4 9 100 0
This question already has answers here:
How to unnest (explode) a column in a pandas DataFrame, into multiple rows
(16 answers)
Closed 3 years ago.
I have the following dataframe
df_in = pd.DataFrame({
'State':['C','B','D','A','C','B'],
'Contact':['alpha a. theta| beta','beta| alpha a. theta| delta','Theta','gamma| delta','alpha|Eta| gamma| delta','beta'],
'Timestamp':[911583000000,912020000000,912449000000,912742000000,913863000000,915644000000]})
How do I transform it so that the second column which has pipe separated data is broken out into different rows as follows:
df_out = pd.DataFrame({
'State':['C','C','B','B','B','D','A','A','C','C','C','C','B'],
'Contact':['alpha a. theta','beta','beta','alpha a. theta','delta','Theta','gamma', 'delta','alpha','Eta','gamma','delta','beta'],
'Timestamp':[911583000000,911583000000,912020000000,912020000000,912020000000,912449000000,912742000000,912742000000,913863000000,913863000000,913863000000,913863000000,915644000000]})
print(df_in)
print(df_out)
I can use pd.melt but for that I already need to have the 'Contact' column broken out into multiple columns and not have all the contacts in one column separated by a delimiter.
You could split the column, then merge on the index:
df_in.Contact.str.split('|',expand=True).stack().reset_index()\
.merge(df_in.reset_index(),left_on ='level_0',right_on='index')\
.drop(['level_0','level_1','index','Contact'],1)
Out:
0 State Timestamp
0 alpha a. theta C 911583000000
1 beta C 911583000000
2 beta B 912020000000
3 alpha a. theta B 912020000000
4 delta B 912020000000
5 Theta D 912449000000
6 gamma A 912742000000
7 delta A 912742000000
8 alpha C 913863000000
9 Eta C 913863000000
10 gamma C 913863000000
11 delta C 913863000000
12 beta B 915644000000
This question already has answers here:
pandas: records with lists to separate rows
(3 answers)
Closed 4 years ago.
I have a pandas dataframe as shown here:
id pos value sent
1 a/b/c test/test2/test3 21
2 d/a test/test5 21
I would like to split (=explode)df['pos'] and df['token'] so that the dataframe looks like this:
id pos value sent
1 a test 21
1 b test2 21
1 c test3 21
2 d test 21
2 a test5 21
It doesn't work if I split each column and then concat them à la
pos = df.token.str.split('/', expand=True).stack().str.strip().reset_index(level=1, drop=True)
df1 = pd.concat([pos,value], axis=1, keys=['pos','value'])
Any ideas? I'd really appreciate it.
EDIT:
I tried using this solution here : https://stackoverflow.com/a/40449726/4219498
But I get the following error:
TypeError: Cannot cast array data from dtype('int64') to dtype('int32') according to the rule 'safe'
I suppose this is a numpy related issue although I'm not sure how this happens. I'm using Python 2.7.14
I tend to avoid the stack magic in favour of building a new dataframe from scratch. This is usually also more efficient. Below is one way.
import numpy as np
from itertools import chain
lens = list(map(len, df['pos'].str.split('/')))
res = pd.DataFrame({'id': np.repeat(df['id'], lens),
'pos': list(chain.from_iterable(df['pos'].str.split('/'))),
'value': list(chain.from_iterable(df['value'].str.split('/'))),
'sent': np.repeat(df['sent'], lens)})
print(res)
id pos sent value
0 1 a 21 test
0 1 b 21 test2
0 1 c 21 test3
1 2 d 21 test
1 2 a 21 test5