Splitting columns with made with groupby [duplicate]

Splitting columns with made with groupby [duplicate] - python

This question already has answers here:
Groupby and aggregate using lambda functions
(2 answers)
How can I pivot a dataframe?
(5 answers)
Python: Counting specific occurrences in dataframe by group
(2 answers)
Closed 1 year ago.
I can't seem to find a solution to my problem. I have used groupby and value_count to get all the results for each team but I can not find a way to split the column to make three new ones for (H = Win, D = Draw, A= Loss) the new column names.
I don't know if I am just serching for the wrong thing or the way I have it now will not work.
matchdata.groupby("HomeTeam")["FullTimeResult"].value_counts()
Results in:
HomeTeam FullTimeResult
Arsenal H 10
D 6
A 3
Aston Villa A 9
H 7
D 3
Wanted output
HomeTeam Win Draw Loss
Arsenal 10 6 3
AstonVilla 7 3 9
eg...

Related

Count the number of times a pair of value occurs in a Dartaframe [duplicate]

This question already has answers here:
Pandas, groupby and count
(3 answers)
Get statistics for each group (such as count, mean, etc) using pandas GroupBy?
(9 answers)
Adding a 'count' column to the result of a groupby in pandas?
(2 answers)
Pandas create new column with count from groupby
(5 answers)
Closed 3 years ago.
I have the following dataframe:
print(df)
Product Store Quantity_Sold
A NORTH 10
A NORTH 5
A SOUTH 8
B SOUTH 8
B SOUTH 5
(...)
I would like to count the number of times the same pair of product and store is present; to illustrate:
print(final_df)
Product Store count
A NORTH 2
A SOUTH 1
B SOUTH 2
(...)
I tried with:
df["Product"].value_counts()
But it only works with single columns. How can I create final_df?

How to calculate in python the sum of variables in each unique raw in dataframe? [duplicate]

This question already has answers here:
How do I Pandas group-by to get sum?
(11 answers)
Closed 3 years ago.
I have a dataframe with information of different users (ID) with many duplicated categorical variables (photo) and its corresponding numbers of interactions (likes). How i can calculate the sum of total likes for each different photo type?
For example:
id photo_type likes
1 nature 2
2 art 4
3 art 1
4 fashion 3
5 fashion 2
I expect to get information like that:
total numbers of likes for nature:2
total numbers of likes for art: 5
total numbers of likes for fashion: 5

Use pandas.DataFrame.groupby:
df.groupby('photo_type')['likes'].sum()
Output:
photo_type
art 5
fashion 5
nature 2
Name: likes, dtype: int64

How do I clean phone numbers in pandas [duplicate]

This question already has answers here:
How to only do string manupilation on column of pandas that have 4 digits or less?
(3 answers)
Closed 3 years ago.
I have a pandas dataframe with a column for Phone however, the data is a bit inconsistent. Here are some examples that I would like to focus on.
df["Phone"]
0 732009852
1 738073222
2 755920306
3 0755353288
Row 3 has the necessary leading 0 for an Australian number. How do I update rows like 0,1 and 2?

Use pandas.Series.str.zfill:
s = pd.Series(['732009852', '0755353288'])
s.str.zfill(10)
Output:
0 0732009852
1 0755353288

Or pd.Series.str.rjust:
print(df["Phone"].str.rjust(10, '0'))
Output:
0 0732009852
1 0738073222
2 0755920306
3 0755353288

subset the dataframe into a new one using copy [duplicate]

This question already has answers here:
why should I make a copy of a data frame in pandas
(8 answers)
Closed 4 years ago.
I have a dataframe df
a b c
0 5 6 9
1 6 7 10
2 7 8 11
3 8 9 12
So if I want to select only col a and b and store it in another df I would use something like this
df1 = df[['a','b']]
But I have seen places where people write it this way
df1 = df[['a','b']].copy()
Can anyone let me know what is .copy() because the earlier code works just fine.

For example, if you want to rename a dataframe (example using replace):
df2=df
df2=df2.replace('blah','foo')
Here:
df==df2
Will be:
True
You want it to only do to, df2:
df2=df.copy()
df2=df2.replace('blah','foo')
Then now:
df==df2
Returns:
False

Error subsetting a data frame in python [duplicate]

This question already has an answer here:
Python - splitting dataframe into multiple dataframes based on column values and naming them with those values [duplicate]
(1 answer)
Closed 4 years ago.
I am learning python and pandas and am having trouble overcoming an error while trying to subset a data frame.
I have an input data frame:
df0-
Index Group Value
1 A 10
2 A 15
3 B 20
4 C 10
5 C 10
df0.dtypes-
Group object
Value float64
That I am trying to split out into unique values based off of the Group column. With the output looking something like this:
df1-
Index Group Value
1 A 10
2 A 15
df2-
Index Group Value
3 B 20
df3-
Index Group Value
4 C 10
5 C 10
So far I have written this code to subset the input:
UniqueGroups = df0['Group'].unique().tolist()
OutputFrame = {}
for x in UniqueAgencies:
ReturnFrame[str('ConsolidateReport_')+x] = UniqueAgencies[df0['Group']==x]
The code above returns the following error, which I can`t quite work my head around. Can anyone point me in the right direction?
*** TypeError: list indices must be integers or slices, not str

you can use groupby to group the column
for _, g in df0.groupby('Group'):
print g

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Splitting columns with made with groupby [duplicate] - python

Related

Count the number of times a pair of value occurs in a Dartaframe [duplicate]

How to calculate in python the sum of variables in each unique raw in dataframe? [duplicate]

How do I clean phone numbers in pandas [duplicate]

subset the dataframe into a new one using copy [duplicate]

Error subsetting a data frame in python [duplicate]

Categories

Resources