This question already has answers here:
Pandas: sum DataFrame rows for given columns
(8 answers)
Closed 4 years ago.
I want add the row values of different three columns in pandas. like
dctr mctr tctr
100 20 10
20 90 70``
30 10 80
40 05 120
50 20 60
I want add these three columns by rows values to total_ctr. Here what type of comment want to be used in pandas.??
Like this I have seven total values and I want to add these seven different values into a new dataframe. Is that possible. Likewise "total_ctr", "total_cpc", "total_avg", "total_cost" and so on. I want to make this as a new dataframe from these total values
I know there's a similar question on sum of rows, but I've not managed to get that one to work for this problem.
This will work, assuming above is dataframe named df
df['total_ctr'] = df.sum(axis=1)
Related
This question already has answers here:
How to remove common rows in two dataframes in Pandas?
(4 answers)
Closed 25 days ago.
I have two data frames with the exact same column, but one of them have 1000 rows (df1), and one of them 500 (df2). The rows of df2 are also found in the data frame df1, but I want the rows that are not.
For example, lets say this is df1:
Gender Age
1 F 43
3 F 56
33 M 76
476 F 30
810 M 29
and df2:
Gender Age
3 F 56
476 F 30
I want a new data frame, df3, that have the unshared rows:
Gender Age
1 F 43
33 M 76
810 M 29
How can I do that ?
Use pd.Index.difference:
df3 = df1.loc[df1.index.difference(df2.index)]
This has many ways.
I know 3 ways for it.
first way:
df = df1[~df1.index.isin(df2.index)]
second way:
left merge 2 dataframes and then filter rows that just in df1
third way:
Add a column to both dataframes that define the source and then concat two dataframes with axis=1
then countt each index in data frame and filter all index that seen 1 time and has a record with source=df1
finally:
Use from first way. It is very very faster
You can concatenate two tables and delete any rows that have duplicates:
df3 = pd.concat([df1, df2]).drop_duplicates(keep=False)
The keep parameter ask you if you want to keep the row that has duplicates or not. If keep = True, it will keep 1 copy of the rows. If false, delete all rows that have duplicates.
I have a dataframe with 78 columns, but i want to melt just 10 consecutives. Is there any way to select that columns range and leave others just like they are?
This question already has answers here:
how do I remove rows with duplicate values of columns in pandas data frame?
(4 answers)
How to create a list in Python with the unique values of a CSV file?
(3 answers)
Closed 3 years ago.
I have df like this:
df1:
PL IN
22 NE22
22 NE22
22 NE22
33 DE33
33 DE33
66 NL66
66 NL66
66 NL66
I need to save csv with only unique value so the result should be:
22 NE22
33 DE33
66 NL66
I know .unique() method but it works only on Series (?) I need to pic 2 col. Can someone give me an advice?
Drop the duplicates then write to csv.
df1 = df1.drop_duplicates(subset=['PL', 'IN'], keep='first')
df1.to_csv('my_unique_csv.csv', index=False)
This question already has answers here:
Pandas dataframe: truncate string fields
(4 answers)
Closed 4 years ago.
I have a dataframe with some columns having large sentences.
How do I truncate the columns to say 50 characters max?
current df:
a b c
I like data science 1 2
new truncated df for ONLY column a:
a b c
I like data 1 2
(The above is an example sentence I made up)
For a specific column:
df['a'] = df['a'].str[:50]
This question already has an answer here:
Python - splitting dataframe into multiple dataframes based on column values and naming them with those values [duplicate]
(1 answer)
Closed 4 years ago.
I am learning python and pandas and am having trouble overcoming an error while trying to subset a data frame.
I have an input data frame:
df0-
Index Group Value
1 A 10
2 A 15
3 B 20
4 C 10
5 C 10
df0.dtypes-
Group object
Value float64
That I am trying to split out into unique values based off of the Group column. With the output looking something like this:
df1-
Index Group Value
1 A 10
2 A 15
df2-
Index Group Value
3 B 20
df3-
Index Group Value
4 C 10
5 C 10
So far I have written this code to subset the input:
UniqueGroups = df0['Group'].unique().tolist()
OutputFrame = {}
for x in UniqueAgencies:
ReturnFrame[str('ConsolidateReport_')+x] = UniqueAgencies[df0['Group']==x]
The code above returns the following error, which I can`t quite work my head around. Can anyone point me in the right direction?
*** TypeError: list indices must be integers or slices, not str
you can use groupby to group the column
for _, g in df0.groupby('Group'):
print g