I have this csv file from https://www.data.gov.au/dataset/airport-traffic-data/resource/f0fbdc3d-1a82-4671-956f-7fee3bf9d7f2
I'm trying to aggregate with
airportdata = Airports.groupby(['Year_Ended_December'])('Dom_Pax_in','Dom_Pax_Out')
airportdata.sum()
However, I keep getting 'DataFrameGroupBy' object is not callable
and it wont print the data I want
How to fix the issue?
You need to execute the sum aggregation before extracting the columns:
airportdata_agg = Airports.groupby(['Year_Ended_December']).sum()[['Dom_Pax_in','Dom_Pax_Out']]
Alternatively, if you'd like to ensure you're not aggregating columns you are not going to use:
airportdata_agg = Airports[['Dom_Pax_in','Dom_Pax_Out', 'Year_Ended_December']].groupby(['Year_Ended_December']).sum()
Related
I am working on dataset named "data" and what to use explode function but facing error in this code. Error:-'DataFrame' object has no attribute 'explode' and showing attribute error
artists_exploded = data[['artists_upd','id']].explode('artists_upd')
Not sure whether you are trying to explode data on 2 columns (artists & ID), but this is how you do it for just artists_upd. Perhaps you want to use a different value for str.split(), depending on your data.
artists_exploded = data.assign(artists_upd = data.artists_upd.str.split(",")).explode("artists_upd")
I am new to python, I have imported a file into jupyter as follows:
df = pd.read_csv(r"C:\Users\shalotte1\Documents\EBQS_INTEGRATEDQUOTEDOCUMENT\groceries.csv")
I am using the following code to determine the number of rows and columns in the data
df.shape()
However I am getting the following error:
TypeError: 'tuple' object is not callable
You want df.shape - this will return a tuple as in (n_rows, n_cols). You are then trying to call this tuple as though it were a function.
As you are new to python, I would recommend you to read this page. This will make you get aware of other causes too so that you can solve this problem again if it appears in the future.
https://careerkarma.com/blog/python-typeerror-tuple-object-is-not-callable/
I'm trying to merge two dataframes using
grouped_data = pd.merge(grouped_data, df['Pattern'].str[7:11]
,how='left',left_on='Calc_DRILLING_Holes',
right_on='Calc_DRILLING_Holes')
But I get an error saying can not merge DataFrame with instance of type <class 'pandas.core.series.Series'>
What could be the issue here. The original dataframe that I'm trying to merge to was created from a much larger dataset with the following code:
import pandas as pd
raw_data = pd.read_csv(r"C:\Users\cherp2\Desktop\test.csv")
data_drill = raw_data.query('Activity =="DRILL"')
grouped_data = data_drill.groupby([data_drill[
'PeriodStartDate'].str[:10], 'Blast'])[
'Calc_DRILLING_Holes'].sum().reset_index(
).sort_values('PeriodStartDate')
What do I need to change here to make it a regular normal dataframe?
If I try to convert either of them to a dataframe using .to_frame() I get an error saying that 'DataFrame' object has no attribute 'to_frame'
I'm so confused at to what kind of data type it is.
Both objects in a call to pd.merge need to be DataFrame objects. Is grouped_data a Series? If so, try promoting it to a DataFrame by passing pd.DataFrame(grouped_data) instead of just grouped_data.
I have a dataframe and I want to make a deep copy of it so I can modify the copy and use it in further processing.
I am working in Azure Databricks.
My dataframe is called "a" and I tried the following command:
b = a.copy(deep=True)
When I run it, I encounter the following error :
'DataFrame' object has no attribute 'copy'
I also tried to use 'iloc' or 'loc' function to create a new dataframe only with the columns that I need, but same error ('DataFrame' object has no attribute 'lit').
Any ideas why is this happening?..
Assuming you're working in Python, check whether you're using a Spark DataFrame or a pandas DataFrame. If you're using a pandas one then I couldn't tell you what's going on without more information; if you're using the spark one then you should use
newDataFrame = oldDataFrame.select('*')
I have a data frame "df" with a column called "column1". By running the below code:
df.column1.value_counts()
I get the output which contains values in column1 and its frequency. I want this result in the excel. When I try to this by running the below code:
df.column1.value_counts().to_excel("result.xlsx",index=None)
I get the below error:
AttributeError: 'Series' object has no attribute 'to_excel'
How can I accomplish the above task?
You are using index = None, You need the index, its the name of the values.
pd.DataFrame(df.column1.value_counts()).to_excel("result.xlsx")
If go through the documentation Series had no method to_excelit applies only to Dataframe.
So either you can save it another frame and create an excel as:
a=df.column1.value_counts()
a.to_excel("result.xlsx")
Look at Merlin comment I think it is the best way:
pd.DataFrame(df.column1.value_counts()).to_excel("result.xlsx")