Why fillna not working as expected for mode

Why fillna not working as expected for mode - python

I am working on a carsale data set having columns : 'car', 'price', 'body', 'mileage', 'engV', 'engType', 'registration','year', 'model', 'drive'
column 'drive' and 'engType' have NaN missing values, I want to calculate mode for let say for 'drive' based on group by of ['car', 'model'] and then where this group falls, I want to replace NaN value there based on this groupby
I have tried these methods:
for numeric data
carsale['engV2'] = (carsale.groupby(['car','body','model']))['engV'].transform(lambda x: x.fillna(x.median()))
this is working fine, filling/replacing data accurately
for categorical data
carsale['driveT'] = (carsale.groupby(['car','model']))['drive'].transform(lambda x: x.fillna(x.mode()))
carsale['driveT'] = (carsale.groupby(['car','model']))['drive'].transform(lambda x: x.fillna(pd.Series.mode(x)))
both are giving same results
Here is the full code:
# carsale['price2'] = (carsale.groupby(['car','model','year']))['price'].transform(lambda x: x.fillna(x.median()))
# carsale['engV2'] = (carsale.groupby(['car','body','model']))['engV'].transform(lambda x: x.fillna(x.median()))
# carsale['mileage2'] = (carsale.groupby(['car','model','year']))['mileage'].transform(lambda x: x.fillna(x.median()))
# mode = carsale.filter(['car','drive']).mode()
# carsale[['test1','test2']] = carsale[['car','engType']].fillna(carsale.mode().iloc[0])
**carsale.groupby(['car', 'model'])['engType'].apply(pd.Series.mode)**
# carsale.apply()
# carsale
# carsale['engType2'] = carsale.groupby('car').engType.transform(lambda x: x.fillna(x.mode()))
**carsale['driveT'] = carsale.groupby(['car', 'model'])['drive'
].transform(lambda x: x.fillna(x.mode()))
carsale['driveT'] = carsale.groupby(['car', 'model'])['drive'
].transform(lambda x: x.fillna(pd.Series.mode(x)))**
# carsale[carsale.car == 'Mercedes-Benz'].sort_values(['body','engType','model','mileage']).tail(50)
# carsale[carsale.engV.isnull()]
# carsale.sort_values(['car','body','model'])
**carsale**
from above both methods giving the same results, it is just replacing/adding values in new column driveT same as we have in origional column 'drive'. like if we have NaN in some indexes then it is showing same NaN in driveT as well and same for other values.
But for numerical data, if i applied median it is adding/replacing correct value.
So the thing is it actually not calculating mode based on ['car', 'model'] group instead it is doing mode for single values in 'drive', but if you run this command
**carsale.groupby(['car','model'])['engType'].apply(pd.Series.mode)**
this is correctly calculating mode based on groupby (car, model)
Can anyone help in this matter?

My approach was to:
Use .groupby() to create a look-up dataframe that contains the mode of the drive feature for each car/model combo.
Write a method that looks up the mode in this dataframe and returns it for a given car/model, when that car/model's value in drive is null.
However, turned out there were two key corner cases specific to OP's dataset that needed to be handled:
When a particular car/model combo has no mode (because all entries in the drive column for this combo were NaN).
When a particular car brand has no mode.
Below are the steps I followed. If I begin with an example extended from first several rows of the sample dataframe in the question:
carsale = pd.DataFrame({'car': ['Ford', 'Mercedes-Benz', 'Mercedes-Benz', 'Mercedes-Benz', 'Mercedes-Benz', 'Nissan', 'Honda','Renault', 'Mercedes-Benz', 'Mercedes-Benz', 'Toyota', 'Toyota', 'Ferrari'],
'price': [15500.000, 20500.000, 35000.000, 17800.000, 33000.000, 16600.000, 6500.000, 10500.000, 21500.000, 21500.000, 1280.000, 2005.00, 300000.000],
'body': ['crossover', 'sedan', 'other', 'van', 'vagon', 'crossover', 'sedan', 'vagon', 'sedan', 'sedan', 'compact', 'compact', 'sport'],
'mileage': [68.0, 173.0, 135.0, 162.0, 91.0, 83.0, 199.0, 185.0, 146.0, 146.0, 200.0, 134, 123.0],
'engType': ['Gas', 'Gas', 'Petrol', 'Diesel', np.nan, 'Petrol', 'Petrol', 'Diesel', 'Gas', 'Gas', 'Hybrid', 'Gas', 'Gas'],
'registration':['yes', 'yes', 'yes', 'yes', 'yes', 'yes', 'yes', 'yes', 'yes', 'yes', 'yes', 'yes', 'yes'],
'year': [2010, 2011, 2008, 2012, 2013, 2013, 2003, 2011, 2012, 2012, 2009, 2003, 1988],
'model': ['Kuga', 'E-Class', 'CL 550', 'B 180', 'E-Class', 'X-Trail', 'Accord', 'Megane', 'E-Class', 'E-Class', 'Prius', 'Corolla', 'Testarossa'],
'drive': ['full', 'rear', 'rear', 'front', np.nan, 'full', 'front', 'front', 'rear', np.nan, np.nan, 'front', np.nan],
})
carsale
car price body mileage engType registration year model drive
0 Ford 15500.0 crossover 68.0 Gas yes 2010 Kuga full
1 Mercedes-Benz 20500.0 sedan 173.0 Gas yes 2011 E-Class rear
2 Mercedes-Benz 35000.0 other 135.0 Petrol yes 2008 CL 550 rear
3 Mercedes-Benz 17800.0 van 162.0 Diesel yes 2012 B 180 front
4 Mercedes-Benz 33000.0 vagon 91.0 NaN yes 2013 E-Class NaN
5 Nissan 16600.0 crossover 83.0 Petrol yes 2013 X-Trail full
6 Honda 6500.0 sedan 199.0 Petrol yes 2003 Accord front
7 Renault 10500.0 vagon 185.0 Diesel yes 2011 Megane front
8 Mercedes-Benz 21500.0 sedan 146.0 Gas yes 2012 E-Class rear
9 Mercedes-Benz 21500.0 sedan 146.0 Gas yes 2012 E-Class NaN
10 Toyota 1280.0 compact 200.0 Hybrid yes 2009 Prius NaN
11 Toyota 2005.0 compact 134.0 Gas yes 2003 Corolla front
12 Ferrari 300000.0 sport 123.0 Gas yes 1988 Testarossa NaN
Create a dataframe to that shows the mode of the drive feature for each car/model combination.
If a car/model combo has no mode (such as the row with Toyota Prius), I fill with the mode of that particular car brand (Toyota).
However, if the car brand, itself, (such as Ferrari here in my example) has no mode, I fill with the dataset's mode for the drive feature.
def get_drive_mode(x):
brand = x.name[0]
if x.count() > 0:
return x.mode() # Return mode for a brand/model if the mode exists.
elif carsale.groupby(['car'])['drive'].count()[brand] > 0:
brand_mode = carsale.groupby(['car'])['drive'].apply(lambda x: x.mode())[brand]
return brand_mode # Return mode of brand if particular brand/model combo has no mode,
else: # but brand itself has a mode for the 'drive' feature.
return carsale['drive'].mode() # Otherwise return dataset's mode for the 'drive' feature.
drive_modes = carsale.groupby(['car','model'])['drive'].apply(get_drive_mode).reset_index().drop('level_2', axis=1)
drive_modes.rename(columns={'drive': 'drive_mode'}, inplace=True)
drive_modes
car model drive_mode
0 Ferrari Testarossa front
1 Ford Kuga full
2 Honda Accord front
3 Mercedes-Benz B 180 front
4 Mercedes-Benz CL 550 rear
5 Mercedes-Benz E-Class rear
6 Nissan X-Trail full
7 Renault Megane front
8 Toyota Corolla front
9 Toyota Prius front
Write a method that looks up the drive mode value for a given car/model in a given row if that row's value for drive is NaN:
def fill_with_mode(x):
if pd.isnull(x['drive']):
return drive_modes[(drive_modes['car'] == x['car']) & (drive_modes['model'] == x['model'])]['drive_mode'].values[0]
else:
return x['drive']
Apply the above method to the rows in the carsale dataframe in order to create the driveT feature:
carsale['driveT'] = carsale.apply(fill_with_mode, axis=1)
del(drive_modes)
Which results in the following dataframe:
carsale
car price body mileage engType registration year model drive driveT
0 Ford 15500.0 crossover 68.0 Gas yes 2010 Kuga full full
1 Mercedes-Benz 20500.0 sedan 173.0 Gas yes 2011 E-Class rear rear
2 Mercedes-Benz 35000.0 other 135.0 Petrol yes 2008 CL 550 rear rear
3 Mercedes-Benz 17800.0 van 162.0 Diesel yes 2012 B 180 front front
4 Mercedes-Benz 33000.0 vagon 91.0 NaN yes 2013 E-Class NaN rear
5 Nissan 16600.0 crossover 83.0 Petrol yes 2013 X-Trail full full
6 Honda 6500.0 sedan 199.0 Petrol yes 2003 Accord front front
7 Renault 10500.0 vagon 185.0 Diesel yes 2011 Megane front front
8 Mercedes-Benz 21500.0 sedan 146.0 Gas yes 2012 E-Class rear rear
9 Mercedes-Benz 21500.0 sedan 146.0 Gas yes 2012 E-Class NaN rear
10 Toyota 1280.0 compact 200.0 Hybrid yes 2009 Prius NaN front
11 Toyota 2005.0 compact 134.0 Gas yes 2003 Corolla front front
12 Ferrari 300000.0 sport 123.0 Gas yes 1988 Testarossa NaN front
Notice that in rows 4 and 9 of the driveT column, the NaN value that was in the drive column has been replaced by the string rear, which as we would expect, is the mode of drive for a Mercedes E-Class.
Also, in row 11, since there is no mode for the Toyota Prius car/model combo, we fill with the mode for the Toyota brand, which is front.
Finally, in row 12, since there is no mode for the Ferrari car brand, we fill with the mode of the entire dataset's drive column, which is also front.

Related

How to select the first value that is not NaN after using groupby().agg()?

I have the following code:
df.groupby(["id", "year"], as_index=False).agg({"brand":"first", "color":"first"})
However, I have some values that are NaN. I want to select the first value that is not NaN.
Suppose my dataframe looks like this:
id
year
brand
color
001
2010
NaN
Blue
001
2010
Audi
NaN
001
2010
Audo
Blue
001
2011
Bmw
NaN
001
2011
NaN
NaN
001
2012
BMW
Green
002
2010
Tesla
White
I want to find all unique combinations of id and year,i.e. df.groupby(["id", "year"]) and then find the first non-NaN value. The motivation behind this is that a have a large and messy data set with many missing values and many typos. In the example table I also simulated a typo. Note that it is irrelevant whether the typo is first and gets chosen, as long as I keep track of the data per combination of id and year. Typos are a completely separate problem for now.
The desired output would be:
id
year
brand
color
001
2010
Audi
Blue
001
2011
BMW
NaN
001
2012
BMW
Green
002
2010
Tesla
White

This approach will produce the below output.
df1 = df.groupby(['id', 'year']).agg({
'brand': lambda x: x.dropna().iloc[0] if x.dropna().any() else np.nan,
'color': lambda x: x.dropna().iloc[0] if x.dropna().any() else np.nan,
}).reset_index()
print(df1)
Using a costume function:
def get_first_non_nan(x):
return x.dropna().iloc[0] if x.dropna().any() else np.nan
df1 = df.groupby(['id', 'year']).agg({
'brand': get_first_non_nan,
'color': get_first_non_nan,
}).reset_index()
print(df1)
id year brand color
0 001 2010 Audi Blue
1 001 2011 Bmw NaN
2 001 2012 BMW Green
3 002 2010 Tesla White

Sum two variables by two specific columns and compute quotient

I have a dataframe df1:
Plant name Brand Region Units produced capacity Cost incurred
Gujarat Plant Hyundai Asia 8500 9250 18500000
Haryana Plant Honda Asia 10000 10750 21500000
Chennai Plant Hyundai Asia 12000 12750 25500000
Zurich Plant Volkswagen Europe 25000 25750 77250000
Chennai Plant Suzuki Asia 6000 6750 13500000
Rengensburg BMW Europe 12500 13250 92750000
Dingolfing Mercedes Europe 14000 14750 103250000
I want a output dataframe with the following format:
df2= Region BMW Mercedes Volkswagen Toyota Suzuki Honda Hyundai
Europe
North America
Asia
Oceania
where the contents of each cell equals sum(cost incurred) / sum(units produced) for that specific Region and Brand.
Code I have tried, resulting in a ValueError:
for i,j in itertools.zip_longest(range(len(df2),range(len(df2.columns)):
if (df2.index[i] in list(df1["Region"]) & df2.columns[j] in list(df1["Brand"])==True:
temp1 = df1["Region"]==df2.index[i]
temp2 = df1["Brand"]==df2.columns[j]]
df2.loc[df2.index[i],df2.columns[j]] = df1(temp1&temp2)["Cost incurred"].sum()/
df1(temp1&temp2)["Units Produced"].sum()
elif (df2.index[i] in list(df1["Region"]) & df2.columns[j] in list(df1["Brand"])==False:
df2.loc[df2.index[i],df2.columns[j]] = 0
ValueError: The truth value of an array with more than one element is
ambiguous. Use a.any() or a.all()

df.pivot_table() is designed for pivot-and-aggregation capability. A quick(?) and dirty solution:
df1.pivot_table(index="Region", columns="Brand", values="Cost incurred", aggfunc=np.sum)\
/ df1.pivot_table(index="Region", columns="Brand", values="Units produced", aggfunc=np.sum)
Output
Brand BMW Honda Hyundai Mercedes Suzuki Volkswagen
Region
Asia NaN 2150.0 2146.341463 NaN 2250.0 NaN
Europe 7420.0 NaN NaN 7375.0 NaN 3090.0

Pandas: removing float values from output of a pivot_table used for counting

I have the following (toy) dataset:
import pandas as pd
import numpy as np
df = pd.DataFrame({'System_Key':['MER-002', 'MER-003', 'MER-004', 'MER-005', 'BAV-378', 'BAV-379', 'BAV-380', 'BAV-381', 'AUD-220', 'AUD-221', 'AUD-222', 'AUD-223'],
'Manufacturer':['Mercedes', 'Mercedes', 'Mercedes', 'Mercedes', 'BMW', 'BMW', 'BMW', 'BMW', 'Audi', 'Audi', 'Audi', 'Audi'],
'Region':['Americas', 'Europe', 'Americas', 'Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'Americas', 'Asia', 'Americas', 'Americas'],
'Department':[np.nan, 'Sales', np.nan, 'Operations', np.nan, np.nan, 'Accounting', np.nan, 'Finance', 'Finance', 'Finance', np.nan]
})
System_Key Manufacturer Region Department
0 MER-002 Mercedes Americas NaN
1 MER-003 Mercedes Europe Sales
2 MER-004 Mercedes Americas NaN
3 MER-005 Mercedes Asia Operations
4 BAV-378 BMW Asia NaN
5 BAV-379 BMW Europe NaN
6 BAV-380 BMW Europe Accounting
7 BAV-381 BMW Europe NaN
8 AUD-220 Audi Americas Finance
9 AUD-221 Audi Asia Finance
10 AUD-222 Audi Americas Finance
11 AUD-223 Audi Americas NaN
First, I remove the NaN values in the data frame:
df = df.fillna('')
Then, I pivot the data frame as follows:
pivot = pd.pivot_table(df, index='Manufacturer', columns='Region', values='System_Key', aggfunc='size').applymap(str)
Notice that I'm passing aggfunc='size' for counting.
This results in the following pivot table:
Region Americas Asia Europe
Manufacturer
Audi 3.0 1.0 NaN
BMW NaN 1.0 3.0
Mercedes 2.0 1.0 1.0
How would I convert the float values in this pivot table to integers?
Thanks in advance!

Try fill_value
pivot = pd.pivot_table(df, index='Manufacturer', columns='Region', values='System_Key', aggfunc='size',fill_value=-1).astype(int)

The only reason you get floats when aggregating integers is because some missing size() values are NaN. So use fill_value=0 to impute them to zeros. Avoid getting the NaNs in the first place:
df.pivot_table(index='Manufacturer', columns='Region', values='System_Key', aggfunc='size', fill_value=0)
Region Americas Asia Europe
Manufacturer
Audi 3 1 0
BMW 0 1 3
Mercedes 2 1 1
Notes:
This is much better than kludging the dtype after.
You also don't need the df.fillna(''), and filling NaN with string '' on an integer(/float) column is a bad idea
Note you don't need to do pd.pivot_table(df, ...), just call df.pivot_table(...) directly since it's a method of dataframe.

Since you have NaN in your data, pandas would automatically downcast to float. You can either use Int64 (available from Pandas 0.24+) datatype:
pivot = (pd.pivot_table(df, index='Manufacturer', columns='Region',
values='System_Key', aggfunc='size')
.astype('Int64')
)
Output:
Region Americas Asia Europe
Manufacturer
Audi 3 1 <NA>
BMW <NA> 1 3
Mercedes 2 1 1
or fill NaN with, say, -1 in pivot_table:
pivot = (pd.pivot_table(df, index='Manufacturer', columns='Region',
values='System_Key', aggfunc='size',
fill_value=-1) # <--- here
)
Output:
Region Americas Asia Europe
Manufacturer
Audi 3 1 -1
BMW -1 1 3
Mercedes 2 1 1

Use the Int64 datatype which allows for intgeger NaNs. The convert_dtypes() function would be handy here.
pivot.convert_dtypes()
Americas Asia Europe
Manufacturer
Audi 3 1 <NA>
BMW <NA> 1 3
Mercedes 2 1 1
Also...
I'd probably do df.fillna('', inplace=True) instead of df = df.fillna('') to minimize data copies
I assume you meant to ditch the .applymap(str) bit at the end of your call to pivot_table().

populate a dataframe column based on a list

I have a dataframe
vehicle_make vehicle_model vehicle_year
Toyota Corolla 2016
Hyundai Sonata 2016
Cadillac DTS 2006
Toyota Prius 2014
Kia Optima 2015
I want to add a new column 'vehicle_make_category' which populates based on a list i have
luxury=['Bentley',
'Maserati',
'Hummer',
'Porsche',
'Lexus']
non_luxury=['Saab',
'Mazda',
'Dodge',
'Volkswagen',
'Kia',
'Chevrolet',
'Hyundai',
'Ford',
'Nissan',
'Honda',
'Toyota'
]
How can accomplish this? I have tried using
df['vehicle_make_category']=np.where(df['vehicle_make']=i for i in luxury, 'luxury')
but it doesnt work...

Simply
df["vehicle_make_category"] = None
df.loc[df["vehicle_make"].isin(luxury), "vehicle_make_category"] = "luxury"
df.loc[df["vehicle_make"].isin(non_luxury), "vehicle_make_category"] = "non_luxury"

Use isin and also add a condition to np.where that fills the gaps for a condition not evaluated as true
df['vehicle_make_category'] = np.where(df.vehicle_make.isin(luxury),'luxury','non-luxury')
vehicle_make vehicle_model vehicle_year vehicle_make_category
0 Toyota Corolla 2016 non-luxury
1 Hyundai Sonata 2016 non-luxury
2 Cadillac DTS 2006 non-luxury
3 Toyota Prius 2014 non-luxury
4 Kia Optima 2015 non-luxury
Using np.select we can create a conditions list and assign values based on a condition being true
conditions = [df.vehicle_make.isin(luxury),df.vehicle_make.isin(non_luxury)]
df['vehicle_make_category'] = np.select(conditions,['luxury','non-luxury'],default='no-category')
vehicle_make vehicle_model vehicle_year vehicle_make_category
0 Toyota Corolla 2016 non-luxury
1 Hyundai Sonata 2016 non-luxury
2 Cadillac DTS 2006 no-category
3 Toyota Prius 2014 non-luxury
4 Kia Optima 2015 non-luxury

You can us df.join
You'll have to make a new dataframe identifying luxury/nonluxury.
veh = ['toyota','hyundai','cadillac']
yr = [2016,2016,2016]
lux = ['non','non','lux']
#recreating your lux/non layout
n_lux = [veh[0],veh[1]]
lux = [veh[2]]
#then making a new column
b = ['non' if v in n_lux else 'lux' for v in veh]
A = pd.DataFrame(np.array([veh,yr]).T)
B =pd.DataFrame(np.array([veh,b]).T)
pd.concat([A,B],axis = 1, keys = [0])

You can create the column via list comprehension:
df['vehicle_make_category'] = [
'luxury' if row.vehicle_make in luxury
else 'non_luxury'
for _, row in df.iterrows()
]

You can create a lookup_df from the lists for non_luxury and luxury.
lookup_df = pd.DataFrame({
'vehicle_make': luxury + non_luxury,
'vehicl_make_category': (["luxury"] * len(luxury))+(["non_luxury"] * len(non_luxury))
})
Then left join on the original df that you have.
df.merge(lookup_df, how='left',left_on='vehicle_make', right_on='vehicle_make')
Output:
vehicle_make vehicle_model vehicle_year vehicle_make_category
0 Toyota Corolla 2016 non_luxury
1 Hyundai Sonata 2016 non_luxury
2 Cadillac DTS 2006 NaN
3 Toyota Prius 2014 non_luxury
4 Kia Optima 2015 non_luxury

How to compare fields from two CSV files with an arithmetic condition?

I have two csv files. The first file contains names of all countries with their capital cities,
CSV 1:
Capital Country Country Code
Budapest Hungary HUN
Rome Italy ITA
Dublin Ireland IRL
Paris France FRA
Berlin Germany DEU
.
.
.
CSV 2:
Second CSV file contains trip details of a bus,
Trip City Trip Country No. of pax
Budapest HUN 24
Paris FRA 36
Munich DEU 9
Florence ITA 5
Milan ITA 25
Rome ITA 2
Rome ITA 45
I would like to add a new column df["Touism visit"] with the values of no of pax, if the Trip City (from CSV 2) is a capital of a country (from CSV 1) and if the number of pax is more than 10.
Thank you.

Try this:
df2['tourism'] = 0
df2.loc[df2['Trip City'].isin(df1['Capital']) & (df2['No. of pax'] > 10), 'tourism'] = df2.loc[df2['Trip City'].isin(df1['Capital'])& (df2['No. of pax'] > 10), 'No. of pax']
I get :
Trip_City Trip_Country No._of_pax tourism
0 Budapest HUN 24 24
1 Paris FRA 36 36
2 Munich DEU 9 0
3 Florence ITA 5 0
4 Milan ITA 25 0
5 Rome ITA 2 0
6 Rome ITA 45 45
(I had to add _s to get pd.read_clipboard() to work properly)

this might also help,
import the dfs
df1 = pd.read_csv("CSV1.csv")
df2 = pd.read_csv("CSV2.csv")
make a dictionary out of the pandas Series
my_dict=dict(zip((df1["Country_Code"]),(df1["Capital"])))
define a function that test your conditions (note i used np.logical_and() to combine the conditions. A normal and
def isTourism(country_code,trip_city,No_of_pax):
if np.logical_and((my_dict[country_code]==trip_city),(No_of_pax >= 10)):
return "Yes"
else:
return "No
call function with map
df2["Tourism"] = list(map(isTourism,df2["Trip Country"],df2["Trip City"], df2["No. Of pax"]))
print(df2)
Trip City Trip Country No. Of pax Tourism
0 Budapest HUN 24 Yes
1 Paris FRA 36 Yes
2 Munich DEU 9 No
3 Florence ITA 5 No
4 Milan ITA 25 No
5 Rome ITA 2 No
6 Rome ITA 45 Yes

If you filter your second dataframe to only the values > 10, you could merge and sum as follows:
import pandas as pd
df1 = pd.DataFrame({'Capital': ['Budapest', 'Rome', 'Dublin', 'Paris',
'Berlin'],
'Country': ['Hungary', 'Italy', 'Ireland', 'France',
'Germany'],
'Country Code': ['HUN', 'ITA', 'IRL', 'FRA', 'DEU']
})
df2 = pd.DataFrame({'Trip City': ['Budapest', 'Paris', 'Munich', 'Florence',
'Milan', 'Rome', 'Rome'],
'Trip Country': ['HUN', 'FRA', 'DEU', 'ITA', 'ITA',
'ITA', 'ITA'],
'No. of pax': [24, 36, 9, 5, 25, 2, 45]
})
df2 = df2[df2['No. of pax'] > 10]
combined = df1.merge(df2,
left_on=['Capital', 'Country Code'],
right_on=['Trip City', 'Trip Country'],
how='left').groupby(['Capital', 'Country Code'],
sort=False,
as_index=False)['No. of pax'].sum()
print combined
This prints:
Capital Country Code No. of pax
0 Budapest HUN 24.0
1 Rome ITA 45.0
2 Dublin IRL NaN
3 Paris FRA 36.0
4 Berlin DEU NaN

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why fillna not working as expected for mode - python

Related

How to select the first value that is not NaN after using groupby().agg()?

Sum two variables by two specific columns and compute quotient

Pandas: removing float values from output of a pivot_table used for counting

populate a dataframe column based on a list

How to compare fields from two CSV files with an arithmetic condition?

Categories

Resources