I am trying to add two columns together using the Decimal module in Python but can't seem to get the syntax right for this. I have 2 columns called month1 and month2 and do not want these to become floats at any point in the outcome as division and then rounding will later be required.
The month1 and month2 columns are already to several decimals as they are averages and I need to preserve this accuracy in the addition.
I can see guidance online for how to add numbers together using Decimal but not how to apply it to columns in a pandas dataframe. I've tried things like:
df['MonthTotal'] = Decimal.decimal(df['Month1']) + Decimal.decimal(df['Month1'])
What is the solution?
from decimal import Decimal
def convert_decimal(row):
row["monthtotal"] = Decimal(row["month1"])+Decimal(row["month2"])
return row
df = df.apply(convert_decimal, axis =1)
decimal.Decimal is designed to accept single value, not pandas.Series of them. Assuming that your column is holding strings representing number values, you might use .applymap for using decimal.Decimal element-wise i.e.:
import decimal
import pandas as pd
df = pd.DataFrame({'x':['0.1','0.1','0.1'],'y':['0.1','0.1','0.1'],'z':['0.1','0.1','0.1']})
df_decimal = df.applymap(decimal.Decimal)
df_decimal["total"] = df_decimal.x + df_decimal.y + df_decimal.z
print(df_decimal.total[0])
print(type(df_decimal.total[0]))
output
0.3
<class 'decimal.Decimal'>
Related
I have a dataframe looks like this.
df = pd.DataFrame([0.70,1.0,0.75,0.0,5.0], columns=['pitch'])
I want to convert it into
df = pd.DataFrame([0.7,1,0.75,0,5], columns=['pitch'])
If i convert the float to int, 0.7 will be 0 .
How to solve this problem, thanks!!
astype()
e round
can use the astype(int) function in conjunction with the round() function on the dataframe. This will round each value in the 'pitch' column to the desired number of decimal places and then convert those rounded values to integers.
df['pitch'] = df['pitch'].round(1).astype(int)
or using round(0) to remove decimal places completely
df['pitch'] = df['pitch'].round(0).astype(int)
Be careful using the astype(int) function to convert to int, as it can round values, not just remove decimal places.
You can use the round() function to round the decimal values to the desired number of decimal places before converting them to integers. For example, you can use round(df['pitch'], 1) to round the values in the 'pitch' column to one decimal place, then use astype(int) to convert them to integers.
df['pitch'] = round(df['pitch'], 1).astype(int)
How can I retrain only 2 decimals for each values in a Pandas series? (I'm working with latitudes and longitudes). dtype is float64.
series = [-74.002568, -74.003085, -74.003546]
I tried using the round function but as the name suggests, it rounds. I looked into trunc() but this can only remove all decimals. Then I figures why not try running a For loop. I tried the following:
for i in series:
i = "{0:.2f}".format(i)
I was able to run the code without any errors but it didn't modify the data in any way.
Expected output would be the following:
[-74.00, -74.00, -74.00]
Anyone knows how to achieve this? Thanks!
series = [-74.002568, -74.003085, -74.003546]
["%0.2f" % (x,) for x in series]
['-74.00', '-74.00', '-74.00']
It will convert your data to string/object data type. It is just for display purpose. If you want to use it for calculation purpose then you can cast it to float. Then only one digit decimal will be visible.
[float('{0:.2f}'.format(x)) for x in series]
[-74.0, -74.0, -74.0]
here is one way to do it
assuming you meant pandas.Series, and if its true then
# you indicated its a series but defined only a list
# assuming you meant pandas.Series, and if its true then
series = [-74.002568, -74.003085, -74.003546]
s=pd.Series(series)
# use regex extract to pick the number until first two decimal places
out=s.astype(str).str.extract(r"(.*\..{2})")[0]
out
0 -74.00
1 -74.00
2 -74.00
Name: 0, dtype: object
Change the display options. This shouldn't change your underlying data.
pd.options.display.float_format = "{:,.2f}".format
I have a pandas dataframe with a column that contains the numbers:
[4.534000e-01, 6.580000e-01, 1.349300e+00, 2.069180e+01, 3.498000e-01,...]
I want to round this column up to 3 decimal places, for which I use the round(col) function; however, I have noticed that panda gives me the following:
[0.453, 0.658, 1.349, 20.692, 0.35,...]
where the last element doesn't have three digits after the decimal.
I would like to have all the numbers rounded with the same amount of digits, for example, like: [0.453, 0.658, 1.349, 20.692, 0.350,...].
How can be done this within pandas?
You can use pandas.DataFrame.round to specify a precision.
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.round.html
import pandas as pd
# instantiate dataframe
dataframe = pd.DataFrame({'column_to_round': [4.534000e-01, 6.580000e-01, 1.349300e+00, 2.069180e+01, 3.498000e-01,]})
# create a new column with this new precision
dataframe['set_decimal_level'] = dataframe.round({'column_to_round': 3})
import pandas as pd
df = pd.DataFrame([4.534000e-01, 6.580000e-01, 1.349300e+00, 2.069180e+01, 3.498000e-01], columns=['numbers'])
df.round(3)
Prints:
0.453 0.658 1.349 20.692 0.350
When using pandas, how do I convert a decimal to a percentage when the result obtained by dividing two columns as another column?
for example :
df_income_structure['C'] = (df_income_structure['A']/df_income_structure['B'])
If the value of df_income_structure['C'] is a decimal, how to convert it to a percentage ?
Format it like this:
df_income_structure.style.format({'C': '{:,.2%}'.format})
Change the number depending on how many decimal places you'd like.
Use basic pandas operators
For example if we have a dataframe with columns name like column1, column2,column3 , ... so we can :
Columns = [column1, column2,column3, ....] .
df[Columns] = df[Columns].div(df[Columns].sum(axis=1), axis=0).multiply(100)
(df[Columns].sum(axis=1). axis=1 makes the summation for rows.
Divide the dataframe by (df[Columns].div(df[Columns].sum(axis=1), axis=0). axis=0 is for devision of columns.
multiply the results by 100 for percentages of 100.
I hope this answer has solved your problem.
Good luck
I have a table example input :
Energy1 Energy2
-966.463549649 -966.463549649
-966.463608088 -966.463585840
So I need a script for summing the two energies E1 and E2 and then convert with a factor *627.51 (hartree in kcal/mol) and at the end truncate the number with 4 digits.
I never attempted this with Python. I've always written this in Julia, but I think it should be simple.
Do you know how I can find an example of reading the table and then doing operations with the numbers in it?
something like:
import numpy
data = numpy.loadtxt('table.tab')
print(data[?:,?].sum())
You can use pandas for this if you convert the table to a csv file. You add the columns directly then use the apply function with lambda to multiply each of the elements by the conversion factor. To truncate to 4 digits, you can change pandas global settings to display the format as 1 digit + 3 decimal in scientific notation.
import pandas as pd
df = pd.read_csv('something.csv')
pd.set_option('display.float_format', '{:.3E}'.format)
df['Sum Energies'] = (df['Energy1'] + df['Energy2']).apply(lambda x: x*627.51)
print(df)
This outputs:
Energy1 Energy2 Sum Energies
0 -9.665E+02 -9.665E+02 -1.213E+06
1 -9.665E+02 -9.665E+02 -1.213E+06