Roundin numbers with pandas - python

I have a pandas dataframe with a column that contains the numbers:
[4.534000e-01, 6.580000e-01, 1.349300e+00, 2.069180e+01, 3.498000e-01,...]
I want to round this column up to 3 decimal places, for which I use the round(col) function; however, I have noticed that panda gives me the following:
[0.453, 0.658, 1.349, 20.692, 0.35,...]
where the last element doesn't have three digits after the decimal.
I would like to have all the numbers rounded with the same amount of digits, for example, like: [0.453, 0.658, 1.349, 20.692, 0.350,...].
How can be done this within pandas?

You can use pandas.DataFrame.round to specify a precision.
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.round.html
import pandas as pd
# instantiate dataframe
dataframe = pd.DataFrame({'column_to_round': [4.534000e-01, 6.580000e-01, 1.349300e+00, 2.069180e+01, 3.498000e-01,]})
# create a new column with this new precision
dataframe['set_decimal_level'] = dataframe.round({'column_to_round': 3})

import pandas as pd
df = pd.DataFrame([4.534000e-01, 6.580000e-01, 1.349300e+00, 2.069180e+01, 3.498000e-01], columns=['numbers'])
df.round(3)
Prints:
0.453 0.658 1.349 20.692 0.350

Related

In pandas, how to convert the result of dividing two columns from a decimal to a percentage?

When using pandas, how do I convert a decimal to a percentage when the result obtained by dividing two columns as another column?
for example :
df_income_structure['C'] = (df_income_structure['A']/df_income_structure['B'])
If the value of df_income_structure['C'] is a decimal, how to convert it to a percentage ?
Format it like this:
df_income_structure.style.format({'C': '{:,.2%}'.format})
Change the number depending on how many decimal places you'd like.
Use basic pandas operators
For example if we have a dataframe with columns name like column1, column2,column3 , ... so we can :
Columns = [column1, column2,column3, ....] .
df[Columns] = df[Columns].div(df[Columns].sum(axis=1), axis=0).multiply(100)
(df[Columns].sum(axis=1). axis=1 makes the summation for rows.
Divide the dataframe by (df[Columns].div(df[Columns].sum(axis=1), axis=0). axis=0 is for devision of columns.
multiply the results by 100 for percentages of 100.
I hope this answer has solved your problem.
Good luck

How to add two columns of a dataframe as Decimals?

I am trying to add two columns together using the Decimal module in Python but can't seem to get the syntax right for this. I have 2 columns called month1 and month2 and do not want these to become floats at any point in the outcome as division and then rounding will later be required.
The month1 and month2 columns are already to several decimals as they are averages and I need to preserve this accuracy in the addition.
I can see guidance online for how to add numbers together using Decimal but not how to apply it to columns in a pandas dataframe. I've tried things like:
df['MonthTotal'] = Decimal.decimal(df['Month1']) + Decimal.decimal(df['Month1'])
What is the solution?
from decimal import Decimal
def convert_decimal(row):
row["monthtotal"] = Decimal(row["month1"])+Decimal(row["month2"])
return row
df = df.apply(convert_decimal, axis =1)
decimal.Decimal is designed to accept single value, not pandas.Series of them. Assuming that your column is holding strings representing number values, you might use .applymap for using decimal.Decimal element-wise i.e.:
import decimal
import pandas as pd
df = pd.DataFrame({'x':['0.1','0.1','0.1'],'y':['0.1','0.1','0.1'],'z':['0.1','0.1','0.1']})
df_decimal = df.applymap(decimal.Decimal)
df_decimal["total"] = df_decimal.x + df_decimal.y + df_decimal.z
print(df_decimal.total[0])
print(type(df_decimal.total[0]))
output
0.3
<class 'decimal.Decimal'>

How do you display values in a pandas dataframe column with 2 decimal places?

What I am looking to do is make it so that regardless of the value, it displays 2 decimal places.
What I have tried thus far:
DF['price'] = DF['price'].apply(lambda x: round(x, 2))
However, the problem is that I wish to display everything in 2 decimal places, but values like 0.5 are staying at 1 decimal place since they don't need to be rounded.
Is there a function I can apply that gives the following type of output:
Current After Changes
0 0.00
0.5 0.50
1.01 1.01
1.133333 1.13
Ideally, these values will be rounded but I am open to truncating if that is all that works.
I think you want something like this
DF['price'] = DF['price'].apply(lambda x: float("{:.2f}".format(x)))
This applies the change just to that column
You have to set the precision for pandas display. Put this on top of your script after importing pandas:
import pandas as pd
pd.set_option('precision', 2)
If you want to only modify the format of your values without doing any operation in pandas, you should just execute the following instruction:
pd.options.display.float_format = "{:,.2f}".format
You should be able to get more info here:
https://pandas.pydata.org/docs/user_guide/options.html#number-formatting
Try:
import pandas as pd
pd.set_option('display.precision', 2)

Table conversion in python

I have a table example input :
Energy1 Energy2
-966.463549649 -966.463549649
-966.463608088 -966.463585840
So I need a script for summing the two energies E1 and E2 and then convert with a factor *627.51 (hartree in kcal/mol) and at the end truncate the number with 4 digits.
I never attempted this with Python. I've always written this in Julia, but I think it should be simple.
Do you know how I can find an example of reading the table and then doing operations with the numbers in it?
something like:
import numpy
data = numpy.loadtxt('table.tab')
print(data[?:,?].sum())
You can use pandas for this if you convert the table to a csv file. You add the columns directly then use the apply function with lambda to multiply each of the elements by the conversion factor. To truncate to 4 digits, you can change pandas global settings to display the format as 1 digit + 3 decimal in scientific notation.
import pandas as pd
df = pd.read_csv('something.csv')
pd.set_option('display.float_format', '{:.3E}'.format)
df['Sum Energies'] = (df['Energy1'] + df['Energy2']).apply(lambda x: x*627.51)
print(df)
This outputs:
Energy1 Energy2 Sum Energies
0 -9.665E+02 -9.665E+02 -1.213E+06
1 -9.665E+02 -9.665E+02 -1.213E+06

Selecting rows by value in a floating point column in pandas

I import a csv data file into a pandas DataFrame df with pd.read_csv. The text file contains a column with strings like these:
y
0.001
0.0003
0.0001
3e-05
1e-05
1e-06
If I print the DataFrame, pandas outputs the decimal representation of these values with 6 digits after the comma, and everything looks good.
When I try to select rows by value, like here:
df[df['y'] == value],
by typing the corresponding decimal representation of value, pandas correctly matches certain values (example: rows 0, 2, 4) but does not match others (rows 1, 3, 5). This is of course due to the fact that those rows values do not have a perfect representation in base two.
I was able to workaround this problem is this way:
df[abs(df['y']/value-1) <= 0.0001]
but it seems somewhat awkward. What I'm wondering is: numpy already has a method, .isclose, that is specifically for this purpose.
Is there a way to use .isclose in a case like this? Or a more direct solution in pandas?
Yes, you can use numpy's isclose
df[np.isclose(df['y'], value)]

Categories

Resources