I'm working on a project that lets users track different data types over time. Part of the base idea is that a user should be able to enter data using any units that they need to. I've been looking at both units:
http://pypi.python.org/pypi/units/
and quantities:
http://pypi.python.org/pypi/quantities/
However I'm not sure the best way to go. From what I can tell, quantities is more complex, but includes a better initial list of units.
I applaud use of explicit units in scientific computing applications. Using explicit units is analogous brushing your teeth. It adds some tedium up front, but the type safety you get can save a lot of trouble in the long run. Like, say, not crashing $125 million orbiters into planets.
You should also probably check out these two other python unit/quantity packages:
Unum
Scientific.Physics.PhysicalQuantity
I once investigated Scientific.Physics.PhysicalQuantity. It did not quite meet my needs, but might satisfy yours. It's hard to tell what features you need from your brief description.
I ended up writing my own python package for unit conversion and dimensional analysis, but it is not properly packaged for release yet. We are using my unit system in the python bindings for our OpenMM system for GPU accelerated molecular mechanics. You can browse the svn repository of my python units code at:
SimTK python units
Eventually I intend to package it for distribution. If you find it interesting, please let me know. That might motivate me to package it up sooner. The features I was looking for when I was designing the SimTK python units system included the following:
Units are NOT necessarily stored in terms of SI units internally. This is very important for me, because one important application area for us is at the molecular scale. Using SI units internally can lead to exponent overflow in commonly used molecular force calculations. Internally, all unit systems are equally fundamental in SimTK.
I wanted similar power and flexibility to the Boost.Units system in C++. Both because I am familiar with that system, and because it was designed under the scrutiny of a large group of brilliant engineers. Boost.Units is a well crafted second generation dimensional analysis system. Thus I might argue that the SimTK units system is a third generation system :). Be aware that while Boost.Units is a "zero overhead" system with no runtime cost, all python quantity implementations, including SimTK units, probably exact a runtime cost.
I want dimensioned Quantities that are compatible with numpy arrays, but do not necessarily require the python numpy package. In other words, Quantities can be based on either numpy arrays or on built in python types.
What features are important to you?
Pint has recently come onto the field. Anybody care to share their experiences? Looks good. FYI: It looks like Pint will be integrated with Uncertainties in the near future.
There is another package called unyt from the yt-project. The authors of unyt acknowledge the existence of Pint and astropy.units. Conversions from and to these other packages is supported.
The selling point of unyt is speed. It is faster than the other two. The unit packages are compared in several benchmarks in this paper.
The benchmarks are disappointing for anyone obsessed with performance. :-( The slowdown of calculations with any of these unit systems is large. The slowdown factor is 6-10 for arrays with 1000 entries (worse for smaller arrays).
Disclaimer: I am not affiliated with unyt, I just want to share what I learned about unit systems.
Note that quantities has very bad support for temperature:
>>> (100 * pq.degC).rescale(pq.degF)
array(179.99999999999997) * degF
>>> (0 * pq.degC).rescale(pq.degF)
array(0.0) * degF
0 degrees Celsius isn't 0 degrees Fahrenheit. Their framework doesn't support any kind of conversion that isn't just multiplying by a factor.
I am surprised that nobody mentioned SymPy yet. SymPy is a mature and well-maintained symbolic mathematics library for Python that is moreover a NumFOCUS-sponsored project.
It has a Physics module with many useful classes and functions for "solving problems in physics". Most relevant for you, it has a Unit sub-module that contains everything you need, I think; just read the excellent documentation.
You may want to look at a new package called natu. It addresses the three issues that #ChristopherBruns listed. It's available in PyPI.
I'm the author of that package, and I'd appreciate any comments or suggestions.
It looks like another package has come out for doing this as well, written by Massimo DiPierro of web2py fame, called Buckingham.
Also of note, Brian has had something like this for some time.
Thought to mention the units package which is part of the Astropy package.
It's well maintained, easy to use, and has all the basic units (as well as astrophysics-related units).
It provides tools for both units and quantities. And there's also a module for physical constants.
I'd like to point to a separate library for dealing with units: Barril
https://github.com/ESSS/barril
Docs at: https://barril.readthedocs.io/en/latest/
While it does have support for creating "random" units from computation (such as Pint, unum, etc), it's more tailored to having a database of units (which the library has by default -- see: https://barril.readthedocs.io/en/latest/units.html and the implementation: https://github.com/ESSS/barril/blob/master/src/barril/units/posc.py) and then you can query and transform based on the related units.
One thing it supports that does a lot of difference in that regard is dealing with unit conversions which would be "dimentionless" -- such as m3/m3 (i.e.:volume per volume) and then converting to cm3/m3 and keeping the dimension.
i.e.: in pint:
>>> import pint
>>> ureg = pint.UnitRegistry()
>>> m = ureg.meter
>>> v = 1 \* (m\*3)/(m\*3)
>>> v
<Quantity(1.0, 'dimensionless')>
And then, after that (as far as I know), it's not really possible to do additional unit conversions properly knowing that it was m3/m3.
In barril:
>>> from barril.units import Scalar
>>> a = Scalar(3, 'm3/m3')
>>> a.GetValue('cm3/m3')
3000000.0
>>> a.category
'volume per volume'
>>> a.unit
'm3/m3'
and something as a.GetValue('m3') (with an invalid value) would give an error saying that the conversion is actually invalid.
The unit database (which was initially based on the POSC Units of Measure Dictionary) is a bit more tailored for the Oil & Gas field, but should be usable outside of it too.
I think you should use quantities, because a quantity has some units associated with it.
Pressure, for example, will be a quantity that could be entered and converted in and to different units (Pa, psi, atm, etc). Probably you could create new quantities specifics for your application.
My preferred package is QuantiPhy. It takes a different approach than most other packages. With QuantiPhy the units are simply strings, and the package is largely used when reading or writing quantities. As such, it much easier to incorporate into your software. QuantiPhy supports unit and scale factor conversion both when creating quantities and when rendering them to strings. Here is an example that reads and then writes a table of times and temperatures, converting from minutes/°F to seconds/K on the way in and back to the original units on the way out:
>>> from quanitphy import Quantity
>>> rawdata = '0 450, 10 400, 20 360'
>>> data = []
>>> for pair in rawdata.split(','):
... time, temp = pair.split()
... time = Quantity(time, 'min', scale='s')
... temp = Quantity(temp, '°F', scale='K')
... data += [(time, temp)]
>>> for time, temp in data:
... print(f'{time:9q} {temp:9q}')
0 s 505.37 K
600 s 477.59 K
1.2 ks 455.37 K
>>> for time, temp in data:
... print(f"{time:<7smin} {temp:s°F}")
0 min 450 °F
10 min 400 °F
20 min 360 °F
I find the units packages to be more than what want. It doesn't take much code to start building your own functions that refer back to the very few basic fundamental numbers. Also, It forces you to do the dimensional analysis to prevent errors.
def FtoC(Tf):
return (Tf-32)*5/9
def CtoF(Tc):
return 9*Tc/5+32
def CtoK(Tc):
return Tc+273.15
def INCHtoCM(Inch):
return 2.54 * Inch
def CMtoINCH(cm):
return cm / INCHtoCM(1)
def INCHtoMETER(inch):
return .01*INCHtoCM(inch)
def FOOTtoMETER(foot):
return INCHtoMETER(12*foot)
def METERtoINCH(Meter):
return CMtoINCH(100 * Meter)
def METERtoFOOT(Meter):
return METERtoINCH(Meter)/12
def M3toINCH3(M3):
return (METERtoINCH(M3))**3
def INCH3toGALLON(Inch3):
return Inch3 / 231
def M3toGALLON(M3):
return INCH3toGALLON(M3toINCH3(M3))
def KG_M3toLB_GALLON(KGperM3):
return KGtoLBM(KGperM3) / M3toGALLON(1)
def BARtoPASCAL(bar):
return 100000 * bar
def KGtoLBM(kilogram):
return kilogram * 2.20462262185
def LBMtoKG(lbm):
return lbm/KGtoLBM(1)
def NEWTONtoLBF(newton):
return newton * KGtoLBM(1) * METERtoFOOT(1) / STANDARD_GRAVITY_IMPERIAL()
def LBFtoNEWTON(lbf):
return lbf * STANDARD_GRAVITY_IMPERIAL() * LBMtoKG(1) * FOOTtoMETER(1)
def STANDARD_GRAVITY_IMPERIAL():
return 32.174049
def STANDARD_GRAVITY_SI():
return 9.80665
def PASCALtoPSI(pascal):
return pascal * NEWTONtoLBF(1) / METERtoINCH(1)**2
def PSItoPASCAL(psi):
return psi * LBFtoNEWTON(1) / INCHtoMETER(1)**2
Then let's say you want to plot the static head pressure of 1,3 Butadiene #44 F and use gauges in PSI because you live in the US but the density tables are in SI units as they should be........................
# butadiene temperature in Fahrenheit
Tf = 44
# DIPPR105 Equation Parameters (Density in kg/m3, T in K)
# valid in temperature 165 to 424 Kelvin
A=66.9883
B=0.272506
C=425.17
D=0.288139
# array of pressures in psi
Pscale = np.arange(0,5,.1, dtype=float)
Tk = CtoK(FtoC(44))
Density = A / (B**(1+(1-Tk/C)**D)) # KG/M3
Height = [PSItoPASCAL(P) / (Density * STANDARD_GRAVITY_SI()) for P in Pscale]
Height_inch = METERtoINCH(1) * np.array(Height, dtype=np.single)
Another package to mention is Axiompy.
Installation: pip install axiompy
from axiompy import Units
units = Units()
print(units.unit_convert(3 * units.metre, units.foot))
>>> <Value (9.84251968503937 <Unit (foot)>)>
Related
Introduction
Today I found a weird behaviour in python while running some experiments with exponentiation and I was wondering if someone here knows what's happening. In my experiments, I was trying to check what is faster in python int**int or float**float. To check that I run some small snippets, and I found a really weird behaviour.
Weird results
My first approach was just to write some for loops and prints to check which one is faster. The snipper I used is this one
import time
# Run powers outside a method
ti = time.time()
for i in range(EXPERIMENTS):
x = 2**2
tf = time.time()
print(f"int**int took {tf-ti:.5f} seconds")
ti = time.time()
for i in range(EXPERIMENTS):
x = 2.**2.
tf = time.time()
print(f"float**float took {tf-ti:.5f} seconds")
After running it I got
int**int took 0.03004
float**float took 0.03070 seconds
Cool, it seems that data types do not affect the execution time. However, since I try to be a clean coder I refactored the repeated logic in a function power_time
import time
# Run powers in a method
def power_time(base, exponent):
ti = time.time()
for i in range(EXPERIMENTS):
x = base ** exponent
tf = time.time()
return tf-ti
print(f"int**int took {power_time(2, 2):.5f} seconds")
print(f"float**float took {power_time(2., 2.):5f} seconds")
And what a surprise of mine when I got these results
int**int took 0.20140 seconds
float**float took 0.05051 seconds
The refactor didn't affect a lot the float case, but it multiplied by ~7 the time required for the int case.
Conclusions and questions
Apparently, running something in a method can slow down your process depending on your data types, and that's really weird to me.
Also, if I run the same experiments but change ** by * or + the weird results disappear, and all the approaches give more or less the same results
Does someone know why is this happening? Am I missing something?
Apparently, running something in a method can slow down your process depending on your data types, and that's really weird to me.
It would be really weird if it was not like this! You can write your class that has it's own ** operator (through implementing the __pow__(self, other) method), and you could, for example, sleep 1s in there. Why should that take as long as taking a float to the power of another?
So, yeah, Python is a dynamically typed language. So, the operations done on data depend on the type of that data, and things can generally take different times.
In your first example, the difference never arises, because a) most probably the values get cached, because right after parsing it's clear that 2**2 is a constant and does not need to get evaluated every loop. Even if that's not the case, b) the time it costs to run a loop in python is hundreds of times that it takes to actually execute the math here – again, dynamically typed, dynamically named.
base**exponent is a whole different story. None about this is constant. So, there's actually going to be a calculation every iteration.
Now, the ** operator (__rpow__ in the Python data model) for Python's built-in float type is specified to do the float exponent (which is something implemented in highly optimized C and assembler), as exponentiation can elegantly be done on floating point numbers. Look for nb_power in cpython's floatobject.c. So, for the float case, the actual calculation is "free" for all that matters, again, because your loop is limited by how much effort it is to resolve all the names, types and functions to call in your loop. Not by doing the actual math, which is trivial.
The ** operator on Python's built-in int type is not as neatly optimized. It's a lot more complicated – it needs to do checks like "if the exponent is negative, return a float," and it does not do elementary math that your computer can do with a simple instruction, it handles arbitrary-length integers (remember, a python integer has as many bytes as it needs. You can save numbers that are larger than 64 bit in a Python integer!), which comes with allocation and deallocations. (I encourage you to read long_pow in CPython's longobject.c; it has 200 lines.)
All in all, integer exponentiation is expensive in python, because of python's type system.
This function is working fine but it takes too much time to solve. Please suggest me how to improve the solving time.
from sympy.solvers import solve
from sympy import Symbol
QD = 25.45
CDI = 0.65
AIN = 33.6
GTL = 10
GTSELV = 2300.1
CDGT = 1.9
def fun(HWE):
TWE = Symbol('TWE')
expression = (CDI*AIN*(2*9.81*(HWE-TWE))**0.5) - (CDGT*GTL*(TWE-GTSELV)**1.5)-QD
solution = solve(expression)
return solution
Function fun(2303) gives [2302.23386564786] which is correct but solving time is about 30 seconds. I need to run this for many arguments.
The dReal system can handle these sorts of problems, using the notion of delta-satisfiability. (See http://dreal.github.io for details.)
This is how your program is coded using dReal's Python interface (To install, see the notes at https://github.com/dreal/dreal4#python-binding):
from dreal import *
QD = 25.45
CDI = 0.65
AIN = 33.6
GTL = 10
GTSELV = 2300.1
CDGT = 1.9
def fun(HWE):
TWE = Variable("TWE")
expression = (CDI*AIN*(2*9.81*(HWE-TWE))**0.5) - (CDGT*GTL*(TWE-GTSELV)**1.5)-QD
return (expression == 0)
result = CheckSatisfiability(fun(2303), 0.001)
print(result)
When I run it on my now 3 year old computer, I get:
$ time python a.py
TWE : [2302.2338656478555, 2302.2338656478582]
python3 a.py 0.03s user 0.01s system 92% cpu 0.044 total
So, it takes about 0.044 seconds to go through, which does include loading the entire Python echo-system. (So, if you run many problems one after another, each instance should go even faster.)
Note that dReal shows you an interval for the acceptable solution, within a user-specified numerical error bound. The bound is the second argument to CheckSatisfiability, which we set at 0.001 for this problem. You can increase this precision at the cost of potentially more computation time, but looks like 0.001 seems to be doing quite well in this case. Also note that you get an "interval" for the solution for each variable. If you increase the precision, this interval might get smaller. For instance, when I change the call to:
result = CheckSatisfiability(fun(2303), 0.0000000000001)
I get:
$ time python a.py
TWE : [2302.2338656478569, 2302.2338656478569]
python3 a.py 0.03s user 0.01s system 84% cpu 0.050 total
where the interval has reduced to a single-point, but the program took slightly longer to run. For each problem, you should experiment with an appropriate delta to make sure the interval you get for the results is reasonable.
Use solve when you want a solution in terms of symbols. Use nsolve when you want a numerical solution. In your case, replace solve with nsolve and add the argument HWE to the call statement (i.e. nsolve(expression, HWE)). That 2nd argument is a guess at where the solution is near. Alternatively, give fun a 2nd arg guess and use that as the 2nd arg for nsolve. If you slowly change some parameter, using the last solution as the guess for the next solution will speed up the process (which is already quite fast).
If you know that the solution is real then you might want to take the real part of it with re(solution) since the solution comes back with a small imaginary component.
My goal is to perform a parameter estimation (model calibration) using PyGmo. My model will be an external "black blox" model (c-code) outputting the objective function J to be minimized (J in this case will be the "Normalized Root Mean Square Error" (NRMSE) between model outputs and measured data. To speed up the optimization (calibration) I would like to run my models/simulations on multiple cores/threads in parallel. Therefore I would like to use a batch fitness evaluator (bfe) in PyGMO. I prepared a minimal example using a simple problem class but using pure python (no external model) and the rosenbrock problem:
#!/usr/bin/env python
# coding: utf-8
import numpy as np
from fmpy import read_model_description, extract, simulate_fmu, freeLibrary
from fmpy.fmi2 import FMU2Slave
import pygmo as pg
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
from matplotlib import cm
import time
#-------------------------------------------------------
def main():
# Optimization
# Define problem
class my_problem:
def __init__(self, dim):
self.dim = dim
def fitness(self, x):
J = np.zeros((1,))
for i in range(len(x) - 1):
J[0] += 100.*(x[i + 1]-x[i]**2)**2+(1.-x[i])**2
return J
def get_bounds(self):
return (np.full((self.dim,),-5.),np.full((self.dim,),10.))
def get_name(self):
return "My implementation of the Rosenbrock problem"
def get_extra_info(self):
return "\nDimensions: " + str(self.dim)
def batch_fitness(self, dvs):
J = [123] * len(dvs)
return J
prob = pg.problem(my_problem(30))
print('\n----------------------------------------------')
print('\nProblem description: \n')
print(prob)
#-------------------------------------------------------
dvs = pg.batch_random_decision_vector(prob, 1)
print('\n----------------------------------------------')
print('\nBarch fitness evaluation:')
print('\ndvs length:' + str(len(dvs)))
print('\ndvs:')
print(dvs)
udbfe = pg.default_bfe()
b = pg.bfe(udbfe=udbfe)
print('\nudbfe:')
print(udbfe)
print('\nbfe:')
print(b)
fvs = b(prob, dvs)
print(fvs)
#-------------------------------------------------------
pop_size = 50
gen_size = 1000
algo = pg.algorithm(pg.sade(gen = gen_size)) # The algorithm (a self-adaptive form of Differential Evolution (sade - jDE variant)
algo.set_verbosity(int(gen_size/10)) # We set the verbosity to 100 (i.e. each 100 gen there will be a log line)
print('\n----------------------------------------------')
print('\nOptimization:')
start = time.time()
pop = pg.population(prob, size = pop_size) # The initial population
pop = algo.evolve(pop) # The actual optimization process
best_fitness = pop.get_f()[pop.best_idx()] # Getting the best individual in the population
print('\n----------------------------------------------')
print('\nResult:')
print('\nBest fitness: ', best_fitness) # Get the best parameter set
best_parameterset = pop.get_x()[pop.best_idx()]
print('\nBest parameter set: ',best_parameterset)
print('\nTime elapsed for optimization: ', time.time() - start, ' seconds\n')
if __name__ == '__main__':
main()
When I try to run this code I get the following error:
Exception has occurred: ValueError
function: bfe_check_output_fvs
where: C:\projects\pagmo2\src\detail\bfe_impl.cpp, 103
what: An invalid result was produced by a batch fitness evaluation: the number of produced fitness vectors, 30, differs from the number of input decision vectors, 1
By deleting or commeting out this two lines:
fvs = b(prob, dvs)
print(fvs)
the script can be run without errors.
My questions:
How to use the batch fitness evaluation? (I know this is a new
capability of PyGMO and they are still working on the
documentation...) Can anybody give a minimal example on how to implement this?
Is this the right way to go to speed up my model calibration problem? Or should I use islands and archipelagos? If I got it right, the islands in an archipelago are not communicating to eachother, right? So if one performs e.g. a Particle Swarm Optimization and wants to evaluate several objective function calls simultaneously (in parallel) then the batch fitness evaluator is the right choice?
Do I need to care about archipelagos and islands in this example? What are they exactly meant for? Is it worth running several optimizations but with different initial x (input to objective function) and then to take the best solution? Is this a common approach in optimization with GA's?
I am very knew to the field of optimization and PyGMO, so thx for helping!
Is this the right way to go to speed up my model calibration problem? Or should I use islands and archipelagos? If I got it right, the islands in an archipelago are not communicating to eachother, right? So if one performs e.g. a Particle Swarm Optimization and wants to evaluate several objective function calls simultaneously (in parallel) then the batch fitness evaluator is the right choice?
There are 2 modes of parallelization in pagmo, the island model (i.e., coarse-grained parallelization) and the BFE machinery (i.e., fine-grained parallelization).
The island model works on any problem/algorithm combination, and it is based on the idea that multiple optimisations are run in parallel while exchanging information to accelerate the global convergence to a solution.
The BFE machinery, instead, parallelizes a single optimisation, and it requires explicit support in the solver to work. Currently in pagmo only a handful of solvers are able to take advantage of the BFE machinery. The BFE machinery can also be used to parallelise the initialisation of a population of individuals, which can be useful is your fitness function is particularly heavyweight.
Which parallelisation method is best for you depends on the properties of your problem. In my experience, users tend to prefer the BFE machinery (fine-grained parallelisation) if the fitness function is very heavy (e.g., it takes minutes or more to compute), because in such a situation fitness evaluations are so costly that in order to take advantage of the island model one would have to wait too long. The BFE is also in some sense easier to understand because you don't have to delve into the details of archipelagos, topologies, etc. On the other hand, the BFE works only with certain solvers (although we are trying to extend BFE support to other solvers as time goes by).
How to use the batch fitness evaluation? (I know this is a new capability of PyGMO and they are still working on the documentation...) Can anybody give a minimal example on how to implement this?
One way of using the BFE is what you did in your example, i.e., via the implementation of a batch_fitness() method in your problem. However, my suggestion would be to comment out the batch_fitness() method and try using one of the general-purpose batch fitness evaluators provided with pagmo. The easiest thing to do is to just default-construct an instance of the bfe class and then pass it to one of the algorithms that can use the BFE machinery. One such algorithm is nspso:
https://esa.github.io/pygmo2/algorithms.html#pygmo.nspso
So, something like this:
b = pg.bfe() # Construct a default BFE
uda = pg.nspso(gen = gen_size) # Construct the algorithm
uda.set_bfe(b) # Tell the UDA to use the BFE machinery
algo = pg.algorithm(uda) # Construct a pg.algorithm from the UDA
new_pop = algo.evolve(pop) # Evolve the population
This should use multiple processes to evaluate your fitness functions in parallel within the loop of the nspso algorithm.
If you need more help, please come over to our public users/devs chat room, where you should get assistance rather quickly (normally):
https://gitter.im/pagmo2/Lobby
I have looked between random and secrets and found that secrets is "cryptographically secure". Everyone stack overflow source says it's the closest to true random. So I thought to use it for generating a population. However, it didn't give very random results at all, rather, predictable results.
The first characteristic I tested was gender, 4 to be exact, and mapped it all out...
# code may not function as it's typed on mobile without a computer to test on
import secrets
import multiprocessing
def gen(args*):
gender = ["Male", "Female", "X", "XXY"]
rng = secrets.choice(gender)
return rng
with multiprocessing.Pool(processes=4) as pool:
id_ = [I for I in range (2000000000)]
Out = pool.map(gen, id_)
# Do stuff with the data
When I process the data through other functions that determine the percent of 1 gender to the other it is always 25 +- 1% . I was expecting to have the occasional 100% of 1 gender and 0 others, but that never happened.
I also tried the same thing with random, it produced similar results but somehow took twice as long.
I also changed the list gender to have one of X and XXY, while having 49 of the other two, and it gave the predictable result of 1% X and 1% XXY.
I don't have much experience with RNG in computers aside from the term entropy... Does Python have any native or PYPI packages that produce entropy or chaotic numbers?
Is the secrets module supposed to act in a somewhat predictable way?
I think you might be conflating some different ideas here.
The secrets.choice function is going to randomly select 1 of the 4 gender options you have provided every time it is called, which in your example is 2000000000 times. The likelihood of getting 100% of any option after randomly selecting from a list of 4 options 2000000000 times is practically zero in any reasonably implemented randomness generator.
If I am understanding your question correctly, this is actually pretty strong evidence that the secrets.choice function is behaving as expected and providing an even distribution of the options provided to it. The variance should drop to zero as your N approaches infinity.
Question: What is the maximum value of kernel statistic counters and how can I handle it in python code?
Context: I calculate some statistics based on kernel statistics (e.g. /proc/partitions - it'll be customized python iostat version). But I have a problem with overflowed values - negative values. Original iostat code https://github.com/sysstat/sysstat/blob/master/iostat.c comments:
* Counters overflows are possible, but don't need to be handled in
* a special way: The difference is still properly calculated if the
* result is of the same type as the two values.
My language is python and I need to care about overflow in my case. Probably it depends also on architecture (32/64). I've tried 2^64-1 (64bit system), but no success.
The following function will work for 32-bit counters:
def calc_32bit_diff(old, new):
return (new - old + 0x100000000) % 0x100000000
print calc_32bit_diff(1, 42)
print calc_32bit_diff(2147483647, -2147483648)
print calc_32bit_diff(-2147483648, 2147483647)
This obviously won't work is the counter wraps around more than once between two consecutive reads (but then no other method would work either, since the information has been irrecoverably lost).
Writing a 64-bit version of this is left as an exercise for the reader. :)