Length of float integer in Python [duplicate] - python

This question already has an answer here:
"sys.getsizeof(int)" returns an unreasonably large value?
(1 answer)
Closed 3 days ago.
I am unable to understand the actual memory space that is allocated to integers and floats in python.
From what I know, sizes of int and float variables by default in python are 32-bit and 64-bit respectively.
But from results of the following code, it looks like 28-bits and 24-bits are allocated.
i = 2
print(sys.getsizeof(i))
Output: 28
i = 0.02
print(sys.getsizeof(i))
Output: 24
Please let me know what part have I misunderstood here.

getsizeof actually returns you the amount of bytes of an object - in Python both floats, ints and every other object is a full fledged object: there are no "naked" CPU primitives.
That account for most of the performance hits Python has when compared with similar mixed bytecode-interpreted languages, when benchmarking.
That said, Python floats are, indeed, internally 64 bit floats, and Python INTs are special unlimited objects which are stored internally in base-256 and are variable length featuring arbitrary precision

Related

Is there a difference between an int of 5 and a float of 5.0?

I am confused on whether there is or is not any difference between and int of 5 and a float of 5.0, besides the float having a decimal.
What are some of the things I can do with an int that I can't with a float? What is the point of having two separate types, instead of just letting everything be a float in the first place?
They are different data types:
type(5) # int
type(5.0) # float
And therefore they are not, strictly speaking, the same.
However, they are equal:
5 == 5.0 # true
They are different types.
>>> type(5)
<type 'int'>
>>> type(5.0)
<type 'float'>
Internally, they are stored differently.
5 and 5.0 are different objects in python so 5 is 5.0 is False
But in most cases they behave the same like 5 == 5.0 is True
As your question is focused on the difference and need of having two different data types I will try to focus on that and answer.
Need for different data type (why not put everything in float?)
Different data type have different memory usage.int uses 2 byte whereas float uses 4 byte.One can use the correct data type in correct palce and save memory
What are some of the things I can do with an int that I can't with a float?
One of the most important thing one should know while using these these two data type is that,"integer division truncates": any fractional part is discarded.To get desired result you should use the correct type.
A nice example is given in the book "The C Programming Language.Book by Brian Kernighan and Dennis Ritchie" which is applicable regardless of the language used.
This statement converts the temparature from Fahrenheit to Celsius.
float celsius=(5/9)*(Fahrenheit-32)
This code will always give you the answer as 0.That is because the answer of 5/9 is 0.5556 which due to truncation is taken as 0.
now look at this code:
float celsius=(5.0/9.0)*(Fahrenheit-32)
This code will give you the correct answer as 5.0/9.0 gives us the value 0.5556. As we have used float value here the compiler does not truncate the value.The float value prevents the truncation of fraction part and gives us our desired answer
I think this will tell you how important is the difference between 5 and 5.0
This question is already answered: they have different type.
But what does that mean?
One must think in term of object: they are somehow objects of different classes, and the class dictates the object behavior.
Thus they will behave differently.
It's easier to grasp such things when you are in a pure object oriented language like in Smalltalk, because you clearly can browse the Float and Integer classes and learn how they might differ thru their implementation. In Python, it's more complex because the computation model is multi-layer with notions of types, operators, function, and this complexity is somehow obscuring the basic object oriented principles. But from behavioural point of view, it ends up being the same: Python : terminology 'class' VS 'type'
So what are these differences of Behavior?
They are thin because we make our best effort to have uniform and unsurprising arithmetic (including mixed arithmetic) behavior matching the laws of mathematics, whatever the programming language.
Floating point behaves differently because they keep a limited number of significand bits. It's a necessary trade-off for keeping computations simple and fast. Small integers require few significand bits, so they will behave mostly the same than floating point. But when growing larger, they won't. Here is an arithmetic example:
print(5.0**3 == 5**3)
print(5.0**23 == 5**23)
The former expression will print True, the later False... Because 5^23 requires 54bits to be represented and Python VM will in most case depend on IEEE754 double floating point which provide only 53 bits significand.

getting size of primitive data types in python

I am having a lot of confusion using the sys.getsizeof function in python. All I want to find out is that for say a floating point value, is the system using 4 or 8 bytes (i.e. single or double precision in C terms).
I do the following:
import sys
x = 0.0
sys.getsizeof(x) # Returns 24
type(x) # returns float
sys.getsizeof(float) # Returns 400.
How can I simply find out the how many bytes are actually used for the floating point representation. I know it should be 8 bytes but how can I verify this (something like the sizeof operator in C++)
Running
sys.getsizeof(float)
does not return the size of any individual float, it returns the size of the float class. That class contains a lot more data than just any single float, so the returned size will also be much bigger.
If you just want to know the size of a single float, the easiest way is to simply instantiate some arbitrary float. For example:
sys.getsizeof(float())
Note that
float()
simply returns 0.0, so this is actually equivalent to:
sys.getsizeof(0.0)
This returns 24 bytes in your case (and probably for most other people as well). In the case of CPython (the most common Python implementation), every float object will contain a reference counter and a pointer to the type (a pointer to the float class), which will each be 8 bytes for 64bit CPython or 4 bytes each for 32bit CPython. The remaining bytes (24 - 8 - 8 = 8 in your case which is very likely to be 64bit CPython) will be the bytes used for the actual float value itself.
This is not guaranteed to work out the same way for other Python implementations though. The language reference says:
These represent machine-level double precision floating point numbers. You are at the mercy of the underlying machine architecture (and C or Java implementation) for the accepted range and handling of overflow. Python does not support single-precision floating point numbers; the savings in processor and memory usage that are usually the reason for using these are dwarfed by the overhead of using objects in Python, so there is no reason to complicate the language with two kinds of floating point numbers.
and I'm not aware of any runtime methods to accurately tell you the number of bytes used. However, note that the quote above from the language reference does say that Python only supports double precision floats, so in most cases (depending on how critical it is for you to always be 100% right) it should be comparable to double precision in C.
import ctypes
ctypes.sizeof(ctypes.c_double)
From the docs:
getsizeof() calls the object’s sizeof method and adds an additional garbage collector overhead if the object is managed by the garbage collector.
sys.getsizeof is not about the byte size as in C.
For int there is bit_length().

Is integer comparison in Python constant time?

is integer comparison in Python constant time? Can I use it to compare a user-provided int token with a server-stored int for crypto in the way I would compare strings with constant_time_compare from django.utils.crypto, i.e. without suffering timing attacks?
Alternatively, is it more secure to convert to a string and then use the above function?
The answer is yes for a given size of integer - by default python integers that get big become long and then have potentially infinite length - the compare time then grows with the size. If you restrict the size of the integer to a ctypes.c_uint64 or ctypes.c_uint32 this will not be the case.
Note that compare with 0 is a special case, normally much faster, due to the hardware actions many CPUs have a special flag for 0, but if you are using/allowing seeds or tokens with a values of 0 you are asking for trouble.

Why isn't Python throwing an overflow error? [duplicate]

This question already has answers here:
Handling very large numbers in Python
(6 answers)
Closed 11 days ago.
I'm learning Python and I have a question about the range of the data types.
This program:
print("8 bits:", pow(2, 8)-1)
print("16 bits:", pow(2, 16)-1)
print("32 bits:", pow(2, 32)-1)
print("64 bits:", pow(2, 64)-1)
print( pow(18446744073709551615+18446744073709551615+2, 9) )
Produces the following output:
8 bits: 255
16 bits: 65535
32 bits: 4294967295
64 bits: 18446744073709551615
12663316555422952143897729076205936129798725073982046203600028471956337925454431
59912019973433564390346740077701202633417478988975650566195033836314121693019733
02667340133957632
My question is: how can Python calculate the result of the last call to pow()? My CPU cannot handle integers with more than 64 bits, so I expect the operation to produce an overflow.
The Python long integer type is only limited by your available memory. Until you run out of memory, the digits will just keep on coming.
Quoting from the numeric types documentation:
Long integers have unlimited precision.
Python will transparently use long integers when you need the unlimited precision. In Python 3, all integers are long integers; there is no distinction.
Python knows two data types for integers: int and long. If a number is too large for an int (64 bit), then automatically a long is used. Also if the result of a computation is too large for an int, a long is used instead.
You can explicitly declare a literal to be a long; just add an L to it (an l also is possible but is discouraged because in many fonts this is indistinguishable or at least very much alike a 1 (one) character). So, 5L is a long.
Typically this distinction is not important; to know of the difference will become necessary, though, if you compare the types of values (because type(5) ≠ type(5L)).
longs aren't limited in any particular number of bits. At ridiculous high values, the memory consumption and the computation times pose a practical limit, though.
Also keep in mind that computing stuff with these long might be faster than printing them because when converting them to a string for the printing, they have to be converted into the decimal system.

Are python int's architecture specific?

We can define variables as integer values, e.g.
x = 3
y = -2
and then operate on bits with binary operators &, |, ^ and ~. The question is if we always get the same result on every architecture, or is the behavior architecture specific?
Can we always assume a two's complement representation of integers?
Python 2.x supports two integer types: int and long. int is based on the underlying C long type and long is an arbitrary precision type. Very early version of Python (pre-2.2), treated the types as two separate types but they were mostly combined in Python 2.2.
Python 3.x only uses the arbitrary precision type.
Bit operations behave as if applied to arbitrary-precision 2's complement numbers. If required, an int will be automatically promoted to a long in Python 2.x.
The behavior is consistent across platforms.
From the python 2 documentation (emphasis mine):
Plain integers: These represent numbers in the range -2147483648 through 2147483647. (The range may be larger on machines with a larger natural word size, but not smaller.) When the result of an operation would fall outside this range, the result is normally returned as a long integer (in some cases, the exception OverflowError is raised instead). For the purpose of shift and mask operations, integers are assumed to have a binary, 2’s complement notation using 32 or more bits, and hiding no bits from the user (i.e., all 4294967296 different bit patterns correspond to different values).
So yes: the integers are architecture specific for Python 2.
From the Python 3 documentation:
Plain integers: These represent numbers in an unlimited range, subject to available (virtual) memory only. For the purpose of shift and mask operations, a binary representation is assumed, and negative numbers are represented in a variant of 2’s complement which gives the illusion of an infinite string of sign bits extending to the left.
So no: the integers are not architecture specific for Python 3.

Categories

Resources