How do arithmetic operators work in python? - python

I am wondering how the "+" operator works in python, or indeed how any of the basic arithmetic operators work. My knowledge is very limited with regards to this topic, so I hope this isn't a repeat of a question already here.
More specifically, I would like to know how this code:
a = 5
b = 2
c = a + b
print (c)
produces the result of c = 7 when ran. How does the computer perform this operation? I found a thread on Reddit explaining how the computer performs the calculation in binary (https://www.reddit.com/r/askscience/comments/1oqxfr/how_do_computers_do_math/) which I can understand. What I fail to comprehend however is how the computer knows how to convert the values of 5 and 2 into binary and then perform the calculation. Is there a set formula for doing this for all integers or base 10 numbers? Or is there something else happening at a deeper hardware level here?
Again I'm sorry if this a repeat or if the question seems completely silly, I just can't seem to understand how python can take any two numbers and then sum them, add them, divide them or multiply them. Cheers.

The numbers are always in binary. The computer just isn't capable of keeping then in a different numerical system (well, there are ternary computers but these are a rare exception). The decimal system is just used for a "human representation", so that it is easier to read, but all the symbols (including the symbol "5" in the file, it's just a character) are mapped to numbers through some encoding (e. g. ASCII). These numbers are, of course in binary, just the computer knows (through the specification of the encoding) that if there is a 1000001 in a context of some string of characters, it has to display the symbol a (in the case of ASCII). That's it, the computer don't know the number 58, for it, these are just two symbols and are kept in the memory as ones and zeros.
Now, memory. This is where it's getting interesting. All the instructions and the data are kept in one place as a large buffer of ones and zeros. These are passed to the CPU which (using its instruction set) knows what the first chunk of ones and zeros (this is what we call a "word") means. The first word is an instruction, then the argument(s) follow. Depending on the instruction different things happen. Ok, what happens if the instruction means "add these two numbers" and store the result here?
Well, now it's a hardware job. Adding binary numbers isn't that complicated, it's explained in the link you provided. But how the CPU knows that this is the algorithm and how to execute it? Well, it uses a bunch of "full-adders". What is a "full-adder"? This is a hardware circuit that by given two inputs (each one of them is one bit, i. e. either one or zero) "adds" them and outputs the result to two other bits (one of which it uses for carry). But how the full-adder works? Well, it is constructed (physically) by half-adders, which are constructed by standard and and xor gates. If you're familiar with similar operators (& and ^ in Python) you probably know how they work. These gates are designed to work as expected using the physical properties of the elements (the most important of them being silicon) used in the electronic components. And I think this is where I'll stop.

Related

Getting the most accurate precision with a function equating factorial, divison and squaring [duplicate]

I'm using the Decimal class for operations that requires precision.
I would like to use 'largest possible' precision. With this, I mean as precise as the system on which the program runs can handle.
To set a certain precision it's simple:
import decimal
decimal.getcontext().prec = 123 #123 decimal precision
I tried to figure out the maximum precision the 'Decimal' class can compute:
print(decimal.MAX_PREC)
>> 999999999999999999
So I tried to set the precision to the maximum precision (knowing it probably won't work..):
decimal.getcontext().prec = decimal.MAX_PREC
But, of course, this throws a Memory Error (on division)
So my question is: How do I figure out the maximum precision the current system can handle?
Extra info:
import sys
print(sys.maxsize)
>> 9223372036854775807
Trying to do this is a mistake. Throwing more precision at a problem is a tempting trap for newcomers to floating-point, but it's not that useful, especially to this extreme.
Your operations wouldn't actually require the "largest possible" precision even if that was a well-defined notion. Either they require exact arithmetic, in which case decimal.Decimal is the wrong tool entirely and you should look into something like fractions.Fraction or symbolic computation, or they don't require that much precision, and you should determine how much precision you actually need and use that.
If you still want to throw all the precision you can at your problem, then how much precision that actually is will depend on what kind of math you're doing, and how many absurdly precise numbers you're attempting to store in memory at once. This can be determined by analyzing your program and the memory requirements of Decimal objects, or you can instead take the precision as a parameter and binary search for the largest precision that doesn't cause a crash.
I'd like to suggest a function that allows you to estimate your maximum precision for a given operation in a brute force way:
def find_optimum(a,b, max_iter):
for i in range(max_iter):
print(i)
c = int((a+b)/2)
decimal.getcontext().prec = c
try:
dummy = decimal.Decimal(1)/decimal.Decimal(7) #your operation
a = c
print("no fail")
except MemoryError:
print("fail")
dummy = 1
b = c
print(c)
del dummy
This is just halving intervals one step at a time and looks if an error occurs. Calling with max_iter=10 and a=int(1e9), b=int(1e11) gives:
>>> find_optimum(int(1e9), int(1e11), 10)
0
fail
50500000000
1
no fail
25750000000
2
no fail
38125000000
3
no fail
44312500000
4
fail
47406250000
5
fail
45859375000
6
no fail
45085937500
7
no fail
45472656250
8
no fail
45666015625
9
no fail
45762695312
This may give a rough idea of what you are dealing with. This took approx half an hour on i5-3470 and 16GB RAM so you really only would use it for testing purposes.
I don't think, that there is an actual exact way of getting the maximum precision for your operation, as you'd have to have exact knowledge of the dependency of your memory usage on memory consumption. I hope this helps you at least a bit and I would really like to know, what you need that kind of precision for.
EDIT I feel like this really needs to be added, since I read your comments under the top rated post here. Using arbitrarily high precision in this manner is not the way, that people calculate constants. You would program something, that utilizes disk space in a smart way (for example calcutating a bunch of digits in RAM and writing this bunch to a text file), but never only use RAM/swap only, because this will always limit your results. With modern algorithms to calculate pi, you don't need infinite RAM, you just put another 4TB hard drive in the machine and let it write the next digits. So far for mathematical constants.
Now for physical constants: They are not precise. They rely on measurement. I'm not quite sure atm (will edit) but I think the most exact physical constant has an error of 10**(-8). Throwing more precision at it, doesn't make it more exact, you just calculate more wrong numbers.
As an experiment though, this was a fun idea, which is why I even posted the answer in the first place.
The maximum precision of the Decimal class is a function of the memory on the device, so there's no good way to set it for the general case. Basically, you're allocating all of the memory on the machine to one variable to get the maximum precision.
If the mathematical operation supports it, long integers will give you unlimited precision. However, you are limited to whole numbers.
Addition, subtraction, multiplication, and simple exponents can be performed exactly with long integers.
Prior to Python 3, the built-in long data type would perform arbitrary precision calculations.
https://docs.python.org/2/library/functions.html#long
In Python >=3, the int data type now represents long integers.
https://docs.python.org/3/library/functions.html#int
One example of a 64-bit integer math is implementation is bitcoind, where transactions calculations require exact values. However, the precision of Bitcoin transactions is limited to 1 "Satoshi"; each Bitcoin is defined as 10^8 (integer) Satoshi.
The Decimal class works similarly under the hood. A Decimal precision of 10^-8 is similar to the Bitcoin-Satoshi paradigm.
From your reply above:
What if I just wanted to find more digits in pi than already found? what if I wanted to test the irrationality of e or mill's constant.
I get it. I really do. My one SO question, several years old, is about arbitrary-precision floating point libraries for Python. If those are the types of numerical representations you want to generate, be prepared for the deep dive. Decimal/FP arithmetic is notoriously tricky in Computer Science.
Some programmers, when confronted with a problem, think “I know, I’ll use floating point arithmetic.” Now they have 1.999999999997 problems. – #tomscott
I think when others have said it's a "mistake" or "it depends" to wonder what the max precision is for a Python Decimal type on a given platform, they're taking your question more literally than I'm guessing it was intended. You asked about the Python Decimal type, but if you're interested in FP arithmetic for educational purposes -- "to find more digits in pi" -- you're going to need more powerful, more flexible tools than Decimal or float. These built-in Python types don't even come close. Those are good enough for NASA maybe, but they have limits... in fact, the very limits you are asking about.
That's what multiple-precision (or arbitrary-precision) floating point libraries are for: arbitrarily-precise representations. Want to compute pi for the next 20 years? Python's Decimal type won't even get you through the day.
The fact is, multi-precision binary FP arithmetic is still kinda fringe science. For Python, you'll need to install the GNU MPFR library on your Linux box, then you can use the Python library gmpy2 to dive as deep as you like.
Then, the question isn't, "What's the max precision my program can use?"
It's, "How do I write my program so that it'll run until the electricity goes out?"
And that's a whole other problem, but at least it's restricted by your algorithm, not the hardware it runs on.

Does every simple mathematical operation use the same amount of power (as in, battery power)?

Recently I have been revising some of my old python codes, which are essentially loops of algebra, in order to have them execute faster, generally by eliminating some un-necessary operations. Often, changing the value of an entry in a list from 0 (as a python float, which I believe is a double by default) to the same value, which is obviously not necessary. Or, checking if a float is equal to something, when it MUST be that thing, because a preceeding "if" would not have triggered if it wasn't, or some other extraneous operation. This got me wondering about what will preserve my battery more, as I do a some of my coding on the bus where I can't plug my laptop in.
For example, which of the following two operations would be expected to use less battery power?
if b != 0: #b was assigned previously, and I know it is zero already
b = 0
or:
b = 0
The first one checks if b is zero, and it is, so it doesn't do the next part. The second one just assigns b to zero without bothering to check. I believe the first one is more time-efficient, as you don't have to change anything in memory. Is that correct, and if so, would it also be more power-efficient? Does "more time efficient" always imply "more power efficient"?
I suggest watching this talk by Chandler Carruth: "Efficiency with Algorithms, Performance with Data Structures"
He addresses the idea of "Power efficient instructions" at 4m 49s in the video. I agree with him, thinking about how much watt particular code consumes is useless. As he put it
Q: "How to save battery life?"
A: "Finish ruining the program".
Also, in Python you do not have low level control to be even thinking about low level problems like this. Use appropriate data structures and algorithms, and pray that Python interpreter will give you well optimized byte-code.
Does every simple mathematical operation use the same amount of power (as in, battery power)?
No. It's not the same to compute a two number addition than a fourier transform of a 20 megapixel photo.
I believe the first one is more time-efficient, as you don't have to change anything in memory. Is that correct, and if so, would it also be more power-efficient? Does "more time efficient" always imply "more power efficient"?
Yes. You are right on your intuitions but these are very trivial examples. And if you dig deeper you will get into uncharted territory of weird optimization that's quite difficult to grasp (e.g., see this question: Times two faster than bit shift?)
In general the more your code utilizes system resources the greater power those resources would use. However it is more useful to optimize your code based on time or size constraints instead of thinking about high level code in terms of power draw.
One way of doing this is Big O notation. In essence, Big O notation is a way of comparing the size and or runtime complexity of an algorithm. https://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/
A computer at its lowest level is large quantity of transistors which require power to change and maintain their state.
It would be extremely difficult to anticipate how much power any one line of python code would draw.
I once had questions like this. Still do sometimes. Here's the answer I wish someone told me earlier.
Summary
You are correct that generally, if your computer does less work, it'll use less power.
But we have to be really careful in figuring out which logical operations involve more work and which ones involve less work - in this case:
Reading vs writing memory is usually the same amount of work.
if and any other conditional execution also costs work.
Python's "simple operations" are not "simple operations" for the CPU.
But the idea you had is probably correct for some cases you had in mind.
If you're concerned about power consumption, measure where power is being used.
For some perspective: You're asking about which Python code costs you one more drop of water, but really in Python every operation costs a bucket and your whole Python program is using a river and your computer as a whole is using an ocean.
Direct Answers
Don't apply these answers to Python yet. Read the rest of the answer first, because there's so much indirection between Python and the CPU that you'll mislead yourself about how they're connected if you don't take that into account.
I believe the first one is more time-efficient, as you don't have to change anything in memory.
As a general rule, reading memory is just as slow as writing to memory, or even slower depending on exactly what your computer is doing. For further reading you'll want to look into CPU memory cache levels, memory access times, and how out-of-order execution and data dependencies factor into modern CPU architectures.
As a general rule, the if statement in a language is itself an operation which can have a non-negligible cost. For further reading you should look into how CPU pipelining relates to branch prediction and branch penalties. Also look into how if statements are implemented in typical CPU instruction sets.
Does "more time efficient" always imply "more power efficient"?
As a general rule, more work efficient (doing less work - less machine instructions, for example) implies more power efficient, because on modern hardware (this wasn't always this way) your hardware will use less power when it's not doing anything.
You should be careful about the idea of "more time efficient" though, because modern hardware doesn't always execute the same amount of work in the same amount of time: for further reading you'll want to look into CPU frequency scaling, ARM's big.LITTLE architectures, and discussions about the "Race to Idle" concept as a starting point.
"One Simple Operation" - CPU vs. Python
Your question is about Python, so it's important to realize that Python's x != 0, if, and x = 0 do not map directly to simple operations in the CPU.
For further reading, especially if you're familiar with C, I would recommend taking a long look at how Python is implemented. There are many implementations - the main one is CPython, which is a C program that reads and interprets Python source, converts it into Python "bytecode" and then when running interprets that bytecode one by one.
As a baseline, if you're using Python, any one "simple" operation is actually a lot of CPU operations, as each step in the Python interpreter is multiple CPU operations, but which ones cost more might be surprising.
Let's break down the three used in our example (I'm primarily describing this from the perspective of the main Python implementation written in C, called "CPython", which I am the most closely familiar with, but in general this explanation is roughly applicable to all of them, though some will be able to optimize out certain steps):
x != 0
It looks like a simple operation, and if this was C and x was an int it would be just one machine instruction - but Python allows for operator overloading, so first Python has to:
look up x (at least one memory read, but may involve one or more hashmap lookups in Python's internals, which is many machine operations),
check the type of x (more memory reading),
based on the type look up a function pointer that implements the not-equality operation (one or arbitrarily many memory reads and one or more arbitrarily many conditional branches, with data dependencies between them),
only then it can finally call that function with references to Python objects holding the values of x and 0 (which is also not "free" - look up function calling ABI for more on that).
All that and more has to be done by the CPU even if x is a Python int or float mapping closely to the CPU's native numerical data types.
x = 0
An assignment is actually far cheaper in Python (though still not trivial): it only has to get as far as step 1 above, because once it knows "where" x is, it can just overwrite that pointer with the pointer to the Python object representing 0.
if
Abstractly speaking, the Python if statement has to be able to handle "truthy" and "falsey" values, which in the most naive implementation would involves running through more CPU instructions to evaluate what result of the condition is according to Python's semantics of what's true and what's false.
Sidenote About Optimizations
Different Python implementations go to different lengths to get Python operations closer to as few CPU operations in possible. For example, an optimizing JIT (Just In Time) compiler might notice that, inside some loop on an array, all elements of the array are native integers and actually reduce the if x != 0 and x = 0 parts into their respective minimal machine instructions, but that only happens in very specific circumstances when the optimizing logic can prove that it can safely bypass a lot of the behavior it would normally need to do.
The biggest thing here is this: a high-level language like Python is so removed from the hardware that "simple" operations are often complex "under the covers".
What You Asked vs. What I Think You Wanted To Ask
Correct me if I'm wrong, but I suspect the use-case you actually had in mind was this:
if x != 0:
# some code
x = 0
vs. this:
if x != 0:
# some code
x = 0
In that case, the first option is superior to the second, because you are already paying the cost of if x != 0 anyway.
Last Point of Emphasis
The hardest breakthrough for me was moving away from trying to reason about individual instructions in my head, and instead switching into looking at how things work and measuring real systems.
Looking at how things work will teach you how to optimize, but measuring will show you where to optimize.
This question is great for exploring the former, but for your stated motivation of reducing power consumption on your laptop, you would benefit more from the latter.

Should a convert a long binary string before operation on it

In a question I should deal with long inputs given as binaries. Like
"1000101011111101010100100101010101010101"
I am required to use the bitwise opertator OR | in this question. I have researched the use of this operator and it seems to work on regular integers not binaries. So I call int(thing, 2) on it. After that, I use the bitwise operator. However something troubles me. Isn't the python interpreter changes it back to binary again to apply Bitwise OR on it ? So isn't it seems like a repeated step ?
Is there no other way to directly use this string, maybe an iteration over all the letters is a better approach ? There is also another problem that about integer precision. Because sometimes the input is larger than 500 characters so I can't store it as an integer.
I tried something like this, Imagine a and b are two binary strings.
for comparison in zip(a, b):
if any(comparison):
# Do stuff if OR gives 1
This is proven to be very slow indeed. Please enlighten me.
Thanks in advance.
Firstly definitely use int(binary_string, 2) any other method will take longer.
(although the for loop using zip and any is quite clever, however not optimal)
Python interpreter will not change your number back to binary as the computer already stores the number as binary in memory, it will use the CPU instruction for OR on the 2 numbers without converting them first. No repeated step.

How can I use python to calculate very large numbers?

Like, really, really large numbers..
I'm trying out a variation of the fiboncci series (most significant variation being it squares each term before feeding it in again, although there are a few other modifications as well.), and I need to obtain a particular term whose value is too large for python to handle. I'm talking like well over a thousand digits, probably more. The program just starts and does nothing at all.
Is there any way I can use python to print such massive numbers, or can it be done with JavaScript (preferred) or any other language?
Program in question:
g=[0 for y in range(31)]
g[0]=0
g[1]=1
for x in range(2,31):
g[x]=pow((g[x-1]+g[x-2]),2)
print(g[30])
your program does nothing because it has probably consumed all the memory. As far as python, it can handle very large numbers. Check this link:
https://www.python.org/dev/peps/pep-0237/

Emulating Matlab fixed point number behavior in Python

I have a Matlab script that is old and decrepit, and I am trying to rewrite parts of it in Python. Unfortunately, not only am I unfamiliar with Matlab, but the script was written some 5-6 years ago, so there is no one that fully understands what it does or how it does it. For now the line I am interested in is this:
% Matlab
[rawData,count,errorMsg] = fscanf(serialStream, '%f')
Now, I incorrectly tried to do that as:
# Python
rawData = []
count = 0
while True:
rawData.append(struct.unpack('f', ser.read(4))[0])
count += 1
However, this prints out completely garbage values. Upon further research, I learned that, in Matlab, %f does not mean float like it does in any sensible language, but fixed point number. As such, it makes sense that my data looked like garbage.
Through trial and error, I have determined that I should be getting blocks of 156 bytes from a serial port. However, I am unsure of how many values that translates to, as I can't find documentation that explains how large fixed point numbers are in Matlab (this says they can be up to 128 bits, but that's not very helpful). I have also found the python library decimal, and it seems like I would want to form them from the constituent parts (i.e. provide sign, digits and exponent), but I'm not sure how those are stored in the stream of data I'm getting.
Is there a good way of making a fixed point number from a binary stream in Python? Or do I have to look up the implementation in Matlab and recreate it? Perhaps there's a better way of doing what I want to do?

Categories

Resources