Like, really, really large numbers..
I'm trying out a variation of the fiboncci series (most significant variation being it squares each term before feeding it in again, although there are a few other modifications as well.), and I need to obtain a particular term whose value is too large for python to handle. I'm talking like well over a thousand digits, probably more. The program just starts and does nothing at all.
Is there any way I can use python to print such massive numbers, or can it be done with JavaScript (preferred) or any other language?
Program in question:
g=[0 for y in range(31)]
g[0]=0
g[1]=1
for x in range(2,31):
g[x]=pow((g[x-1]+g[x-2]),2)
print(g[30])
your program does nothing because it has probably consumed all the memory. As far as python, it can handle very large numbers. Check this link:
https://www.python.org/dev/peps/pep-0237/
Related
I am using scipy.stats.mode to calculate the mode of a list of numbers. The mode gets calculated very quickly when the numbers are a bunch of floats (using the built-in float) but is much slower when the numbers are a bunch of decimal.Decimals. What's a fast (or faster) way to calculate the mode of a bunch of Decimals?
First, Decimals are inherently slower than floats, because all of the logic is implemented in Python with a bit of C acceleration, instead of in custom circuitry on your CPU.
Plus, if you've put them in a NumPy array, NumPy doesn't know anything about the Decimal type, so it has to store them as drtype=object, which means references to normal Python objects that have to be unboxed for every operation. By contrast, float values can be stored with dtype=float, which means they're just raw IEEE doubles that can be used directly (or even CPU-vectorized to process multiple values at a time).
So, I'd expect it to be about an order of magnitude or so slower. And when I run a quick test, it takes a bit over as long.
Second, scipy.stats.mode is inherently slow if you have a lot of unique elements.
And, even if you don't, it's still doing extra work for extra features you may not need.
Anyway, you don't need to do any math to calculate the mode, just compare values for equality.
And we're not getting any benefit out of NumPy anywhere else.
So, simpler, less-powerful, non-NumPy solutions might actually be faster.
If you actually need any the features of scipy.stats.mode, that doesn't help you. For example, it can return multiple results if there are equally-common values; it gives the mode's count as well as the mode; it knows how to skip over NaN values instead of just telling you the mode is NaN; etc.
If you need any of the scipy features, you might want to consider building a mode replacement out of find_repeats, as described here. This seems to be roughly 5x as fast as mode even in a fair case, and to not degenerate when there are tons of uniques. So, even adding the 10x cost for using Decimal, it still ends up pretty fast.
But if you don't need them?
statistics.mode(a) is actually slightly faster than scipy.stats.mode even on the fair case even on an array of floats. And, instead of taking 10x as long if you give it Decimals, it takes about the same amount of time.
collections.Counter(a).most_common(1) is only about 50% slower than statistics.mode, and again doesn't slow down with Decimals.
The point is, either of the obvious stdlib solutions outperforms scipy on Decimal values, by about 10x or 7x, on what seemed like a fair test. And if I craft a worst-case-for-scipy test where almost all of the values are unique, scipy.stats.mode becomes roughly 10x slower again, while the plain Python solutions don't slow down at all.
Anyway, for this case, the times are much more sensitive to details of the input data than for most. So, instead of posting benchmarks with a caveat that you really want to test against your actual data (knowing that half the readers aren't going to and are just going to take my benchmarks as meaningful), I'm going to keep my benchmarks to myself and insist that you really, really, really want to test against your own actual data.
Recently I have been revising some of my old python codes, which are essentially loops of algebra, in order to have them execute faster, generally by eliminating some un-necessary operations. Often, changing the value of an entry in a list from 0 (as a python float, which I believe is a double by default) to the same value, which is obviously not necessary. Or, checking if a float is equal to something, when it MUST be that thing, because a preceeding "if" would not have triggered if it wasn't, or some other extraneous operation. This got me wondering about what will preserve my battery more, as I do a some of my coding on the bus where I can't plug my laptop in.
For example, which of the following two operations would be expected to use less battery power?
if b != 0: #b was assigned previously, and I know it is zero already
b = 0
or:
b = 0
The first one checks if b is zero, and it is, so it doesn't do the next part. The second one just assigns b to zero without bothering to check. I believe the first one is more time-efficient, as you don't have to change anything in memory. Is that correct, and if so, would it also be more power-efficient? Does "more time efficient" always imply "more power efficient"?
I suggest watching this talk by Chandler Carruth: "Efficiency with Algorithms, Performance with Data Structures"
He addresses the idea of "Power efficient instructions" at 4m 49s in the video. I agree with him, thinking about how much watt particular code consumes is useless. As he put it
Q: "How to save battery life?"
A: "Finish ruining the program".
Also, in Python you do not have low level control to be even thinking about low level problems like this. Use appropriate data structures and algorithms, and pray that Python interpreter will give you well optimized byte-code.
Does every simple mathematical operation use the same amount of power (as in, battery power)?
No. It's not the same to compute a two number addition than a fourier transform of a 20 megapixel photo.
I believe the first one is more time-efficient, as you don't have to change anything in memory. Is that correct, and if so, would it also be more power-efficient? Does "more time efficient" always imply "more power efficient"?
Yes. You are right on your intuitions but these are very trivial examples. And if you dig deeper you will get into uncharted territory of weird optimization that's quite difficult to grasp (e.g., see this question: Times two faster than bit shift?)
In general the more your code utilizes system resources the greater power those resources would use. However it is more useful to optimize your code based on time or size constraints instead of thinking about high level code in terms of power draw.
One way of doing this is Big O notation. In essence, Big O notation is a way of comparing the size and or runtime complexity of an algorithm. https://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/
A computer at its lowest level is large quantity of transistors which require power to change and maintain their state.
It would be extremely difficult to anticipate how much power any one line of python code would draw.
I once had questions like this. Still do sometimes. Here's the answer I wish someone told me earlier.
Summary
You are correct that generally, if your computer does less work, it'll use less power.
But we have to be really careful in figuring out which logical operations involve more work and which ones involve less work - in this case:
Reading vs writing memory is usually the same amount of work.
if and any other conditional execution also costs work.
Python's "simple operations" are not "simple operations" for the CPU.
But the idea you had is probably correct for some cases you had in mind.
If you're concerned about power consumption, measure where power is being used.
For some perspective: You're asking about which Python code costs you one more drop of water, but really in Python every operation costs a bucket and your whole Python program is using a river and your computer as a whole is using an ocean.
Direct Answers
Don't apply these answers to Python yet. Read the rest of the answer first, because there's so much indirection between Python and the CPU that you'll mislead yourself about how they're connected if you don't take that into account.
I believe the first one is more time-efficient, as you don't have to change anything in memory.
As a general rule, reading memory is just as slow as writing to memory, or even slower depending on exactly what your computer is doing. For further reading you'll want to look into CPU memory cache levels, memory access times, and how out-of-order execution and data dependencies factor into modern CPU architectures.
As a general rule, the if statement in a language is itself an operation which can have a non-negligible cost. For further reading you should look into how CPU pipelining relates to branch prediction and branch penalties. Also look into how if statements are implemented in typical CPU instruction sets.
Does "more time efficient" always imply "more power efficient"?
As a general rule, more work efficient (doing less work - less machine instructions, for example) implies more power efficient, because on modern hardware (this wasn't always this way) your hardware will use less power when it's not doing anything.
You should be careful about the idea of "more time efficient" though, because modern hardware doesn't always execute the same amount of work in the same amount of time: for further reading you'll want to look into CPU frequency scaling, ARM's big.LITTLE architectures, and discussions about the "Race to Idle" concept as a starting point.
"One Simple Operation" - CPU vs. Python
Your question is about Python, so it's important to realize that Python's x != 0, if, and x = 0 do not map directly to simple operations in the CPU.
For further reading, especially if you're familiar with C, I would recommend taking a long look at how Python is implemented. There are many implementations - the main one is CPython, which is a C program that reads and interprets Python source, converts it into Python "bytecode" and then when running interprets that bytecode one by one.
As a baseline, if you're using Python, any one "simple" operation is actually a lot of CPU operations, as each step in the Python interpreter is multiple CPU operations, but which ones cost more might be surprising.
Let's break down the three used in our example (I'm primarily describing this from the perspective of the main Python implementation written in C, called "CPython", which I am the most closely familiar with, but in general this explanation is roughly applicable to all of them, though some will be able to optimize out certain steps):
x != 0
It looks like a simple operation, and if this was C and x was an int it would be just one machine instruction - but Python allows for operator overloading, so first Python has to:
look up x (at least one memory read, but may involve one or more hashmap lookups in Python's internals, which is many machine operations),
check the type of x (more memory reading),
based on the type look up a function pointer that implements the not-equality operation (one or arbitrarily many memory reads and one or more arbitrarily many conditional branches, with data dependencies between them),
only then it can finally call that function with references to Python objects holding the values of x and 0 (which is also not "free" - look up function calling ABI for more on that).
All that and more has to be done by the CPU even if x is a Python int or float mapping closely to the CPU's native numerical data types.
x = 0
An assignment is actually far cheaper in Python (though still not trivial): it only has to get as far as step 1 above, because once it knows "where" x is, it can just overwrite that pointer with the pointer to the Python object representing 0.
if
Abstractly speaking, the Python if statement has to be able to handle "truthy" and "falsey" values, which in the most naive implementation would involves running through more CPU instructions to evaluate what result of the condition is according to Python's semantics of what's true and what's false.
Sidenote About Optimizations
Different Python implementations go to different lengths to get Python operations closer to as few CPU operations in possible. For example, an optimizing JIT (Just In Time) compiler might notice that, inside some loop on an array, all elements of the array are native integers and actually reduce the if x != 0 and x = 0 parts into their respective minimal machine instructions, but that only happens in very specific circumstances when the optimizing logic can prove that it can safely bypass a lot of the behavior it would normally need to do.
The biggest thing here is this: a high-level language like Python is so removed from the hardware that "simple" operations are often complex "under the covers".
What You Asked vs. What I Think You Wanted To Ask
Correct me if I'm wrong, but I suspect the use-case you actually had in mind was this:
if x != 0:
# some code
x = 0
vs. this:
if x != 0:
# some code
x = 0
In that case, the first option is superior to the second, because you are already paying the cost of if x != 0 anyway.
Last Point of Emphasis
The hardest breakthrough for me was moving away from trying to reason about individual instructions in my head, and instead switching into looking at how things work and measuring real systems.
Looking at how things work will teach you how to optimize, but measuring will show you where to optimize.
This question is great for exploring the former, but for your stated motivation of reducing power consumption on your laptop, you would benefit more from the latter.
I am wondering how the "+" operator works in python, or indeed how any of the basic arithmetic operators work. My knowledge is very limited with regards to this topic, so I hope this isn't a repeat of a question already here.
More specifically, I would like to know how this code:
a = 5
b = 2
c = a + b
print (c)
produces the result of c = 7 when ran. How does the computer perform this operation? I found a thread on Reddit explaining how the computer performs the calculation in binary (https://www.reddit.com/r/askscience/comments/1oqxfr/how_do_computers_do_math/) which I can understand. What I fail to comprehend however is how the computer knows how to convert the values of 5 and 2 into binary and then perform the calculation. Is there a set formula for doing this for all integers or base 10 numbers? Or is there something else happening at a deeper hardware level here?
Again I'm sorry if this a repeat or if the question seems completely silly, I just can't seem to understand how python can take any two numbers and then sum them, add them, divide them or multiply them. Cheers.
The numbers are always in binary. The computer just isn't capable of keeping then in a different numerical system (well, there are ternary computers but these are a rare exception). The decimal system is just used for a "human representation", so that it is easier to read, but all the symbols (including the symbol "5" in the file, it's just a character) are mapped to numbers through some encoding (e. g. ASCII). These numbers are, of course in binary, just the computer knows (through the specification of the encoding) that if there is a 1000001 in a context of some string of characters, it has to display the symbol a (in the case of ASCII). That's it, the computer don't know the number 58, for it, these are just two symbols and are kept in the memory as ones and zeros.
Now, memory. This is where it's getting interesting. All the instructions and the data are kept in one place as a large buffer of ones and zeros. These are passed to the CPU which (using its instruction set) knows what the first chunk of ones and zeros (this is what we call a "word") means. The first word is an instruction, then the argument(s) follow. Depending on the instruction different things happen. Ok, what happens if the instruction means "add these two numbers" and store the result here?
Well, now it's a hardware job. Adding binary numbers isn't that complicated, it's explained in the link you provided. But how the CPU knows that this is the algorithm and how to execute it? Well, it uses a bunch of "full-adders". What is a "full-adder"? This is a hardware circuit that by given two inputs (each one of them is one bit, i. e. either one or zero) "adds" them and outputs the result to two other bits (one of which it uses for carry). But how the full-adder works? Well, it is constructed (physically) by half-adders, which are constructed by standard and and xor gates. If you're familiar with similar operators (& and ^ in Python) you probably know how they work. These gates are designed to work as expected using the physical properties of the elements (the most important of them being silicon) used in the electronic components. And I think this is where I'll stop.
I have a Matlab script that is old and decrepit, and I am trying to rewrite parts of it in Python. Unfortunately, not only am I unfamiliar with Matlab, but the script was written some 5-6 years ago, so there is no one that fully understands what it does or how it does it. For now the line I am interested in is this:
% Matlab
[rawData,count,errorMsg] = fscanf(serialStream, '%f')
Now, I incorrectly tried to do that as:
# Python
rawData = []
count = 0
while True:
rawData.append(struct.unpack('f', ser.read(4))[0])
count += 1
However, this prints out completely garbage values. Upon further research, I learned that, in Matlab, %f does not mean float like it does in any sensible language, but fixed point number. As such, it makes sense that my data looked like garbage.
Through trial and error, I have determined that I should be getting blocks of 156 bytes from a serial port. However, I am unsure of how many values that translates to, as I can't find documentation that explains how large fixed point numbers are in Matlab (this says they can be up to 128 bits, but that's not very helpful). I have also found the python library decimal, and it seems like I would want to form them from the constituent parts (i.e. provide sign, digits and exponent), but I'm not sure how those are stored in the stream of data I'm getting.
Is there a good way of making a fixed point number from a binary stream in Python? Or do I have to look up the implementation in Matlab and recreate it? Perhaps there's a better way of doing what I want to do?
I find myself constantly having to change and adapt old code back and forth repeatedly for different purposes, but occasionally to implement the same purpose it had two versions ago.
One example of this is a function which deals with prime numbers. Sometimes what I need from it is a list of n primes. Sometimes what I need is the nth prime. Maybe I'll come across a third need from the function down the road.
Any way I do it though I have to do the same processes but just return different values. I thought there must be a better way to do this than just constantly changing the same code. The possible alternatives I have come up with are:
Return a tuple or a list, but this seems kind of messy since there will be all kinds of data types within including lists of thousands of items.
Use input statements to direct the code, though I would rather just have it do everything for me when I click run.
Figure out how to utilize class features to return class properties and access them where I need them. This seems to be the cleanest solution to me, but I am not sure since I am still new to this.
Just make five versions of every reusable function.
I don't want to be a bad programmer, so which choice is the correct choice? Or maybe there is something I could do which I have not thought of.
Modular, reusable code
Your question is indeed important. It's important in a programmers everyday life. It is the question:
Is my code reusable?
If it's not, you will run into code redundancies, having the same lines of code in more than one place. This is the best starting point for bugs. Imagine you want to change the behavior somehow, e.g., because you discovered a potential problem. Then you change it in one place, but you will forget the second location. Especially if your code reaches dimensions like 1,000, 10,0000 or 100,000 lines of code.
It is summarized in the SRP, the Single-Responsibilty-Principle. It states that every class (also applicable to functions) should only have one determination, that it "should do just one thing". If a function does more than one thing, you should break it apart into smaller chunks, smaller tasks.
Every time you come across (or write) a function with more than 10 or 20 lines of (real) code, you should be skeptical. Such functions rarely stick to this principle.
For your example, you could identify as individual tasks:
generate prime numbers, one by one (generate implies using yield for me)
collect n prime numbers. Uses 1. and puts them into a list
get nth prime number. Uses 1., but does not save every number, just waits for the nth. Does not consume as much memory as 2. does.
Find pairs of primes: Uses 1., remembers the previous number and, if the difference to the current number is two, yields this pair
collect all pairs of primes: Uses 4. and puts them into a list
...
...
The list is extensible, and you can reuse it at any level. Every function will not have more than 10 lines of code, and you will not be reinventing the wheel everytime.
Put them all into a module, and use it from every script for an Euler Problem related to primes.
In general, I started a small library for my Euler Problem scripts. You really can get used to writing reusable code in "Project Euler".
Keyword arguments
Another option you didn't mention (as far as I understand) is the use of optional keyword arguments. If you regard small, atomic functions as too complicated (though I really insist you should get used to it) you could add a keyword argument to control the return value. E.g., in some scipy functions there is a parameter full_output, that takes a bool. If it's False (default), only the most important information is returned (e.g., an optimized value), if it's True some supplementary information is returned as well, e.g., how well the optimization performed and how many iterations it took to converge.
You could define a parameter output_mode, with possible values "list", "last" ord whatever.
Recommendation
Stick to small, reusable chunks of code. Getting used to this is one of the most valuable things you can pick up at "Project Euler".
Remark
If you try to implement the pattern I propose for reusable functions, you might run into a problem immediately at point 1: How to create a generator-style function for this? E.g., if you use the sieve method. But it's not too bad.
My guess, create module that contain:
private core function (example: return list of n-th first primes or even something more generall)
several wrapper/util functions that use core one and prepare output different ways. (example: n-th prime number)
Try to reduce your functions as much as possible, and reuse them.
For example you might have a function next_prime which is called repeatedly by n_primes and n_th_prime.
This also makes your code more maintainable, as if you come up with a more efficient way to count primes, all you do is change the code in next_prime.
Furthermore you should make your output as neutral as possible. If you're function returns several values, it should return a list or a generator, not a comma separated string.