Memory Limit Exceed error in python

Memory Limit Exceed error in python - python

sum=0
n=int(input())
x = list(map(int, input().rstrip().split()))
for i in range(n):
str1=str(2**x[i])
if(len(str1)>2):
str1[len(str1)-2 :]
else:
str1
sum+=int(str1)
print(sum%100)
Input
3
1 2 3
Output
14
Constraints
1<=n<=10^7
0<=x<=10^18
This code is working fine for small values,like n=4 & x=8,7,6,4 and output is=64.but not for the given constraints.

Well... since you do
2**x[i]
which results in an integer value, that will most definately fail when x get's large enough.
You can do the math:
Assuming your computer has 8GB of main memory, that would be 8*8*1024^3=68719476736 bits of memory. So the single largest value that would span your entire(not possible to get this large, but I just want to highlight what the physical limit is here) memory, would be 2**68719476736 which lies somewhere between 2^(10^10) to 2^(10^11)
So as it stands, I think your code cannot work for numbers this large.

Related

How to solve this problem without using a strucure? (Getting Time Limit Exceeded)

Recently I took part in a competition for middle school girls. I ran across this problem and I have been working on it for a few weeks. Here is the problem:
I. Ventilator Shipments
At the local hospital, Gabriela keeps track of all the ventilator shipments. Recently, a new factory has been established to produce ventilators. She knows that the new factory is almost extraordinary in its production, as on a certain day Di, it produces the same amount of ventilators as the product of the previous K days' production. However, the hospital's computer can only handle non-negative numbers less than P, a prime number. Gabriela knows the production value, Di, for each of the first K days. Accordingly, Gabriela wants to know how many ventilators are produced after N days. If this number is greater than or equal to P, the computer displays the remainder of the number of ventilators produced divided by P.
Input
Line 1: Three space-separated integers N, K, P
Lines 2...K+1: A single integer Di
Output
Line 1: Number of ventilators produced after N days as displayed by the computer
Example Input:
5 2 7
1
3
Output:
6
Note:
2 ≤ N ≤ 1000000
1 ≤ K ≤ N
2 ≤ P ≤ 1000003 (where P is guaranteed to be prime)
1 ≤ Di ≤ P−1
The time limit for this problem has been extended to 2000 ms.
I have tried 3 different methods
Here is the first:
import math
import sys
string=sys.stdin.readline()
string=string.rstrip()
arr=[0]*3
arr=string.split(' ')
n=int(arr[0])
k=int(arr[1])
p=int(arr[2])
mylist=[0]*k
for i in range (k):
a=int(sys.stdin.readline())
mylist[i]=a%p
product=math.prod(mylist)
for start in range (n-k):
smallest=mylist[start%k]
mylist[start%k]=(product%p)
product=product*(product%p)
product=product//smallest
sys.stdout.write (str(mylist[start%k]))
In another method I used a queue:
import math
from collections import deque
import sys
string=sys.stdin.readline()
string=string.rstrip()
arr=[0]*3
arr=string.split(' ')
n=int(arr[0])
k=int(arr[1])
p=int(arr[2])
q=deque()
for i in range (k):
a=int(sys.stdin.readline())
q.append(a%p)
product=math.prod(q)
for i in range (n-k):
q.append(product%p)
product=product*(product%p)
smallest=q.popleft()
product=product//smallest
sys.stdout.write (str(q.pop())+'\n')
However, I'm still getting time limit exceeded on test cast 8. Given the time and space constraints, I don't think I can any kind of structure (list, queue, etc.) to solve this problem. Can someone give me an idea on how to solve this problem?

The problem is not with your data structures so much as your algorithmic overhead. Your first attempt includes a multiplication and five divisions in each loop, plus two list accesses and four assignments. Your second attempt has three divisions, three assignments, and two list-changing operations.
You might want to experiment a little to determine roughly how many operations you can perform in 2 seconds. How long does it take you to run 10*6 loop iterations with a trivial body? I suspect that you're not going be able to carry out an iterative solution.
Instead of carrying out each iterative computation individually, try focusing on the problem as given. You do not need each day's output; you need only to compute the final day's output, modulo p. That production is a high-order product of the input production sequence (the "seed" days of production). How many times does each of those days appear in that final product? For large n, what is the cycle of values produced? Most importantly, what factor gives you a modular residue of 1? (It's p-1 for any factor)
Compute how many times each factor appears in the final product; call it use. Reduce that mod p-1. Now you have an expression such as
product = k[0] ** (use[0] % (p-1) ) *
k[1] ** (use[1] % (p-1) ) *
...
print(product % p)

How can I minimize code here so that its more memory efficient in python3?

I was going through problems on www.spoj.com and the prime generator problem caught my attention as a good warmup problem to get started.
The problem is straight forward and easy to code. But, every time I submit my code I get a run time error. I'm guessing it's a code efficiency problem and I would love to expand my knowledge on what is memory efficient and what is not.
https://www.spoj.com/problems/PRIME1/
import numpy as np
run_times = int(input())
for i in range(run_times):
search_range = list(map(int, input().split()))
for j in range(search_range[0], search_range[1]):
a = j%np.arange(2, j)
if 0 in a:
pass
elif 0 not in a and j>0 :
print(j)

This is far from optimal, but is a simple rework that should improve your code speed siginificantly and fix a couple of bugs:
import numpy as np
run_times = int(input())
for _ in range(run_times):
start, stop = map(int, input().split())
for number in range(max(start, 2), stop + 1):
array = number % np.arange(2, int(number ** 0.5) + 1)
if 0 not in array:
print(number)
Bugs-wise, your code outputs 1 as a prime but it shouldn't. Also, given the range '3 5' your code outputs '3' but the example shows '3 5' so you don't have your end point set correctly. Finally, your j > 0 comes late in your code (after you've done work you'll end up throwing away) and isn't necessary as the problem indicates a minimum of 1 as input.
Speed-wise, let's consider a worst case based on the problem description:
1
999900000 1000000000
My revised code should solve this in just under a minute. I don't believe yours will have even found it's first prime in that amount of time. If this isn't fast enough, there's more that could be done. (E.g treat 2 as a special case and only compute with odd numbers; reuse calculations instead of tossing them, etc.)
None of this is about memory efficiency -- it's all about using a correct algorithm.

Pseudorandom Algorithm for VERY Large (10^1.2mil) Numbers?

I'm looking for a pseudo-random number generator (an algorithm where you input a seed number and it outputs a different 'random-looking' number, and the same seed will always generate the same output) for numbers between 1 and 951,312,000.
I would use the Linear Feedback Shift Register (LFSR) PRNG, but if I did, I would have to convert the seed number (which could be up to 1.2 million digits long in base-10) into a binary number, which would be so massive that I think it would take too long to compute.
In response to a similar question, the Feistel cipher was recommended, but I didn't understand the vocabulary of the wiki page for that method (I'm going into 10th grade so I don't have a degree in encryption), so if you could use layman's terms, I would strongly appreciate it.
Is there an efficient way of doing this which won't take until the end of time, or is this problem impossible?
Edit: I forgot to mention that the prng sequence needs to have a full period. My mistake.

A simple way to do this is to use a linear congruential generator with modulus m = 95^1312000.
The formula for the generator is x_(n+1) = a*x_n + c (mod m). By the Hull-Dobell Theorem, it will have full period if and only if gcd(m,c) = 1 and 95 divides a-1. Furthermore, if you want good second values (right after the seed) even for very small seeds, a and c should be fairly large. Also, your code can't store these values as literals (they would be much too big). Instead, you need to be able to reliably produce them on the fly. After a bit of trial and error to make sure gcd(m,c) = 1, I hit upon:
import random
def get_book(n):
random.seed(1941) #Borges' Library of Babel was published in 1941
m = 95**1312000
a = 1 + 95 * random.randint(1, m//100)
c = random.randint(1, m - 1) #math.gcd(c,m) = 1
return (a*n + c) % m
For example:
>>> book = get_book(42)
>>> book % 10**100
4779746919502753142323572698478137996323206967194197332998517828771427155582287891935067701239737874
shows the last 100 digits of "book" number 42. Given Python's built-in support for large integers, the code runs surprisingly fast (it takes less than 1 second to grab a book on my machine)

If you have a method that can produce a pseudo-random digit, then you can concatenate as many together as you want. It will be just as repeatable as the underlying prng.
However, you'll probably run out of memory scaling that up to millions of digits and attempting to do arithmetic. Normally stuff on that scale isn't done on "numbers". It's done on byte vectors, or something similar.

Fastest way to Factor (Prime-1)/2 for 64-bit Prime?

I'm trying to gather some statistics on prime numbers, among which is the distribution of factors for the number (prime-1)/2. I know there are general formulas for the size of factors of uniformly selected numbers, but I haven't seen anything about the distribution of factors of one less than a prime.
I've written a program to iterate through primes starting at the first prime after 2^63, and then factor the (prime - 1)/2 using trial division by all primes up to 2^32. However, this is extremely slow because that is a lot of primes (and a lot of memory) to iterate through. I store the primes as a single byte each (by storing the increment from one prime to the next). I also use a deterministic variant of the Miller-Rabin primality test for numbers up to 2^64, so I can easily detect when the remaining value (after a successful division) is prime.
I've experimented using a variant of pollard-rho and elliptic curve factorization, but it is hard to find the right balance of between trial division and switching to these more complicated methods. Also I'm not sure I've implemented them correctly, because sometimes they seem to take a very lone time to find a factor, and based on their asymptotic behavior, I'd expect them to be quite quick for such small numbers.
I have not found any information on factoring many numbers (vs just trying to factor one), but it seems like there should be some way to speed up the task by taking advantage of this.
Any suggestions, pointers to alternate approaches, or other guidance on this problem is greatly appreciated.
Edit:
The way I store the primes is by storing an 8-bit offset to the next prime, with the implicit first prime being 3. Thus, in my algorithms, I have a separate check for division by 2, then I start a loop:
factorCounts = collections.Counter()
while N % 2 == 0:
factorCounts[2] += 1
N //= 2
pp = 3
for gg in smallPrimeGaps:
if pp*pp > N:
break
if N % pp == 0:
while N % pp == 0:
factorCounts[pp] += 1
N //= pp
pp += gg
Also, I used a wheel sieve to calculate the primes for trial division, and I use an algorithm based on the remainder by several primes to get the next prime after the given starting point.
I use the following for testing if a given number is prime (porting code to c++ now):
bool IsPrime(uint64_t n)
{
if(n < 341531)
return MillerRabinMulti(n, {9345883071009581737ull});
else if(n < 1050535501)
return MillerRabinMulti(n, {336781006125ull, 9639812373923155ull});
else if(n < 350269456337)
return MillerRabinMulti(n, {4230279247111683200ull, 14694767155120705706ull, 1664113952636775035ull});
else if(n < 55245642489451)
return MillerRabinMulti(n, {2ull, 141889084524735ull, 1199124725622454117, 11096072698276303650});
else if(n < 7999252175582851)
return MillerRabinMulti(n, {2ull, 4130806001517ull, 149795463772692060ull, 186635894390467037ull, 3967304179347715805ull});
else if(n < 585226005592931977)
return MillerRabinMulti(n, {2ull, 123635709730000ull, 9233062284813009ull, 43835965440333360ull, 761179012939631437ull, 1263739024124850375ull});
else
return MillerRabinMulti(n, {2ull, 325ull, 9375ull, 28178ull, 450775ull, 9780504ull, 1795265022ull});
}

I don't have a definitive answer, but I do have some observations and some suggestions.
There are about 2*10^17 primes between 2^63 and 2^64, so any program you write is going to run for a while.
Let's talk about a primality test for numbers in the range 2^63 to 2^64. Any general-purpose test will do more work than you need, so you can speed things up by writing a special-purpose test. I suggest strong-pseudoprime tests (as in Miller-Rabin) to bases 2 and 3. If either of those tests shows the number is composite, you're done. Otherwise, look up the number (binary search) in a table of strong-pseudoprimes to bases 2 and 3 (ask Google to find those tables for you). Two strong pseudoprime tests followed by a table lookup will certainly be faster than the deterministic Miller-Rabin test you are currently performing, which probably uses six or seven bases.
For factoring, trial division to 1000 followed by Brent-Rho until the product of the known prime factors exceeds the cube root of the number being factored ought to be fairly fast, a few milliseconds. Then, if the remaining cofactor is composite, it will necessarily have only two factors, so SQUFOF would be a good algorithm to split them, faster than the other methods because all the arithmetic is done with numbers less than the square root of the number being factored, which in your case means the factorization could be done using 32-bit arithmetic instead of 64-bit arithmetic, so it ought to be fast.
Instead of factoring and primality tests, a better method uses a variant of the Sieve of Eratosthenes to factor large blocks of numbers. That will still be slow, as there are 203 million sieving primes less than 2^32, and you will need to deal with the bookkeeping of a segmented sieve, but considering that you factor lots of numbers at once, it's probably the best approach to your task.
I have code for everything mentioned above at my blog.

This is how I store primes for later:
(I'm going to assume you want the factors of the number, and not just a primality test).
Copied from my website http://chemicaldevelopment.us/programming/2016/10/03/PGS.html
I’m going to assume you know the binary number system for this part. If not, just think of 1 as a “yes” and 0 as a “no”.
So, there are plenty of algorithms to generate the first few primes. I use the Sieve of Eratosthenes to compute a list.
But, if we stored the primes as an array, like [2, 3, 5, 7] this would take up too much space. How much space exactly?
Well, 32 bit integers which can store up to 2^32 each take up 4 bytes because each byte is 8 bits, and 32 / 8 = 4
If we wanted to store each prime under 2,000,000,000, we would have to store over 98,000,000,000. This takes up more space, and is slower at runtime than a bitset, which is explained below.
This approach will take 98,000,000 integers of space (each is 32 bits, which is 4 bytes), and when we check at runtime, we will need to check every integer in the array until we find it, or we find a number that is greater than it.
For example, say I give you a small list of primes: [2, 3, 5, 7, 11, 13, 17, 19]. I ask you if 15 is prime. How do you tell me?
Well, you would go through the list and compare each to 15.
Is 2 = 15?
Is 3 = 15?
. . .
Is 17 = 15?
At this point, you can stop because you have passed where 15 would be, so you know it isn’t prime.
Now then, let’s say we use a list of bits to tell you if the number is prime. The list above would look like:
001101010001010001010
This starts at 0, and goes to 19
The 1s mean that the index is prime, so count from the left: 0, 1, 2
001101010001010001010
The last number in bold is 1, which indicates that 2 is prime.
In this case, if I asked you to check if 15 is prime, you don’t need to go through all the numbers in the list; All you need to do is skip to 0 . . . 15, and check that single bit.
And for memory usage, the first approach uses 98000000 integers, whereas this one can store 32 numbers in a single integer (using the list of 1s and 0s), so we would need
2000000000/32=62500000 integers.
So it uses about 60% as much memory as the first approach, and is much faster to use.
We store the array of integers from the second approach in a file, then read it back when you run.
This uses 250MB of ram to store data on the first 2000000000 primes.
You can further reduce this with wheel sieving (like what you did storing (prime-1)/2)
I'll go a little bit more into wheel sieve.
You got it right by storing (prime - 1)/2, and 2 being a special case.
You can extend this to p# (the product of the first p primes)
For example, you use (1#)*k+1 for numbers k
You can also use the set of linear equations (n#)*k+L, where L is the set of primes less than n# and 1 excluding the first n primes.
So, you can also just store info for 6*k+1 and 6*k+5, and even more than that, because L={1, 2, 3, 5}{2, 3}
These methods should give you an understanding of some the methods behind it.
You will need someway to implement this bitset, such as a list of 32 bit integers, or a string.
Look at: https://pypi.python.org/pypi/bitarray for a possible abstraction

Why does this not seem to be random?

I was running a procedure to be like one of those games were people try to guess a number between 0 and 100 where there are 100 people guessing.I then averaged how many different guesses there are.
import random
def averager(times):
tests=[]
for i in range(times):
l=[]
for i in range(0,100):
l.append(random.randint(0,100))
tests.append(len(set(l)))
return (sum(tests))/len(tests)
print(averager(1000))
For some reason, the number of different guesses averages out to 63.6
Why is this?Is it due to a flaw in the python random library?
In a scenario where people were guessing a number between 1 and 10
The first person has a 100% chance to guess a previously unguessed number
The second person has a 90% chance to guess a previously unguessed number
The third person has a 80% chance to guess a previously unguessed number
and so on...
The average chance of guessing a new number(by my reasoning) is 55%.
But the data doesn't reflect this.

Your code is for finding the average number of unique guesses made by 100 people each guessing a number from 1 to 100.
As for why it converges to a number around 63... you should post your question to the math Stack Exchange.

If this was a completely flat distribution, you would expect the average to come out as 100, meaning everybody's guess was different. However, you know that such a scenario is much less random than a scenario where you have duplication. The fact that you get repeated numbers during a random sequence should be comforting.
All you are doing here is measuring some kind of uniqueness within very small sets: ie 1000 repeats of an experiment involving 100 random values. You might get a better appreciation of this if you use some sort of bootstrapping algorithm to sample from.
Also, if you scale up the number of repeats to millions, and perhaps measure the sample distribution (not just the mean), you'll have a little more confidence in the results you're getting.
It may be that the pseudo-random generator has a characteristic which yields approximately 60-70% non-repeated values inside a sequence the same length as the range. However, you would need to experiment with far more samples, as well as different random seeds. Otherwise your results are meaningless.

I modified your code so it would take an already generated sequence as input, rather than calculating random numbers:
def averager(seqs):
tests = []
for s in seqs:
tests.append(len(set(s)))
return float(sum(tests))/len(tests)
Then I made a function to return all possible choices for any given number of people and guess range:
def combos(n, limit):
return itertools.product(*((range(limit),) * n))
(One of the things I love about Python is that it's so easy to break apart a function into trivial pieces.)
Then I started testing with increasing numbers:
for n in range(2,100):
x = averager(combos(n, n))
print n, x, x/n
2 1.5 0.75
3 2.11111111111 0.703703703704
4 2.734375 0.68359375
5 3.3616 0.67232
6 3.99061213992 0.66510202332
7 4.62058326038 0.660083322911
8 5.25112867355 0.656391084194
This algorithm has a horrible complexity, so at this point I got a MemoryError. As you can see, the percentage of unique results keeps dropping as the number of people and guess range keeps increasing.
Repeating the test with random numbers:
def rands(repeats, n, limit):
for i in range(repeats):
yield [random.randint(0, limit) for j in range(n)]
for n in range(10, 101, 10):
x = averager(rands(10000, n, n))
print n, x, x/n
10 6.7752 0.67752
20 13.0751 0.653755
30 19.4131 0.647103333333
40 25.7309 0.6432725
50 32.0471 0.640942
60 38.3333 0.638888333333
70 44.6882 0.638402857143
80 50.948 0.63685
90 57.3525 0.63725
100 63.6322 0.636322
As you can see the results are consistent with what we saw earlier and with your own observation. I'm sure a bit of combinatorial math could explain it all.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.