Sum of a large list in python

Sum of a large list in python - python

Calculate the index of an integer from a given large list whose removal does not effect the mean of the list
I have tried linear time approach but it seems to fail test cases having numbers above 10^9 and the size of the list exceeds 10^5.
Please suggest some better approach to solve this problem if any or to suggest more efficient way to sum large list with large values .
Here is my code below :
for _ in range(int(input())):
n=int(input())
#ar=list(map(int,input().split()))
ar=[int(x) for x in input().split()]
me=sum(ar)/n
for j in range(n):
#arr2=deepcopy(ar)
arr2=ar[:]
#arr2=[]
#for _ in ar:
# arr2.append(_)
arr2.remove(ar[j])
if (sum(arr2)/(n-1))==me:
print(j+1)
break
else:
print("Impossible")
The code fails in two of the 10 test cases just because of increase in the len of the list and size of the integer

You seem to make a deep copy of the entire array in each iteration, which is expensive. Why not just check whether an item equals the mean?
for _ in range(int(input())):
n = int(input())
ar = [int(x) for x in input().split()]
mean = sum(ar) / n
found = False
for j in range(n):
if ar[j] == mean:
print(j, " is the result.")
found = True
break
if not found:
print("Impossible")

This looks more like a math problem than an algorithm challenge:
given that mean = sum(list)/len(list), we need a value k such that:
mean = (sum(list)-k)/(len(list)-1)
Which a little algebra tells us is:
k = sum(list) - mean * (len(list)-1)
k = sum(list) - mean * len(list) + mean
k = sum(list) - sum(list) + mean <-- because mean*len(list) = sum(list)
k = mean
for example:
a = [1,2,3,4,5,6,7]
k = sum(a)/len(a) # 4
ik = a.index(k) # 3

Related

All possible way of adding up number in a sequence so that it becomes a given number

I was given range n and number k. Count the possible ways so that two (not identical) number in that range add up to number k. And can this be done without nested loops?
Here's my approach, the only thing is I'm using a nested loop, which takes times and not computer-friendly. Opposite pairs like (A, B) and (B, A) still count as 1.
n, k = int(input()), int(input())
cnt = 0
for i in range(1, n+1):
for s in range(1, n+1):
if i == 1 and s == 1 or i == n+1 and s==n+1:
pass
else:
if i+s==k:
cnt += 1
print(int(cnt/2))
example inputs (first line is n, second is k)
8
5
explanation(1, 4 and 2, 3), so I should be printing 2

You only need a single loop for this:
n = int(input('N: ')) # range 1 to n
k = int(input('K: '))
r = set(range(1, n+1))
c = 0
while r:
if k - r.pop() in r:
c += 1
print(c)

If I understood you well it's gonna be just a single while loop counting up to k:
counter = 0
while counter<min(n,k)/2:
if counter+(k-counter) == k: # This is always true actually...
print(counter,k-counter)
counter+=1
Starting from 0 up to k those pairs are gonna be counter and k - counter (complement to k, so result of subtracting the counter from k)
We should can count up to smaller of the two n and k, cause numbers bigger than k are not gonna add up to k
Actually we should count up to a half of that, cause we're gonna get symmetric results after that.
So considering you don't want to print each pair it's actually:
count = int(min(n,k)//2)

why are you iterating and checking number combinations when you can mathematically derive the count of valid pairs using n and k itself?
depending on whether n or k being larger the number of pairs can be calculated directly
Every number i within n range has a matching pair k-i
and depending on whether n or k which greater we need to validate whether k-i and i both are within the range n and not equal.
for n>=k case the valid range is from 1 to k-1
and for the other case the valid range is from k-n to n
and the count of a range a to b is b-a+1
since in both conditions the pairs are symmetrical these range count should be halved.
so the entire code becomes
n= int(input())
k=int(input())
if n>=k:print(int((k-1)/2))
if n<k:print(int((2*n-(k-1))/2))

A problem of combinatorics. The following code uses python's built-in library to generate all possible combinations
from itertools import combinations
n = 10
k = 5
n_range = [i for i in range(1, n+1)]
result = []
for i in n_range:
n_comb = combinations(n_range, i)
for comb in n_comb:
if sum(comb) == k:
result.append(comb)
print(result)

How many numbers have n as their smallest prime factor within 10^6?

for _ in range(int(input())):
n=int(input())
least_prime = [0] * (1000001)
count=0
for i in range(2, int(1000001**0.5 + 1)):
if least_prime[i] == 0:
least_prime[i] = i
for j in range(i * i, 1000000, 2*i) :
if (least_prime[j] == 0) :
least_prime[j] = i
for i in range(2, 1000001) :
if least_prime[i] == n:
count+=1
print(count)
Can anyone reduce the time complexity of the problem? tried for an hour maybe but can't simplify more than this.
First for loop is for the number of test cases. And the question is regarding how many numbers have n as their smallest prime factor within 10^6?

there are 3 problems with your code:
you repeat the same task however many test case you have, which is a waste of time and resources
you leave unchanged/ignore the numbers/primes above int(1000001**0.5 + 1)
and by doing a step of 2*i in your j range you mislabel a bunch of number, like for example 6, which should have 2 marked as its least prime but it get marked as itself because it get skipped, 6=4+2 but you never get there with your j range because you make a step of 4 so you go 4,8,12,... instead of 4,6,8,...
How to fix it? simple, make the sieve first and only once, no need to repeat the exact same thing 10**6 times or however many test case you have, two or more is two too many times ( if n is always prime that is 78498 which is the number of primes less than 10**6), and the other 2 points are simple fix
I would put it inside its own function, which make it more reusable and easy to play with, and for the counting part, there is no problem with how you do it, but is more convenient with a Counter which make all the counting at once
from collections import Counter
def make_sieve(MAX):
MAX += 1
least_prime = [0] * MAX
for i in range(2, MAX):
if least_prime[i] == 0:
least_prime[i] = i
for j in range(i * i, MAX, i) :
if (least_prime[j] == 0) :
least_prime[j] = i
return least_prime
result = Counter(make_sieve(10**6))
print("n=2->",result[2])# n=2-> 500000
print("n=3->",result[3])# n=3-> 166667
so now your test can be as simple as
for _ in range(int(input())):
n = int(input())
print(result[n])
And just for completeness, here is how you can do it without the Counter
least_prime = make_sieve(10**6)
for _ in range(int(input())):
n = int(input())
count = 0
for p in least_prime:
if p==n:
count+=1
print(count)
but that is also too long, a list already do that for us with .count
least_prime = make_sieve(10**6)
for _ in range(int(input())):
n = int(input())
count = least_prime.count(n)
print(count)
The counter is still better, because in just one go you get all the answers, and if needed you can make your own with a regular dictionary, but I leave that as exercise to the reader.

Python: find top k-biggest numbers in an array

What's wrong with this code? Please, do not use built-in functions
Also, could you add condition for int(k) < int(size)
numbers = list()
size = input("Enter the size of an array: ")
for i in range(int(size)):
n = input("number: ")
numbers.append(int(n))
print(numbers)
k = input("k = ")
max = numbers[0]
top = list()
for j in range(int(k)):
for x in numbers:
if x > max:
max = x
top.append(max)
del numbers[numbers.index(max)]
print(top)

Here is the corrected version of your code:
size = int(input("Enter the size of an array: ")) # Ask the user for the amount of values that will be entered
numbers = [int(input("number: ")) for _ in range(size)] # Use a list comprehension to store each input
k = int(input("k = ")) # The number of greatest values to be printed
top = list() # List to store the greatest values
for i in range(0, k): # For every unit in the number of greatest values
max = numbers[0] # Set the maximum to any value, I chose index 0 to avoid IndexError
for j in numbers: # For every number in the list of numbers
if j > max: # If that number is greater than max
max = j # Set max to that number and repeat
top.append(max) # Add the gratest to the top list
numbers.remove(max) # Now remove the greatest so we can proceed to find the next greatest
print(top) # Print our result!
Please note that it is a bad practice to name any of your variables the names of built-in functions, so maybe change your max to mx.

You want to have largest k numbers in the list right?
You could just sort the array and take the last k elements with a backwards iteration. Because after you sort, the largest numbers will be at the end.
The algorithm would be as follows:
0. Take the array through a for loop # You seem to have done this
Sort the array (Python has a native sort function, but let's just define a simple bubble sort function as you said no libraries)
Initialize a list or a k-tuple to hold largest numbers
Starting from index = array.length - 1 iterate to index = array.length - 1 - k
Insert each element to the k-tuple in step 2.
In Python, it would be:
def kLargestNumbers(array,k):
### Considering the input array is taken
bubbleSort(array) ## if native python lib is available change it to sort()
largest_k_numbers = list()
for i in range(len(array)-1, len(array)-1-k,-1):
largest_k_numbers.append(array[i])
return largest_k_numbers
def bubbleSort(array):
n = len(arr)
for i in range(n-1):
for j in range(0, n-i-1):
if arr[j] < arr[j+1] :
arr[j], arr[j+1] = arr[j], arr[j+1]

Sum of all prime numbers between 1 and N in Python

I'm new to programming. While trying to solve this problem, I'm getting the wrong answer. I checked my code a number of times but was not able to figure out the mistake. Please, help me on this simple problem. The problem is as follows:
Given a positive integer N, calculate the sum of all prime numbers between 1 and N (inclusive). The first line of input contains an integer T denoting the number of test cases. T testcases follow. Each testcase contains one line of input containing N. For each testcase, in a new line, print the sum of all prime numbers between 1 and N.
And my code is:
from math import sqrt
sum = 0
test = int(input())
for i in range(test):
max = int(input())
if max==1:
sum = 0
elif max==2:
sum += 2
else:
sum = sum + 2
for x in range(3,max+1):
half = int(sqrt(max)) + 1
for y in range(2,half):
res = x%y
if res==0:
sum = sum + x
break
print(sum)
For input 5 and 10, my code is giving output 6 and 48 respectively, while the correct answer is 10 and 17 respectively. Please, figure out the mistake in my code.

Here, I implemented simple program to find the sum of all prime numbers between 1 to n.
Consider primeAddition() as a function and ip as an input parameter. It may help you to solve your problem.Try it.
Code snippet:
def primeAddition(ip):
# list to store prime numbers...
prime = [True] * (ip + 1)
p = 2
while p * p <= ip:
# If prime[p] is not changed, then it is a prime...
if prime[p] == True:
# Update all multiples of p...
i = p * 2
while i <= ip:
prime[i] = False
i += p
p += 1
# Return sum of prime numbers...
sum = 0
for i in range (2, ip + 1):
if(prime[i]):
sum += i
return sum
#The program is ready... Now, time to call the primeAddition() function with any argument... Here I pass 5 as an argument...
#Function call...
print primeAddition(5)

This is the most broken part of your code, it's doing the opposite of what you want:
res = x%y
if res==0:
sum = sum + x
break
You only increment sum if you get through the entire loop without breaking. (And don't use sum as you're redefining a Python built-in.) This can be checked using the special case of else on a for loop, aka "no break". I've made that change below as well as corrected some inefficiencies:
from math import sqrt
T = int(input())
for _ in range(T):
N = int(input())
sum_of_primes = 0
if N < 2:
pass
elif N == 2:
sum_of_primes = 2
else:
sum_of_primes = 2
for number in range(3, N + 1, 2):
for odd in range(3, int(sqrt(number)) + 1, 2):
if (number % odd) == 0:
break
else: # no break
sum_of_primes += number
print(sum_of_primes)
OUTPUT
> python3 test.py
3
5
10
10
17
23
100
>

A slight modification to what you have:
from math import sqrt
sum = 0
test = int(input())
max = int(input())
for x in range(test,max+1):
if x == 1:
pass
else:
half = int(sqrt(x)) + 1
for y in range(2,half):
res = x%y
if res==0:
break
else:
sum = sum + x
print(sum)
Your biggest error was that you were doing the sum = sum + x before the break rather than outside in an else statement.
PS: (although you can) I'd recommend not using variable names like max and sum in your code. These are special functions that are now overridden.

Because your logic is not correct.
for y in range(2,half):
res = x%y
if res==0:
sum = sum + x
break
here you check for the factors and if there is a factor then adds to sum which is opposite of the Primes. So check for the numbers where there is no factors(except 1).
from math import sqrt
test = int(input())
for i in range(test):
sum = 0
max = int(input())
if max==1:
sum = 0
elif max==2:
sum += 2
else:
sum = sum + 2
for x in range(3,max+1):
half = int(sqrt(x)) + 1
if all(x%y!=0 for y in range(2,half)):
sum = sum + x
print(sum)

First of all, declare sum to be zero at the beginning of the for i loop.
The problem lies in the if statement at almost the very end of the code, as you add x to the sum, if the res is equal to zero, meaning that the number is indeed not a prime number. You can see that this is the case, because you get an output of 6 when entering 5, as the only non-prime number in the range 1 to and including 5 is 4 and you add 2 to the sum at the beginning already.
Last but not least, you should change the
half = int(sqrt(max)) + 1
line to
half = int(sqrt(x)) + 1
Try to work with my information provided and fix the code yourself. You learn the most by not copying other people's code.
Happy coding!

I believe the mistake in your code might be coming from the following lines of code:
for x in range(3,max+1):
half = int(sqrt(max)) + 1
Since you are looping using x, you should change int(sqrt(max)) to int(sqrt(x)) like this:
for x in range(3,max+1):
half = int(sqrt(x)) + 1
Your code is trying to see if max is prime N times, where you should be seeing if every number from 1-N is prime instead.
This is my first time answering a question so if you need more help just let me know.

I can't find where I did wrong :(

I was working on project euler question 23 with python. For this question, I have to find sum of any numbers <28124 that cannot be made by sum of two abundant numbers. abundant numbers are numbers that are smaller then its own sum of proper divisors.
my apporach was : https://gist.github.com/anonymous/373f23098aeb5fea3b12fdc45142e8f7
from math import sqrt
def dSum(n): #find sum of proper divisors
lst = set([])
if n%2 == 0:
step = 1
else:
step = 2
for i in range(1, int(sqrt(n))+1, step):
if n % i == 0:
lst.add(i)
lst.add(int(n/i))
llst = list(lst)
lst.remove(n)
sum = 0
for j in lst:
sum += j
return sum
#any numbers greater than 28123 can be written as the sum of two abundant numbers.
#thus, only have to find abundant numbers up to 28124 / 2 = 14062
abnum = [] #list of abundant numbers
sum = 0
can = set([])
for i in range(1,14062):
if i < dSum(i):
abnum.append(i)
for i in abnum:
for j in abnum:
can.add(i + j)
print (abnum)
print (can)
cannot = set(range(1,28124))
cannot = cannot - can
cannot = list(cannot)
cannot.sort ()
result = 0
print (cannot)
for i in cannot:
result += i
print (result)
which gave me answer of 31531501, which is wrong.
I googled the answer and answer should be 4179871.
theres like 1 million difference between the answers, so it should mean that I'm removing numbers that cannot be written as sum of two abundant numbers. But when I re-read the code it looks fine logically...
Please save from this despair

Just for some experience you really should look at comprehensions and leveraging the builtins (vs. hiding them):
You loops outside of dSum() (which can also be simplified) could look like:
import itertools as it
abnum = [i for i in range(1,28124) if i < dSum(i)]
can = {i+j for i, j in it.product(abnum, repeat=2)}
cannot = set(range(1,28124)) - can
print(sum(cannot)) # 4179871

There are a few ways to improve your code.
Firstly, here's a more compact version of dSum that's fairly close to your code. Operators are generally faster than function calls, so I use ** .5 instead of calling math.sqrt. I use a conditional expression instead of an if...else block to compute the step size. I use the built-in sum function instead of a for loop to add up the divisors; also, I use integer subtraction to remove n from the total because that's more efficient than calling the set.remove method.
def dSum(n):
lst = set()
for i in range(1, int(n ** .5) + 1, 2 if n % 2 else 1):
if n % i == 0:
lst.add(i)
lst.add(n // i)
return sum(lst) - n
However, we don't really need to use a set here. We can just add the divisor pairs as we find them, if we're careful not to add any divisor twice.
def dSum(n):
total = 0
for i in range(1, int(n ** .5) + 1, 2 if n % 2 else 1):
if n % i == 0:
j = n // i
if i < j:
total += i + j
else:
if i == j:
total += i
break
return total - n
This is slightly faster, and uses less RAM, at the expense of added code complexity. However, there's a more efficient approach to this problem.
Instead of finding the divisors (and hence the divisor sum) of each number individually, it's better to use a sieving approach that finds the divisors of all the numbers in the required range. Here's a simple example.
num = 28124
# Build a table of divisor sums.
table = [1] * num
for i in range(2, num):
for j in range(2 * i, num, i):
table[j] += i
# Collect abundant numbers
abnum = [i for i in range(2, num) if i < table[i]]
print(len(abnum), abnum[0], abnum[-1])
output
6965 12 28122
If we need to find divisor sums for a very large num a good approach is to find the prime power factors of each number, since there's an efficient way to compute the sum of the divisors from the prime power factorization. However, for numbers this small the minor time saving doesn't warrant the extra code complexity. (But I can add some prime power sieve code if you're curious; for finding divisor sums for all numbers < 28124, the prime power sieve technique is about twice as fast as the above code).
AChampion's answer shows a very compact way to find the sum of the numbers that cannot be written as the sum of two abundant numbers. However, it's a bit slow, mostly because it loops over all pairs of abundant numbers in abnum. Here's a faster way.
def main():
num = 28124
# Build a table of divisor sums. table[0] should be 0, but we ignore it.
table = [1] * num
for i in range(2, num):
for j in range(2 * i, num, i):
table[j] += i
# Collect abundant numbers
abnum = [i for i in range(2, num) if i < table[i]]
del table
# Create a set for fast searching
abset = set(abnum)
print(len(abset), abnum[0], abnum[-1])
total = 0
for i in range(1, num):
# Search for pairs of abundant numbers j <= d: j + d == i
for j in abnum:
d = i - j
if d < j:
# No pairs were found
total += i
break
if d in abset:
break
print(total)
if __name__ == "__main__":
main()
output
6965 12 28122
4179871
This code runs in around 2.7 seconds on my old 32bit single core 2GHz machine running Python 3.6.0. On Python 2, it's about 10% faster; I think that's because list comprehensions have less overhead in Python 2 (the run in the current scope rather than creating a new scope).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Sum of a large list in python - python

Related

All possible way of adding up number in a sequence so that it becomes a given number

How many numbers have n as their smallest prime factor within 10^6?

Python: find top k-biggest numbers in an array

Sum of all prime numbers between 1 and N in Python

I can't find where I did wrong :(

Categories

Resources