Python Get Access to locals() Back In 2.7 to Prevent Duplicates - python

So I am creating a list of primes using the "sieve" method and a Python comprehension.
no_primes = [j for i in range(2,sqrt_n) for j in range(i*2, n, i)]
Problem is the Sieve method generates tons of duplicates in the 'no_primes' list. It was recommended to use locals()['_[1]'] to gain access to the list as it is being built and then removing the dups as they occur:
no_primes = [j for i in range(2,sqrt_n) for j in range(i*2, n, i) if j not in locals()['_[1]']]
Problem is, this ability has been removed as of 2.7 and so does not work.
I understand that this method may be "evil" (Dr. Evil with his pinky at his lips.) , however, I need to remove dups before they affect memory with a massive list. Yes, I can filter or use 'set' to remove dups, but by then the list will have taken over my computer's memory and/or 'filter' or 'set' will have a massive task ahead.
So how do I get this ability back? I promise not to take over the world with it.
Thanks.

You can use a set-comprehension (which automatically prevents duplicates):
no_primes = {j for i in range(2,sqrt_n) for j in range(i*2, n, i)}
You could then sort it into a list if necessary:
no_primes = sorted(no_primes)
For a further optimization, you can use xrange instead of range:
no_primes = {j for i in xrange(2,sqrt_n) for j in xrange(i*2, n, i)}
Unlike the latter, which produces an unnecessary list, xrange returns an iterator.

A simple and readable approach would be:
def get_primes(n):
multiples = []
for i in xrange(2, n+1):
if i not in multiples:
for j in xrange(i*i, n+1, i):
multiples.append(j)
return multiples
m = get_primes(100)
print m

Related

How do I return a list containing multiple lists?

I want my code to get multiple lists(multiple_list) and append them to one single list(single_list). I know there may be easier ways but I'm a beginner and I want to solve it with what I can do.
It returns the list (single_list) filled with empty lists(multiple_list), and not with numbers that are in the if statement.
def list_squared(m, n):
multiple_list=[]
single_list=[]
for i in range (m, n+1):
for j in range(1, i+1):
if i%j == 0:
multiple_list.append(j)
single_list.append(multiple_list)
multiple_list.clear()
return single_list
print(list_squared(1,250))
You're close! Instead of appending and then immediately clearing the original list, just create a new list to append and leave the lists that go into the higher order list alone:
def list_squared(m, n):
single_list=[]
for i in range (m, n+1):
multiple_list=[]
for j in range(1, i+1):
if i%j == 0:
multiple_list.append(j)
single_list.append(multiple_list)
return single_list
print(list_squared(1,250))
To explain why you need to do this a little better, in Python, values are passed as a reference to the original value, so when you did multiple_list.clear() you cleared the same list that was just appended, and so your loop just adds that same list (multiple_list) to single_list every iteration of your outer loop instead of generating a new list for each iteration.
Here is a super-compressed way of your question.
m=1
n=250
new_l=[[j for j in range(1,i+1) if i%j==0] for i in range(m,n+1)]
This is list comprehension, works same as your main code.
This is this traditional method:
def list_squared(m, n):
single_list=[]
for i in range (m, n+1):
multiple_list=[]
for j in range(1, i+1):
if i%j == 0:
multiple_list.append(j)
single_list.append(multiple_list)
return single_list
m=1
n=250
print(list_squared(1,250))

How to go through a double for loop randomly in python

Consider the following code:
for i in range(size-1):
for j in range(i+1,size):
print((i,j))
I need to go through this for-loop in a random fashion. I attempt to write a generator to do such a thing
def Neighborhood(size):
for i in shuffle(range(size-1)):
for j in shuffle(range(i+1), size):
yield i, j
for i,j in Neighborhood(size):
print((i,j))
However, shuffle cannot be applied to whatever object range is. I do not know how to remedy the situation, and any help is much appreciated. I would prefer a solution avoid converting range to a list, since I need speed. For example, size could be on the order of 30,000 and i will do perform this for loop around 30,000 times.
I also plan to escape the for loop early, so I want to avoid solutions that incorporate shuffle(list(range(size)))
You can use random.sample.
The advantage of using random.sample over random.shuffle, is , it can work on iterators, so in :
Python 3.X you don't need to convert range() to list
In Python 2,X, you can use xrange
Same Code can work in Python 2.X and 3.X
Sample code :
n=10
l1=range(n)
for i in sample(l1,len(l1)):
l2=range(i,n)
for j in sample(l2,len(l2)):
print(i,j)
Edit :
As to why I put in this edit, go through the comments.
def Neighborhood(size):
range1 = range(size-1)
for i in sample(range1, len(range1)):
range2 = range(i+1)
for j in sample(range2, len(range2)):
yield i, j
A simple way to go really random, not row-by-row:
def Neighborhood(size):
yielded = set()
while True:
i = random.randrange(size)
j = random.randrange(size)
if i < j and (i, j) not in yielded:
yield i, j
yielded.add((i, j))
Demo:
for i, j in Neighborhood(30000):
print(i, j)
Prints something like:
2045 5990
224 5588
1577 16076
11498 15640
15219 28006
8066 10142
7856 8248
17830 26616
...
Note: I assume you're indeed going to "escape the for loop early". Then this won't have problems with slowing down due to pairs being produced repeatedly.
I don't think you can randomly traverse an Iterator. You can predefine the shuffled lists, though
random iteration in Python
L1 = list(range(size-1))
random.shuffle(L1)
for i in L1:
L2 = list(range(i+1, size))
random.shuffle(L2)
for j in L2:
print((i,j))
Of course, not optimal for large lists

does anyone know how to write a list of primes up to n, using a list comprehension?

does anyone know how to write a list of primes up to n, using a list comprehension?
I have this code so far which is efficient, but I want a code just as efficient but in a list comprehension! thanks for the help!
my code:
primes = [2]
for i in range(3, n + 1, 2):
isprime = True
for j in primes:
if i % j == 0:
isprime = False
break
if isprime:
primes.append(i)
I don't believe that there is any way to use the previously-generated contents of a list while it's still being constructed during a list comprehension. The first usable reference to that list object appears only at the end of construction as the expression result.
You can use a similar idea, a generator expression, to remove the inner loop, though:
if all(i%j != 0 for j in primes):
primes.append(i)
That's efficient in that all accepts an iterable, and stops at the first false result, just as your for loop does.
I haven't timed this to see whether this is faster or not. It can't be "algorithmically" faster, since it does the very same operations in the same order. One time should be approximately a fixed fraction of the other for large operations.
The Sieve of Eratosthenes is usually much faster than trial division (your method) for generating all primes from 2 through n.
I think you can't do it with your algorithm, but you could check if the number n is divisible by a number between n and square root of n (you wanted to check prime numbers only):
primes = [i for i in range(2, n+1) if all(i%i2 for i2 in range(2, int(sqrt(i))))]
primes is now a list of all primes between 2 and 100.
This is probably less efficient than your approach, but it uses list comprehension and is therefore much less code.
EDIT:
I just figured out the efficiency of my solution with the timeit module and my solution is more than 10 times slower than yours. You shouldn't use it.

alternative (faster) way to 3 nested for loop python

How can I make this function faster? (I call it a lot of times and it could result in some speed improvements)
def vectorr(I, J, K):
vect = []
for k in range(0, K):
for j in range(0, J):
for i in range(0, I):
vect.append([i, j, k])
return vect
You can try to take a look at itertools.product
Equivalent to nested for-loops in a generator expression. For example,
product(A, B) returns the same as ((x,y) for x in A for y in B).
The nested loops cycle like an odometer with the rightmost element
advancing on every iteration. This pattern creates a lexicographic
ordering so that if the input’s iterables are sorted, the product
tuples are emitted in sorted order.
Also no need in 0 while calling range(0, I) and etc - use just range(I)
So in your case it can be:
import itertools
def vectorr(I, J, K):
return itertools.product(range(K), range(J), range(I))
You said you want it to be faster. Let's use NumPy!
import numpy as np
def vectorr(I, J, K):
arr = np.empty((I*J*K, 3), int)
arr[:,0] = np.tile(np.arange(I), J*K)
arr[:,1] = np.tile(np.repeat(np.arange(J), I), K)
arr[:,2] = np.repeat(np.arange(K), I*J)
return arr
There may be even more elegant tweaks possible here, but that's a basic tiling that gives the same result (but as a 2D array rather than a list of lists). The code for this is all implemented in C, so it's very, very fast--this may be important if the input values may get somewhat large.
The other answers are more thorough and, in this specific case at least, better, but in general, if you're using Python 2, and for large values of I, J, or K, use xrange() instead of range(). xrange gives a generator-like object, instead of constructing a list, so you don't have to allocate memory for the entire list.
In Python 3, range works like Python 2's xrange.
import numpy
def vectorr(I,J,K):
val = numpy.indices( (I,J,K))
val.shape = (3,-1)
return val.transpose() # or val.transpose().tolist()

Rewriting nested if-statements in a more Pythonic fashion

I'm working on a function that, given a sequence, tries to find said sequence within a list and should then return the list item immediately after that sequence terminates.
Currently this code does return the list item immediately after the end of the sequence, however I'm not to happy with having this many nested if-statements and would love to rewrite it but I can't figure out how to go about it as it is quite unlike anything I've ever written in the past and feel a bit out of practice.
def sequence_in_list(seq, lst):
m, n = len(lst), len(seq)
for i in xrange(m):
for j in xrange(n):
if lst[i] == seq[j]:
if lst[i+1] == seq[j+1]:
if lst[i+2] == seq[j+2]:
return lst[i+3]
(My intention is to then extend this function so that if that sequence occurs more than once throughout the list it should return the subsequent item that has happened the most often after the sequence)
I would do this with a generator and slicing:
sequence = [1, 2, 3, 5, 1, 2, 3, 6, 1, 2, 3]
pattern = [1, 2, 3]
def find_item_after_pattern(sequence, pattern):
n = len(pattern)
for index in range(0, len(sequence) - n):
if pattern == sequence[index:index + n]:
yield sequence[index + n]
for item in find_item_after_pattern(sequence, pattern):
print(item)
And you'll get:
5
6
The function isn't too efficient and won't work for infinite sequences, but it's short and generic.
Since you are comparing consecutive indexes, and assuming lst and seq are of the same type, you can use slicing:
def sequence_in_list(seq, lst):
m, n = len(lst), len(seq)
for i in xrange(m):
for j in xrange(n):
if lst[i:i+3] == seq[j:j+3]:
return lst[i+3]
If the sequences are of different kind you should convert to a common type before doing the comparison(e.g. lst[i:i+3] == list(seq[j:j+3]) would work if seq is a string and lst is a list).
Alternatively, if the sequences do not support slicing, you can use the built-in all to check for more conditions:
def sequence_in_list(seq, lst):
m, n = len(lst), len(seq)
for i in xrange(m):
for j in xrange(n):
if all(lst[i+k] == seq[j+k] for k in range(3)):
return lst[i+3]
If you want to extend the check over 10 indices instead of 3, simply change range(3) to range(10).
Side note: your original code would raise an IndexError at some point, since you access list[i+1] where i may be len(list) - 1. The above code doesn't produce any errors, since slicing may produce a slice shorter than the difference of the indices, meainig that seq[j:j+3] can have less than 3 elements. If this is a problem you should adjust the indexes on which you are iterating over.
Last remark: don't use the name list since it shadows a built-in name.
You can combine list comprehension with slicing to make comparing more readable:
n, m = len(lst), len(seq)
[lst[j+3] for i in range(m-2) for j in range(n-2) if seq[i:i+3] == lst[j:j+3]]
Of course there are more efficient ways to do it, but this is simple, short and python styled.

Categories

Resources