Is this loop method correct? Python 3.4.3 - python

So, I wanted a set of XY positions that would be different to each other. In order to do this, I had used a list to store the variables XY, which were randomly generated. If the position was not in the list, it would be added to it, and if it were in the list, then it would remake a position for it.
I am unsure if this will work in all instances, and wonder if there is a better way of doing this.
import random
positionList = []
for i in range(6):
position = [random.randint(0,5),random.randint(0,5)]
print("Original", position)
while position in positionList:
position = [random.randint(0,5),random.randint(0,5)]
positionList.append(position)
print(position)
Could the remade position be the same as other positions in the list?

Could the remade position be the same as other positions in the list?
Yes, because you are using random. If you want to be sure that you are preserving the unique items you can use a set object for that aim which preserve the unique items for you. But note that since lists are not hashable objects you should use a hashable container for pairs (like tuple):
>>> position_set = set()
>>>
>>> while len(position_set) != 6:
... position = (random.randint(0,5), random.randint(0,5))
... position_set.add(position)
...
>>> position_set
set([(3, 2), (5, 0), (2, 5), (5, 2), (1, 0), (3, 5)])

If you really need lists, you can convert, if not just leave the code as is:
import random
position_set = set()
for i in range(6):
position = random.randint(0, 5), random.randint(0, 5)
print("Original", position)
while position in position_set:
position = random.randint(0, 5), random.randint(0, 5)
position_set.add(position)
print(position)
print(position_set)
A set lookup is O(1) vs O(n) for a list, since order seems irrelevant just use a set altogether is probably sufficient.

To be sure of 6 different elements, you can use random.shuffle :
from random import shuffle
all=[(x,y) for x in range(5) for y in range(5)]
shuffle(all)
print(all[:6])
"""
[(0, 1), (3, 4), (1, 1), (1, 3), (4, 3), (0, 0)]
"""

I just run your code, and it seems to work pretty fine. I believe it is correct. Let's consider your while loop.
You check if the randomly generated 'position' is already in the 'positionList' list. The mere statement:
position in positionList
returns either True or False. If the 'position' already appears in your list, the while loop gets executed. And you simply calculate another random position.
The only advice I could give is to add a loop counter. When you run out of possible XY positions, the loop runs forever.

This is one way of doing it
import random
my_list =[]
num_of_points = 6
while True:
position = [random.randint(0,5),random.randint(0,5)]
if position not in my_list:
my_list.append(position)
print num_of_points
num_of_points -=1
if (num_of_points == 0):
break
print my_list
of course you just need to make sure that the number of possible random pairs exceeds the num_op_points value.

Related

How to circumvent slow search with Index() method for a large list

I have a large list myList containing tuples.
I need to remove the duplicates in this list (that is the tuples with same elements in the same order). I also need to keep track of this list's indices in a separate list, indexList. If I remove a duplicate, I need to change its index in indexList to first identical value's index.
To demonstrate what I mean, if myList looks like this:
myList = [(6, 2), (4, 3), (6, 2), (8, 1), (5, 4), (4, 3), (2, 1)]
Then I need to construct indexList like this:
indexList = (0, 1, 0, 2, 3, 1, 4)
Here the third value is identical to first, so it (third value) gets index 0. Also the subsequent value gets an updated index of 2 and so on.
Here is how I achieved this:
unique = set()
i = 0
for v in myList[:]:
if v not in unique:
unique.add(v)
indexList.append(i)
i = i+1
else:
myList.pop(i)
indexList.append(myList.index(v))
This does what I need. However index() method makes the script very slow when myList contains hundreds of thousands of elements. As I understand this is because it's an O(n) operation.
So what changes could I make to achieve the same result but make it faster?
If you make a dict to store the first index of each value, you can do the lookup in O(1) instead of O(n). So in this case, before the for loop, do indexes = {}, and then in the if block, do indexes[v] = i and in the else block use indexes[v] instead of myList.index(v).

Python- For Loop - Increment each index position for each tuple in list

I've searched around for a possible way to do this. I'm trying to make a loop that will go through my list of tuple pairs. Each index contains data that I will calculate and append to a list through each loop run until the end of the list of tuples is reached. Currently using a for loop, but I might use while loop.
index_tuple = [(1, 2), (2, 3), (3, 4)]
total_list = []
for index_pairs in index_tuple:
total_list.append(index_tuple[0][1] - index_tuple[0][0])
What I'm trying to get the loop to do:
(index_tuple[0][1] - index_tuple[0][0])#increment
(index_tuple[1][1] - index_tuple[1][0])#increment
(index_tuple[2][1] - index_tuple[2][0])#increment
Then I guess my final question is it possible to increment index position with a while loop?
Use a list comprehension. This iterates the list, unpacks each tuple to two values a and b, then it subtracts the first item from the second and inserts this new subtracted value into the new list.
totals = [b - a for a, b in index_tuple]
A list comprehension is the best solution for this problem, and Malik Brahimi's answer is the way to go.
Nevertheless, sticking with your for loop, you need to reference index_pairs in the body of the loop because this variable is assigned each tuple from index_tuple as the loop iterates. You do not need to maintain an index variable. A corrected version would be this:
index_tuple = [(1, 2), (2, 3), (3, 4)]
total_list = []
for index_pairs in index_tuple:
total_list.append(index_pairs[1] - index_pairs[0])
>>> print total_list
[1, 1, 1]
A cleaner version which unpacks the tuples from the list directly into 2 variables would be:
index_tuples = [(1, 2), (2, 3), (3, 4)]
total_list = []
for a, b in index_tuples:
total_list.append(b - a)
>>> print total_list
[1, 1, 1]
You also asked about using a while loop to achieve the same. Use an integer to keep track of the current index and increment it by one on each iteration of the loop:
index_tuples = [(1, 2), (2, 3), (3, 4)]
total_list = []
index = 0
while index < len(index_tuples):
total_list.append(index_tuples[index][1] - index_tuples[index][0])
index += 1
>>> print total_list
[1, 1, 1]

How to check for common elements between two lists in python

I am having a bit of trouble when I try and check for overlapping elements in list.
This means that I will have to check for common elements between two lists.
The way in which my programme works is that the player enters their two end coordinates for a certain ship, it then creates a list out of this of all of the ships coordinates (ie if they enter (1,1) and (1,5), it would create [(1,1),(1,2),(1,3),(1,4),(1,5)]
I have also tried using the following code but it doesn't work for the way I want it to:
ListA = [(1,1),(1,2),(1,3),(1,4),(1,5)]
ListB = [(1,1),(2,1),(3,1)]
for i in ListB:
if i in ListA:
print("There is an overlap")
#choose coordinates again
else:
print("There is no overlap")
#add to ListA and next ship's coordinate chosen
I would like for the program to check to see if any of the elements in A are in B by considering them collectively, instead of checking them individually.
set.intersection will find any common elements:
ListA = [(1,1),(1,2),(1,3),(1,4),(1,5)]
ListB = [(1,1),(2,1),(3,1)]
print(set(ListA).intersection(ListB))
set([(1, 1)])
Unless order matters it may be just as well to store the tuples in sets:
st_a = {(1, 1), (1, 2), (1, 3), (1, 4), (1, 5)}
st_b = {(1, 1), (2, 1), (3, 1)}
print(st.intersection(st_b))
add it to your code with:
if st_a.intersection(st_b):
print("There is an overlap")
else:
print("There is no overlap")
If there is an overlap you want to choose coordinates again;
for i in ListB:
if i in ListA:
print("There is an overlap")
i=(yourcoordinateshere)
Else you want to add it to ListA;
else:
print("There is no overlap")
ListA.append(i)
Not sure if this helps...
In [1]: from collections import Counter
In [2]: import random
In [3]: lst = [random.randrange(0, 9) for i in xrange(1000)]
In [4]: counted = Counter(lst)
In [7]: counted.most_common(10)
Out[7]:
[(2, 125),
(0, 123),
(5, 120),
(8, 118),
(7, 111),
(1, 107),
(4, 104),
(6, 102),
(3, 90)]
Dictionary
If in practice:
len(ListA) * len(ListB) * ExpectedNumberOfCollisionChecks
is significant then it may make sense to use a dictionary for the longer list because the time complexity for a dictionary lookup is:
Average: O(1)
Worst case: O(n)
Where the average is expected and the worst case happens only when a bad hash function has been selected.
Intersecting Sets
The accepted answer proposes using set.intersection. The time complexity of set.intersection is:
Average: O(n)
Worst case: O(n^2)
Modified Code
The only change to your original code is conversion of ListA to MapA.
MapA = {(1,1): True, (1,2): True, (1,3): True,(1,4): True, (1,5): True}
ListB = [(1,1),(2,1),(3,1)]
for i in ListB:
if MapA.get(i):
print("There is an overlap")
#choose coordinates again
else:
print("There is no overlap")
#add to ListA and next ship's coordinate chosen
Comments
Going further, the entire intersection operation has to be run each time the user inputs new coordinates (i.e. ListB changes). On the other hand the expensive operation most expensive operation: hashing ListA, only occurs once. Insertions and deletions from a dictionary have good time complexity:
Average: O(1)
Worst case: O(n)
For lists, if order does not matter then inserting into a list is always O(1) and regardless of if we care about order deletions are always O(n).

Manually enumerating a list

I'm trying to figure out how to manually enumerate a list however I'm stuck as I cannot figure out how to split up the data list. This is the code I have so far..
enumerated_list = []
data = [5, 10, 15]
for x in (data):
print(x)
for i in range(len(data)):
enumerate_rule = (i, x)
enumerated_list.append(enumerate_rule)
print(enumerated_list)
This prints out..
5
10
15
[(0, 15), (1, 15), (2, 15)]
When what I'm after is [(0, 5), (1, 15), (2, 15)]. How would I go about this?
Use the enumerate() built-in:
>>> list(enumerate([5, 15, 15]))
[(0, 5), (1, 15), (2, 15)]
Your original code's fault lies in the fact you use x in your loop, however, x doesn't change in that loop, it's simply left there from the previous loop where you printed values.
However, this method of doing it is a bad way. Fixing it would require looping by index, something which Python isn't designed to do - it's slow and hard to read. Instead, we loop by value. The enumerate() built-in is there to do this job for us, as it's a reasonably common task.
If you really don't want to use enumerate() (which doesn't ever really make sense, but maybe as an arbitrary restriction trying to teach you about something else, at a stretch), then there are still better ways:
>>> from itertools import count
>>> list(zip(count(), [5, 15, 15]))
[(0, 5), (1, 15), (2, 15)]
Here we use zip(), which is the python function used to loop over two sets of data at once. This returns tuples of the first value from each iterable, then the second from each, etc... This gives us the result we want when combined with itertools.count(), which does what it says on the tin.
If you really feel the need to build a list manually, the more pythonic way of doing something rather unpythonic would be:
enumerated_list = []
count = 0
for item in data:
enumerated_list.append((count, item))
count += 1
Note, however, that generally, one would use a list comprehension to build a list like this - in this case, as soon as one would do that, it makes more sense to use one of the earlier methods. This kind of production of a list is inefficient and hard to read.
Since x goes through every element in `data, at the end of:
for x in (data):
print(x)
x will be the last element. Which is why you get 15 as the second element in each tuple:
[(0, 15), (1, 15), (2, 15)]
You only need one loop:
for i in range(len(data)):
enumerate_rule = (i, data[i]) # data[i] gets the ith element of data
enumerated_list.append(enumerate_rule)
enumerate_rule = (i, x) is the problem. You are using the same value (x, the last item in the list) each time. Change it to enumerate_rule = (i, data[i]).
I would use a normal "for loop" but with enumerated(), so you can use an index i in the loop:
enumerated_list=[]
data = [5, 10, 15]
for i,f in enumerate(data):
enumerated_list.append((i,f))
print enumerated_list
Result:
[(0, 5), (1, 15), (2, 15)]

Python: fast dictionary of big int keys

I have got a list of >10.000 int items. The values of the items can be very high, up to 10^27. Now I want to create all pairs of the items and calculate their sum. Then I want to look for different pairs with the same sum.
For example:
l[0] = 4
l[1] = 3
l[2] = 6
l[3] = 1
...
pairs[10] = [(0,2)] # 10 is the sum of the values of l[0] and l[2]
pairs[7] = [(0,1), (2,3)] # 7 is the sum of the values of l[0] and l[1] or l[2] and l[3]
pairs[5] = [(0,3)]
pairs[9] = [(1,2)]
...
The contents of pairs[7] is what I am looking for. It gives me two pairs with the same value sum.
I have implemented it as follows - and I wonder if it can be done faster. Currently, for 10.000 items it takes >6 hours on a fast machine. (As I said, the values of l and so the keys of pairs are ints up to 10^27.)
l = [4,3,6,1]
pairs = {}
for i in range( len( l ) ):
for j in range(i+1, len( l ) ):
s = l[i] + l[j]
if not s in pairs:
pairs[s] = []
pairs[s].append((i,j))
# pairs = {9: [(1, 2)], 10: [(0, 2)], 4: [(1, 3)], 5: [(0, 3)], 7: [(0, 1), (2, 3)]}
Edit: I want to add some background, as asked by Simon Stelling.
The goal is to find Formal Analogies like
lays : laid :: says : said
within a list of words like
[ lays, lay, laid, says, said, foo, bar ... ]
I already have a function analogy(a,b,c,d) giving True if a : b :: c : d. However, I would need to check all possible quadruples created from the list, which would be a complexity of around O((n^4)/2).
As a pre-filter, I want to use the char-count property. It says that every char has the same count in (a,d) and in (b,c). For instance, in "layssaid" we have got 2 a's, and so we do in "laidsays"
So the idea until now was
for every word to create a "char count vector" and represent it as an integer (the items in the list l)
create all pairings in pairs and see if there are "pair clusters", i.e. more than one pair for a particular char count vector sum.
And it works, it's just slow. The complexity is down to around O((n^2)/2) but this is still a lot, and especially the dictionary lookup and insert is done that often.
There are the trivial optimizations like caching constant values in a local variable and using xrange instead of range:
pairs = {}
len_l = len(l)
for i in xrange(len_l):
for j in xrange(i+1, len_l):
s = l[i] + l[j]
res = pairs.setdefault(s, [])
res.append((i,j))
However, it is probably far more wise to not pre-calculate the list and instead optimize the method on a concept level. What is the intrinsic goal you want to achieve? Do you really just want to calculate what you do? Or are you going to use that result for something else? What is that something else?
Just a hint. Have a look on itertools.combinations.
This is not exactly what you are looking for (because it stores pair of values, not of indexes), but it can be a starting code:
from itertools import combinations
for (a, b) in combinations(l, 2):
pairs.setdefault(a + b, []).append((a, b))
The above comment from SimonStelling is correct; generating all possible pairs is just fundamentally slow, and there's nothing you can do about it aside from altering your algorithm. The correct function to use from itertools is product; and you can get some minor improvements from not creating extra variables or doing unnecessary list indexes, but underneath the hood these are still all O(n^2). Here's how I would do it:
from itertools import product
l = [4,3,6,1]
pairs = {}
for (m,n) in product(l,repeat=2):
pairs.setdefault(m+n, []).append((m,n))
Finally, I have came up with my own solution, just taking half of the calculation time on average.
The basic idea: Instead of reading and writing into the growing dictionary n^2 times, I first collect all the sums in a list. Then I sort the list. Within the sorted list, I then look for same neighbouring items.
This is the code:
from operator import itemgetter
def getPairClusters( l ):
# first, we just store all possible pairs sequentially
# clustering will happen later
pairs = []
for i in xrange( len( l) ):
for j in xrange(i+1, len( l ) ):
pair = l[i] + l[j]
pairs.append( ( pair, i, j ) )
pairs.sort(key=itemgetter(0))
# pairs = [ (4, 1, 3), (5, 0, 3), (7, 0, 1), (7, 2, 3), (9, 1, 2), (10, 0, 2)]
# a list item of pairs now contains a tuple (like (4, 1, 3)) with
# * the sum of two l items: 4
# * the index of the two l items: 1, 3
# now clustering starts
# we want to find neighbouring items as
# (7, 0, 1), (7, 2, 3)
# (since 7=7)
pairClusters = []
# flag if we are within a cluster
# while iterating over pairs list
withinCluster = False
# iterate over pair list
for i in xrange(len(pairs)-1):
if not withinCluster:
if pairs[i][0] == pairs[i+1][0]:
# if not within a cluster
# and found 2 neighbouring same numbers:
# init new cluster
pairCluster = [ ( pairs[i][1], pairs[i][2] ) ]
withinCluster = True
else:
# if still within cluster
if pairs[i][0] == pairs[i+1][0]:
pairCluster.append( ( pairs[i][1], pairs[i][2] ) )
# else cluster has ended
# (next neighbouring item has different number)
else:
pairCluster.append( ( pairs[i][1], pairs[i][2] ) )
pairClusters.append(pairCluster)
withinCluster = False
return pairClusters
l = [4,3,6,1]
print getPairClusters(l)

Categories

Resources