Troubles with matrix in python

Troubles with matrix in python - python

Yesterday, I was trying to solve a problem with python, and I encountered something really odd:
# create matrix
for i in range(N):
tmp.append(0)
for i in range(N):
marker.append(tmp)
# create sum of 2 first cards
for i in range(N) :
for j in range(N):
if i != j and marker[i][j] == 0:
comCard.append(deck[i]+deck[j])
taken[deck[i]+deck[j]] = [i,j]
marker[i][j] = 1
marker[j][i] = 1
The idea is that I want to calculate all the possible sums of each pair of cards in the deck (these cards need to be different), so I think that with a marker, I can avoid calculating the same 2 cards again. for example: deck[1]+deck[2] and deck[2]+deck[1]. But these lines didn't work as they were supposed to do:
marker[i][j] = 1
marker[j][i] = 1

I can recommend another way using standard python modules:
# suppose this is a deck - list of cards (I don't know how to mark them :)
>>> deck = ['Ax', 'Dy', '8p', '6a']
>>> from itertools import combinations
# here's a list of all possible combinations of 2 different cards
>>> list(combinations(deck, 2)))
[('Ax', 'Dy'), ('Ax', '8p'), ('Ax', '6a'), ('Dy', '8p'), ('Dy', '6a'), ('8p', '6a')]
You may work with this list: check some combinations and so on.
I recommend to pay attention to library itertools - it's really awesome for such type computations!

You're using the same instance of tmp, only shallow-copied. That's why it doesn't work. You need a new copy of each line to add in your matrix.
This can be done with:
marker.append(list(tmp))
Also, you may benefit from using True and False instead of 0 and 1 someday. So the initialisation of your matrix could rahter look like this:
tmp = list()
marker = list()
for i in range(N):
tmp.append(False)
for j in range(N):
marker.append(list(tmp))
This way, when you try marker[i][j] = True, only the index (i, j) will be affected.
That being said, Eugene Lisitsky's answer gives you a far more adapted tool for this kind of matter (permutation listing).

Related

Constraining random number generation in Python

I am trying to create a loop in Python with numpy that will give me a variable "times" with 5 numbers generated randomly between 0 and 20. However, I want there to be one condition: that none of the differences between two adjacent elements in that list are less than 1. What is the best way to achieve this? I tried with the last two lines of code, but this is most likely wrong.
for j in range(1,6):
times = np.random.rand(1, 5) * 20
times.sort()
print times
da = np.diff(times)
if da.sum < 1: break
For instance, for one iteration, this would not be good:
4.25230915 4.36463992 10.35915732 12.39446368 18.46893283
But something like this would be perfect:
1.47166904 6.85610453 10.81431629 12.10176092 15.53569052

Since you are using numpy, you might as well use the built-in functions for uniform random numbers.
def uniform_min_range(a, b, n, min_dist):
while True:
x = np.random.uniform(a, b, size=n)
np.sort(x)
if np.all(np.diff(x) >= min_dist):
return x
It uses the same trial-and-error approach as the previous answer, so depending on the parameters the time to find a solution can be large.

Use a hit and miss approach to guarantee uniform distribution. Here is a straight-Python implementation which should be tweakable for numpy:
import random
def randSpacedPoints(n,a,b,minDist):
#draws n random numbers in [a,b]
# with property that their distance apart is >= minDist
#uses a hit-miss approach
while True:
nums = [a + (b-a)*random.random() for i in range(n)]
nums.sort()
if all(nums[i] + minDist < nums[i+1] for i in range(n-1)):
return nums
For example,
>>> randSpacedPoints(5,0,20,1)
[0.6681336968970486, 6.882374558960349, 9.73325447748434, 11.774594560239493, 16.009157676493903]
If there is no feasible solution this will hang in an infinite loop (so you might want to add a safety parameter which controls the number of trials).

(Python) Checking the 3x3 in a Sudoku, are there better ways to do this?

My partner in a summative for HS gave me this algorithm, I was hoping somebody could tell me if there is a more eloquent way of coding this..
CB is current board position(global), its a list of lists.
for a in xrange(0, 3):
for b in xrange(0, 3):
for j in xrange(1, 4):
for k in xrange(1, 4):
boxsum += CB[3a + j][3b + k]
if not(boxsum == 45):
return False
boxsum = 0

First, the following code is not indented correctly:
if not(boxsum == 45):
return False
boxsum = 0
(with the current indentation it will always fail on the first time this code is executed)
Second, in the following line:
boxsum += CB[3a + j][3b + k]
you probably meant to do:
boxsum += CB[3*a + j][3*b + k]
And last, in order to check a 3x3 part of sudoku game it is not enough to check the sum - you should also check that every number between 1-9 is present (or in other words, that all the numbers are in the range 1-9 and there is no number that appears more than once).

There are dozens of "cleaner" ways to do so.
First of all, why not use numpy for matrices, where you are obviously working with a matrix? I am assuming your numeration (which is a bit odd, why you start numerating from "1"?)
import numpy as np
CB = np.array(CB)
def constraint3x3check(CB):
return np.all(np.sum( CB[3*a+1:3*a+3, 3*b+1:3*b+3)==45 for a in range(3) for b in range(3))

Given the sum of the box equals 45, that doesn't mean there are all 1-9 numbers present.
You could for example add your numbers to set and check if the length of the set is always 9.

Since the sum 45 does not mean the answer is correct, necessarily, a different way is needed. Personally, I would join the rows into a single list and compare them to the list (1,2,...9), e.g.
#assuming this is your format...
box = [[4,2,3],[1,5,9],[8,7,6]]
def valid_box(box):
check_list = []
for row in box:
check_list += row
return list(range(1,10)) == sorted(check_list)
Although the code creating the list could also be done with list comprehension (I have no idea which one is more efficient, processor-wise)
def valid_box2(box):
return list(range(1,10)) == sorted( [item for row in box for item in row ] )
Merge list code taken from Making a flat list out of list of lists in Python

Python creating variables with names from range [duplicate]

This question already has answers here:
How do I create variable variables?
(17 answers)
Closed 8 years ago.
I want to use some code similar to what follows that actually works:
P = 20
n = 1
for x in range(1, P+1):
Ax = n #Hoping that you can name the variable from the current element in the range
n = n+1
I want to make varibles A1, A2, A3....A20 they would have the values 1, 2, 3...20 in this example...
Is this possible at all, and what coding does it require?
Cheers

You don't actually want to do this. Instead, you want something like this:
P = 20
n = 1
A = [] # Or `A = list()`
for x in range(1, P+1):
A.append(n)
n += 1
Then, instead of A0, you do A[0] and instead of A5 you do A[5].
Here is the Python 3.x list documentation (I presume you are using Python 3.x due to using range rather than xrange.
Also, as I understand it, your code could just be this:
P = 20
A = []
for x in range(1, P+1):
A.append(x)
Or this:
P = 20
A = [i for i in range(1, P+1)]
(See the documentation for list comprehensions, a very useful feature of Python.)
Or even:
P = 20
A = list(range(1, P+1))

Do not try to dynamically name variables. That way madness lies.
Instead, leverage python's data structures to do what you want. In most cases, people really want to be using a dict or a list.
a = {}
for x in range(1,21):
a[x] = x**2
b = []
for x in range(1,21):
b.append(x**2)
You will get a feel for when you want to use one over the other. For example, in the above if I needed to quickly look up the square of a given integer, I would use a dict. If I instead just needed to do something to the collection of squares between 1 and 20, that's when I use a list.
Trivial example, but this scales up as far as you need it to. Any hashable data type can be a key in a dictionary, so you're no longer restricted from naming your variables with clunky letters and numbers - any object will do!

I almost agree with all the answers and comments you got so far:
99.99% of the times, you don't want to do this. It's dangerous, ugly and bad.
However there is a way to do it, using exec:
P = 20
n = 1
for x in range(1, P+1):
exec("A{} = n".format(x))
n = n+1
Again, you probably shouldn't use this.

memoization in python, off by one errors

I'm currently taking an algorithms class. I'm testing a lot of them out in python, including dynamic programming. Here is an implementation of the bottom up rod cutting implementation.
It doesn't work because of the off-by-one error. Is there a global setting in python where I can change the default array index to be 1 and not 0? Or can someone please provide me with a better strategy for over-coming the off-by-one errors, which I encounter a million times. It's super annoying.
def bottom_up_memo_cut_rod(p,n):
r = [ 0 for i in range(n) ]
r[0] = 0
for j in range(n):
q = -1
for i in range(j):
q = max(q, p[i] + r[j-i])
r[j] = q
return r[n]
bottom_up_memo_cut_rod([1,5,8,9], 4)
answer should be 10 in this case cutting 4 into (2,2) yields the max price of 10.

There are a couple of things in Python that may help you. The built-in enumerate is a great one.
for idx, val_at_idx in enumerate(aList):
# idx is the 0-indexed position, val_at_idx is the actual value.
You can also use list slicing with enumerate to shift indices if absolutely necessary:
for idxOffBy1, val_at_wrong_idx in enumerate(aList[1:]):
# idx here will be 0, but the value will be be from position 1 in the original list.
Realistically though, you don't want to try to change the interpreter so that lists start at index 1. You want to adjust your algorithm to work with the language.

In Python, you can often avoid working with the indices altogether. That algorithm can be written like this:
def bottom_up_memo_cut_rod(p,n):
r = [0]
for dummy in p:
r.append(max(a + b for a, b in zip(reversed(r),p)))
return r[-1]
print bottom_up_memo_cut_rod([1,5,8,9], 4)
#10

In your case, off-by-one is a result of r[n] where len(r)==n. You either write r[n-1], or, more preferably, r[-1], which means "the last element of r", the same way r[-2] will mean "second last" etc.
Unrelated, but useful: [ 0 for i in range(n) ] can be written as [0] * n

Generate 4000 unique pseudo-random cartesian coordinates FASTER?

The range for x and y is from 0 to 99.
I am currently doing it like this:
excludeFromTrainingSet = []
while len(excludeFromTrainingSet) < 4000:
tempX = random.randint(0, 99)
tempY = random.randint(0, 99)
if [tempX, tempY] not in excludeFromTrainingSet:
excludeFromTrainingSet.append([tempX, tempY])
But it takes ages and I really need to speed this up.
Any ideas?

Vincent Savard has an answer that's almost twice as fast as the first solution offered here.
Here's my take on it. It requires tuples instead of lists for hashability:
def method2(size):
ret = set()
while len(ret) < size:
ret.add((random.randint(0, 99), random.randint(0, 99)))
return ret
Just make sure that the limit is sane as other answerers have pointed out. For sane input, this is better algorithmically O(n) as opposed to O(n^2) because of the set instead of list. Also, python is much more efficient about loading locals than globals so always put this stuff in a function.
EDIT: Actually, I'm not sure that they're O(n) and O(n^2) respectively because of the probabilistic component but the estimations are correct if n is taken as the number of unique elements that they see. They'll both be slower as they approach the total number of available spaces. If you want an amount of points which approaches the total number available, then you might be better off using:
import random
import itertools
def method2(size, min_, max_):
range_ = range(min_, max_)
points = itertools.product(range_, range_)
return random.sample(list(points), size)
This will be a memory hog but is sure to be faster as the density of points increases because it avoids looking at the same point more than once. Another option worth profiling (probably better than last one) would be
def method3(size, min_, max_):
range_ = range(min_, max_)
points = list(itertools.product(range_, range_))
N = (max_ - min_)**2
L = N - size
i = 1
while i <= L:
del points[random.randint(0, N - i)]
i += 1
return points

My suggestion :
def method2(size):
randints = range(0, 100)
excludeFromTrainingSet = set()
while len(excludeFromTrainingSet) < size:
excludeFromTrainingSet.add((random.choice(randints), random.choice(randints)))
return excludeFromTrainingSet
Instead of generation 2 random numbers every time, you first generate the list of numbers from 0 to 99, then you choose 2 and appends to the list. As others pointed out, there are only 10 000 possibilities so you can't loop until you get 40 000, but you get the point.

I'm sure someone is going to come in here with a usage of numpy, but how about using a set and tuple?
E.g.:
excludeFromTrainingSet = set()
while len(excludeFromTrainingSet) < 40000:
temp = (random.randint(0, 99), random.randint(0, 99))
if temp not in excludeFromTrainingSet:
excludeFromTrainingSet.add(temp)
EDIT: Isn't this an infinite loop since there are only 100^2 = 10000 POSSIBLE results, and you're waiting until you get 40000?

Make a list of all possible (x,y) values:
allpairs = list((x,y) for x in xrange(99) for y in xrange(99))
# or with Py2.6 or later:
from itertools import product
allpairs = list(product(xrange(99),xrange(99)))
# or even taking DRY to the extreme
allpairs = list(product(*[xrange(99)]*2))
Shuffle the list:
from random import shuffle
shuffle(allpairs)
Read off the first 'n' values:
n = 4000
trainingset = allpairs[:n]
This runs pretty snappily on my laptop.

You could make a lookup table of random values... make a random index into that lookup table, and then step through it with a static increment counter...

Generating 40 thousand numbers inevitably will take a while. But you are performing an O(n) linear search on the excludeFromTrainingSet, which takes quite a while especially later in the process. Use a set instead. You could also consider generating a number of coordinate sets e.g. over night and pickle them, so you don't have to generate new data for each test run (dunno what you're doing, so this might or might not help). Using tuples, as someone noted, is not only the semantically correct choice, it might also help with performance (tuple creation is faster than list creation). Edit: Silly me, using tuples is required when using sets, since set members must be hashable and lists are unhashable.
But in your case, your loop isn't terminating because 0..99 is 100 numbers and two-tuples of them have only 100^2 = 10000 unique combinations. Fix that, then apply the above.

Taking Vince Savard's code:
>>> from random import choice
>>> def method2(size):
... randints = range(0, 100)
... excludeFromTrainingSet = set()
... while True:
... x = size - len(excludeFromTrainingSet)
... if not x:
... break
... else:
... excludeFromTrainingSet.add((choice(randints), choice(randints)) for _ in range(x))
... return excludeFromTrainingSet
...
>>> s = method2(4000)
>>> len(s)
4000
This is not a great algorithm because it has to deal with collisions, but the tuple-generation makes it tolerable. This runs in about a second on my laptop.

## for py 3.0+
## generate 4000 points in 2D
##
import random
maxn = 10000
goodguys = 0
excluded = [0 for excl in range(0, maxn)]
for ntimes in range(0, maxn):
alea = random.randint(0, maxn - 1)
excluded[alea] += 1
if(excluded[alea] > 1): continue
goodguys += 1
if goodguys > 4000: break
two_num = divmod(alea, 100) ## Unfold the 2 numbers
print(two_num)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Troubles with matrix in python - python

Related

Constraining random number generation in Python

(Python) Checking the 3x3 in a Sudoku, are there better ways to do this?

Python creating variables with names from range [duplicate]

memoization in python, off by one errors

Generate 4000 unique pseudo-random cartesian coordinates FASTER?

Categories

Resources