Random Sudoku Generator

Random Sudoku Generator - python

I'm trying to build a python script that generates a 9x9 block with numbers 1-9 that are unique along the rows, columns and within the 3x3 blocks - you know, Sudoku!
So, I thought I would start simple and get more complicated as I went. First I made it so it randomly populated each array value with a number 1-9. Then made sure numbers along rows weren't replicated. Next, I wanted to the same for rows & columns. I think my code is OK - it's certainly not fast but I don't know why it jams up..
import numpy as np
import random
#import pdb
#pdb.set_trace()
#Soduku solver!
#Number input
soduku = np.zeros(shape=(9,9))
for i in range(0,9,1):
for j in range(0,9,1):
while True:
x = random.randint(1,9)
if x not in soduku[i,:] and x not in soduku[:,j]:
soduku[i,j] = x
if j == 8: print(soduku[i,:])
break
So it moves across the columns populating with random ints, drops a row and repeats. The most the code should really need to do is generate 9 numbers for each square if it's really unlucky - I think if we worked it out it would be less than 9*9*9 values needing generating. Something is breaking it!
Any ideas?!

I think what's happening is that your code is getting stuck in your while-loop. You test for the condition if x not in soduku[i,:] and x not in soduku[:,j], but what happens if this condition is not met? It's very likely that your code is running into a dead-end sudoku board (can't be solved with any values), and it's getting stuck inside the while-loop because the condition to break can never be met.

Generating it like this is very unlikely to work. There are many ways where you can generate 8 of the 9 3*3 squares making it impossible to fill in the last square at all, makign it hang forever.
Another approach would be to fill in all the numbers on at the time (so, all the 1s first, then all the 2s, etc.). It would be like the Eight queens puzzle, but with 9 queens. And when you get to a position where it is impossible to place a number, restart.
Another approach would be to start all the squares at 9 and strategically decrement them somehow, e.g. first decrement all the ones that cannot be 9, excluding the 9s in the current row/column/square, then if they are all impossible or all possible, randomly decrement one.
You can also try to enumerate all sudoku boards, then reverse the enumaration function with a random integer, but I don't know how successful this may be, but this is the only method where they could be chosen with uniform randomness.

You are coming at the problem from a difficult direction. It is much easier to start with a valid Sudoku board and play with it to make a different valid Sudoku board.
An easy valid board is:
1 2 3 | 4 5 6 | 7 8 9
4 5 6 | 7 8 9 | 1 2 3
7 8 9 | 1 2 3 | 4 5 6
---------------------
2 3 4 | 5 6 7 | 8 9 1
5 6 7 | 8 9 1 | 2 3 4
8 9 1 | 2 3 4 | 5 6 7
---------------------
3 4 5 | 6 7 8 | 9 1 2
6 7 8 | 9 1 2 | 3 4 5
9 1 2 | 3 4 5 | 6 7 8
Having found a valid board you can make a new valid board by playing with your original.
You can swap any row of three 3x3 blocks with any other block row. You can swap any column of three 3x3 blocks with another block column. Within each block row you can swap single cell rows; within each block column you can swap single cell columns. Finally you can permute the digits so there are different digits in the cells as long as the permutation is consistent across the whole board.
None of these changes will make a valid board invalid.

I use permutations(range(1,10)) from itertools to create a list of all possible rows. Then I put each row into a sudoku from top to bottom one by one. If contradicts occurs, use another row from the list. In this approach, I can find out some valid completed sudoku board in a short time. It continue generate completed board within a minute.
And then I remove numbers from the valid completed sudoku board one by one in random positions. After removing each number, check if it still has unique solution. If not, resume the original number and change to next random position. Usually I can remove 55~60 numbers from the board. It take time within a minute, too. It is workable.
However, the first few generated the completed sudoku board has number 1,2,3,4,5,6,7,8,9 in the first row. So I shuffle the whole list. After shuffling the list, it becomes difficult to generate a completed sudoku board. Mission fails.
A better approach may be in this ways. You collect some sudoku from the internet. You complete them so that they are used as seeds. You remove numbers from them as mention above in paragraph 2. You can get some sudoku. You can use these sudokus to further generate more by any of the following methods
swap row 1 and row 3, or row 4 and row 6, or row 7 and row 9
similar method for columns
swap 3x3 blocks 1,4,7 with 3,6,9 or 1,2,3 with 7,8,9 correspondingly.
mirror the sudoku vertical or horizontal
rotate 90, 180, 270 the sudoku
random permute the numbers on the board. For example, 1->2, 2->3, 3->4, .... 8->9, 9->1. Or you can just swap only 2 of them. eg. 1->2, 2->1. This also works.

Related

user indexing for something similar to a matrix

Need to start off by saying that this is essentially a homework question and it’s really a tough one for me.
From the print output of a dynamic matrix (or anything that can be formatted and printed to be visually similar), I need to use indexing from a user’s input to change values in that “matrix”. It doesn't have to be a matrix, something that can be formatted to be similar would also work.
What makes this problem hard for me is that only un-nested lists, strings, or dictionaries can be used, and importing packages is not allowed. So, list comprehension is out of the question.
One thing I’ve tried so far is to print separate lists and index based on separate lists but I got stuck.

You can use a 1D list, take x, y coordinates in user input and convert the y coordinate in terms of x offsets.
For example, say you want to represent a 5x3 array and the user wants the second column (x=2) and third row (y=3). Let's assume our matrix displays with 1,1 being top left corner.
Multiply the y coordinate minus 1 by the width of the matrix to obtain the number of cell offsets in your 1D-list, then further offset by x - 1 (remember, Python lists are 0-based) to position correctly on the x-axis.
Example of matrix with 1D-based indices:
0 | 1 | 2 | 3 | 4
5 | 6 | 7 |  8 | 9
10 | 11 | 12 | 13 | 14
Taking the algorithm above:
index = (y - 1) * 5 + x - 1 # ((3 - 1) * 5 + 2 - 1) = 11
As you can see, 11 is indeed in our matrix the second column and third row, as per the example user inputs.
You can then display the matrix by way of a simple for loop, knowing the size of the matrix and inserting a new line as appropriate.
You may simplify the above a bit if the user is requested to input 0-based indices as well. You will not need to substract 1 from x and y.

Nested loops-keep the number of iterations constant

By request by #mypetition I am editing my question, although I think that the astronomy details here are unimportant.
I have a file of the form:
a e q Q i lasc aper M H dist comment i_f i_h i_free
45.23710 0.1394 38.93105 51.54315 5.0300 19.9336 286.2554 164.9683 8.41 51.3773 warm 0.000 62.000 4.796
46.78620 0.1404 40.21742 53.35498 3.1061 148.9657 192.3009 337.5967 7.37 40.8789 cold 0.000 42.000 2.473
45.79450 0.1230 40.16178 51.42722 8.0695 104.6470 348.5004 32.9457 8.45 41.3089 warm 0.000 47.000 6.451
42.95280 0.0145 42.32998 43.57562 2.9273 126.3988 262.8777 163.4198 7.36 43.5518 cold 0.000 161.000 2.186
There are 1.6e6 lines in total. These are orbital elements. I need to compute the Minimum Orbit Intersection Distance (MOID) between each pair of orbits, e.g line 1 with line 2, line 1 with line3 and so forth until I reach the end of the file. Then, I start from the second line and go to the end of the file. Then start from the third line and agin go to the end of the file etc. Since I have 1.6e6 orbits, that would be ~1e12 orbit pairs.
I don't want to load all these 1e12 calculation on 1 cpu and wait forever, so I am planning to use a cluster and launch multiple serial jobs.
I need to iterate over 1.6e6 elements, where I start with the first elements and go to the end of the file, then start from the second and go to the end of the file etc, until I lastly start with T-1 and go to T. These will result in 10^12 iterations and I am planning split them into multiple jobs, where each job does C=10^7 calculations, so I can run them on a computer cluster.
I came up with the following nested loop:
for i in range( M, N)
for j in range( i+1, T)
where M=1 and changes according to the number of jobs that I will have. T=1.6e6 is constant (number of lines to iterate over). I want to find the index N, so that the total number of operations is C=10^7. Here is how I approached the problem:
[T-(N+1) + T-(M+1)]*(M-N+1)/2=C - because the number of the operations are just the sum of the arithmetic series above. So, I solve the quadratic equation and I get the roots. Here is the python code for that:
import numpy as np
import math
C=1.0e7 # How many calculations per job do you want?
T=1.6e6 # How many orbits do you have?
M=1 # what is the starting index of outer loop?
# N = end index of outer loop (this is to be calculated!)
P=1
l=0
with open('indx.txt','w') as f:
while P<T:
l=l+1
K=np.roots([-1,2*T,M**2-2*T*(M-1)-2*C])
N=int(round(K[1]))
f.write("%s %s\n" % (P,P+N))
M=K[1]+1
P=P+N+1
However, keeping the above solutions, updating M=M+N, I noticed that the condition C=10^7 is not satisfied. Here is a list of the first few indices.
M N
1 7
8 21
22 41
42 67
68 99
100 138
139 183
184 234
235 291
....
....
1583930 1588385
1588386 1592847
1592848 1597316
1597317 1601791
But if you look at the pair before the last, the loop over i=1592848 - 1597316 and j=i+1, T will produce more calculations than C=10^7 i.e roughly (2685+7153)*4468/2 ~ 2.2e7.
Any idea on how to solve this problem, keeping C=1e7 constant, which will provide the number of jobs (with similar running time) I need to run in order to iterate over 1.6e6 lines.
Hopefully, this explanation is enough according to #mypetition standards and am hoping to resolve the problem.
Your help will be highly appreciated!

I don't know if the nature of each job can lend itself to a different kind of split but if they do, you could try to use the same trick that Gauss used to come up with ∑1..n = n(n+1)/2
The trick was to line up the sequence with a reversed copy of it:
1 2 3 4 5 6 7 8
8 7 6 5 4 3 2 1
-- -- -- -- -- -- -- --
9 9 9 9 9 9 9 9 = 8 * 9 is twice the sum so (8*9)/2 = ∑1..8 = 36
Based on this, if you split the pair of series down the middle, you will get 4 pairs of runs that will process the same number of elements:
1 2 3 4
8 7 6 5
-- -- -- --
9 9 9 9
Sou you would have 8 runs separated in 4 jobs. Each job would process n+1 (9) elements and compute two runs that have a complementary number of elements
Job 1 would do run 8..8 and run 1..8 (length 1 and 8)
Job 2 would do run 7..8 and run 2..8
Job 3 would do run 6..8 and run 3..8
Job 4 would do run 7..8 and run 4..8
In more general terms:
Job i of (N+1)/2 does runs (N-i+1)..N and i..N
If individual runs can't be parallelized further, this should give you the optimal spread (practically the square root of the total process time)
in Python (pseudo code):
size = len(array)
for index in range((size+1)//2):
launchJob(array, run1Start=index, run2Start=size-index-1)
note: you may want to adjust the starting points if you're not using zero based indexes.
note2: if you're not processing the last element on its own (i.e. N..N is excluded), one of your jobs will have N elements to process instead of N+1 and you will have to make an exception for that one
Adding more jobs will not significantly improve total processing time but if you want fewer parallel jobs, you can still keep them fairly equal by grouping pairs.
e.g. 2 jobs : [ 1,8,2,7 ] and [3,6,4,5] = 18 per job
Ideally your number of jobs should be a divider of the number of pairs. If not, you will still get a relatively balanced processing time by spreading extra pairs (or runs) evenly over the other jobs. If you choose to spread runs, select the ones in the middle of the list of pairs (because they will have individual processing times that are closer to each other).

How to import my data into python

I'm currently working on Project Euler 18 which involves a triangle of numbers and finding the value of the maximum path from top to bottom. It says you can do this project either by brute forcing it or by figuring out a trick to it. I think I've figured out the trick, but I can't even begin to solve this because I don't know how to start manipulating this triangle in Python.
https://projecteuler.net/problem=18
Here's a smaller example triangle:
3
7 4
2 4 6
8 5 9 3
In this case, the maximum route would be 3 -> 7 -> 4 -> 9 for a value of 23.
Some approaches I considered:
I've used NumPy quite a lot for other tasks, so I wondered if an array would work. For that 4 number base triangle, I could maybe do a 4x4 array and fill up the rest with zeros, but aside from not knowing how to import the data in that way, it also doesn't seem very efficient. I also considered a list of lists, where each sublist was a row of the triangle, but I don't know how I'd separate out the terms without going through and adding commas after each term.
Just to emphasise, I'm not looking for a method or a solution to the problem, just a way I can start to manipulate the numbers of the triangle in python.

Here is a little snippet that should help you with reading the data:
rows = []
with open('problem-18-data') as f:
for line in f:
rows.append([int(i) for i in line.rstrip('\n').split(" ")])

Sum of array diagonal

I'm very new at this and have to do this for a project so keep that in mind.
I need to write a function sumOfDiagonal that has one parameter of type list.
The list is a 4x4 2-dimensional array of integers (4 rows and 4 columns of integers).
The function must return the sum of the integers in the diagonal positions from top right to bottom left.
I have not tried anything because I have no idea where to begin, so would appreciate some guidance.

Since you haven't specified a language (and this is probably classwork anyway), I'll have to provide pseudo-code. Given the 4x4 2d array, the basic idea is to use a loop specifying the index, and use that index to get the correct elements in both dimensions. Say we had the array:
[][0] [][1] [][2] [][3]
----- ----- ----- -----
[0][] 1 2 3 4
[1][] 5 6 7 8
[2][] 9 10 11 12
[3][] 13 14 15 16
and we wanted to sum the top-left-to-bottom-right diagonal (1+6+11+16)(1). That would be something like:
def sumOfDiagonal (arr, sz):
sum = 0
for i = 0 to sz - 1 inclusive:
sum = sum + arr[i][i]
return sum
That's using the normal means of accessing an array. If, as may be given the ambiguity in the question, your array is actually a list of some description (such as a linked list of sixteen elements), you'll just need to adjust how you get the "array" elements.
For example, a 16-element list would need to get nodes 0, 5, 10 and 15 so you could run through the list skipping four nodes after each accumulation.
By way of example, here's some Python code(2) for doing the top-left-to-bottom-right variant, which outputs 34 (1+6+11+16) as expected:
def sumOfDiagonals(arr):
sum = 0
for i in range(len(arr)):
sum += arr[i][i]
return sum
print(sumOfDiagonals([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]]))
(1) To do top right to bottom left simply requires you to change the second term into sz - i - 1.
(2) Python is the ideal pseudo-code language when you want to be able to test your pseudo-code, provided you stay away from its more complex corners :-)

Generation of values above diagonal numpy array

Suppose you have a 10x10 numpy array of intensity values extracted from an image. The exact numbers do not matter right now. I would like to take this matrix, and generate vertices for a graph using only the vertex locations of the upper half of the matrix. More specifically, if our matrix dimensions are defined as (MxN), we could possibly write something like this:
for x in range(M1,M2,M3...M10):
for y in range(N1,N2,N3...N10):
if (x-y) >=0:
graph.addVertex(x,y)
The graph class definition and addVertex definition are NOT important to me as I already have them written. I am only concerned about a method in which I can only consider vertices which are above the diagonal. Open to any and all suggestions, my suggestion above is merely a starting point that may possibly be helpful. Thanks!
EDIT: SOLUTION
Sorry if my clarity issues were atrocious, as I'm somewhat new to coding in Python, but this is the solution to my issue:
g=Graph()
L=4
outerindex=np.arange(L**2*L**2).reshape((L**2,L**2))
outerindex=np.triu(outerindex,k=0)
for i in range(len(outerindex)):
if outerindex.any()>0:
g.addVertex(i)
In this manner, when adding vertices to our newly formed graph, the only new vertices formed will be those that reside in locations above the main diagonal.

I think what you want is something like this:
import numpy as np
a = np.arange(16).reshape((4,4))
print a
for i in range(4):
for j in range(i, 4):
print a[i,j],
# [[ 0 1 2 3]
# [ 4 5 6 7]
# [ 8 9 10 11]
# [12 13 14 15]]
# 0 1 2 3 5 6 7 10 11 15
That is, the key point here is to make the index of the inner loop dependent on the outer loop.
If you don't want to include the diagonal, use the inner loop with range(i+1,4).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.