Generation of values above diagonal numpy array

Generation of values above diagonal numpy array - python

Suppose you have a 10x10 numpy array of intensity values extracted from an image. The exact numbers do not matter right now. I would like to take this matrix, and generate vertices for a graph using only the vertex locations of the upper half of the matrix. More specifically, if our matrix dimensions are defined as (MxN), we could possibly write something like this:
for x in range(M1,M2,M3...M10):
for y in range(N1,N2,N3...N10):
if (x-y) >=0:
graph.addVertex(x,y)
The graph class definition and addVertex definition are NOT important to me as I already have them written. I am only concerned about a method in which I can only consider vertices which are above the diagonal. Open to any and all suggestions, my suggestion above is merely a starting point that may possibly be helpful. Thanks!
EDIT: SOLUTION
Sorry if my clarity issues were atrocious, as I'm somewhat new to coding in Python, but this is the solution to my issue:
g=Graph()
L=4
outerindex=np.arange(L**2*L**2).reshape((L**2,L**2))
outerindex=np.triu(outerindex,k=0)
for i in range(len(outerindex)):
if outerindex.any()>0:
g.addVertex(i)
In this manner, when adding vertices to our newly formed graph, the only new vertices formed will be those that reside in locations above the main diagonal.

I think what you want is something like this:
import numpy as np
a = np.arange(16).reshape((4,4))
print a
for i in range(4):
for j in range(i, 4):
print a[i,j],
# [[ 0 1 2 3]
# [ 4 5 6 7]
# [ 8 9 10 11]
# [12 13 14 15]]
# 0 1 2 3 5 6 7 10 11 15
That is, the key point here is to make the index of the inner loop dependent on the outer loop.
If you don't want to include the diagonal, use the inner loop with range(i+1,4).

Related

Extending and optimizing 2D grid search code to N-dimensions (using itertools)

I have code for a 2D grid search and it works perfectly. Here is the sample code:
chinp = np.zeros((N,N))
O_M_De = []
for x,y in list(((x,y) for x in range(len(omega_M)) for y in range(len(omega_D)))):
Omin = omega_M[x]
Odin = omega_D[y]
print(Omin, Odin)
chi = np.sum((dist_data - dist_theo)**2/(chi_err))
chinp[y,x] = chi
chi_values1.append(chi)
O_M_De.append((x,y))
My question is, at some point in the future, I may want to perform a grid search over more dimensions. Now if this were the case of 3 dimensions, it would be as simple as adding another variable 'z' in my 'for' statement (line 3). This code would work fine for me to keep adding more dimensions too (i have tried it and it works).
However as you can tell, if I wanted a large number of dimensions to perform a grid search over, it would get a little tedious and inefficient to keep adding variable to my 'for' statement (e.g. for 5D it would go something like 'for v,w,x,y,z in list(((v,w,x,y,z)...').
Just from various google searches, I am under the impression that itertools is very helpful when it comes to performing grid searches however I am still fairly new to programming and unfamiliar with it.
My question is if anyone knows a way (using itertools or some other method I am not aware of) to be able to extend this code to N-dimenions in a more efficient way (i.e. maybe change the 'for' statement so I can grid search over N-dimensions easily without adding on another 'for z in range etc'
Thank you in advance for your help.

You want to take a look at product function from itertools
import itertools
x_list = [0, 1, 2]
y_list = [10, 11, 12]
z_list = [20, 21, 22]
for x, y, z in itertools.product(x_list, y_list, z_list):
print(x, y, z)
0 10 20
0 10 21
0 10 22
0 11 20
0 11 21
(...)
2 11 21
2 11 22
2 12 20
2 12 21
2 12 22
Note that this will not be the most efficient way. The best results you will get if you add some vectorization (for example using numpy or numba) and parallelism (using multiprocessing or numba).

Why does a GraphView omit the 0th vertex?

I am using the latest version of graph_tool installed in it's own conda environment, as per the installation guide.
I ran into some perplexing behavior with this library recently. When I run the following code:
import graph_tool
graph = graph_tool.Graph(directed=False)
graph.add_vertex(10)
subgraph = graph_tool.GraphView(graph, graph.get_vertices())
print(graph.get_vertices())
print(subgraph.get_vertices())
The output is:
[0 1 2 3 4 5 6 7 8 9]
[1 2 3 4 5 6 7 8 9]
I thought a GraphView was supposed to act like a subgraph induced on the specified vertices (so in the case of my sample code, the entire set of vertices). So why does a GraphView omit the 0th vertex?
Or, if this is actually a bug in graph_tool, what would be a good way to work around it, provided I wanted to work with subgraphs that include the 0th vertex?

You posted the documentation in your answer, but it seems you did not read it carefully enough (emphasis added):
The argument g must be an instance of a Graph class. If specified, vfilt and efilt select which vertices and edges are filtered, respectively. These parameters can either be a boolean-valued PropertyMap or a ndarray, which specify which vertices/edges are selected, or an unary function that returns True if a given vertex/edge is to be selected, or False otherwise.
If you pass a property map or an array, it must be boolean valued, not a list of vertices. This means it must have the form [True, False, False, True, ... ], where True means the corresponding vertex is kept, otherwise it's filtered out. That is why the vertex with index 0 (i.e. False) is removed from your example, and all remaining ones are kept.

So I found a workaround. From the documentation of GraphView:
The argument g must be an instance of a Graph class. If specified,
vfilt and efilt select which vertices and edges are filtered,
respectively. These parameters can either be a boolean-valued
PropertyMap or a ndarray, which specify which vertices/edges are
selected, or an unary function that returns True if a given
vertex/edge is to be selected, or False otherwise.
So the vertex mask can also be specified by a unary function that says whether or not the vertex is part of the subgraph:
def get_subgraph(graph, vertices):
f = lambda x: x in vertices
return graph_tool.GraphView(graph, f)
And, somehow, this version actually works!
subgraph = get_subgraph(graph, graph.get_vertices())
print(graph.get_vertices())
print(subgraph.get_vertices())
Output:
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
So it's not actually impossible to make a GraphView that includes 0 as a vertex, it just apparently doesn't work if you try to do it with a numpy array.
This answer works for me, but I would still be interested if anyone has a better workaround (especially since this one makes it much slower to return the subgraph for a large graph), or if somebody knows why this odd behavior arises in the first place.
EDIT:
This implementation leverages numpy to calculate the vertex mask instead of the native python "in" operation, and is therefore much faster for larger graphs:
def get_subgraph(graph, vertices):
property_map = graph.new_vertex_property("bool")
property_map.a = np.isin(graph.get_vertices(), vertices)
return graph_tool.GraphView(graph, property_map)

How to import my data into python

I'm currently working on Project Euler 18 which involves a triangle of numbers and finding the value of the maximum path from top to bottom. It says you can do this project either by brute forcing it or by figuring out a trick to it. I think I've figured out the trick, but I can't even begin to solve this because I don't know how to start manipulating this triangle in Python.
https://projecteuler.net/problem=18
Here's a smaller example triangle:
3
7 4
2 4 6
8 5 9 3
In this case, the maximum route would be 3 -> 7 -> 4 -> 9 for a value of 23.
Some approaches I considered:
I've used NumPy quite a lot for other tasks, so I wondered if an array would work. For that 4 number base triangle, I could maybe do a 4x4 array and fill up the rest with zeros, but aside from not knowing how to import the data in that way, it also doesn't seem very efficient. I also considered a list of lists, where each sublist was a row of the triangle, but I don't know how I'd separate out the terms without going through and adding commas after each term.
Just to emphasise, I'm not looking for a method or a solution to the problem, just a way I can start to manipulate the numbers of the triangle in python.

Here is a little snippet that should help you with reading the data:
rows = []
with open('problem-18-data') as f:
for line in f:
rows.append([int(i) for i in line.rstrip('\n').split(" ")])

Random Sudoku Generator

I'm trying to build a python script that generates a 9x9 block with numbers 1-9 that are unique along the rows, columns and within the 3x3 blocks - you know, Sudoku!
So, I thought I would start simple and get more complicated as I went. First I made it so it randomly populated each array value with a number 1-9. Then made sure numbers along rows weren't replicated. Next, I wanted to the same for rows & columns. I think my code is OK - it's certainly not fast but I don't know why it jams up..
import numpy as np
import random
#import pdb
#pdb.set_trace()
#Soduku solver!
#Number input
soduku = np.zeros(shape=(9,9))
for i in range(0,9,1):
for j in range(0,9,1):
while True:
x = random.randint(1,9)
if x not in soduku[i,:] and x not in soduku[:,j]:
soduku[i,j] = x
if j == 8: print(soduku[i,:])
break
So it moves across the columns populating with random ints, drops a row and repeats. The most the code should really need to do is generate 9 numbers for each square if it's really unlucky - I think if we worked it out it would be less than 9*9*9 values needing generating. Something is breaking it!
Any ideas?!

I think what's happening is that your code is getting stuck in your while-loop. You test for the condition if x not in soduku[i,:] and x not in soduku[:,j], but what happens if this condition is not met? It's very likely that your code is running into a dead-end sudoku board (can't be solved with any values), and it's getting stuck inside the while-loop because the condition to break can never be met.

Generating it like this is very unlikely to work. There are many ways where you can generate 8 of the 9 3*3 squares making it impossible to fill in the last square at all, makign it hang forever.
Another approach would be to fill in all the numbers on at the time (so, all the 1s first, then all the 2s, etc.). It would be like the Eight queens puzzle, but with 9 queens. And when you get to a position where it is impossible to place a number, restart.
Another approach would be to start all the squares at 9 and strategically decrement them somehow, e.g. first decrement all the ones that cannot be 9, excluding the 9s in the current row/column/square, then if they are all impossible or all possible, randomly decrement one.
You can also try to enumerate all sudoku boards, then reverse the enumaration function with a random integer, but I don't know how successful this may be, but this is the only method where they could be chosen with uniform randomness.

You are coming at the problem from a difficult direction. It is much easier to start with a valid Sudoku board and play with it to make a different valid Sudoku board.
An easy valid board is:
1 2 3 | 4 5 6 | 7 8 9
4 5 6 | 7 8 9 | 1 2 3
7 8 9 | 1 2 3 | 4 5 6
---------------------
2 3 4 | 5 6 7 | 8 9 1
5 6 7 | 8 9 1 | 2 3 4
8 9 1 | 2 3 4 | 5 6 7
---------------------
3 4 5 | 6 7 8 | 9 1 2
6 7 8 | 9 1 2 | 3 4 5
9 1 2 | 3 4 5 | 6 7 8
Having found a valid board you can make a new valid board by playing with your original.
You can swap any row of three 3x3 blocks with any other block row. You can swap any column of three 3x3 blocks with another block column. Within each block row you can swap single cell rows; within each block column you can swap single cell columns. Finally you can permute the digits so there are different digits in the cells as long as the permutation is consistent across the whole board.
None of these changes will make a valid board invalid.

I use permutations(range(1,10)) from itertools to create a list of all possible rows. Then I put each row into a sudoku from top to bottom one by one. If contradicts occurs, use another row from the list. In this approach, I can find out some valid completed sudoku board in a short time. It continue generate completed board within a minute.
And then I remove numbers from the valid completed sudoku board one by one in random positions. After removing each number, check if it still has unique solution. If not, resume the original number and change to next random position. Usually I can remove 55~60 numbers from the board. It take time within a minute, too. It is workable.
However, the first few generated the completed sudoku board has number 1,2,3,4,5,6,7,8,9 in the first row. So I shuffle the whole list. After shuffling the list, it becomes difficult to generate a completed sudoku board. Mission fails.
A better approach may be in this ways. You collect some sudoku from the internet. You complete them so that they are used as seeds. You remove numbers from them as mention above in paragraph 2. You can get some sudoku. You can use these sudokus to further generate more by any of the following methods
swap row 1 and row 3, or row 4 and row 6, or row 7 and row 9
similar method for columns
swap 3x3 blocks 1,4,7 with 3,6,9 or 1,2,3 with 7,8,9 correspondingly.
mirror the sudoku vertical or horizontal
rotate 90, 180, 270 the sudoku
random permute the numbers on the board. For example, 1->2, 2->3, 3->4, .... 8->9, 9->1. Or you can just swap only 2 of them. eg. 1->2, 2->1. This also works.

Sum of array diagonal

I'm very new at this and have to do this for a project so keep that in mind.
I need to write a function sumOfDiagonal that has one parameter of type list.
The list is a 4x4 2-dimensional array of integers (4 rows and 4 columns of integers).
The function must return the sum of the integers in the diagonal positions from top right to bottom left.
I have not tried anything because I have no idea where to begin, so would appreciate some guidance.

Since you haven't specified a language (and this is probably classwork anyway), I'll have to provide pseudo-code. Given the 4x4 2d array, the basic idea is to use a loop specifying the index, and use that index to get the correct elements in both dimensions. Say we had the array:
[][0] [][1] [][2] [][3]
----- ----- ----- -----
[0][] 1 2 3 4
[1][] 5 6 7 8
[2][] 9 10 11 12
[3][] 13 14 15 16
and we wanted to sum the top-left-to-bottom-right diagonal (1+6+11+16)(1). That would be something like:
def sumOfDiagonal (arr, sz):
sum = 0
for i = 0 to sz - 1 inclusive:
sum = sum + arr[i][i]
return sum
That's using the normal means of accessing an array. If, as may be given the ambiguity in the question, your array is actually a list of some description (such as a linked list of sixteen elements), you'll just need to adjust how you get the "array" elements.
For example, a 16-element list would need to get nodes 0, 5, 10 and 15 so you could run through the list skipping four nodes after each accumulation.
By way of example, here's some Python code(2) for doing the top-left-to-bottom-right variant, which outputs 34 (1+6+11+16) as expected:
def sumOfDiagonals(arr):
sum = 0
for i in range(len(arr)):
sum += arr[i][i]
return sum
print(sumOfDiagonals([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]]))
(1) To do top right to bottom left simply requires you to change the second term into sz - i - 1.
(2) Python is the ideal pseudo-code language when you want to be able to test your pseudo-code, provided you stay away from its more complex corners :-)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.