Extending and optimizing 2D grid search code to N-dimensions (using itertools) - python

I have code for a 2D grid search and it works perfectly. Here is the sample code:
chinp = np.zeros((N,N))
O_M_De = []
for x,y in list(((x,y) for x in range(len(omega_M)) for y in range(len(omega_D)))):
Omin = omega_M[x]
Odin = omega_D[y]
print(Omin, Odin)
chi = np.sum((dist_data - dist_theo)**2/(chi_err))
chinp[y,x] = chi
chi_values1.append(chi)
O_M_De.append((x,y))
My question is, at some point in the future, I may want to perform a grid search over more dimensions. Now if this were the case of 3 dimensions, it would be as simple as adding another variable 'z' in my 'for' statement (line 3). This code would work fine for me to keep adding more dimensions too (i have tried it and it works).
However as you can tell, if I wanted a large number of dimensions to perform a grid search over, it would get a little tedious and inefficient to keep adding variable to my 'for' statement (e.g. for 5D it would go something like 'for v,w,x,y,z in list(((v,w,x,y,z)...').
Just from various google searches, I am under the impression that itertools is very helpful when it comes to performing grid searches however I am still fairly new to programming and unfamiliar with it.
My question is if anyone knows a way (using itertools or some other method I am not aware of) to be able to extend this code to N-dimenions in a more efficient way (i.e. maybe change the 'for' statement so I can grid search over N-dimensions easily without adding on another 'for z in range etc'
Thank you in advance for your help.

You want to take a look at product function from itertools
import itertools
x_list = [0, 1, 2]
y_list = [10, 11, 12]
z_list = [20, 21, 22]
for x, y, z in itertools.product(x_list, y_list, z_list):
print(x, y, z)
0 10 20
0 10 21
0 10 22
0 11 20
0 11 21
(...)
2 11 21
2 11 22
2 12 20
2 12 21
2 12 22
Note that this will not be the most efficient way. The best results you will get if you add some vectorization (for example using numpy or numba) and parallelism (using multiprocessing or numba).

Related

How to compare specific elements in 2D array in Python

I'm trying to compare these specific elements to find the highest number:
Q_2 = [[5,0,41],[6,3,5],[7,4,3],[8,5,40]]
This is my 2d array in Python I want to compare Q_2[i][2] with each other the example is that number 41 gets compared to 5 and 3 and 40 and the result is the highest number.
I came up with 2 ways:
I store the Q_2[i][2] of the every item to a new list (which I don't know why it wont)
Or I do a loop to compare them
from array import *
#These 2 are used to define columns and rows in for other 2d arrays (All arrays have same column and row)
n = int(3)
m = int(input("Enter number of processes \n")) #I always type 4 for this variable
Q_2 = [[5,0,41],[6,3,5],[7,4,3],[8,5,40]]
for i in range(m):
for j in range(1,3,1):
if(Q_2[i][2]>=Q_2[j][2]:
Max_Timers = Q_2[i]
print(Max_Timers) #to check if the true value is returned or not
The result it returns is 40
This worked when the 0 index was lower then others but once I changed the first one to 41 it no longer works
there is no need of two 'for' loops, as your are after just one element from a 2D array rather than complete 1D array.
This is a working code on 2D array, to get the 1D array that got the highest element on index-2:
max_index = 0
Q_2 = [[5,0,41],[6,3,5],[7,4,3],[8,5,40]]
for i in range(len(Q_2)):
if(Q_2[i][2]>=Q_2[max_index][2]):
max_index = i
print(Q_2[max_index])
The reason your code doesn't work can be easily found out by working out a dry run using the values you are using.
According to your logic,
number 41 gets compared to 5 and 3 and 40 and the result is the highest number
and this is achieved by the two for loops with i and j. But what you overlooked was that the max-value that you calculated was just between the current values and so, you are saving the max value for the current iteration only. So, instead of the Global maximum, only the local/current maximum was being stored.
A quick dry-run of your code (I modified a few lines for the dry-run but with no change in logic) will work like this:
41 is compared to 5 (Max_Timers = 41)
41 is compared to 3 (Max_Timers = 41)
5 is compared to 3 (Max_Timers = 5)
40 is compared to 5 (Max_Timers = 40)
40 is compared to 3 (Max_Timers = 40)
>>> print(Max_timers)
40
So, this is the reason you are getting 40 as a result.
The solution to this is perfectly mentioned by #Rajani B's post, which depicts a global comparison by storing the max value and using that itself for comparison.
Note: I didn't mention this above but even when you used a 2nd for loop, which was already not required, there was an even less reason for you to use a range(1,3,1) in the 2nd loop. As you can see in the dry-run, this resulted in skipping a few of the checks that you probably intended.
You can use numpy to significantly simplify this task and for performance as well. Convert your list into an np.array(), then select the column of interest, which is Q_2[:,2] and apply numpy's .max() method.
import numpy as np
Q_2 = [[5,0,41],[6,3,5],[7,4,3],[8,5,40]]
Q_2 = np.array(Q_2)
mx2 = Q_2[:,2].max()
which gives the desired output:
print(mx2)
41

How to import my data into python

I'm currently working on Project Euler 18 which involves a triangle of numbers and finding the value of the maximum path from top to bottom. It says you can do this project either by brute forcing it or by figuring out a trick to it. I think I've figured out the trick, but I can't even begin to solve this because I don't know how to start manipulating this triangle in Python.
https://projecteuler.net/problem=18
Here's a smaller example triangle:
3
7 4
2 4 6
8 5 9 3
In this case, the maximum route would be 3 -> 7 -> 4 -> 9 for a value of 23.
Some approaches I considered:
I've used NumPy quite a lot for other tasks, so I wondered if an array would work. For that 4 number base triangle, I could maybe do a 4x4 array and fill up the rest with zeros, but aside from not knowing how to import the data in that way, it also doesn't seem very efficient. I also considered a list of lists, where each sublist was a row of the triangle, but I don't know how I'd separate out the terms without going through and adding commas after each term.
Just to emphasise, I'm not looking for a method or a solution to the problem, just a way I can start to manipulate the numbers of the triangle in python.
Here is a little snippet that should help you with reading the data:
rows = []
with open('problem-18-data') as f:
for line in f:
rows.append([int(i) for i in line.rstrip('\n').split(" ")])

Sum of array diagonal

I'm very new at this and have to do this for a project so keep that in mind.
I need to write a function sumOfDiagonal that has one parameter of type list.
The list is a 4x4 2-dimensional array of integers (4 rows and 4 columns of integers).
The function must return the sum of the integers in the diagonal positions from top right to bottom left.
I have not tried anything because I have no idea where to begin, so would appreciate some guidance.
Since you haven't specified a language (and this is probably classwork anyway), I'll have to provide pseudo-code. Given the 4x4 2d array, the basic idea is to use a loop specifying the index, and use that index to get the correct elements in both dimensions. Say we had the array:
[][0] [][1] [][2] [][3]
----- ----- ----- -----
[0][] 1 2 3 4
[1][] 5 6 7 8
[2][] 9 10 11 12
[3][] 13 14 15 16
and we wanted to sum the top-left-to-bottom-right diagonal (1+6+11+16)(1). That would be something like:
def sumOfDiagonal (arr, sz):
sum = 0
for i = 0 to sz - 1 inclusive:
sum = sum + arr[i][i]
return sum
That's using the normal means of accessing an array. If, as may be given the ambiguity in the question, your array is actually a list of some description (such as a linked list of sixteen elements), you'll just need to adjust how you get the "array" elements.
For example, a 16-element list would need to get nodes 0, 5, 10 and 15 so you could run through the list skipping four nodes after each accumulation.
By way of example, here's some Python code(2) for doing the top-left-to-bottom-right variant, which outputs 34 (1+6+11+16) as expected:
def sumOfDiagonals(arr):
sum = 0
for i in range(len(arr)):
sum += arr[i][i]
return sum
print(sumOfDiagonals([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]]))
(1) To do top right to bottom left simply requires you to change the second term into sz - i - 1.
(2) Python is the ideal pseudo-code language when you want to be able to test your pseudo-code, provided you stay away from its more complex corners :-)

Generation of values above diagonal numpy array

Suppose you have a 10x10 numpy array of intensity values extracted from an image. The exact numbers do not matter right now. I would like to take this matrix, and generate vertices for a graph using only the vertex locations of the upper half of the matrix. More specifically, if our matrix dimensions are defined as (MxN), we could possibly write something like this:
for x in range(M1,M2,M3...M10):
for y in range(N1,N2,N3...N10):
if (x-y) >=0:
graph.addVertex(x,y)
The graph class definition and addVertex definition are NOT important to me as I already have them written. I am only concerned about a method in which I can only consider vertices which are above the diagonal. Open to any and all suggestions, my suggestion above is merely a starting point that may possibly be helpful. Thanks!
EDIT: SOLUTION
Sorry if my clarity issues were atrocious, as I'm somewhat new to coding in Python, but this is the solution to my issue:
g=Graph()
L=4
outerindex=np.arange(L**2*L**2).reshape((L**2,L**2))
outerindex=np.triu(outerindex,k=0)
for i in range(len(outerindex)):
if outerindex.any()>0:
g.addVertex(i)
In this manner, when adding vertices to our newly formed graph, the only new vertices formed will be those that reside in locations above the main diagonal.
I think what you want is something like this:
import numpy as np
a = np.arange(16).reshape((4,4))
print a
for i in range(4):
for j in range(i, 4):
print a[i,j],
# [[ 0 1 2 3]
# [ 4 5 6 7]
# [ 8 9 10 11]
# [12 13 14 15]]
# 0 1 2 3 5 6 7 10 11 15
That is, the key point here is to make the index of the inner loop dependent on the outer loop.
If you don't want to include the diagonal, use the inner loop with range(i+1,4).

Matlab spline function in python

I am in the process of converting some matlab code to python when I ran into the spline function in matlab. I assumed that numpy would have something similar but all I can find on google is scipy.interpolate, which has so many options I dont even know where to start. http://docs.scipy.org/doc/scipy/reference/interpolate.html Is there an exact equivalent to the matlab spline? Since I need it to run for various cases there is not one single test case, so in the worst case I need to recode the function and that will take unnecessary amounts of time.
Thanks
Edit:
So i have tried the examples of the answers so far, but i dont see how they are similar, for example spline(x,y) in matlab returns:
>> spline(x,y)
ans =
form: 'pp'
breaks: [0 1 2 3 4 5 6 7 8 9]
coefs: [9x4 double]
pieces: 9
order: 4
dim: 1
SciPy:
scipy.interpolate.UnivariateSpline
http://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.UnivariateSpline.html#scipy.interpolate.UnivariateSpline
Note that it returns an interpolator (function) not interpolated values. You have to make a call to the resulting function:
spline = UnivariateSpline(x, y)
yy = spline(xx)

Categories

Resources