Maximize Function for Adjacent Crops; Garden Optimization Problem - python

Imagine a small garden, divided into 8 equal parts, each a square foot. The garden is 4 ft x 2 ft, so the "bins" are in two rows. Let's number them as:
0 1 2 3
4 5 6 7
We want to arrange different plants in each one. Each plant has some buddies that they like to be near. For example, basil likes to be near tomatoes. I want to find an arrangement for the garden that maximizes the number of positive relationships.
Using python, it's easy to shove the different crops in a list. It's also easy to make a scoring function to find the total score for a particular arrangement. My problem is reducing the problem size. In this setup, there are 8! (40,320) possible permutations, different arrangements of plants in the garden. In the real one I'm trying to solve, I'm using a 16-bin garden, twice the size. That's 16! possible permutations to go through, over 20 trillion. It's taking too long. (I've described the problem here with 8 bins instead of 16 to simplify.)
I've used itertools.permutations to run through all the possible permutations of 8 items. However, it doesn't know enough to skip arrangements that are essentially duplicates. If I rotate a garden arrangement by 180 degrees, it's really the same solution. If I mirror left-to-right or up-and-down, they're also the same solutions. How can I set this up to reduce the total problem set?
In other problems, I've used lookups to check through a list of solutions already checked. With this large number of solutions, that would consume more time than simply going through all of them. Please help me reduce the problem set!
# maximize the number of good relationships in a garden
import itertools
# each crop has 2 items: the name of the crop and a list of all the good friends
crops = []
crops.append(['basil',['tomato','pepper','lettuce']]) # basil likes to be near tomato, pepper or lettuce
crops.append(['strawberry',['beans','lettuce']])
crops.append(['beans',['beet','marigold','cucumber','potato','strawberry','radish']])
crops.append(['beet',['beans']])
crops.append(['cucumber',['lettuce','radish','tomato','dill','marigold']])
crops.append(['marigold',['tomato','cucumber','potato','beans']])
crops.append(['tomato',['cucumber','chives','marigold','basil','dill']])
crops.append(['bok_choy',['dill']])
# 0 1 2 3 This is what the garden looks like, with 8 bins
# 4 5 6 7
mates = [ [0,1], [1,2], [2,3], [4,5], [5,6], [6,7], [0,4], [1,5], [2,6], [3,7] ] # these are the relationships that directly border one another
def score(c): # A scoring function that returns the number of good relationships
s = 0
for pair in mates:
for j in c[pair[1]][1]:
if c[pair[0]][0] == j:
s = s + 1
for j in c[pair[0]][1]: # and the revers, 1-0
if c[pair[1]][0] == j:
s = s + 1
return s
scoremax = 0
for x in itertools.permutations(crops,8):
s = score(x)
if s >= scoremax: # show the arrangement
for i in range(0,4):
print( x[i][0] + ' ' * (12-len(x[i][0])) + x[i+4][0] + ' ' * (12-len(x[i+4][0])) ) # print to screen
print(s)
print('')
if s > scoremax:
scoremax = s
EDIT: To clarify, these are the symmetry and rotation arrangements I'm trying to skip. For clarity, I'll use numbers instead of the plant name strings.
0 1 2 3 is same when mirrored 3 2 1 0
4 5 6 7 7 6 5 4
0 1 2 3 is same when mirrored 4 5 6 7
4 5 6 7 0 1 2 3
0 1 2 3 is same when rotated 7 6 5 4
4 5 6 7 3 2 1 0

In general, it is often very difficult to efficiently break symmetries for this kind of problems.
In this case, there seem to be just 2 symmetries:
right to left is the same as left to right
up to down is the same as down to up
We can break both of them if we add three conditions:
the crop at plot 0 should both be smaller than the crop at plot 3 and the one at plot 4 and the one at plot 7
For 'smaller' we can use any measure that gives a strict ordering. In this case we can simply compare the strings.
The main loop would then look as follows. Once the optimal scoremax is reached, only solutions that don't have symmetry will be printed. Also, every possible solution will either be printed directly or will be printed in its canonical form (i.e. mirrored horizontally and/or vertically).
# maximize the number of good relationships in a garden
import itertools
# each crop has 2 items: the name of the crop and a list of all the good friends
crops = []
crops.append(['basil',['tomato','pepper','lettuce']]) # basil likes to be near tomato, pepper or lettuce
crops.append(['strawberry',['beans','lettuce']])
crops.append(['beans',['beet','marigold','cucumber','potato','strawberry','radish']])
crops.append(['beet',['beans']])
crops.append(['cucumber',['lettuce','radish','tomato','dill','marigold']])
crops.append(['marigold',['tomato','cucumber','potato','beans']])
crops.append(['tomato',['cucumber','chives','marigold','basil','dill']])
crops.append(['bok_choy',['dill']])
# 0 1 2 3 This is what the garden looks like, with 8 bins
# 4 5 6 7
mates = [ [0,1], [1,2], [2,3], [4,5], [5,6], [6,7], [0,4], [1,5], [2,6], [3,7] ] # these are the relationships that directly border one another
def score(c): # A scoring function that returns the number of good relationships
s = 0
for pair in mates:
for j in c[pair[1]][1]:
if c[pair[0]][0] == j:
s = s + 1
for j in c[pair[0]][1]: # and the revers, 1-0
if c[pair[1]][0] == j:
s = s + 1
return s
scoremax = 0
for x in itertools.permutations(crops,8):
if x[0][0] < x[3][0] and x[0][0] < x[4][0] and x[0][0] < x[7][0]:
s = score(x)
if s >= scoremax: # show the arrangement
for i in range(0,4):
print( x[i][0] + ' ' * (12-len(x[i][0])) + x[i+4][0] + ' ' * (12-len(x[i+4][0])) ) # print to screen
print(s)
print('')
if s > scoremax:
scoremax = s

Related

Assign list as new columns based on a condition

I have a dataframe df that looks like this:
ID Sequence
0 A->A
1 C->C->A
2 C->B->A
3 B->A
4 A->C->A
5 A->C->C
6 A->C
7 A->C->C
8 B->B
9 C->C and so on ....
I want to create a column called 'Outcome', which is binomial in nature.
Its value essentially depends on three lists that I am generating from below
Whenever 'A' occurs in a sequence, probability of "Outcome" being 1 is 2%
Whenever 'B' occurs in a sequence, probability of "Outcome" being 1 is 6%
Whenever 'C' occurs in a sequence, probability of "Outcome" being 1 is 1%
so here is the code which is generating these 3 (bi_A, bi_B, bi_C) lists -
A=0.02
B=0.06
C=0.01
count_A=0
count_B=0
count_C=0
for i in range(0,len(df)):
if('A' in df.sequence[i]):
count_A+=1
if('B' in df.sequence[i]):
count_B+=1
if('C' in df.sequence[i]):
count_C+=1
bi_A = np.random.binomial(1, A, count_A)
bi_B = np.random.binomial(1, B, count_B)
bi_C = np.random.binomial(1, C, count_C)
What I am trying to do is to combine these 3 lists as an "output" column so that probability of Outcome being 1 when "A" is in sequence is 2% and so on. How to I solve for it as I understand there would be data overlap, where bi_A says one sequence is 0 and bi_B says it's 1, so how would we solve for this ?
End data should look like -
ID Sequence Output
0 A->A 0
1 C->C->A 1
2 C->B->A 0
3 B->A 0
4 A->C->A 0
5 A->C->C 1
6 A->C 0
7 A->C->C 0
8 B->B 0
9 C->C 0
and so on ....
Such that when I find probability of Outcome = 1 when A is in string, it should be 2%
EDIT -
you can generate the sequence data using this code-
import pandas as pd
import itertools
import numpy as np
import random
alphabets=['A','B','C']
combinations=[]
for i in range(1,len(alphabets)+1):
combinations.append(['->'.join(i) for i in itertools.product(alphabets, repeat = i)])
combinations=(sum(combinations, []))
weights=np.random.normal(100,30,len(combinations))
weights/=sum(weights)
weights=weights.tolist()
#weights=np.random.dirichlet(np.ones(len(combinations))*1000.,size=1)
'''n = len(combinations)
weights = [random.random() for _ in range(n)]
sum_weights = sum(weights)
weights = [w/sum_weights for w in weights]'''
df=pd.DataFrame(random.choices(
population=combinations,weights=weights,
k=10000),columns=['sequence'])

Running perfectly on IDE but line (if mat[j] == mat[colindex]:) is giving index out of range error when submitting on geeks for geeks

t = int(input())
lis =[]
for i in range(t):
col = list(map(int,input()))
colindex = col[0] - 1
count = 0
matsize = col[0] * col[0]
mat = list(map(int,input().split()))
while len(lis) != matsize:
for j in range(len(mat)):
if colindex < len(mat):
if mat[j] == mat[colindex]:
lis.append(mat[j])
colindex += col[0]
count += 1
colindex = col[0] - 1
colindex -= count
for i in lis:
print(i,end= ' ')
Given a square matrix mat[][] of size N x N. The task is to rotate it by 90 degrees in anti-clockwise direction without using any extra space.
Input:
The first line of input contains a single integer T denoting the number of test cases. Then T test cases follow. Each test case consist of two lines. The first line of each test case consists of an integer N, where N is the size of the square matrix.The second line of each test case contains N x N space separated values of the matrix mat.
Output:
Corresponding to each test case, in a new line, print the rotated array.
Constraints:
1 ≤ T ≤ 50
1 ≤ N ≤ 50
1 <= mat[][] <= 100
Example:
Input:
2
3
1 2 3 4 5 6 7 8 9
2
5 7 10 9
Output:
3 6 9 2 5 8 1 4 7
7 9 5 10
Explanation:
Testcase 1: Matrix is as below:
1 2 3
4 5 6
7 8 9
Rotating it by 90 degrees in anticlockwise directions will result as below matrix:
3 6 9
2 5 8
1 4 7
https://practice.geeksforgeeks.org/problems/rotate-by-90-degree/0
It doesn't look like there is a problem with j. Can colindex ever be below 0? One way to identify this would be to simply keep track of the counters. For example, you can add an extra if condition if colindex >= 0: before if mat[j] == mat[colindex].
Rather than using one dimensional list, we can use two dimensional list to solve this challenge. From the given statement and sample test case, we get the following information:
Print the rotated matrix in a single line.
If the given matrix has n columns, the rotated matrix will have the sequential elements of n-1th column, n-2th column, .. 0th column.
Here is my accepted solution of this challenge:
def get_rotated_matrix(ar, n):
ar_2d = []
for i in range(0, len(ar)-n+1, n):
ar_2d.append(ar[i:i+n])
result = []
for i in range(n-1, -1, -1):
for j in range(n):
result.append(str(ar_2d[j][i]))
return result
cas = int(input())
for t in range(cas):
n = int(input())
ar = list(map(int, input().split()))
result = get_rotated_matrix(ar, n)
print(" ".join(result))
Explanation:
To make the solution simple, I created a 2 dimensional list to store the input data as a 2D matrix called ar_2d.
Then I traverse the matrix column wise; from last column to first column and appended the values to our result list as string value.
Finally, I have printed the result with space between elements using join method.
Disclaimer:
My solution uses a 1D list to store the rotated matrix elements thus usages extra space.

Join elements by iterating through the data

I have some data in the form:
ID A B VALUE EXPECTED RESULT
1 1 2 5 GROUP1
2 2 3 5 GROUP1
3 3 4 6 GROUP2
4 3 5 5 GROUP1
5 6 4 5 GROUP3
What i want to do is iterate through the data (thousand of rows) and create a common field so i will be able to join the data easily ( *A-> start Node, B->End Node Value-> Order...the data form something like a chain where only neighbors share a common A or B)
Rules for joining:
equal value for all elements of a group
A of element one equal to B of element two (or the oposite but NOT A=A' or B=B')
The most difficult one: assign to same group all sequential data that form a series of intersecting nodes.
That is the first element [1 1 2 5] has to be joined with [2 2 3 5] and then with [4 3 5 5]
Any idea how to accomplish this robustly when iterating through a large number of data? I have problem with rule number 3, the others are easily applied. For limited data i have some success, but this depends on the order i start examining the data. And this doesn't work for the large dataset.
I can use arcpy (preferably) or even Python or R or Matlab to solve this. Have tried arcpy with no success so i am checking on alternatives.
In ArcPy this code works ok but to limited extend (i.e. in large features with many segments i get 3-4 groups instead of 1):
TheShapefile="c:/Temp/temp.shp"
desc = arcpy.Describe(TheShapefile)
flds = desc.fields
fldin = 'no'
for fld in flds: #Check if new field exists
if fld.name == 'new':
fldin = 'yes'
if fldin!='yes': #If not create
arcpy.AddField_management(TheShapefile, "new", "SHORT")
arcpy.CalculateField_management(TheShapefile,"new",'!FID!', "PYTHON_9.3") # Copy FID to new
with arcpy.da.SearchCursor(TheShapefile, ["FID","NODE_A","NODE_B","ORDER_","new"]) as TheSearch:
for SearchRow in TheSearch:
if SearchRow[1]==SearchRow[4]:
Outer_FID=SearchRow[0]
else:
Outer_FID=SearchRow[4]
Outer_NODEA=SearchRow[1]
Outer_NODEB=SearchRow[2]
Outer_ORDER=SearchRow[3]
Outer_NEW=SearchRow[4]
with arcpy.da.UpdateCursor(TheShapefile, ["FID","NODE_A","NODE_B","ORDER_","new"]) as TheUpdate:
for UpdateRow in TheUpdate:
Inner_FID=UpdateRow[0]
Inner_NODEA=UpdateRow[1]
Inner_NODEB=UpdateRow[2]
Inner_ORDER=UpdateRow[3]
if Inner_ORDER==Outer_ORDER and (Inner_NODEA==Outer_NODEB or Inner_NODEB==Outer_NODEA):
UpdateRow[4]=Outer_FID
TheUpdate.updateRow(UpdateRow)
And some data in shapefile form and dbf form
Using matlab:
A = [1 1 2 5
2 2 3 5
3 3 4 6
4 3 5 5
5 6 4 5]
%% Initialization
% index of matrix line sharing the same group
ind = 1
% length of the index
len = length(ind)
% the group array
g = []
% group counter
c = 1
% Start the small algorithm
while 1
% Check if another line with the same "Value" share some common node
ind = find(any(ismember(A(:,2:3),A(ind,2:3)) & A(:,4) == A(ind(end),4),2));
% If there is no new line, we create a group with the discovered line
if length(ind) == len
%group assignment
g(A(ind,1)) = c
c = c+1
% delete the already discovered line (or node...)
A(ind,:) = []
% break if no more node
if isempty(A)
break
end
% reset the index for the next group
ind = 1;
end
len = length(ind);
end
And here is the output:
g =
1 1 2 1 3
As expected

Finding items in an image which have the same intensity

How can I find all the points (or the region) in an image which has 3-dimensions in which the first two dimensions show the resolution and the 3rd one shows the density? I can use Matlab or Python. I wonder if there is a native function for finding those points that is least computationally expensive.
UPDATE:
Imagine I have the following:
A= [1,2,3; 4,6,6; 7,6,6]
A =
1 2 3
4 6 6
7 6 6
>> B=[7,8,9; 10,11,11; 1, 11,11]
B =
7 8 9
10 11 11
1 11 11
>> C=[0,1,2; 3, 7, 7; 5,7,7]
C =
0 1 2
3 7 7
5 7 7
How can I find the lower square in which all the values of A equal the same all the values of B and all the values of C? If this is too much how can I find the lower square in A wherein all the values in A are equal?
*The shown values are the intensity of the image.
UPDATE: tries the provided answer and got this error:
>> c=conv2(M,T, 'full');
Warning: CONV2 on values of class UINT8 is obsolete.
Use CONV2(DOUBLE(A),DOUBLE(B)) or CONV2(SINGLE(A),SINGLE(B)) instead.
> In uint8/conv2 (line 10)
Undefined function 'conv2' for input arguments of type 'double' and attributes 'full 3d real'.
Error in uint8/conv2 (line 17)
y = conv2(varargin{:});
*Also tried convn and it took forever so I just stopped it!
Basically how to do this for a 2D array as described above?
A possible solution:
A = [1,2,3; 4,6,6; 7,6,6];
B = [7,8,9; 10,11,11; 1, 11,11];
C = [0,1,2; 3, 7, 7; 5,7,7];
%create a 3D array
D = cat(3,A,B,C)
%reshape the 3D array to 2D
%its columns represent the third dimension
%and its rows represent resolution
E = reshape(D,[],size(D,3));
%third output of the unique function applied row-wise to the data
%represents the label of each pixel a [m*n, 1] vector created
[~,~,F] = unique(E,'rows');
%reshape the vector to a [m, n] matrix of labels
result = reshape(F, size(D,1), size(D,2));
You can reshape the 3D matrix to a 2D matrix (E) that its columns represent the third dimension and its rows represent resolution.
Then using unique function you can label the image.
We have a 3D matrix:
A =
1 2 3
4 6 6
7 6 6
B =
7 8 9
10 11 11
1 11 11
C =
0 1 2
3 7 7
5 7 7
When we reshape the 3D matrix to a 2D matrix E we get:
E =
1 7 0
4 10 3
7 1 5
2 8 1
6 11 7
6 11 7
3 9 2
6 11 7
6 11 7
So we need to classify the rows base on their values.
Unique function is capable of extracting unique rows and assign the same label to rows that are equal to each other.
Here varible F capture third output of the unique function that is label of each row.
F =
1
4
6
2
5
5
3
5
5
that should be reshaped to 2D
result =
1 2 3
4 5 5
6 5 5
so each region has different label.
If you want to segment distinct regions(based on both their values and their spatial positions) you need to do labeling the image in a loop
numcolors = max(F);
N = 0;
segment = zeros(size(result));
for c = 1 : numcolors
[label,n] = bwlabel(result==c);
segment = segment +label + logical(label)*N;
N = N + n;
end
So here you need to mark disconnected regions that have the same values with different labels. since MATLAB doesn't have functions for gray segmentation You can use bwlabel function multiple times to do segmentation and add result of the previous iteration to result of current iteration. segment variable contains the segmentd image.
*Note: this result obtained from GNU Octave that its labeling is different from MATLAB. if You use unique(E,'rows','last'); result of MATLAB and Octave will be the same.
You can use a pair of horizontal and vertical 1D filters such that the horizontal filter has a kernel of [1 -1] while the vertical filter has a kernel of [1; -1]. The effect of this is that it takes both horizontal and vertical pairwise distances for each element in each dimension separately. You can then perform image filtering or convolution using these two kernels ensuring that you replicate the borders. To be able to find uniform regions, by checking which regions in both results map to 0 between them both, this gives you areas where areas that are uniform over all channels independently.
To do this, you would first take the opposite of both filtering results so that uniform regions that would become 0 are now 1 and vice-versa. that you perform the logical AND operation on both of these together and then ensure that for each pixel temporally, all of the values are true. This would mean that for a spatial location in this image, all values experience the same uniformity as you expect.
In MATLAB, assuming you have the Image Processing Toolbox, use imfilter to filter the images, then use all in MATLAB to look temporally after the two filtering results, and then use regionprops to find the coordinates of the regions you seek. So do something like this:
%# Reproducing your data
A = [1,2,3; 4,6,6; 7,6,6];
B = [7,8,9; 10,11,11; 1, 11,11];
C = [0,1,2; 3, 7, 7; 5,7,7];
%# Create a 3D matrix to allow for efficient filtering
D = cat(3, A, B, C);
%# Filter using the kernels
ker = [1 -1];
ker2 = ker.'; %#
out = imfilter(D, ker, 'replicate');
out2 = imfilter(D, ker2, 'replicate');
%# Find uniform regions
regions = all(~out & ~out2, 3);
%# Determine the locations of the uniform areas
R = regionprops(regions, 'BoundingBox');
%# Round to ensure pixel accuracy and reshape into a matrix
coords = round(reshape([R.BoundingBox], 4, [])).';
coords would be a N x 4 matrix with each row telling the upper-left coordinates of the bounding box origin as well as the width and height of the bounding box. The first and second elements in a row are the column and row coordinate while the third and fourth elements are the width and height of the bounding box.
The regions we have detected can be found in the regions variable. Both of these show:
>> regions
regions =
3×3 logical array
0 0 0
0 1 1
0 1 1
>> coords
coords =
2 2 2 2
This tells us that we have localised the region of "uniformity" to be the bottom right corner while the coordinates of the top-left corner of the bounding box are row 2, column 2 with a width and height of 2 and 2 respectively.
check out https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.signal.correlate2d.html
2D correlation basically "slides" the two images across each other, and adds up the dot product of the overlap.
more reading: http://www.cs.umd.edu/~djacobs/CMSC426/Convolution.pdf
https://en.wikipedia.org/wiki/Two-dimensional_correlation_analysis

finding best combination of date sets - given some constraints

I am looking for the right approach for solve the following task (using python):
I have a dataset which is a 2D matrix. Lets say:
1 2 3
5 4 7
8 3 9
0 7 2
From each row I need to pick one number which is not 0 (I can also make it NaN if that's easier).
I need to find the combination with the lowest total sum.
So far so easy. I take the lowest value of each row.
The solution would be:
1 x x
x 4 x
x 3 x
x x 2
Sum: 10
But: There is a variable minimum and a maximum sum allowed for each column. So just choosing the minimum of each row may lead to a not valid combination.
Let's say min is defined as 2 in this example, no max is defined. Then the solution would be:
1 x x
5 x x
x 3 x
x x 2
Sum: 11
I need to choose 5 in row two as otherwise column one would be below the minimum (2).
I could use brute force and test all possible combinations. But due to the amount of data which needs to be analyzed (amount of data sets, not size of each data set) that's not possible.
Is this a common problem with a known mathematical/statistical or other solution?
Thanks
Robert

Categories

Resources