How can I find all the points (or the region) in an image which has 3-dimensions in which the first two dimensions show the resolution and the 3rd one shows the density? I can use Matlab or Python. I wonder if there is a native function for finding those points that is least computationally expensive.
UPDATE:
Imagine I have the following:
A= [1,2,3; 4,6,6; 7,6,6]
A =
1 2 3
4 6 6
7 6 6
>> B=[7,8,9; 10,11,11; 1, 11,11]
B =
7 8 9
10 11 11
1 11 11
>> C=[0,1,2; 3, 7, 7; 5,7,7]
C =
0 1 2
3 7 7
5 7 7
How can I find the lower square in which all the values of A equal the same all the values of B and all the values of C? If this is too much how can I find the lower square in A wherein all the values in A are equal?
*The shown values are the intensity of the image.
UPDATE: tries the provided answer and got this error:
>> c=conv2(M,T, 'full');
Warning: CONV2 on values of class UINT8 is obsolete.
Use CONV2(DOUBLE(A),DOUBLE(B)) or CONV2(SINGLE(A),SINGLE(B)) instead.
> In uint8/conv2 (line 10)
Undefined function 'conv2' for input arguments of type 'double' and attributes 'full 3d real'.
Error in uint8/conv2 (line 17)
y = conv2(varargin{:});
*Also tried convn and it took forever so I just stopped it!
Basically how to do this for a 2D array as described above?
A possible solution:
A = [1,2,3; 4,6,6; 7,6,6];
B = [7,8,9; 10,11,11; 1, 11,11];
C = [0,1,2; 3, 7, 7; 5,7,7];
%create a 3D array
D = cat(3,A,B,C)
%reshape the 3D array to 2D
%its columns represent the third dimension
%and its rows represent resolution
E = reshape(D,[],size(D,3));
%third output of the unique function applied row-wise to the data
%represents the label of each pixel a [m*n, 1] vector created
[~,~,F] = unique(E,'rows');
%reshape the vector to a [m, n] matrix of labels
result = reshape(F, size(D,1), size(D,2));
You can reshape the 3D matrix to a 2D matrix (E) that its columns represent the third dimension and its rows represent resolution.
Then using unique function you can label the image.
We have a 3D matrix:
A =
1 2 3
4 6 6
7 6 6
B =
7 8 9
10 11 11
1 11 11
C =
0 1 2
3 7 7
5 7 7
When we reshape the 3D matrix to a 2D matrix E we get:
E =
1 7 0
4 10 3
7 1 5
2 8 1
6 11 7
6 11 7
3 9 2
6 11 7
6 11 7
So we need to classify the rows base on their values.
Unique function is capable of extracting unique rows and assign the same label to rows that are equal to each other.
Here varible F capture third output of the unique function that is label of each row.
F =
1
4
6
2
5
5
3
5
5
that should be reshaped to 2D
result =
1 2 3
4 5 5
6 5 5
so each region has different label.
If you want to segment distinct regions(based on both their values and their spatial positions) you need to do labeling the image in a loop
numcolors = max(F);
N = 0;
segment = zeros(size(result));
for c = 1 : numcolors
[label,n] = bwlabel(result==c);
segment = segment +label + logical(label)*N;
N = N + n;
end
So here you need to mark disconnected regions that have the same values with different labels. since MATLAB doesn't have functions for gray segmentation You can use bwlabel function multiple times to do segmentation and add result of the previous iteration to result of current iteration. segment variable contains the segmentd image.
*Note: this result obtained from GNU Octave that its labeling is different from MATLAB. if You use unique(E,'rows','last'); result of MATLAB and Octave will be the same.
You can use a pair of horizontal and vertical 1D filters such that the horizontal filter has a kernel of [1 -1] while the vertical filter has a kernel of [1; -1]. The effect of this is that it takes both horizontal and vertical pairwise distances for each element in each dimension separately. You can then perform image filtering or convolution using these two kernels ensuring that you replicate the borders. To be able to find uniform regions, by checking which regions in both results map to 0 between them both, this gives you areas where areas that are uniform over all channels independently.
To do this, you would first take the opposite of both filtering results so that uniform regions that would become 0 are now 1 and vice-versa. that you perform the logical AND operation on both of these together and then ensure that for each pixel temporally, all of the values are true. This would mean that for a spatial location in this image, all values experience the same uniformity as you expect.
In MATLAB, assuming you have the Image Processing Toolbox, use imfilter to filter the images, then use all in MATLAB to look temporally after the two filtering results, and then use regionprops to find the coordinates of the regions you seek. So do something like this:
%# Reproducing your data
A = [1,2,3; 4,6,6; 7,6,6];
B = [7,8,9; 10,11,11; 1, 11,11];
C = [0,1,2; 3, 7, 7; 5,7,7];
%# Create a 3D matrix to allow for efficient filtering
D = cat(3, A, B, C);
%# Filter using the kernels
ker = [1 -1];
ker2 = ker.'; %#
out = imfilter(D, ker, 'replicate');
out2 = imfilter(D, ker2, 'replicate');
%# Find uniform regions
regions = all(~out & ~out2, 3);
%# Determine the locations of the uniform areas
R = regionprops(regions, 'BoundingBox');
%# Round to ensure pixel accuracy and reshape into a matrix
coords = round(reshape([R.BoundingBox], 4, [])).';
coords would be a N x 4 matrix with each row telling the upper-left coordinates of the bounding box origin as well as the width and height of the bounding box. The first and second elements in a row are the column and row coordinate while the third and fourth elements are the width and height of the bounding box.
The regions we have detected can be found in the regions variable. Both of these show:
>> regions
regions =
3×3 logical array
0 0 0
0 1 1
0 1 1
>> coords
coords =
2 2 2 2
This tells us that we have localised the region of "uniformity" to be the bottom right corner while the coordinates of the top-left corner of the bounding box are row 2, column 2 with a width and height of 2 and 2 respectively.
check out https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.signal.correlate2d.html
2D correlation basically "slides" the two images across each other, and adds up the dot product of the overlap.
more reading: http://www.cs.umd.edu/~djacobs/CMSC426/Convolution.pdf
https://en.wikipedia.org/wiki/Two-dimensional_correlation_analysis
Related
I'm trying to mark regions of an image array (224x224) to be ignored based on the value of a segmentation network's class mask (16x16). Previous processing means that unwanted regions will be labelled as -1. I want to be able to set the value of all regions of the image where the class mask reports -1 to some nonsense value (say, 999), but maintain the shape of the array (224x224). A concise example of what I mean is below, using a 4x4 image and 2x2 mask.
# prefilter
image = 1 2 3 4
5 6 7 8
9 1 2 3
4 5 6 7
mask = -1 4
-1 5
# postfilter
image_filtered = 999 999 3 4
999 999 7 8
999 999 2 3
999 999 6 7
Is there an efficient way to do this modification?
Here's some code I got to work. It does require the mask and the image to have the same aspect ratio, and be integer multiples of each others sizes.
import numpy as np
image = np.array([[1,2,3,4],
[5,6,7,8],
[9,1,2,3],
[4,5,6,7]])
mask = np.array([[-1,4],
[-1,5]])
def mask_image(image, mask, masked_value):
scale_factor = image.shape[0]//mask.shape[0] # how much the mask needs to be scaled up by to match the image's size
resized_mask = np.kron(mask, np.ones((scale_factor,scale_factor))) # resize the mask (magic)
return(np.where((resized_mask==-1),masked_value,image)) # where the mask==-1, return the masked value, else return the value of the original array.
print(mask_image(image, mask, 999))
I used np.kron to resize an array after seeing this answer.
This function is extremely fast, it took ~2 sec to mask a 1920x1080 image with a 1920x1080 mask.
EDIT: Use what #Jérôme Richard said in their comment.
above25percentile=df.loc[df["order_amount"]>np.percentile(df["order_amount"],25)]
below75percentile=df.loc[df["order_amount"]<np.percentile(df["order_amount"],75)]
interquartile=above25percentile & below75percentile
print(interquartile.mean())
can't seem to get the mean here. any thoughts?
You attempt to compute interquartile as a boolean mask based on the & operator, but its components are Series containing values from the ranges. While the two series are likely to be similar sizes, & will not give you an intersection of their indices. If they were boolean masks, in your subsequent usage, you'd be taking the mean of a bunch of zeros and ones, which is going to be 0.5 (the ratio of data that falls within the IQR as a matter of fact).
First, compute interquartile as a proper mask. Pandas has its own quantile method, which, like np.percentile and siblings, accepts multiple percentiles simultaneously. You can combine that with between to get your mask more efficiently:
interquartile = df['order_amount'].between(*df['order_amount'].quantile([0.25, 0.75]))
You can apply the mask to the column and take the mean like this:
df.loc[interquartile, 'order_amount'].mean()
Try:
above25percentile = df["order_amount"]>np.percentile(df['order_amount'],25)
below75percentile = df['order_amount']<np.percentile(df['order_amount'],75)
print(df.loc[above25percentile & below75percentile, 'order_amount'].mean())
Or you can use between:
df.loc[df['order_amount'].between(np.percentile(df['order_amount'], 25),
np.percentile(df['order_amount'], 75),
inclusive='neither'), 'order_amount'].mean()
Suppose the following dataframe:
df = pd.DataFrame({'order_amount': range(0, 10)})
print(df)
# Output
order_amount
0 0 # Excluded
1 1 # "
2 2 # "
3 3
4 4 # mean <- (3 + 4 + 5 + 6) / 4 = 4.5
5 5
6 6
7 7 # Excluded
8 8 # "
9 9 # "
Output:
>>> df.loc[df['order_amount'].between(np.percentile(df['order_amount'], 25),
np.percentile(df['order_amount'], 75),
inclusive='neither'), 'order_amount'].mean()
4.5
I have a (N, 9, 9) shape tensorflow tensor T, and permutations Px, Py which might look like this: [3 4 5 6 7 8 2 1 0], [6 8 2 0 3 7 4 1 5].
I want to apply the permutation Px to the 1st axis of T, and Py to the 2nd axis. That is, I want to compute a tensor S defined by
S_i,j,k = T_i,Px(j),Py(k)
To use tf.gather_nd to construct S I need to construct an indices tensor such that
indices[i,j,k,0] = i
indices[i,j,k,1] = Px(j)
indices[i,j,k,2] = Py(k)
What's the cleanest way to construct indices (in Python)?
If I undestand your problem statement correctly, I believe this is what you need.
indices[:,:,:,0] = np.arange(indices.shape[0])
indices[:,:,:,1] = indices[:,Px(np.arange(indices.shape[1]),:,1]
indices[:,:,:,2] = indices[:,:,Py(np.arange(indices.shape[2]),2]
Hard to tell without a minimal reproducible.
Imagine a small garden, divided into 8 equal parts, each a square foot. The garden is 4 ft x 2 ft, so the "bins" are in two rows. Let's number them as:
0 1 2 3
4 5 6 7
We want to arrange different plants in each one. Each plant has some buddies that they like to be near. For example, basil likes to be near tomatoes. I want to find an arrangement for the garden that maximizes the number of positive relationships.
Using python, it's easy to shove the different crops in a list. It's also easy to make a scoring function to find the total score for a particular arrangement. My problem is reducing the problem size. In this setup, there are 8! (40,320) possible permutations, different arrangements of plants in the garden. In the real one I'm trying to solve, I'm using a 16-bin garden, twice the size. That's 16! possible permutations to go through, over 20 trillion. It's taking too long. (I've described the problem here with 8 bins instead of 16 to simplify.)
I've used itertools.permutations to run through all the possible permutations of 8 items. However, it doesn't know enough to skip arrangements that are essentially duplicates. If I rotate a garden arrangement by 180 degrees, it's really the same solution. If I mirror left-to-right or up-and-down, they're also the same solutions. How can I set this up to reduce the total problem set?
In other problems, I've used lookups to check through a list of solutions already checked. With this large number of solutions, that would consume more time than simply going through all of them. Please help me reduce the problem set!
# maximize the number of good relationships in a garden
import itertools
# each crop has 2 items: the name of the crop and a list of all the good friends
crops = []
crops.append(['basil',['tomato','pepper','lettuce']]) # basil likes to be near tomato, pepper or lettuce
crops.append(['strawberry',['beans','lettuce']])
crops.append(['beans',['beet','marigold','cucumber','potato','strawberry','radish']])
crops.append(['beet',['beans']])
crops.append(['cucumber',['lettuce','radish','tomato','dill','marigold']])
crops.append(['marigold',['tomato','cucumber','potato','beans']])
crops.append(['tomato',['cucumber','chives','marigold','basil','dill']])
crops.append(['bok_choy',['dill']])
# 0 1 2 3 This is what the garden looks like, with 8 bins
# 4 5 6 7
mates = [ [0,1], [1,2], [2,3], [4,5], [5,6], [6,7], [0,4], [1,5], [2,6], [3,7] ] # these are the relationships that directly border one another
def score(c): # A scoring function that returns the number of good relationships
s = 0
for pair in mates:
for j in c[pair[1]][1]:
if c[pair[0]][0] == j:
s = s + 1
for j in c[pair[0]][1]: # and the revers, 1-0
if c[pair[1]][0] == j:
s = s + 1
return s
scoremax = 0
for x in itertools.permutations(crops,8):
s = score(x)
if s >= scoremax: # show the arrangement
for i in range(0,4):
print( x[i][0] + ' ' * (12-len(x[i][0])) + x[i+4][0] + ' ' * (12-len(x[i+4][0])) ) # print to screen
print(s)
print('')
if s > scoremax:
scoremax = s
EDIT: To clarify, these are the symmetry and rotation arrangements I'm trying to skip. For clarity, I'll use numbers instead of the plant name strings.
0 1 2 3 is same when mirrored 3 2 1 0
4 5 6 7 7 6 5 4
0 1 2 3 is same when mirrored 4 5 6 7
4 5 6 7 0 1 2 3
0 1 2 3 is same when rotated 7 6 5 4
4 5 6 7 3 2 1 0
In general, it is often very difficult to efficiently break symmetries for this kind of problems.
In this case, there seem to be just 2 symmetries:
right to left is the same as left to right
up to down is the same as down to up
We can break both of them if we add three conditions:
the crop at plot 0 should both be smaller than the crop at plot 3 and the one at plot 4 and the one at plot 7
For 'smaller' we can use any measure that gives a strict ordering. In this case we can simply compare the strings.
The main loop would then look as follows. Once the optimal scoremax is reached, only solutions that don't have symmetry will be printed. Also, every possible solution will either be printed directly or will be printed in its canonical form (i.e. mirrored horizontally and/or vertically).
# maximize the number of good relationships in a garden
import itertools
# each crop has 2 items: the name of the crop and a list of all the good friends
crops = []
crops.append(['basil',['tomato','pepper','lettuce']]) # basil likes to be near tomato, pepper or lettuce
crops.append(['strawberry',['beans','lettuce']])
crops.append(['beans',['beet','marigold','cucumber','potato','strawberry','radish']])
crops.append(['beet',['beans']])
crops.append(['cucumber',['lettuce','radish','tomato','dill','marigold']])
crops.append(['marigold',['tomato','cucumber','potato','beans']])
crops.append(['tomato',['cucumber','chives','marigold','basil','dill']])
crops.append(['bok_choy',['dill']])
# 0 1 2 3 This is what the garden looks like, with 8 bins
# 4 5 6 7
mates = [ [0,1], [1,2], [2,3], [4,5], [5,6], [6,7], [0,4], [1,5], [2,6], [3,7] ] # these are the relationships that directly border one another
def score(c): # A scoring function that returns the number of good relationships
s = 0
for pair in mates:
for j in c[pair[1]][1]:
if c[pair[0]][0] == j:
s = s + 1
for j in c[pair[0]][1]: # and the revers, 1-0
if c[pair[1]][0] == j:
s = s + 1
return s
scoremax = 0
for x in itertools.permutations(crops,8):
if x[0][0] < x[3][0] and x[0][0] < x[4][0] and x[0][0] < x[7][0]:
s = score(x)
if s >= scoremax: # show the arrangement
for i in range(0,4):
print( x[i][0] + ' ' * (12-len(x[i][0])) + x[i+4][0] + ' ' * (12-len(x[i+4][0])) ) # print to screen
print(s)
print('')
if s > scoremax:
scoremax = s
I have a –large– dataframe with a list of edges in a bipartite graph. I want transforme it to a python sparse transition matrix.
So I have a dataframe with a list of edges linking nodes from part 1 (a,b,c) with part (x,y,z). Edges have multiplicity: in the example, there are two edges from b to y.
start end multiplicity
a x 1
a y 1
b y 2
b z 1
c x 1
c z 1
The result I want is a sparse matrix, 3x3 in this case. I have dictionaries for part 1 and 2, indicating which node corresponds to which row and columns of the resulting transition matrix:
dic1 = {'a':0,'b':1,'c':2}
dic2 = {'x':1,'y':0,'z':2}
So I want the matrix
y x z
a 1 1 0
b 2 0 1
c 0 1 1
...but in sparse (csr_matrix, lil_matrix, or coo_matrix). I have tried iterating over the list of edges, but it is too slow for long lists.
Also, approaches based on pivot will generate full matrices, which will be slow and memory consumming.
Is there an efficient way to obtain the sparse matrix I want
From what I understand , you can try pivot + reindex with Index.map (I have added 2 variables m and final for readability which you can replace with one after testing):
m = df.pivot(*df).fillna(0).rename_axis(index=None,columns=None)
final = m.reindex(index=m.index[m.index.map(dic1)],columns=m.columns[m.columns.map(dic2)])
print(final)
y x z
a 1.0 1.0 0.0
b 2.0 0.0 1.0
c 0.0 1.0 1.0