I am doing a project about image-processing, and I asked about how to solve a very large overdetermined systems of linear equations here. Before I can figure out a better way to accomplish the task, I just split the image into four equal parts and solve the systems of equations separately. The result is shown in the attached file.
The image represents the surface height of a pictured object. You can think of the two axes as the x and y axis, and the z-axis is the axis coming out of the screen. I solved the very large systems of equations to get z(x,y), which is displayed in this intensity plot. I have the following questions:
The lower left part is not shown because when I solved the equations for that region, the intensity plot calculated is affected by some extreme values. One or two pixels have the intensity (which represents the height) as high as 60, and because of the scaling of the colourbar scale, the rest of the image (which can be seen has height ranging only from -15 to 9) appears largely the same colour. I am still figuring out why those one or two pixels have such abnormal results, but if I do get these abnormal results, how can I eliminate/ignore them so the rest of the image can be seen properly?
I am using the imshow() in matplotlib. I also tried using a 3D plot, with the z-axis representing the surface height, but the result is not good. Are there any other visualization tools that can display the results in a nice way (preferably showing it in a 3D way) given that I have obtained z(x,y) for many pairs of (x,y)?
The four separate parts are clearly visible. Are there any ways to merge the separate parts together? First, I am thinking of sharing the central column and row. e.g. The top-left region spans from column=0 to 250, and the top-right region spans from column=250 to the right. In this case, values in col=250 will be calculated twice in total, and the values in each region will almost certainly differ from the other one slightly. How to reconcile the two slightly different values together to combine the different regions? Just taking the average of the two, do something related to curve fitting to merge the two regions, or what? Or should I stick to col=0 to 250, then col=251 to rightmost?
thanks
About point 2: you could try hill shading. See the matplotlib example and/or the novitsky blog
Related
I have two coordinate systems for each record in my dataset. Lat-lon coordinates and what I suppose is utm x-y coordinates.
50% of my dataset only has x-y data without lat-lon, viceversa is 6%.
There is a good portion of the dataset that has both (33%) for each single record.
I wanted to know if there is a way to take advantage of the intersection (and maybe the x-y only part, since it's the biggest) to obtain a full dataset with only one coordinate system that makes sense. The problem is that after a little bit of preprocessing, they look "relaxed" in a different way and the intersection doesn't really match. The scatter plot shows what I believe to be a non linear, warped relationship from one system of coordinates to another. With this, I mean that normalizing them both to [0;1] and centering them to (0,0) (by subtracting the mean), gives two slightly different point distributions, and apparently a coefficient multiplication to scale one down to look like the other is not enough to get them to match completely. Looks like some more complicated relationship between the two.
I also tried to use an external library called utm to convert the lat-long coordinates to x-y to have a third pair of attributes (let's call it my_xy), only to find out that is not matching even one of the first two systems, instead it shows another slight warp.
Notes: When I say I do not have data from one coordinate system, assume NaN.
Furthermore, I know the warping could be a result of the fundamental geometrical differences between latlon and xy systems, but I still do not know what else I could try, if the utm conversion and the scaling did not work.
Blue: latlon, Red: original xy, Green: my_xy calculated from latlon
I have a series of many thousands of (1D) spectra corresponding to different repetitions of an experiment. For each repetition, the same data has been recorded by two different instruments - so I have two very similar spectra, each consisting of a few hundred individual peaks/events. The instruments have different resolutions, precisions and likely detection efficiencies so the each pair of spectra are non-identical but similar - looking at them closely by eye one can confidently match many peaks in each spectra. I want to be able to automatically and reliably match the two spectra for each pair of spectra, i.e confidently say which peak corresponds to which. This will likely involve 'throwing away' some data which can't be confidently matched (e.g only one of the two instruments detect an event).
I've attached an image of what the data look like over an entire spectra and zoomed into a relatively sparse region. The red spectra has essentially already been peak found, such that it is 0 everywhere apart from where a real event is. I have used scipy.signal.find_peaks() on the blue trace, and plotted the found peaks, which seems to work well.
Now I just need to find a reliable method to match peaks between the spectra. I have tried matching peaks by just pairing the peaks which are closest to each other - however this runs into significant issues due to some peaks not being present in both spectra. I could add constraints about how close peaks must be to be matched but I think there are probably better ways out there. There are also issues arising from the red trace being a lower resolution than the blue. I expect there are pattern finding algorithms/python packages out there that would be best suited for this - but this is far from my area of expertise so I don't really know where to start. Thanks in advance.
Zoom in of relatively spare region of example pair of spectra :
An entire example pair of spectra, showing some very dense regions :
Example code to generate to plot the spectra:
from scipy.signal import find_peaks
for i in range(0, 10):
spectra1 = spectra1_list[i]
spectra2 = spectra2_list[i]
fig, ax1 = plt.subplots(1, 1,figsize=(12, 8))
peaks, properties = scipy.signal.find_peaks(shot_ADC, height=(6,None), threshold=(None,None), distance=2, prominence = (5, None))
plt.plot(spectra1)
plt.plot(spectra2_axis, spectra2, color='red')
plt.plot(peaks, spectra1[peaks], "x")
plt.show()
Deep learning perspective: you could train a pair of neural networks using cycle loss - mapping from signal A to signal B, and back again should bring you to the initial point on your signal.
Good start would be to read about CycleGAN which uses this to change style of images.
Admittedly this would be a bit of a research project and might take some time until it will work robustly.
I'm working on a heatmap generation program which hopefully will fill in the colors based on value samples provided from a building layout (this is not GPS based).
If I have only a few known data points such as these in a large matrix of unknowns, how do I get the values in between interpolated in Python?:
0,0,0,0,1,0,0,0,0,0,5,0,0,0,0,9
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0,0,0,2,0,0,0,0,0,0,0,0,8,0,0,0
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0,8,0,0,0,0,0,0,0,6,0,0,0,0,0,0
0,0,0,0,0,3,0,0,0,0,0,0,0,0,7,0
I understand that bilinear won't do it, and Gaussian will bring all the peaks down to low values due to the sheer number of surrounding zeros. This is obviously a matrix handling proposition, and I don't need it to be Bezier curve smooth, just close enough to be a graphic representation would be fine. My matrix will end up being about 1500×900 cells in size, with approximately 100 known points.
Once the values are interpolated, I have written code to convert it all to colors, no problem. It's just that right now I'm getting single colored pixels sprinkled over a black background.
Proposing a naive solution:
Step 1: interpolate and extrapolate existing data points onto surroundings.
This can be done using "wave propagation" type algorithm.
The known points "spread out" their values onto surroundings until all the grid is "flooded" with some known values. At the end of this stage you have a number of intersected "disks", and no zeroes left.
Step 2: smoothen the result (using bilinear filtering or some other filtering).
If you are able to use ScyPy, then interp2d does exactly what you want. A possible problem with is that it seems to not extrapolate smoothly according to this issue. This means that all values near the walls are going to be the same as closest their neighbour points. This can be solved by putting thermometers in all 4 corners :)
My software should judge spectrum bands, and given the location of the bands, find the peak point and width of the bands.
I learned to take the projection of the image and to find width of each peak.
But I need a better way to find the projection.
The method I used reduces a 1600-pixel wide image (eg 1600X40) to a 1600-long sequence. Ideally I would want to reduce the image to a 10000-long sequence using the same image.
I want a longer sequence as 1600 points provide too low resolution. A single point causes a large difference (there is a 4% difference if a band is judged from 18 to 19) to the measure.
How do I get a longer projection from the same image?
Code I used: https://stackoverflow.com/a/9771560/604511
import Image
from scipy import *
from scipy.optimize import leastsq
# Load the picture with PIL, process if needed
pic = asarray(Image.open("band2.png"))
# Average the pixel values along vertical axis
pic_avg = pic.mean(axis=2)
projection = pic_avg.sum(axis=0)
# Set the min value to zero for a nice fit
projection /= projection.mean()
projection -= projection.min()
What you want to do is called interpolation. Scipy has an interpolate module, with a whole bunch of different functions for differing situations, take a look here, or specifically for images here.
Here is a recently asked question that has some example code, and a graph that shows what happens.
But it is really important to realise that interpolating will not make your data more accurate, so it will not help you in this situation.
If you want more accurate results, you need more accurate data. There is no other way. You need to start with a higher resolution image. (If you resample, or interpolate, you results will acually be less accurate!)
Update - as the question has changed
#Hooked has made a nice point. Another way to think about it is that instead of immediately averaging (which does throw away the variance in the data), you can produce 40 graphs (like your lower one in your posted image) from each horizontal row in your spectrum image, all these graphs are going to be pretty similar but with some variations in peak position, height and width. You should calculate the position, height, and width of each of these peaks in each of these 40 images, then combine this data (matching peaks across the 40 graphs), and use the appropriate variance as an error estimate (for peak position, height, and width), by using the central limit theorem. That way you can get the most out of your data. However, I believe this is assuming some independence between each of the rows in the spectrogram, which may or may not be the case?
I'd like to offer some more detail to #fraxel's answer (to long for a comment). He's right that you can't get any more information than what you put in, but I think it needs some elaboration...
You are projecting your data from 1600x40 -> 1600 which seems like you are throwing some data away. While technically correct, the whole point of a projection is to bring higher dimensional data to a lower dimension. This only makes sense if...
Your data can be adequately represented in the lower dimension. Correct me if I'm wrong, but it looks like your data is indeed one-dimensional, the vertical axis is a measure of the variability of that particular point on the x-axis (wavelength?).
Given that the projection makes sense, how can we best summarize the data for each particular wavelength point? In my previous answer, you can see I took the average for each point. In the absence of other information about the particular properties of the system, this is a reasonable first-order approximation.
You can keep more of the information if you like. Below I've plotted the variance along the y-axis. This tells me that your measurements have more variability when the signal is higher, and low variability when the signal is lower (which seems useful!):
What you need to do then, is decide what you are going to do with those extra 40 pixels of data before the projection. They mean something physically, and your job as a researcher is to interpret and project that data in a meaningful way!
The code to produce the image is below, the spec. data was taken from the screencap of your original post:
import Image
from scipy import *
from scipy.optimize import leastsq
# Load the picture with PIL, process if needed
pic = asarray(Image.open("spec2.png"))
# Average the pixel values along vertical axis
pic_avg = pic.mean(axis=2)
projection = pic_avg.sum(axis=0)
# Compute the variance
variance = pic_avg.var(axis=0)
from pylab import *
scale = 1/40.
X_val = range(projection.shape[0])
errorbar(X_val,projection*scale,yerr=variance*scale)
imshow(pic,origin='lower',alpha=.8)
axis('tight')
show()
Currently, if I want to compare pressure under each of the paws of a dog, I only compare the pressure underneath each of the toes. But I want to try and compare the pressures underneath the entire paw.
But to do so I have to rotate them, so the toes overlap (better). Because most of the times the left and right paws are slightly rotated externally, so if you can't simply project one on top of the other. Therefore, I want to rotate the paws, so they are all aligned the same way.
Currently, I calculate the angle of rotation, by looking up the two middle toes and the rear one using the toe detection then I calculate the the angle between the yellow line (axis between toe green and red) and the green line (neutral axis).
Now I want to rotate the array would rotate around the rear toe, such that the yellow and green lines are aligned. But how do I do this?
Note that while this image is just 2D (only the maximal values of each sensor), I want to calculate this on a 3D array (10x10x50 on average). Also a downside of my angle calculation is that its very sensitive to the toe detection, so if somebody has a more mathematically correct proposal for calculating this, I'm all ears.
I have seen one study with pressure measurements on humans, where they used the local geometric inertial axis method, which at least was very reliable. But that still doesn't help me explain how to rotate the array!
If someone feels the need to experiment, here's a file with all the sliced arrays that contain the pressure data of each paw. To clarfiy: walk_sliced_data is a dictionary that contains ['ser_3', 'ser_2', 'sel_1', 'sel_2', 'ser_1', 'sel_3'], which are the names of the measurements. Each measurement contains another dictionary, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] (example from 'sel_1') which represent the impacts that were extracted.
Why would you do it that way? Why not simply integrate the whole region and compare? In this case you'll get a magnitude of the force and you can simply compare scalars which would be much easier.
If you need to somehow compare regions(and hence that's why you need to align them) then maybe attempt a feature extraction and alignment. But this would seem to fail if the pressure maps are not similar(say someone is not putting much wait on one foot).
I suppose you can get really complex but it sounds like simply calculating the force is what you want?
BTW, you can use a simple correlation test to find the optimal angle and translation if the images are similar.
To do this you simply compute the correlation between the two different images for various translations and rotations.
Using the Python Imaging Library, you can rotate an array with for example:
array(Image.fromarray(<data>).rotate(<angle>, resample=Image.BICUBIC))
From there, you can just create a for loop over the different layers of your 3D array.
If you have your first dimension as the layers, then array[<layer>] would return a 2D layer, thus:
for x in range(<amount of layers>):
layer = <array>[i]
<array>[i] = (Image.fromarray(layer).rotate(<angle>, resample=Image.BICUBIC))
Results by #IvoFlipse, with a conversation suggesting:
Putting the array in a bigger array to remedy the darker background.
Look into resampling, perhaps scale the array first.
Moving the rear toe towards the middle allows you to rotate around that instead.
A smaller image can be determined by finding the borders and positioning them in a 15x15 again.