Related
I want to implement a photo editor in python using flask. So far, I managed to apply an s curve to a photo, like this:
import cv2
import numpy as np
image = cv2.imread('apple.jpg')
def sToneCurve(frame):
look_up_table = np.zeros((256, 1), dtype='uint8')
for i in range(256):
look_up_table[i][0] = 255 * (np.sin(np.pi * (i / 255 - 1 / 2)) + 1) / 2
return cv2.LUT(frame, look_up_table)
image_contrasted = sToneCurve(image)
cv2.imwrite('apple_dark.jpg', image_contrasted)
How could I implement an interactive tone curve, so that the user could select how he would like to edit the photos, like this: tone curve and not be a predefined formula applied to the photo, as in the code above. What would be the best approach, what libraries and visualizations for the curve plots to use?
You implement this using "standard" polynomial fitting: you have N points that you need a curve through, so you find the N-1st order polynomial that does that, then use that polynomial as your mapping function.
You're already using numpy, so use numpy.polynomial.polynomial.polyfit with:
x all your points' x coordinates, including your black and white points (which in a proper tone curve users should be able to move off of (0,0) and (1,1) respective),
y all your points' y coordinates,
deg if the polynomial has to pass through all points, which it should, this should be equal to len(x) - 1, as two points is a line, or a first degree polynomial, three points is a quadratic curve, or a second degree polynomial, etc. "The" polynomial through N points is an N-1 degree polynomial,
the rest of the args shouldn't particularly matter.
This gives you a numpy array of polynomial coefficients (let's call that array c) that you can then use for mapping: any pixel with lightness/intensity value i should get mapped to:
mapped = f(I) = c[0] * i**0 + c[1] * i**1 + c[2] * i**2 + ...
Which thankfully numpy can do for you by simply using the corresponding polyval function.
And of course, to make that fast, what you really want to do is build a LUT that you can just directly consult, every time the user changes a coordinate in the tone curve UI, so:
from numpy.polynomial.polynomial import polyfit, polyval
# How big of a LUT you actually need depends entirely
# on the bit depth you're working with, of course...
BIT_DEPTH = 2**16
TONE_LUT = range(0, BIT_DEPTH)
def update_from_tone_ui(coordinates):
"""
Called on user value update, with coordinates being
a list-of-lists a la [[0,0], [0.1,0.1], ...]
"""
x, y = zip(*coordinates)
coefficients = polyfit(x, y, len(x) - 1)
f = lamba i: clamp(polyval(i, coefficients), 0, 1)
# And remember to make sure the input range to f() matches
# the actual x/y domain that we used for the polyfit:
divisor = BIT_DEPTH - 1
TONE_LUT = [BIT_DEPTH * f(i/divisor) for i in range(0, BIT_DEPTH)]
with clamp coming from "somewhere", but if you don't already have one then it's trivially implemented with some shortcut returns:
def clamp(n, floor, ceiling):
if n < floor: return floor
if n > ceiling: return ceiling
return n
(And of course make sure to adjust your clamping values if you don't want your tone curve x and y coordinates in [0,1])
Now, rather than running the mapping function every time, you just directly look up the mapped value. Note that you get a bit of freedom in terms of precision: you could use a tone curve in which the x and y values run from 0 to 1, or you use have them run from 0 to whatever-bit-depth-you-use (28, 216, what have you) but whatever you use, make sure you scale your actual pixel intensities accordingly when you generate your LUT. Otherwise things will look really interesting.
I am writing some simple scripts to plot a graph given a trigonometric function (in this example, a sine).
My issue is that I'd like to plot JUST two periods of the given trig function. To clarify, in trigonometry a Period is the length (on a graph) that ONE wave takes up. For sin and cos, one period is 2pi.
I'd like to take my existing code, and (preferably) using matplotlib, plot two periods of my given trig function, and line up a couple of points on the graph with a couple of points on my function.
If it's possible, I would like to be able to plot my function so that the start of the first period lines up with my first label, the highest point of the first period lines up with the second label, the point where my function crosses the x-axis with the third label, the lowest point with the fourth label, and the end of my first period/beginning of my second period with the fifth label. This pattern would then repeat for the second period. From here on, I'm going to refer to the x labels as the "Period Markings".
I've come up with three possible solutions for this:
I could set the borders of my graph (in this case x = -4 and x = 4) to be labeled as the first and ninth Period Markings respectively, then constrain my function to just be within the graph somehow.
I could somehow set a parameter in matplotlib to only plot 4pi (the length of two periods) units worth of line, although in that case, however, I don't think that the Period Markings would match up with their desired points.
If matplotlib supports it, I could find the low points, x-intercepts, and high points of the graph, then assign my Period Markers to each one from left to right. This would have the advantage of removing the necessity to plot ONLY two periods, as the Period Markers would dictate the beginning and end of the two periods.
Below I've inserted a couple of things:
A copy of the plotting part of my code, containing a sample equation and some sample Period Markings
A screenshot of the graph of the given sample equation
A visual representation of where each Period Marking would line up with, ideally, as well as a line demarcating an estimation of two full periods.
The standard form of a sin function is y = aSIN(bx-c)+d. The equation here is just sin(x), but you can see how variables c and d play a role in determining the graph. Usually, the xlabels array would be filled in with variables that are determined earlier in the script, as would all the variables at the top (func, a, b, c, d).
import math
import matplotlib.pyplot as plt
import numpy as np
func = sin
a = 1
b = 1
c = 0
d = 0
xlabels = np.array(['-2pi', '-3pi/2', '-pi', '-pi/2',
'0','pi/2', 'pi', '3pi/2','2pi'])
xlabelcount = -4, -3, -2, -1, 0, 1, 2, 3, 4
x = np.arange(-4, 4, 0.01)
if func == 'sin':
ypoints = a*np.sin(2*x-c)+d
if func == 'cos':
ypoints = a*np.cos(2*x+c)+d
if b < 0:
plt.gca().invert_yaxis()
plt.title('Wave Function')
plt.xlabel('Period (Not to Scale)')
plt.ylabel('Amplitude')
plt.grid(True, which='both')
plt.axhline(y=0, color='k')
plt.plot(x, ypoints)
plt.xticks(ticks=xlabelcount,labels=xlabels)
plt.show()
Plot of sin(x)
Preferred Period Marking placements
I hope this can provide a comprehensive understanding of the issue I face, and any help would be greatly appreciated. I feel that I've done a fair amount of Googling around, but nothing has yielded a good answer. I apologize in advance if I'm missing something really obvious.
Thanks,
dreadlearner
If I understand this correctly, you would like to add points on the curve at certain predefined locations on x-axis (period markings). If this is correct, the best way is to evaluate the value of the function at those particular "period markings" and plot this as a single point. Something like:
fn = "sin"
if fn == "sin":
fn = np.sin
elif fn == "cos":
fn = np.cos
# if required, the next three statements can be
# customized for each function by shifting them
# inside the if ... else blocks
x = np.linspace(-2*np.pi, 2*np.pi, 1000)
points = [i * np.pi/2 for i in range(-4, 5)]
labels = ["-2π", "-3π/2", "-π", "-π/2", "0", "π/2", "π", "3π/2", "2π"]
fig, ax = plt.subplots()
ax.plot(x, fn(x))
ax.set_xticks(points)
ax.set_xticklabels(labels)
# the next line is what you probably want
for pt in points:
ax.plot(pt, fn(pt), "ok")
ax.hlines(0, x[0], x[-1], "r")
plt.show()
Looks like this:
I have an large array of elements that I call RelDist (In which dimensionally, is a unit of distance) in a simulated volume. I am attempting to determine the distribution for the "number of values per unit volume" which is also number density. It should be similar to this diagram:
I am aware that the axis is scaled log base 10, the plot of the set should definitely drop off.
Mathematically, I set it up as two equivalent equations:
where N is the number of elements in the array being differentiated in respect to the natural log of the distances. It can also be equivalently re-written in the form of a regular derivative by introducing another factor of r.
Equivalently,
So for ever increasing r, I want to count the change in N of elements per logarithmic bin of r.
As of now, I have trouble setting up the frequency counting in the histogram while accommodating the volume along side it.
Attempt 1
This is using the dN/dlnr/volume equations
def n(dist, numbins):
logdist= np.log(dist)
hist, r_array = np.histogram(logdist, numbins)
dlogR = r_array[1]-r_array[0]
x_array = r_array[1:] - dlogR/2
## I am condifent the above part of this code is correct.
## The succeeding portion does not work.
dR = r_array[1:] - r_array[0:numbins]
dN_dlogR = hist * x_array/dR
volume = 4*np.pi*dist*dist*dist
## The included volume is incorrect
return [x_array, dN_dlogR/volume]
Plotting this does not even properly show a distribution like the first plot I posted above and it only works when I choose the bin number to be the same shape as my input array. The bun number should arbitrary, should it not?
Attempt 2
This is using the equivalent dN/dr/volume equation.
numbins = np.linspace(min(RelDist),max(RelDist), 100)
hist, r_array = np.histogram(RelDist, numbins)
volume = 4*np.float(1000**2)
dR = r_array[1]-r_array[0]
x_array = r_array[1:] - dR/2
y = hist/dR
A little bit easier, but without including the volume term, I get a sort of histogram distribution, which is at least a start.
With this attempt, how would include the volume term with the array?
Example
Start at a distance R value of something like 10, counts the change in number in respect to R, then increasing to a distance value R of 20, counts the change, increase to value of 30, counts the change, and so on so forth.
Here is a txt file of my array if you are interested in re-creating it
https://www.dropbox.com/s/g40gp88k2p6pp6y/RelDist.txt?dl=0
Since no one was able to help answer, I will provide my result in case someone wants to use it for future use:
def n_ln(dist, numbins):
log_dist = np.log10(dist)
bins = np.linspace(min(log_dist),max(log_dist), numbins)
hist, r_array = np.histogram(log_dist, bins)
dR = r_array[1]-r_array[0]
x_array = r_array[1:] - dR/2
volume = [4.*np.pi*i**3. for i in 10**x_array[:] ]
return [10**x_array, hist/dR/volume]
I have a function (f : black line) which varies sharply in a specific, small region (derivative f' : blue line, and second derivative f'' : red line). I would like to integrate this function numerically, and if I distribution points evenly (in log-space) I end up with fairly large errors in the sharply varying region (near 2E15 in the plot).
How can I construct an array spacing such that it is very well sampled in the area where the second derivative is large (i.e. a sampling frequency proportional to the second derivative)?
I happen to be using python, but I'm interested in a general algorithm.
Edit:
1) It would be nice to be able to still control the number of sampling points (at least roughly).
2) I've considered constructing a probability distribution function shaped like the second derivative and drawing randomly from that --- but I think this will offer poor convergence, and in general, it seems like a more deterministic approach should be feasible.
Assuming f'' is a NumPy array, you could do the following
# Scale these deltas as you see fit
deltas = 1/f''
domain = deltas.cumsum()
To account only for order of magnitude swings, this could be adjusted as follows...
deltas = 1/(-np.log10(1/f''))
I'm just spitballing here ... (as I don't have time to try this out for real)...
Your data looks (roughly) linear on a log-log plot (at least, each segment seems to be... So, I might consider doing a sort-of integration in log-space.
log_x = log(x)
log_y = log(y)
Now, for each of your points, you can get the slope (and intercept) in log-log space:
rise = np.diff(log_y)
run = np.diff(log_x)
slopes = rise / run
And, similarly, the the intercept can be calculated:
# y = mx + b
# :. b = y - mx
intercepts = y_log[:-1] - slopes * x_log[:-1]
Alright, now we have a bunch of (straight) lines in log-log space. But, a straight line in log-log space, corresponds to y = log(intercept)*x^slope in real space. We can integrate that easily enough: y = a/(k+1) x ^ (k+1), so...
def _eval_log_log_integrate(a, k, x):
return np.log(a)/(k+1) * x ** (k+1)
def log_log_integrate(a, k, x1, x2):
return _eval_log_log_integrate(a, k, x2) - _eval_log_log_integrate(a, k, x1)
partial_integrals = []
for a, k, x_lower, x_upper in zip(intercepts, slopes, x[:-1], x[1:]):
partial_integrals.append(log_log_integrate(a, k, x_lower, x_upper))
total_integral = sum(partial_integrals)
You'll want to check my math -- It's been a while since I've done this sort of thing :-)
1) The Cool Approach
At the moment I implemented an 'adaptive refinement' approach inspired by hydrodynamics techniques. I have a function which I want to sample, f, and I choose some initial array of sample points x_i. I construct a "sampling" function g, which determines where to insert new sample points.
In this case I chose g as the slope of log(f) --- since I want to resolve rapid changes in log space. I then divide the span of g into L=3 refinement levels. If g(x_i) exceeds a refinement level, that span is subdivided into N=2 pieces, those subdivisions are added into the samples and are checked against the next level. This yields something like this:
The solid grey line is the function I want to sample, and the black crosses are my initial sampling points.
The dashed grey line is the derivative of the log of my function.
The colored dashed lines are my 'refinement levels'
The colored crosses are my refined sampling points.
This is all shown in log-space.
2) The Simple Approach
After I finished (1), I realized that I probably could have just chosen a maximum spacing in in y, and choose x-spacings to achieve that. Similarly, just divide the function evenly in y, and find the corresponding x points.... The results of this are shown below:
A simple approach would be to split the x-axis-array into three parts and use different spacing for each of them. It would allow you to maintain the total number of points and also the required spacing in different regions of the plot. For example:
x = np.linspace(10**13, 10**15, 100)
x = np.append(x, np.linspace(10**15, 10**16, 100))
x = np.append(x, np.linspace(10**16, 10**18, 100))
You may want to choose a better spacing based on your data, but you get the idea.
Does anyone know a good method to calculate the empirical/sample covariogram, if possible in Python?
This is a screenshot of a book which contains a good definition of covariagram:
If I understood it correctly, for a given lag/width h, I'm supposed to get all the pair of points that are separated by h (or less than h), multiply its values and for each of these points, calculate its mean, which in this case, are defined as m(x_i). However, according to the definition of m(x_{i}), if I want to compute m(x1), I need to obtain the average of the values located within distance h from x1. This looks like a very intensive computation.
First of all, am I understanding this correctly? If so, what is a good way to compute this assuming a two dimensional space? I tried to code this in Python (using numpy and pandas), but it takes a couple of seconds and I'm not even sure it is correct, that is why I will refrain from posting the code here. Here is another attempt of a very naive implementation:
from scipy.spatial.distance import pdist, squareform
distances = squareform(pdist(np.array(coordinates))) # coordinates is a nx2 array
z = np.array(z) # z are the values
cutoff = np.max(distances)/3.0 # somewhat arbitrary cutoff
width = cutoff/15.0
widths = np.arange(0, cutoff + width, width)
Z = []
Cov = []
for w in np.arange(len(widths)-1): # for each width
# for each pairwise distance
for i in np.arange(distances.shape[0]):
for j in np.arange(distances.shape[1]):
if distances[i, j] <= widths[w+1] and distances[i, j] > widths[w]:
m1 = []
m2 = []
# when a distance is within a given width, calculate the means of
# the points involved
for x in np.arange(distances.shape[1]):
if distances[i,x] <= widths[w+1] and distances[i, x] > widths[w]:
m1.append(z[x])
for y in np.arange(distances.shape[1]):
if distances[j,y] <= widths[w+1] and distances[j, y] > widths[w]:
m2.append(z[y])
mean_m1 = np.array(m1).mean()
mean_m2 = np.array(m2).mean()
Z.append(z[i]*z[j] - mean_m1*mean_m2)
Z_mean = np.array(Z).mean() # calculate covariogram for width w
Cov.append(Z_mean) # collect covariances for all widths
However, now I have confirmed that there is an error in my code. I know that because I used the variogram to calculate the covariogram (covariogram(h) = covariogram(0) - variogram(h)) and I get a different plot:
And it is supposed to look like this:
Finally, if you know a Python/R/MATLAB library to calculate empirical covariograms, let me know. At least, that way I can verify what I did.
One could use scipy.cov, but if one does the calculation directly (which is very easy), there are more ways to speed this up.
First, make some fake data that has some spacial correlations. I'll do this by first making the spatial correlations, and then using random data points that are generated using this, where the data is positioned according to the underlying map, and also takes on the values of the underlying map.
Edit 1:
I changed the data point generator so positions are purely random, but z-values are proportional to the spatial map. And, I changed the map so that left and right side were shifted relative to eachother to create negative correlation at large h.
from numpy import *
import random
import matplotlib.pyplot as plt
S = 1000
N = 900
# first, make some fake data, with correlations on two spatial scales
# density map
x = linspace(0, 2*pi, S)
sx = sin(3*x)*sin(10*x)
density = .8* abs(outer(sx, sx))
density[:,:S//2] += .2
# make a point cloud motivated by this density
random.seed(10) # so this can be repeated
points = []
while len(points)<N:
v, ix, iy = random.random(), random.randint(0,S-1), random.randint(0,S-1)
if True: #v<density[ix,iy]:
points.append([ix, iy, density[ix,iy]])
locations = array(points).transpose()
print locations.shape
plt.imshow(density, alpha=.3, origin='lower')
plt.plot(locations[1,:], locations[0,:], '.k')
plt.xlim((0,S))
plt.ylim((0,S))
plt.show()
# build these into the main data: all pairs into distances and z0 z1 values
L = locations
m = array([[math.sqrt((L[0,i]-L[0,j])**2+(L[1,i]-L[1,j])**2), L[2,i], L[2,j]]
for i in range(N) for j in range(N) if i>j])
Which gives:
The above is just the simulated data, and I made no attempt to optimize it's production, etc. I assume this is where the OP starts, with the task below, since the data already exists in a real situation.
Now calculate the "covariogram" (which is much easier than generating the fake data, btw). The idea here is to sort all the pairs and associated values by h, and then index into these using ihvals. That is, summing up to index ihval is the sum over N(h) in the equation, since this includes all pairs with hs below the desired values.
Edit 2:
As suggested in the comments below, N(h) is now only the pairs that are between h-dh and h, rather than all pairs between 0 and h (where dh is the spacing of h-values in ihvals -- ie, S/1000 was used below).
# now do the real calculations for the covariogram
# sort by h and give clear names
i = argsort(m[:,0]) # h sorting
h = m[i,0]
zh = m[i,1]
zsh = m[i,2]
zz = zh*zsh
hvals = linspace(0,S,1000) # the values of h to use (S should be in the units of distance, here I just used ints)
ihvals = searchsorted(h, hvals)
result = []
for i, ihval in enumerate(ihvals[1:]):
start, stop = ihvals[i-1], ihval
N = stop-start
if N>0:
mnh = sum(zh[start:stop])/N
mph = sum(zsh[start:stop])/N
szz = sum(zz[start:stop])/N
C = szz-mnh*mph
result.append([h[ihval], C])
result = array(result)
plt.plot(result[:,0], result[:,1])
plt.grid()
plt.show()
which looks reasonable to me as one can see bumps or troughs at the expected for the h values, but I haven't done a careful check.
The main speedup here over scipy.cov, is that one can precalculate all of the products, zz. Otherwise, one would feed zh and zsh into cov for every new h, and all the products would be recalculated. This calculate could be sped up even more by doing partial sums, ie, from ihvals[n-1] to ihvals[n] at each timestep n, but I doubt that will be necessary.