Related
I want to compute discrete difference of identity matrix.
The code below use numpy and scipy.
import numpy as np
from scipy.sparse import identity
from scipy.sparse import csc_matrix
x = identity(4).toarray()
y = csc_matrix(np.diff(x, n=2))
print(y)
I would like to improve performance or memory usage.
Since identity matrix produce many zeros, it would reduce memory usage to perform calculation in compressed sparse column(csc) format. However, np.diff() does not accept csc format, so converting between csc and normal format using csc_matrix would slow it down a bit.
Normal format
x = identity(4).toarray()
print(x)
[[1. 0. 0. 0.]
[0. 1. 0. 0.]
[0. 0. 1. 0.]
[0. 0. 0. 1.]]
csc format
x = identity(4)
print(x)
(0, 0) 1.0
(1, 1) 1.0
(2, 2) 1.0
(3, 3) 1.0
Thanks
Here is my hacky solution to get the sparse matrix as you want.
L - the length of the original identity matrix,
n - the parameter of np.diff.
In your question they are:
L = 4
n = 2
My code produces the same y as your code, but without the conversions between csc and normal formats.
Your code:
from scipy.sparse import identity, csc_matrix
x = identity(L).toarray()
y = csc_matrix(np.diff(x, n=n))
My code:
from scipy.linalg import pascal
def get_data(n, L):
nums = pascal(n + 1, kind='lower')[-1].astype(float)
minuses_from = n % 2 + 1
nums[minuses_from : : 2] *= -1
return np.tile(nums, L - n)
data = get_data(n, L)
row_ind = (np.arange(n + 1) + np.arange(L - n).reshape(-1, 1)).flatten()
col_ind = np.repeat(np.arange(L - n), n + 1)
y = csc_matrix((data, (row_ind, col_ind)), shape=(L, L - n))
I have noticed that after applying np.diff to the identity matrix n times, the values of the columns are the binomial coefficients with their signs alternating. This is my variable data.
Then I am just constructing the csc_matrix.
Unfortunately, it does not seem that SciPy provides any tools for this kind of sparse matrix manipulation. Regardless, by cleverly manipulating the indices and data of the entries one can emulate np.diff(x,n) in a straightforward fashion.
Given 2D NumPy array (matrix) of dimension MxN, np.diff() multiplies each column (of column index y) with -1 and adds the next column to it (column index y+1). Difference of order k is just the iterative application of k differences of order 1. A difference of order 0 is just the returns the input matrix.
The method below makes use of this, iterateively eliminating duplicate entries by addition through sum_duplicates(), reducing the number of columns by one, and filtering non-valid indices.
def csc_diff(x, n):
'''Emulates np.diff(x,n) for a sparse matrix by iteratively taking difference of order 1'''
assert isinstance(x, csc_matrix) or (isinstance(x, np.ndarray) & len(x.shape) == 2), "Input matrix must be a 2D np.ndarray or csc_matrix."
assert isinstance(n, int) & n >= 0, "Integer n must be larger or equal to 0."
if n >= x.shape[1]:
return csc_matrix(([], ([], [])), shape=(x.shape[0], 0))
if isinstance(x, np.ndarray):
x = csc_matrix(x)
# set-up of data/indices via column-wise difference
if(n > 0):
for k in range(1,n+1):
# extract data/indices of non-zero entries of (current) sparse matrix
M, N = x.shape
idx, idy = x.nonzero()
dat = x.data
# difference: this row (y) * (-1) + next row (y+1)
idx = np.concatenate((idx, idx))
idy = np.concatenate((idy, idy-1))
dat = np.concatenate(((-1)*dat, dat))
# filter valid indices
validInd = (0<=idy) & (idy<N-1)
# x_diff: csc_matrix emulating np.diff(x,1)'s output'
x_diff = csc_matrix((dat[validInd], (idx[validInd], idy[validInd])), shape=(M, N-1))
x_diff.sum_duplicates()
x = x_diff
return x
Moreover, the method outputs an empty csc_matrix of dimension Mx0 when the difference order is larger or equal to the number of columns of the input matrix. This is why the output is identical, see
csc_diff(x, 2).toarray()
> array([[ 1., 0.],
[-2., 1.],
[ 1., -2.],
[ 0., 1.]])
which is identical to
np.diff(x.toarray(), 2)
> array([[ 1., 0.],
[-2., 1.],
[ 1., -2.],
[ 0., 1.]])
This identity holds for other difference orders, too
(csc_diff(x, 0).toarray() == np.diff(x.toarray(), 0)).all()
>True
(csc_diff(x, 3).toarray() == np.diff(x.toarray(), 3)).all()
>True
(csc_diff(x, 13).toarray() == np.diff(x.toarray(), 13)).all()
>True
Q1.
I'm trying to make my custom autograd function with pytorch.
But I had a problem with making analytical back propagation with y = x / sum(x, dim=0)
where size of tensor x is (Height, Width) (x is 2-dimensional).
Here's my code
class MyFunc(torch.autograd.Function):
#staticmethod
def forward(ctx, input):
ctx.save_for_backward(input)
input = input / torch.sum(input, dim=0)
return input
#staticmethod
def backward(ctx, grad_output):
input = ctx.saved_tensors[0]
H, W = input.size()
sum = torch.sum(input, dim=0)
grad_input = grad_output * (1/sum - input*1/sum**2)
return grad_input
I used (torch.autograd import) gradcheck to compare Jacobian matrix,
from torch.autograd import gradcheck
func = MyFunc.apply
input = (torch.randn(3,3,dtype=torch.double,requires_grad=True))
test = gradcheck(func, input)
and the result was
Please someone help me to get correct back propagation result
Thanks!
Q2.
Thanks for answers!
Because of your help, I could implement back propagation in case of (H,W) tensor.
However, while I implemented back propagation in case of (N,H,W) tensor, I got a problem.
I think the problem would be initializing new tensor.
Here's my new code
import torch
import torch.nn as nn
import torch.nn.functional as F
class MyFunc(torch.autograd.Function):
#staticmethod
def forward(ctx, input):
ctx.save_for_backward(input)
N = input.size(0)
for n in range(N):
input[n] /= torch.sum(input[n], dim=0)
return input
#staticmethod
def backward(ctx, grad_output):
input = ctx.saved_tensors[0]
N, H, W = input.size()
I = torch.eye(H).unsqueeze(-1)
sum = input.sum(1)
grad_input = torch.zeros((N,H,W), dtype = torch.double, requires_grad=True)
for n in range(N):
grad_input[n] = ((sum[n] * I - input[n]) * grad_output[n] / sum[n]**2).sum(1)
return grad_input
Gradcheck code is
from torch.autograd import gradcheck
func = MyFunc.apply
input = (torch.rand(2,2,2,dtype=torch.double,requires_grad=True))
test = gradcheck(func, input)
print(test)
and result is
enter image description here
I don't know why the error occurs...
Your help will be very helpful for me to implement my own convolutional network.
Thanks! Have a nice day.
Let's look an example with a single column, for instance: [[x1], [x2], [x3]].
Let sum be x1 + x2 + x3, then normalizing x will give y = [[y1], [y2], [y3]] = [[x1/sum], [x2/sum], [x3/sum]]. You're looking for dL/dx1, dL/x2, and dL/x3 - we'll just write them as: dx1, dx2, and dx3. Same for all dL/dyi.
So dx1 is equal to dL/dy1*dy1/dx1 + dL/dy2*dy2/dx1 + dL/dy3*dy3/dx1. That's because x1 contributes to all ouput element on the corresponding column: y1, y2, and y3.
We have:
dy1/dx1 = d(x1/sum)/dx1 = (sum - x1)/sum²
dy2/dx1 = d(x2/sum)/dx1 = -x2/sum²
similarly, dy3/dx1 = d(x3/sum)/dx1 = -x3/sum²
Therefore dx1 = (sum - x1)/sum²*dy1 - x2/sum²*dy2 - x3/sum²*dy3. Same for dx2 and dx3. As a result, the Jacobian is [dxi]_i = (sum - xi)/sum² and [dxi]_j = -xj/sum² (for all j different to i).
In your implementation, you seem to be missing all non-diagonal components.
Keeping the same one-column example, with x1=2, x2=3, and x3=5:
>>> x = torch.tensor([[2.], [3.], [5.]])
>>> sum = input.sum(0)
tensor([10])
The Jacobian will be:
>>> J = (sum*torch.eye(input.size(0)) - input)/sum**2
tensor([[ 0.0800, -0.0200, -0.0200],
[-0.0300, 0.0700, -0.0300],
[-0.0500, -0.0500, 0.0500]])
For an implementation with multiple columns, it's a bit trickier, more specifically for the shape of the diagonal matrix. It's easier to keep the column axis last so we don't have to bother with broadcastings:
>>> x = torch.tensor([[2., 1], [3., 3], [5., 5]])
>>> sum = x.sum(0)
tensor([10., 9.])
>>> diag = sum*torch.eye(3).unsqueeze(-1).repeat(1, 1, len(sum))
tensor([[[10., 9.],
[ 0., 0.],
[ 0., 0.]],
[[ 0., 0.],
[10., 9.],
[ 0., 0.]],
[[ 0., 0.],
[ 0., 0.],
[10., 9.]]])
Above diag has a shape of (3, 3, 2) where the two columns are on the last axis. Notice how we didn't need to broadcast sum.
What I wouldn't have done is: torch.eye(3).unsqueeze(0).repeat(len(sum), 1, 1). Since with this kind of shape - (2, 3, 3) - you will have to use sum[:, None, None], and will need further broadcasting down the road...
The Jacobian is simply:
>>> J = (diag - x)/sum**2
tensor([[[ 0.0800, 0.0988],
[-0.0300, -0.0370],
[-0.0500, -0.0617]],
[[-0.0200, -0.0123],
[ 0.0700, 0.0741],
[-0.0500, -0.0617]],
[[-0.0200, -0.0123],
[-0.0300, -0.0370],
[ 0.0500, 0.0494]]])
You can check the results by backpropagating through the operation using an arbitrary dy vector (not with torch.ones though, you'll get 0s because of J!). After backpropagating, x.grad should equal to torch.einsum('abc,bc->ac', J, dy).
Your Jacobian is not accurate: It is a 4d tensor, you only computed a 2D slice of it.
You neglected the second row of the Jacobian:
Answer for Q2.
I implemented back propagation myself for many batch case.
I used unsqueeze function and it worked.
size of input : (N,H,W) (N is batch size)
forward:
out = input / torch.sum(input, dim=1).unsqueeze(1)
backward:
diag = torch.eye(input.size(1), dtype=torch.double, requires_grad=True).unsqueeze(-1)
sum = input.sum(1)
grad_input = ((sum.unsqueeze(1).unsqueeze(1) * diag - input.unsqueeze(1)) * grad_out.unsqueeze(1) / (sum**2).unsqueeze(1).unsqueeze(1)).sum(2)
I have two numpy arrays of different shapes, but with the same length (leading dimension). I want to shuffle each of them, such that corresponding elements continue to correspond -- i.e. shuffle them in unison with respect to their leading indices.
This code works, and illustrates my goals:
def shuffle_in_unison(a, b):
assert len(a) == len(b)
shuffled_a = numpy.empty(a.shape, dtype=a.dtype)
shuffled_b = numpy.empty(b.shape, dtype=b.dtype)
permutation = numpy.random.permutation(len(a))
for old_index, new_index in enumerate(permutation):
shuffled_a[new_index] = a[old_index]
shuffled_b[new_index] = b[old_index]
return shuffled_a, shuffled_b
For example:
>>> a = numpy.asarray([[1, 1], [2, 2], [3, 3]])
>>> b = numpy.asarray([1, 2, 3])
>>> shuffle_in_unison(a, b)
(array([[2, 2],
[1, 1],
[3, 3]]), array([2, 1, 3]))
However, this feels clunky, inefficient, and slow, and it requires making a copy of the arrays -- I'd rather shuffle them in-place, since they'll be quite large.
Is there a better way to go about this? Faster execution and lower memory usage are my primary goals, but elegant code would be nice, too.
One other thought I had was this:
def shuffle_in_unison_scary(a, b):
rng_state = numpy.random.get_state()
numpy.random.shuffle(a)
numpy.random.set_state(rng_state)
numpy.random.shuffle(b)
This works...but it's a little scary, as I see little guarantee it'll continue to work -- it doesn't look like the sort of thing that's guaranteed to survive across numpy version, for example.
Your can use NumPy's array indexing:
def unison_shuffled_copies(a, b):
assert len(a) == len(b)
p = numpy.random.permutation(len(a))
return a[p], b[p]
This will result in creation of separate unison-shuffled arrays.
X = np.array([[1., 0.], [2., 1.], [0., 0.]])
y = np.array([0, 1, 2])
from sklearn.utils import shuffle
X, y = shuffle(X, y, random_state=0)
To learn more, see http://scikit-learn.org/stable/modules/generated/sklearn.utils.shuffle.html
Your "scary" solution does not appear scary to me. Calling shuffle() for two sequences of the same length results in the same number of calls to the random number generator, and these are the only "random" elements in the shuffle algorithm. By resetting the state, you ensure that the calls to the random number generator will give the same results in the second call to shuffle(), so the whole algorithm will generate the same permutation.
If you don't like this, a different solution would be to store your data in one array instead of two right from the beginning, and create two views into this single array simulating the two arrays you have now. You can use the single array for shuffling and the views for all other purposes.
Example: Let's assume the arrays a and b look like this:
a = numpy.array([[[ 0., 1., 2.],
[ 3., 4., 5.]],
[[ 6., 7., 8.],
[ 9., 10., 11.]],
[[ 12., 13., 14.],
[ 15., 16., 17.]]])
b = numpy.array([[ 0., 1.],
[ 2., 3.],
[ 4., 5.]])
We can now construct a single array containing all the data:
c = numpy.c_[a.reshape(len(a), -1), b.reshape(len(b), -1)]
# array([[ 0., 1., 2., 3., 4., 5., 0., 1.],
# [ 6., 7., 8., 9., 10., 11., 2., 3.],
# [ 12., 13., 14., 15., 16., 17., 4., 5.]])
Now we create views simulating the original a and b:
a2 = c[:, :a.size//len(a)].reshape(a.shape)
b2 = c[:, a.size//len(a):].reshape(b.shape)
The data of a2 and b2 is shared with c. To shuffle both arrays simultaneously, use numpy.random.shuffle(c).
In production code, you would of course try to avoid creating the original a and b at all and right away create c, a2 and b2.
This solution could be adapted to the case that a and b have different dtypes.
Very simple solution:
randomize = np.arange(len(x))
np.random.shuffle(randomize)
x = x[randomize]
y = y[randomize]
the two arrays x,y are now both randomly shuffled in the same way
James wrote in 2015 an sklearn solution which is helpful. But he added a random state variable, which is not needed. In the below code, the random state from numpy is automatically assumed.
X = np.array([[1., 0.], [2., 1.], [0., 0.]])
y = np.array([0, 1, 2])
from sklearn.utils import shuffle
X, y = shuffle(X, y)
from np.random import permutation
from sklearn.datasets import load_iris
iris = load_iris()
X = iris.data #numpy array
y = iris.target #numpy array
# Data is currently unshuffled; we should shuffle
# each X[i] with its corresponding y[i]
perm = permutation(len(X))
X = X[perm]
y = y[perm]
Shuffle any number of arrays together, in-place, using only NumPy.
import numpy as np
def shuffle_arrays(arrays, set_seed=-1):
"""Shuffles arrays in-place, in the same order, along axis=0
Parameters:
-----------
arrays : List of NumPy arrays.
set_seed : Seed value if int >= 0, else seed is random.
"""
assert all(len(arr) == len(arrays[0]) for arr in arrays)
seed = np.random.randint(0, 2**(32 - 1) - 1) if set_seed < 0 else set_seed
for arr in arrays:
rstate = np.random.RandomState(seed)
rstate.shuffle(arr)
And can be used like this
a = np.array([1, 2, 3, 4, 5])
b = np.array([10,20,30,40,50])
c = np.array([[1,10,11], [2,20,22], [3,30,33], [4,40,44], [5,50,55]])
shuffle_arrays([a, b, c])
A few things to note:
The assert ensures that all input arrays have the same length along
their first dimension.
Arrays shuffled in-place by their first dimension - nothing returned.
Random seed within positive int32 range.
If a repeatable shuffle is needed, seed value can be set.
After the shuffle, the data can be split using np.split or referenced using slices - depending on the application.
you can make an array like:
s = np.arange(0, len(a), 1)
then shuffle it:
np.random.shuffle(s)
now use this s as argument of your arrays. same shuffled arguments return same shuffled vectors.
x_data = x_data[s]
x_label = x_label[s]
There is a well-known function that can handle this:
from sklearn.model_selection import train_test_split
X, _, Y, _ = train_test_split(X,Y, test_size=0.0)
Just setting test_size to 0 will avoid splitting and give you shuffled data.
Though it is usually used to split train and test data, it does shuffle them too.
From documentation
Split arrays or matrices into random train and test subsets
Quick utility that wraps input validation and
next(ShuffleSplit().split(X, y)) and application to input data into a
single call for splitting (and optionally subsampling) data in a
oneliner.
This seems like a very simple solution:
import numpy as np
def shuffle_in_unison(a,b):
assert len(a)==len(b)
c = np.arange(len(a))
np.random.shuffle(c)
return a[c],b[c]
a = np.asarray([[1, 1], [2, 2], [3, 3]])
b = np.asarray([11, 22, 33])
shuffle_in_unison(a,b)
Out[94]:
(array([[3, 3],
[2, 2],
[1, 1]]),
array([33, 22, 11]))
One way in which in-place shuffling can be done for connected lists is using a seed (it could be random) and using numpy.random.shuffle to do the shuffling.
# Set seed to a random number if you want the shuffling to be non-deterministic.
def shuffle(a, b, seed):
np.random.seed(seed)
np.random.shuffle(a)
np.random.seed(seed)
np.random.shuffle(b)
That's it. This will shuffle both a and b in the exact same way. This is also done in-place which is always a plus.
EDIT, don't use np.random.seed() use np.random.RandomState instead
def shuffle(a, b, seed):
rand_state = np.random.RandomState(seed)
rand_state.shuffle(a)
rand_state.seed(seed)
rand_state.shuffle(b)
When calling it just pass in any seed to feed the random state:
a = [1,2,3,4]
b = [11, 22, 33, 44]
shuffle(a, b, 12345)
Output:
>>> a
[1, 4, 2, 3]
>>> b
[11, 44, 22, 33]
Edit: Fixed code to re-seed the random state
Say we have two arrays: a and b.
a = np.array([[1,2,3],[4,5,6],[7,8,9]])
b = np.array([[9,1,1],[6,6,6],[4,2,0]])
We can first obtain row indices by permutating first dimension
indices = np.random.permutation(a.shape[0])
[1 2 0]
Then use advanced indexing.
Here we are using the same indices to shuffle both arrays in unison.
a_shuffled = a[indices[:,np.newaxis], np.arange(a.shape[1])]
b_shuffled = b[indices[:,np.newaxis], np.arange(b.shape[1])]
This is equivalent to
np.take(a, indices, axis=0)
[[4 5 6]
[7 8 9]
[1 2 3]]
np.take(b, indices, axis=0)
[[6 6 6]
[4 2 0]
[9 1 1]]
If you want to avoid copying arrays, then I would suggest that instead of generating a permutation list, you go through every element in the array, and randomly swap it to another position in the array
for old_index in len(a):
new_index = numpy.random.randint(old_index+1)
a[old_index], a[new_index] = a[new_index], a[old_index]
b[old_index], b[new_index] = b[new_index], b[old_index]
This implements the Knuth-Fisher-Yates shuffle algorithm.
Shortest and easiest way in my opinion, use seed:
random.seed(seed)
random.shuffle(x_data)
# reset the same seed to get the identical random sequence and shuffle the y
random.seed(seed)
random.shuffle(y_data)
most solutions above work, however if you have column vectors you have to transpose them first. here is an example
def shuffle(self) -> None:
"""
Shuffles X and Y
"""
x = self.X.T
y = self.Y.T
p = np.random.permutation(len(x))
self.X = x[p].T
self.Y = y[p].T
With an example, this is what I'm doing:
combo = []
for i in range(60000):
combo.append((images[i], labels[i]))
shuffle(combo)
im = []
lab = []
for c in combo:
im.append(c[0])
lab.append(c[1])
images = np.asarray(im)
labels = np.asarray(lab)
I extended python's random.shuffle() to take a second arg:
def shuffle_together(x, y):
assert len(x) == len(y)
for i in reversed(xrange(1, len(x))):
# pick an element in x[:i+1] with which to exchange x[i]
j = int(random.random() * (i+1))
x[i], x[j] = x[j], x[i]
y[i], y[j] = y[j], y[i]
That way I can be sure that the shuffling happens in-place, and the function is not all too long or complicated.
Just use numpy...
First merge the two input arrays 1D array is labels(y) and 2D array is data(x) and shuffle them with NumPy shuffle method. Finally split them and return.
import numpy as np
def shuffle_2d(a, b):
rows= a.shape[0]
if b.shape != (rows,1):
b = b.reshape((rows,1))
S = np.hstack((b,a))
np.random.shuffle(S)
b, a = S[:,0], S[:,1:]
return a,b
features, samples = 2, 5
x, y = np.random.random((samples, features)), np.arange(samples)
x, y = shuffle_2d(train, test)
Ok, so basically my problem is shifting frame of mind from solving math problems „on the paper“ to solving them by programing. Let me explain: I want to know is it possible to perform operations on variable before assigning it a value. Like if I have something like (1-x)**n can I firstly assign n a value, then turn it into a from specific for certain degree and then give x a value or values. If I wasn’t clear enough: if n=2 can I firstly turn equation in form 1-2x+x**2 and then in the next step take care of x value?
I want to write a code for calculating and drawing n-th degree Bezier curve .I am using Bernstein polynomials for this, so I realized that equations consists of 3 parts: first part are polynomial coefficients which are all part of Pascal triangle; I am calculating those and putting them in one list. Second part are coordinates of control points which are also some kind of coefficients, and put them in separate list. Now comes the hard part: part of equation that has a variable.Bernsteins are working with barocentric coordinates (meaning u and 1-u).N-th degree formula for this part of equation is:
u**i *(1-u)**(n-i)
where n is curve degree, I goes from 0->n and U is variable.U is acctualy normalised variable,meaning that it value can be from 0 to 1 and i want to itterate it later in certain number of steps (like 1000).But problem is if i try to use mentioned equation i keep getting error, because Python doesnt know what to do with u.I taught about nested loops in which first one would itterate a value of u from 0 to 1 and second would take care of the mentioned equation from 0 to n, but not sure if it is right solution,and no idea how to chech results.What do you think?
PS: I have not uploaded the code because the part with which im having problem i can not even start,and ,I think but could be wrong, that it is separated from the rest of the code; but if you think it can help solving problem i can upload it.
You can do with higher-order functions, that is functions that return functions, like in
def Bernstein(n,i):
def f(t):
return t**i*(1.0-t)**(n-i)
return f
that you could use like this
b52 = Bernstein(5,2)
val = b52(0.74)
but instead you'll rather use lists
Bernstein_ni = [Bernstein(n,i) for i in range(n+1)]
to be used in a higher order function to build the Bezier curve function
def mk_bezier(Px,Py):
"Input, lists of control points, output a function of t that returns (x,y)"
n = len(Px)
binomials = {0:[1], 1:[1,1], 2:[1,2,1],
3:[1,3,3,1], 4:[1,4,6,4,1], 5:[1,5,10,10,5,1]}
binomial = binomials[n-1]
bPx = [b*x for b,x in zip(binomial,Px)]
bPy = [b*y for b,y in zip(binomial,Py)]
bns = [Bernstein(n-1,i) for i in range(n)]
def f(t):
x = 0 ; y = 0
for i in range(n):
berns = bns[i](t)
x = x + bPx[i]*berns
y = y + bPy[i]*berns
return x, y
return f
eventually, in your program, you can use the function factory like this
linear = mk_bezier([0.0,1.0],[1.0,0.0])
quadra = mk_bezier([0.0,1.0,2.0],[1.0,3.0,1.0])
for t in (0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0):
l = linear(t) ; q = quadra(t)
print "%3.1f (%6.4f,%6.4f) (%6.4f,%6.4f)" % (t, l[0],l[1], q[0],q[1])
and this is the testing output
0.0 (0.0000,1.0000) (0.0000,1.0000)
0.1 (0.1000,0.9000) (0.2000,1.3600)
0.2 (0.2000,0.8000) (0.4000,1.6400)
0.3 (0.3000,0.7000) (0.6000,1.8400)
0.4 (0.4000,0.6000) (0.8000,1.9600)
0.5 (0.5000,0.5000) (1.0000,2.0000)
0.6 (0.6000,0.4000) (1.2000,1.9600)
0.7 (0.7000,0.3000) (1.4000,1.8400)
0.8 (0.8000,0.2000) (1.6000,1.6400)
0.9 (0.9000,0.1000) (1.8000,1.3600)
1.0 (1.0000,0.0000) (2.0000,1.0000)
Edit
I think that the right way to do it is at the module level, with a top level sort-of-defaultdictionary that memoizes all the different lists required to perform the actual computations, but defaultdict doesn't pass a variable to its default_factory and I don't feel like subclassing dict (not now) for the sake of this answer, the main reason being that I've never subclassed before...
In response to OP comment
You say that the function degree is the main parameter? But it is implicitely defined by length of the list of control points...
N = user_input()
P0x = user_input()
P0y = user_input()
PNx = user_input()
PNy = user_input()
# code that computes P1, ..., PNminus1
orderN = mk_bezier([P0x,P1x,...,PNminus1x,PNx],
[P0y,P1y,...,PNminus1y,PNy])
x077, y077 = orderN(0.77)
But the customer is always right, so I'll never try again to convince you that my solution works for you if you state that it does things differently from your expectations.
There are Python packages for doing symbolic math, but it might be easier to use some of the polynomial functions available in Numpy. These functions use the convention that a polynomial is represented as an array of coefficients, starting with the lowest order coefficient. So a polynomial a*x^2 + b*x + c would be represented as array([c, b, a]).
Some examples:
In [49]: import numpy.polynomial.polynomial as poly
In [50]: p = [-1, 1] # -x + 1
In [51]: p = poly.polypow(p, 2)
In [52]: p # should be 1 - 2x + x^2
Out[52]: array([ 1., -2., 1.])
In [53]: x = np.arange(10)
In [54]: poly.polyval(x, p) # evaluate polynomial at points x
Out[54]: array([ 1., 0., 1., 4., 9., 16., 25., 36., 49., 64.])
And you could calculate your Bernstein polynomial in a way similar to this (there is still a binomial coefficient missing):
In [55]: def Bernstein(n, i):
...: part1 = poly.polypow([0, 1], i) # (0 + u)**i
...: part2 = poly.polypow([1, -1], n - i) # (1 - u)**(n - i)
...: return poly.polymul(part1, part2)
In [56]: p = Bernstein(3, 2)
In [57]: p
Out[57]: array([ 0., 0., 1., -1.])
In [58]: poly.polyval(x, p) # evaluate polynomial at points x
Out[58]: array([ 0., 0., -4., -18., ..., -448., -648.])
I would like students to solve a quadratic program in an assignment without them having to install extra software like cvxopt etc. Is there a python implementation available that only depends on NumPy/SciPy?
I'm not very familiar with quadratic programming, but I think you can solve this sort of problem just using scipy.optimize's constrained minimization algorithms. Here's an example:
import numpy as np
from scipy import optimize
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d.axes3d import Axes3D
# minimize
# F = x[1]^2 + 4x[2]^2 -32x[2] + 64
# subject to:
# x[1] + x[2] <= 7
# -x[1] + 2x[2] <= 4
# x[1] >= 0
# x[2] >= 0
# x[2] <= 4
# in matrix notation:
# F = (1/2)*x.T*H*x + c*x + c0
# subject to:
# Ax <= b
# where:
# H = [[2, 0],
# [0, 8]]
# c = [0, -32]
# c0 = 64
# A = [[ 1, 1],
# [-1, 2],
# [-1, 0],
# [0, -1],
# [0, 1]]
# b = [7,4,0,0,4]
H = np.array([[2., 0.],
[0., 8.]])
c = np.array([0, -32])
c0 = 64
A = np.array([[ 1., 1.],
[-1., 2.],
[-1., 0.],
[0., -1.],
[0., 1.]])
b = np.array([7., 4., 0., 0., 4.])
x0 = np.random.randn(2)
def loss(x, sign=1.):
return sign * (0.5 * np.dot(x.T, np.dot(H, x))+ np.dot(c, x) + c0)
def jac(x, sign=1.):
return sign * (np.dot(x.T, H) + c)
cons = {'type':'ineq',
'fun':lambda x: b - np.dot(A,x),
'jac':lambda x: -A}
opt = {'disp':False}
def solve():
res_cons = optimize.minimize(loss, x0, jac=jac,constraints=cons,
method='SLSQP', options=opt)
res_uncons = optimize.minimize(loss, x0, jac=jac, method='SLSQP',
options=opt)
print '\nConstrained:'
print res_cons
print '\nUnconstrained:'
print res_uncons
x1, x2 = res_cons['x']
f = res_cons['fun']
x1_unc, x2_unc = res_uncons['x']
f_unc = res_uncons['fun']
# plotting
xgrid = np.mgrid[-2:4:0.1, 1.5:5.5:0.1]
xvec = xgrid.reshape(2, -1).T
F = np.vstack([loss(xi) for xi in xvec]).reshape(xgrid.shape[1:])
ax = plt.axes(projection='3d')
ax.hold(True)
ax.plot_surface(xgrid[0], xgrid[1], F, rstride=1, cstride=1,
cmap=plt.cm.jet, shade=True, alpha=0.9, linewidth=0)
ax.plot3D([x1], [x2], [f], 'og', mec='w', label='Constrained minimum')
ax.plot3D([x1_unc], [x2_unc], [f_unc], 'oy', mec='w',
label='Unconstrained minimum')
ax.legend(fancybox=True, numpoints=1)
ax.set_xlabel('x1')
ax.set_ylabel('x2')
ax.set_zlabel('F')
Output:
Constrained:
status: 0
success: True
njev: 4
nfev: 4
fun: 7.9999999999997584
x: array([ 2., 3.])
message: 'Optimization terminated successfully.'
jac: array([ 4., -8., 0.])
nit: 4
Unconstrained:
status: 0
success: True
njev: 3
nfev: 5
fun: 0.0
x: array([ -2.66453526e-15, 4.00000000e+00])
message: 'Optimization terminated successfully.'
jac: array([ -5.32907052e-15, -3.55271368e-15, 0.00000000e+00])
nit: 3
This might be a late answer, but I found CVXOPT - http://cvxopt.org/ - as the commonly used free python library for Quadratic Programming. However, it is not easy to install, as it requires the installation of other dependencies.
I ran across a good solution and wanted to get it out there. There is a python implementation of LOQO in the ELEFANT machine learning toolkit out of NICTA (http://elefant.forge.nicta.com.au as of this posting). Have a look at optimization.intpointsolver. This was coded by Alex Smola, and I've used a C-version of the same code with great success.
mystic provides a pure python implementation of nonlinear/non-convex optimization algorithms with advanced constraints functionality that typically is only found in QP solvers. mystic actually provides more robust constraints than most QP solvers. However, if you are looking for optimization algorithmic speed, then the following is not for you. mystic is not slow, but it's pure python as opposed to python bindings to C. If you are looking for flexibility and QP constraints functionality in a nonlinear solver, then you might be interested.
"""
Maximize: f = 2*x[0]*x[1] + 2*x[0] - x[0]**2 - 2*x[1]**2
Subject to: -2*x[0] + 2*x[1] <= -2
2*x[0] - 4*x[1] <= 0
x[0]**3 -x[1] == 0
where: 0 <= x[0] <= inf
1 <= x[1] <= inf
"""
import numpy as np
import mystic.symbolic as ms
import mystic.solvers as my
import mystic.math as mm
# generate constraints and penalty for a nonlinear system of equations
ieqn = '''
-2*x0 + 2*x1 <= -2
2*x0 - 4*x1 <= 0'''
eqn = '''
x0**3 - x1 == 0'''
cons = ms.generate_constraint(ms.generate_solvers(ms.simplify(eqn,target='x1')))
pens = ms.generate_penalty(ms.generate_conditions(ieqn), k=1e3)
bounds = [(0., None), (1., None)]
# get the objective
def objective(x, sign=1):
x = np.asarray(x)
return sign * (2*x[0]*x[1] + 2*x[0] - x[0]**2 - 2*x[1]**2)
# solve
x0 = np.random.rand(2)
sol = my.fmin_powell(objective, x0, constraint=cons, penalty=pens, disp=True,
bounds=bounds, gtol=3, ftol=1e-6, full_output=True,
args=(-1,))
print 'x* = %s; f(x*) = %s' % (sol[0], -sol[1])
Things to note is that mystic can generically apply LP, QP, and higher order equality and inequality constraints to any given optimizer, not just a special QP solver. Secondly, mystic can digest symbolic math, so the ease of defining/entering the constraints is a bit nicer than working with the matrices and derivatives of functions. mystic depends on numpy, and will use scipy if it is installed (however, scipy is not required). mystic utilizes sympy to handle symbolic constraints, but it's also not required for optimization in general.
Output:
Optimization terminated successfully.
Current function value: -2.000000
Iterations: 3
Function evaluations: 103
x* = [ 2. 1.]; f(x*) = 2.0
Get mystic here: https://github.com/uqfoundation
The qpsolvers package also seems to fit the bill. It only depends on NumPy and can be installed by pip install qpsolvers. Then, you can do:
from numpy import array, dot
from qpsolvers import solve_qp
M = array([[1., 2., 0.], [-8., 3., 2.], [0., 1., 1.]])
P = dot(M.T, M) # quick way to build a symmetric matrix
q = dot(array([3., 2., 3.]), M).reshape((3,))
G = array([[1., 2., 1.], [2., 0., 1.], [-1., 2., -1.]])
h = array([3., 2., -2.]).reshape((3,))
# min. 1/2 x^T P x + q^T x with G x <= h
print "QP solution:", solve_qp(P, q, G, h)
You can also try different QP solvers (such as CVXOPT mentioned by Curious) by changing the solver keyword argument, for example solver='cvxopt' or solver='osqp'.