# imports
import numpy as np
import matplotlib.pyplot as plt
Problem
I have a 2 arrays, representing datapoints on a line. For example:
x = np.array([0,1,2])
y = np.array([5,6,4])
Plot the data:
plt.plot(x,y)
plt.scatter(x,y,c='r')
Get:
I'd like to have arrays where the datapoints represent the same lines as above, but they are doubled. The desired result is:
new_x = np.array([0,0.5,1,1.5,2]) # this is the result I am looking for
new_y = np.array([5,5.5,6,5,4]) # this is the result I am looking for
Plot:
plt.plot(new_x,new_y)
plt.scatter(new_x,new_y,c='r')
Result:
My attempt
My method of doing this:
x = np.array([0,1,2])
y = np.array([5,6,4])
new_x=[]
for index, each in enumerate(x):
if index!=0:
new_x.append((each+x[index-1])/2)
new_y=[]
for index, each in enumerate(y):
if index!=0:
new_y.append((each+y[index-1])/2)
new_x.extend(x)
new_y.extend(y)
Plot:
plt.plot(new_x,new_y)
plt.scatter(new_x,new_y,c='r')
Result:
Points are in the right place, the new points added are the correct points, but they are put into the new_x & new_y arrays in the wrong order. It is fixable by paying more attention to the .extend, but this does not seem a good method to me anyway, because of the iteration.
My research
linear interpolation between two data points: doesn't use 2D arrays
How to write a function that returns an interpolated value (pandas dataframe)?: deals with interpolated values of functions, not resampling lines as above
And there are some other questions with the [interpolation] tag, but I haven't found one descibing this problem above.
Question
How do I get the above described new_x and new_y, preferably without iteration?
You could use these one-liner list comprehensions:
new_x = [j for i, v in enumerate(x) for j in [v, (x[i] + x[(i + 1 if (len(x) - 1) != i else i)]) / 2]]
new_y = [j for i, v in enumerate(y) for j in [v, (y[i] + y[(i + 1 if (len(y) - 1) != i else i)]) / 2]]
plt.plot(new_x, new_y)
plt.scatter(new_x, new_y, c='r')
plt.show()
Output:
Related
As an extension to my previous project, where the equation X[i+1]=R*X[i](1-X[i]) is used to demonstrate a chaotic system (depending on R). Now I'm trying to construct the bifurcation graph.
About the code, I defined the function to do the actual calculations, and extracting the last 100 calculated values (to ensure the equilibrium reached), in order to plot out the bifurcated R vs x[i], I'm appending each R value to a empty X-data list, and multiple (aka, the returned 100 values) x[i] to a Y-data list (so it is actually a nested list...)
The thing is, depending on the R value, x[i] can be either single value (after equilibrium reached) or multiple values. So I was thinking to "purify" the nested Y-data list by numpy.unique() to remove all the replicated values.
Weirdly, when I don't make the extra "purification" step, the code actually works.
But when I put x = np.unique(logistic_calc(R,N)) it throws me a error says ValueError: setting an array element with a sequence.
Below is the code that works...
import numpy as np
import matplotlib.pyplot as plt
R = 0.2
N = 10_000
x0 = 0.5
def logistic_calc(R,N):
x = np.empty(N)
x[0] = x0
for i in range(1, N):
x[i] = R* x[i-1] * (1 - x[i-1])
return x[-100:]
x_lst = []
y_lst = []
for r in np.linspace(0.1,4,100):
R = r
x = logistic_calc(R,N)
x_lst.append(r)
y_lst.append(x)
plt.figure(figsize=(7, 4))
plt.plot(x_lst, y_lst, ls='', marker='.',ms='0.5', c="royalblue")
plt.ylim(0, 1)
plt.grid(c="lightgray")
plt.xlabel(r"$r$")
plt.ylabel(r"$x_n$")
plt.show()
From matplotlib documentation, paragraph "Plotting multiple sets of data":
"If x and/or y are 2D arrays a separate data set will be drawn for every column. If both x and y are 2D, they must have the same shape. If only one of them is 2D with shape (N, m) the other must have length N and will be used for every data set m."
It is not explicitly written that all sublists must have the same length. But it only refers to 2D arrays and not ragged nested sequences. To understand the behavior of plt.plot, just imagine that x and y will be cast into numpy arrays. In your second case, since y_lst contains lists with different lengths, this conversion cannot be made.
So I would go for something like this:
plt.figure(figize=(7, 4))
for r in np.linspace(1, 4, 100):
x = np.unique(logistic_calc(r, N))
plt.plot([r], [x], '.', ms=.5, c="royalblue") # a little bit tricky!
# OR
# plt.plot([r] * len(x), x, '.', ms=.5, c="royalblue")
...
plt.show()
When I run your example with np.unique, I get ...
...
Traceback (most recent call last):
File "test.py", line 29, in <module>
plt.plot(x_lst, y_lst, ls='', marker='.',ms='0.5', c="royalblue")
... more stack trace
ValueError: setting an array element with a sequence.
So the error is clearly happening at line ...
plt.plot(x_lst, y_lst, ls='', marker='.',ms='0.5', c="royalblue")
because the shapes of x_lst and y_lst no longer match up when you use np.unique.
You can get the code to work by looping over each each index of x_lst and y_lst and plotting them separately ...
import numpy as np
import matplotlib.pyplot as plt
R = 0.2
N = 10_000
x0 = 0.5
def logistic_calc(R,N):
x = np.empty(N)
x[0] = x0
for i in range(1, N):
x[i] = R* x[i-1] * (1 - x[i-1])
return x[-100:]
x_lst = []
y_lst = []
for r in np.linspace(0.1,4,100):
R = r
x = logistic_calc(R,N)
x = x.reshape(100)
x_lst.append(r)
y_lst.append(np.unique(x.round(decimals=4)))
plt.figure(figsize=(7, 4))
for x, y in zip(x_lst, y_lst):
plt.plot([x]*len(y), y, ls='', marker='.',ms='0.5', c="royalblue")
plt.ylim(0, 1)
plt.grid(c="lightgray")
plt.xlabel(r"$r$")
plt.ylabel(r"$x_n$")
plt.show()
I'm new to Python so please be patient. I appreciate any help!
What I have: three 1D lists (xr, yr, zr), one containing x-values, the other two y- and z-values
What I want to do: create a 3D contour plot in matplotlib
I realized that I need to convert the three 1D lists into three 2D lists, by using the meshgrid function.
Here's what I have so far:
xr = np.asarray(xr)
yr = np.asarray(yr)
zr = np.asarray(zr)
X, Y = np.meshgrid(xr,yr)
znew = np.array([zr for x,y in zip(np.ravel(X), np.ravel(Y))])
Z = znew.reshape(X.shape)
Running this gives me the following error (for the last line I entered above):
total size of new array must be unchanged
I went digging around stackoverflow, and tried using suggestions from people having similar problems. Here are the errors I get from each of those suggestions:
Changing the last line to:
Z = znew.reshape(X.shape[0])
Gives the same error.
Changing the last line to:
Z = znew.reshape(X.shape[0], len(znew))
Gives the error:
Shape of x does not match that of z: found (294, 294) instead of (294, 86436).
Changing it to:
Z = znew.reshape(X.shape, len(znew))
Gives the error:
an integer is required
Any ideas?
Well,sample code below works for me
import numpy as np
import matplotlib.pyplot as plt
xr = np.linspace(-20, 20, 100)
yr = np.linspace(-25, 25, 110)
X, Y = np.meshgrid(xr, yr)
#Z = 4*X**2 + Y**2
zr = []
for i in range(0, 110):
y = -25.0 + (50./110.)*float(i)
for k in range(0, 100):
x = -20.0 + (40./100.)*float(k)
v = 4.0*x*x + y*y
zr.append(v)
Z = np.reshape(zr, X.shape)
print(X.shape)
print(Y.shape)
print(Z.shape)
plt.contour(X, Y, Z)
plt.show()
TL;DR
import matplotlib.pyplot as plt
import numpy as np
def get_data_for_mpl(X, Y, Z):
result_x = np.unique(X)
result_y = np.unique(Y)
result_z = np.zeros((len(result_x), len(result_y)))
# result_z[:] = np.nan
for x, y, z in zip(X, Y, Z):
i = np.searchsorted(result_x, x)
j = np.searchsorted(result_y, y)
result_z[i, j] = z
return result_x, result_y, result_z
xr, yr, zr = np.genfromtxt('data.txt', unpack=True)
plt.contourf(*get_data_for_mpl(xr, yr, zr), 100)
plt.show()
Detailed answer
At the beginning, you need to find out for which values of x and y the graph is being plotted. This can be done using the numpy.unique function:
result_x = numpy.unique(X)
result_y = numpy.unique(Y)
Next, you need to create a numpy.ndarray with function values for each point (x, y) from zip(X, Y):
result_z = numpy.zeros((len(result_x), len(result_y)))
for x, y, z in zip(X, Y, Z):
i = search(result_x, x)
j = search(result_y, y)
result_z[i, j] = z
If the array is sorted, then the search in it can be performed not in linear time, but in logarithmic time, so it is enough to use the numpy.searchsorted function to search. but to use it, the arrays result_x and result_y must be sorted. Fortunately, sorting is part of the numpy.unique method and there are no additional actions to do. It is enough to replace the search (this method is not implemented anywhere and is given simply as an intermediate step) method with np.searchsorted.
Finally, to get the desired image, it is enough to call the matplotlib.pyplot.contour or matplotlib.pyplot.contourf method.
If the function value does not exist for (x, y) for all x from result_x and all y from result_y, and you just want to not draw anything, then it is enough to replace the missing values with NaN. Or, more simply, create result_z as numpy.ndarray` from NaN and then fill it in:
result_z = numpy.zeros((len(result_x), len(result_y)))
result_z[:] = numpy.nan
I have problem using matplotlib streamplot. I want to use a 3d vector field in coordinates (x,y,z) stored in a numpy array, and plot slices of it with streamplot.
To test it I wanted to use a vector field with arrows pointed up in the z>0 region and pointed down in the z<0 region.
So I tried this:
import numpy as np
import matplotlib.pyplot as plt
from math import *
max = 100
min = -100
X = np.linspace(min, max, num=100)
Y = np.linspace(min, max, num=100)
Z = np.linspace(min, max, num=100)
N = X.size
#single components in the 3D matrix
Bxa = np.zeros((N, N, N))
Bya = np.zeros((N, N, N))
Bza = np.zeros((N, N, N))
for i, x in enumerate(X):
for j, y in enumerate(Y):
for k, z in enumerate(Z):
Bxa[ i, j, k] = 0.0 #x
Bya[ i, j, k] = 0.0 #y
Bza[ i, j, k] = z
#I take a slice close to Y=0
Bx_sec = Bxa[:,N/2,:]
By_sec = Bya[:,N/2,:]
Bz_sec = Bza[:,N/2,:]
fig = plt.figure()
ax = fig.add_subplot(111)
ax.streamplot(X, Z, Bx_sec, Bz_sec, color='b')
ax.set_xlim([X.min(), X.max()])
ax.set_ylim([Z.min(), Z.max()])
plt.show()
But I obtain something that looks like if I have put Bza = x! I tried to invert the order of vectors but it is unuseful!
Does anyone of you understand the problem? Thanks
Gabriele
one friend told me
The documentation for streamplot:
x, y : 1d arrays
an *evenly spaced* grid.
u, v : 2d arrays
x and y-velocities. Number of rows should match length of y, and
the number of columns should match x.
Note that the rows in u and v should match y, and the columns should match x. I think your u and v are transposed.
so I used numpy.transpose( ) and everything worked!
Sorry for the seemingly elementary question. What I'm trying to implement is summarized in the following steps:
Generate input variables: x, y.
Let z = F(x,y).
Plot z's for particular combinations of x and y.
For example:
zlist = []
for _ in range(100):
x = np.random.random()*1.
y = np.random.random()*.5
if x < .5:
z = y / 2
else:
z = y * 2
zlist.append(z)
Now if I want to plot z for all the x between (0, 0.3), I presumably would need some marker on each element in zlist indicating its inputs variables. How would I attach such marker and then access it from the list when plotting?
I don't actually know anything about Numpy, so someone please comment and tell me if I'm making a fool out of myself. It seems like vanilla python behavior, though.
Rather than appending z, let's append (z,x) instead. Now zlist is a list of tuples, and you can loop through and plot by checking zlist[i][1].
zlist = []
for _ in range(100):
x = np.random.random()*1.
y = np.random.random()*.5
if x < .5:
z = y / 2
else:
z = y * 2
zlist.append((z,x))
for value in zlist:
if value[1] > 0 and value[1] < 0.3:
# Plot value[0]
# Or if you prefer list comprehensions:
# [value[0] for value in zlist if value[1] >0 and value[1] < 0.3]
# that will return a list with only the z values in zlist.
With numpy it's almost always much more efficient to perform operations on vectors and
arrays rather than on built-in Python sequence types such as lists. Here's one
way you can quickly find F(x, y) for every combination of two sets of random x
and y values without looping in Python. The result is going to be an nx-by-ny
array Z, where Z[i, j] = F(x[i], y[j]).
First of all, you can generate all of your x, y inputs as vectors:
nx = 100
ny = 200
x = np.random.random(size=nx) * 1.
y = np.random.random(size=ny) * 5.
For the result to be an nx-by-ny array, you could take these two vectors and
multiply them by ones to get two 2D nx-by-ny arrays containing the x and y
values in the rows and columns respectively. You can do this by taking advantage of numpy's
broadcasting rules:
x_arr = x[:,np.newaxis] * np.ones((nx,ny))
y_arr = y[np.newaxis,:] * np.ones((nx,ny))
The function you will apply to each x,y pair depends on the x value.
Fortunately, you can use np.where(<condition>, <do_this>, <do_that>) to apply
different operations to the values in your input depending on some condition:
Z = np.where(x_arr < 0.5, y_arr / 2., y_arr * 2.)
We can check that all the results are correct:
for i in xrange(nx):
for j in xrange(ny):
if x[i] < 0.5:
assert Z[i, j] == y[j] / 2.
else:
assert Z[i, j] == y[j] * 2
There's actually an even cleaner way to compute Z without expanding x and y into 2D arrays. Using the same broadcasting trick we used to get
x_arr and y_arr, you can pass x and y directly to np.where():
x2 = x[:,np.newaxis]
y2 = y[np.newaxis,:]
Z2 = np.where(x2 < 0.5, y2 / 2., y2 * 2.)
assert np.all(Z == Z2)
I would like to filter the values of a numpy meshgrid:
X,Y = np.mgrid[-10:10,-10:10]
in this case, I would like to remove all coordinates for which x**2 + y**2 <= 2. However, when I try to filter the array directly, for example
filter(lambda x,y: x**2 + y**2 >= 2, np.meshgrid[-10:10,-10:10])
I get errors because I'm not properly dealing with the array's structure.
Any tips for doing this right would be appreciated!
I was able to achieve the result that I needed using numpy.where, by filtering each array individually, but referencing both in the where condition:
X,Y = np.mgrid[-10:10,-10:10]
X,Y = np.where(X**2 + Y**2 > 2, X, 0), np.where(X**2 + Y**2 > 2, Y, 0)
This results in new 2D arrays, which is what I needed for matplotlib. Thanks to everyone who took the time to look at this question!
X,Y = np.mgrid[-10:10,-10:10]
idx = (X**2 + Y**2 > 2)
X, Y = X[idx], Y[idx]
The problem is that you no longer have 2D arrays, which may be an issue for things like matplotlib.
Seeing your own answer, and that you basically want to replace with 0 entries not fulfilling the condition, it is probably going to be cleaner and more efficient to do:
idx = X**2 + Y**2 > 2
X[~idx] = 0
Y[~idx] = 0