Ok, so I feel like there should be an easy way to create a 3-dimensional scatter plot using matplotlib. I have a 3D numpy array (dset) with 0's where I don't want a point and 1's where I do, basically to plot it now I have to step through three for: loops as such:
for i in range(30):
for x in range(60):
for y in range(60):
if dset[i, x, y] == 1:
ax.scatter(x, y, -i, zdir='z', c= 'red')
Any suggestions on how I could accomplish this more efficiently? Any ideas would be greatly appreciated.
If you have a dset like that, and you want to just get the 1 values, you could use nonzero, which "returns a tuple of arrays, one for each dimension of a, containing the indices of the non-zero elements in that dimension.".
For example, we can make a simple 3d array:
>>> import numpy
>>> numpy.random.seed(29)
>>> d = numpy.random.randint(0, 2, size=(3,3,3))
>>> d
array([[[1, 1, 0],
[1, 0, 0],
[0, 1, 1]],
[[0, 1, 1],
[1, 0, 0],
[0, 1, 1]],
[[1, 1, 0],
[0, 1, 0],
[0, 0, 1]]])
and find where the nonzero elements are located:
>>> d.nonzero()
(array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2]), array([0, 0, 1, 2, 2, 0, 0, 1, 2, 2, 0, 0, 1, 2]), array([0, 1, 0, 1, 2, 1, 2, 0, 1, 2, 0, 1, 1, 2]))
>>> z,x,y = d.nonzero()
If we wanted a more complicated cut, we could have done something like (d > 3.4).nonzero() or something, as True has an integer value of 1 and counts as nonzero.
Finally, we plot:
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(x, y, -z, zdir='z', c= 'red')
plt.savefig("demo.png")
giving
If you wanted to avoid using the nonzero option (for example, if you had a 3D numpy array whose values were supposed to be the color values of the data points), you could do what you do, but save some lines of code by using ndenumerate.
Your example might become:
for index, x in np.ndenumerate(dset):
if x == 1:
ax.scatter(*index, c = 'red')
I guess the point is just that you dont need to have nested for loops to iterate through multidimensional numpy arrays.
Related
I use a data set called: arr0 with dimensions: (x, y, t) = (151, 151, 600). To create a 2D image I took a specific y-coordinate with slicing i.e. arr0[:,0], now the dimensions are: (x, t) = (151, 600). Within the dataset arr0[:,0] there are x-values for a time greater than t = 0.00173 (below the blue line in the image and last line of code, both shown below). What I would like to do is zero out all x-values for t > 0.00173.
The problem is that I don't know how I should do this, could someone please help me?
Best,
FR
plt.figure(figsize = (10,10))
plt.subplots_adjust(wspace=0.3)
plt.subplot(121)
plt.imshow((np.rot90(arr0[:,0],3)),cmap='Greys',extent=[xr1,xr2,taxis[nt-1],taxis[0]],aspect='auto')
plt.xlabel("Xr location [m]")
plt.ylabel("Time [s]")
plt.title("Direct wave after extrap (arr0)")
plt.hlines(0.00173,-0.3,0.3)
2D graph with values I want to turn zero below the blue line
If I understand you correctly, we can regard x, y, t as index variables along the axes of your 3-dimensional array, which contains the actual data values (represented by grayscale pixels in the plot). Here is a simplified example:
import numpy as np
import matplotlib.pyplot as plt
arr0 = np.array([[[0, 2, 1, 1, 2, 2],
[0, 2, 1, 1, 2, 2]],
[[0, 3, 1, 1, 2, 2],
[0, 3, 1, 1, 2, 2]]])
arr0.shape
(2, 2, 6)
Now I suppose you don't want to set the x values to zero, but rather the actual data values. Since NumPy works with integer indices, you'll have to convert your t-threshold to the appropriate integer index, based on the range of t values and the length of the array in dimension t:
plt.figure(figsize = (10, 4))
plt.subplots_adjust(wspace=0.3)
# plot original data slice
plt.subplot(121)
plt.imshow((np.rot90(arr0[:,0],3)), cmap='Greys', aspect='auto')
# define t-threshold integer index based on t range
t_threshold = int(arr0.shape[2] * 0.00173 / 0.006) + 1
# clean up data by setting values above t-threshold to zero
arr0[:, 0, t_threshold:] = 0
# plot cleaned up data slice
plt.subplot(122)
plt.imshow((np.rot90(arr0[:,0],3)), cmap='Greys', aspect='auto');
Note that assigning new values to the slice has changed the full array accordingly:
arr0
array([[[0, 2, 0, 0, 0, 0],
[0, 2, 1, 1, 2, 2]],
[[0, 3, 0, 0, 0, 0],
[0, 3, 1, 1, 2, 2]]])
I'm working with numpy and I got a problem with index, I have a numpy array of zeros, and a 2D array of indexes, what I need is to use this indexes to change the values of the array of zeros by the value of 1, I tried something, but it's not working, here is what I tried.
import numpy as np
idx = np.array([0, 3, 4],
[1, 3, 5],
[0, 4, 5]]) #Array of index
zeros = np.zeros(6) #Array of zeros [0, 0, 0, 0, 0, 0]
repeat = np.tile(zeros, (idx.shape[0], 1)) #This repeats the array of zeros to match the number of rows of the index array
res = []
for i, j in zip(repeat, idx):
res.append(i[j] = 1) #Here I try to replace the matching index by the value of 1
output = np.array(res)
but I get the syntax error
expression cannot contain assignment, perhaps you meant "=="?
my desired output should be
output = [[1, 0, 0, 1, 1, 0],
[0, 1, 0, 1, 0, 1],
[1, 0, 0, 0, 1, 1]]
This is just an example, the idx array can be bigger, I think the problem is the indexing, and I believe there is a much simple way of doing this without repeating the array of zeros and using the zip function, but I can't figure it out, any help would be aprecciated, thank you!
EDIT: When I change the = by == I get a boolean array which I don't need, so I don't know what's happening there either.
You can use np.put_along_axis to assign values into the array repeat based on indices in idx. This is more efficient than a loop (and easier).
import numpy as np
idx = np.array([[0, 3, 4],
[1, 3, 5],
[0, 4, 5]]) #Array of index
zeros = np.zeros(6).astype(int) #Array of zeros [0, 0, 0, 0, 0, 0]
repeat = np.tile(zeros, (idx.shape[0], 1))
np.put_along_axis(repeat, idx, 1, 1)
repeat will then be:
array([[1, 0, 0, 1, 1, 0],
[0, 1, 0, 1, 0, 1],
[1, 0, 0, 0, 1, 1]])
FWIW, you can also make the array of zeros directly by passing in the shape:
np.zeros([idx.shape[0], 6])
I have an example array that looks like array = np.array([[1,1,0,1], [0,1,0,0], [1,1,1,0], [0,0,1,2], [0,1,3,2], [1,1,0,1], [0,1,0,0]]) ...
array([[1, 1, 0, 1],
[0, 1, 0, 0],
[1, 1, 1, 0],
[0, 0, 1, 2],
[0, 1, 3, 2],
[1, 1, 0, 1],
[0, 1, 0, 0]])
With this in mind I want reformat this array into subarrays based off of the first two columns. Using How to split a numpy array based on a column? as a reference, I made this array into a list of arrays with ...
df = pd.DataFrame(array)
df['4'] = df[0].astype(str) + df[1].astype(str)
df['4'] = df['4'].astype(int)
arr = df.to_numpy()
y = [arr[arr[:,4]==k] for k in np.unique(arr[:,4])]
where y is ...
[array([[0, 0, 1, 2, 0]]),
array([[0, 1, 0, 0, 1],
[0, 1, 3, 2, 1],
[0, 1, 0, 0, 1]]),
array([[ 1, 1, 0, 1, 11],
[ 1, 1, 1, 0, 11],
[ 1, 1, 0, 1, 11]])]
This works fine but it takes far too long for y to run. The amount of time it takes increases exponentially with every row. I am playing around with hundreds of millions of rows and y = [arr[arr[:,4]==k] for k in np.unique(arr[:,4])] is not practical from a time standpoint.
Any ideas on how to speed this up?
What about using the numpy_indexed library:
import numpy as np
import numpy_indexed as npi
a = np.array([[1, 1, 0, 1],
[0, 1, 0, 0],
[1, 1, 1, 0],
[0, 0, 1, 2],
[0, 1, 3, 2],
[1, 1, 0, 1],
[0, 1, 0, 0]])
key = np.dot(a[:,:2], [1, 10])
y = npi.group_by(key).split_array_as_list(arr)
Output
y
[array([[0, 0, 1, 2]]),
array([[0, 1, 0, 0],
[0, 1, 3, 2],
[0, 1, 0, 0]]),
array([[ 1, 1, 0, 1],
[ 1, 1, 1, 0],
[ 1, 1, 0, 1]])]
You can easily install the library with:
> pip install numpy-indexed
Let me know if this performs better,
from collections import defaultdict
import numpy as np
outgen = defaultdict(lambda: [])
# arr: The input numpy array, :type: np.ndarray.
c = map(lambda x: ((x[0], x[1]), x), arr)
for key, val in c:
outgen[key].append(val)
# outgen: The required output, :type: list[np.ndarray].
outgen = [np.array(x) for x in outgen.values()]
You can use np.unique directly here.
unique, indexer = np.unique(arr[:, :2], axis=0, return_inverse=True)
{i: arr[indexer == k, :] for i, k in enumerate(unique)}
This is probably about as good as it gets for your desired output. However, instead of splitting it into a list of subarrays you could sort it by the unique key and then work with slices. This might be helpful if there are many unique values leading to a long list.
arr[:] = arr[np.argsort(indexer), :] # not sure if this is guaranteed to preserve the order within each group
EDIT:
Here is a powerful solution which I have been using for a sort of 2-D factorization. It takes 8ms for 1 million rows of single digit integers (vs > 100ms for np.unique).
columns = x[:, 0], x[:, 1]
factored = map(pd.factorize, columns)
codes, unique_values = map(list, zip(*factored))
group_index = get_group_index(codes, map(len, unique_values), sort=False, xnull=False)
It uses the internal algorithm of Dataframe.drop_duplicates.
Note that the ordering of the keys is not the sort order of the unique tuples.
There is also a new open source library, riptable which emulates numpy and pandas in some ways but is can be a lot more powerful. The creation of th takes around 4ms
import riptable as rt
columns = [x[:, 0], x[:, 1]]
unique_values, key = rt.unique(columns, return_inverse=True)
Here, unique_values is a tuple containing two arrays which can be zipped to get the unique tuples
In the code that I am writing, I have three 2D numpy arrays with the same dimensions (m x n), with each 2D array containing info about a specific trait, but each corresponding cell (with a specific row/col value) across all three 2D arrays corresponding to a specific person. The three 2D arrays are trait1, trait2, and trait3. As an example, person (0, 0) will have traits 1, 2, but not three, if only trait1 and trait2 have a value of 1 at location (0,0), but trait3 does not.
What would be an efficient method of updating a 2D array at a specific location based on the values of other corresponding 2D arrays of the same dimension at the same location? That is, how can I efficiently update a 2D array at a specific location such that the other 2D arrays at this same location fulfill specific conditions?
I am currently trying to update the values of the 2D array trait1 and trait2 according to the current values of trait1 and trait2 (such that the corresponding trait1 value == 1, and the corresponding trait2 value == 0); I am also trying to update the values of trait3 according to the current values of trait1, and trait2 (under the same conditions as the previous). However, I am having trouble doing this without using nested for loops, which greatly slows down my program.
Below is my current approach, which works, but is much too slow for my purposes:
for i in range (0, m):
for j in range (0, n):
if trait1[i][j] == 1:
if trait2[i][j] == 0:
trait1[i][j] = 0
trait2[i][j] = 1
new_color(i, j, 1) #updates the color of the specific person on a grid
trait3[i][j] = 0
elif trait1[i][j] == 0:
if trait2[i][j] <= 0:
trait1[i][j] = 1
trait2[i][j] = 0
new_color(i, j, 0)
Numpy array are really slow if you use loop indeed. If you can use matrices operations / numpy function for everything, it will go much faster.
In your case, you could first extract the indices you're interested about, and then update your matrices like this:
import numpy as np
np.random.seed(1)
# Generate some sample data
trait1, trait2, trait3 = ( np.random.randint(0,2, [4,4]) for _ in range(3) )
In [4]: trait1
Out[4]:
array([[1, 1, 0, 0],
[1, 1, 1, 1],
[1, 0, 0, 1],
[0, 1, 1, 0]])
In [5]: trait2
Out[5]:
array([[0, 1, 0, 0],
[0, 1, 0, 0],
[1, 0, 0, 0],
[1, 0, 0, 0]])
In [6]: trait3
Out[6]:
array([[1, 1, 1, 1],
[1, 0, 0, 0],
[1, 1, 1, 1],
[1, 1, 0, 1]])
And then:
cond1_idx = np.where((trait1 == 1) & (trait2==0))
cond2_idx = np.where((trait1 == 0) & (trait2<=0))
trait1[cond1_idx] = 0
trait2[cond1_idx] = 1
trait3[cond1_idx] = 0
[ new_color(i, j, 1) for i,j in zip(*cond1_idx) ]
trait1[cond2_idx] = 1
trait2[cond2_idx] = 0
[ new_color(i, j, 0) for i,j in zip(*cond2_idx) ]
Result:
In [2]: trait1
Out[2]:
array([[0, 1, 1, 1],
[0, 1, 0, 0],
[1, 1, 1, 0],
[0, 0, 0, 1]])
In [3]: trait2
Out[3]:
array([[1, 1, 0, 0],
[1, 1, 1, 1],
[1, 0, 0, 1],
[1, 1, 1, 0]])
In [4]: trait3
Out[4]:
array([[0, 1, 1, 1],
[0, 0, 0, 0],
[1, 1, 1, 0],
[1, 0, 0, 1]])
I cannot really test the new_color though since I don't have the function
I am working on a piece of python code that will take in an image in grey scale, scale it, and output a 3d model with the height of each pixel being determined by the value of the grey scale. I have everything working except the output of the 3d model. I am using numpy-stl to create it based on an array of values derived from the image. Using the numpy-stl library I create a box and then copy it as many times as i need for the image. then I translate each one to the position and height corresponding with the image. This all works. The problem comes when I try to save it all as one .stl file. I cant figure out how to combine all the individual meshes of the cubes into one.
Here is just the code dealing with the creation of the 3d array. I can plot the created meshes but not save them.
from stl import mesh
import math
import numpy
test = [[1,2],[2,1]]
a = [[1,2,3,4],
[5,6,7,8],
[9,10,11,12],
[13,14,15,16]]
# Create 6 faces of a cube, 2 triagles per face
data = numpy.zeros(12, dtype=mesh.Mesh.dtype)
#cube defined in stl format
# Top of the cube
data['vectors'][0] = numpy.array([[0, 1, 1],
[1, 0, 1],
[0, 0, 1]])
data['vectors'][1] = numpy.array([[1, 0, 1],
[0, 1, 1],
[1, 1, 1]])
# Right face
data['vectors'][2] = numpy.array([[1, 0, 0],
[1, 0, 1],
[1, 1, 0]])
data['vectors'][3] = numpy.array([[1, 1, 1],
[1, 0, 1],
[1, 1, 0]])
# Left face
data['vectors'][4] = numpy.array([[0, 0, 0],
[1, 0, 0],
[1, 0, 1]])
data['vectors'][5] = numpy.array([[0, 0, 0],
[0, 0, 1],
[1, 0, 1]])
# Bottem of the cube
data['vectors'][6] = numpy.array([[0, 1, 0],
[1, 0, 0],
[0, 0, 0]])
data['vectors'][7] = numpy.array([[1, 0, 0],
[0, 1, 0],
[1, 1, 0]])
# Right back
data['vectors'][8] = numpy.array([[0, 0, 0],
[0, 0, 1],
[0, 1, 0]])
data['vectors'][9] = numpy.array([[0, 1, 1],
[0, 0, 1],
[0, 1, 0]])
# Left back
data['vectors'][10] = numpy.array([[0, 1, 0],
[1, 1, 0],
[1, 1, 1]])
data['vectors'][11] = numpy.array([[0, 1, 0],
[0, 1, 1],
[1, 1, 1]])
# Generate 4 different meshes so we can rotate them later
meshes = [mesh.Mesh(data.copy()) for _ in range(16)]
#iterates through the array and translates cube in the x and y direction according
#to position in array and in the z direction according to eh value stored in the array
def ArrayToSTL(array, STLmesh):
y_count = 0
x_count = 0
count = 0
for row in array:
x_count = 0
for item in row:
meshes[count].x += x_count
meshes[count].y += y_count
meshes[count].z += item
x_count +=1
count += 1
y_count += 1
ArrayToSTL(a, meshes)
# Optionally render the rotated cube faces
from matplotlib import pyplot
from mpl_toolkits import mplot3d
# Create a new plot
figure = pyplot.figure()
axes = mplot3d.Axes3D(figure)
# Render the cube faces
for m in meshes:
axes.add_collection3d(mplot3d.art3d.Poly3DCollection(m.vectors))
# Auto scale to the mesh size
scale = numpy.concatenate([m.points for m in meshes]).flatten(-1)
axes.auto_scale_xyz(scale, scale, scale)
# Show the plot to the screen
pyplot.show()
This works well:
import numpy as np
import stl
from stl import mesh
import os
def combined_stl(meshes, save_path="./combined.stl"):
combined = mesh.Mesh(np.concatenate([m.data for m in meshes]))
combined.save(save_path, mode=stl.Mode.ASCII)
loading stored stl files and meshing them, use this.
direc = "path_of_directory"
paths = [os.path.join(direc, i) for i in os.listdir(direc)]
meshes = [mesh.Mesh.from_file(path) for path in paths]
combined_stl(meshes)