subtracting arrays in numpy and plotting with pylab - python

I have a simple text file composed of 8 columns and I read it with loadtxt function. I want to plot as x-axis column6-column7 and as y-axis column7-column8 so I put this command
>>> pl.plot(np.subtract(data2[:,6], data2[:7]), np.subtract(data2[:,7], data2[:,8]))
and it gave this error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: operands could not be broadcast together with shapes (59427) (7,9)
What is the problem? and how to do that?

data2[:7] should be data2[:,7] -- you missed a comma.
data2[:7] apparently has shape (7,9), while data2[:,6] has shape (50427,). The error message is saying that the two arrays can not be broadcasted to a common shape upon which np.subtract can be applied.
x = data2[:,6] - data2[:,7]
y = data2[:,7] - data2[:,8]
pl.plot(x, y)

Related

Bar graph for a python dictionary with key as labels and the values as the values to be plotted without adjusting the dimesnions

I have a python dictionary like this. Each key and value pairs are different in dimensions. Also it has same values repeating in the same array of values as well as in other arrays as well. For e.g., '0' can be found repeating in the same array as well as in almost all arrays.
pd_plot = {'MinMaxScalerLogisticRegression': [0,0,1,6,150,200], 'StandardScalerHoeffdingTreeClassifier': [2,0,50,100], 'MaxAbsScalerKNNClassifier': [23,45,56,0,0], 'MinMaxScalerGaussianNB': [43,56,76,87,35], 'MinMaxScalerKNNClassifier': [50,2,78,135,74,0,1,5,0], 'StandardScalerKNNClassifier': [1,200], 'MinMaxScalerHoeffdingTreeClassifier': [34,35,76,87,90,98,43,32,32,4,5], 'StandardScalerLogisticRegression': [6,7,0,1,2,5], 'MaxAbsScalerHoeffdingTreeClassifier': [2]}
When I try to plot a bar graph using the below code.
import matplotlib.pylab as plt
myList = pd_plot.items()
myList = sorted(pd_plot)
x, y = zip(*myList)
plt.plot(x, y)
plt.show()
I get this error
ValueError Traceback (most recent call last)
<ipython-input-23-916d14b0fed0> in <module>
2 myList = pd_plot.items()
3 myList = sorted(pd_plot)
----> 4 x, y = zip(*myList)
5
6 plt.plot(x, y)
ValueError: too many values to unpack (expected 2)
When I tried other approach
import matplotlib.pyplot as plt
names = list(pd_plot.keys())
values = list(pd_plot.values())
plt.bar(range(len(pd_plot)), values, tick_label=names)
plt.show()
I get this error
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-24-319c3c2e7d40> in <module>
4 values = list(pd_plot.values())
5
----> 6 plt.bar(range(len(pd_plot)), values, tick_label=names)
7 plt.show()
7 frames
/usr/local/lib/python3.7/dist-packages/matplotlib/transforms.py in from_extents(*args)
787 The *y*-axis increases upwards.
788 """
--> 789 points = np.array(args, dtype=float).reshape(2, 2)
790 return Bbox(points)
791
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (4,) + inhomogeneous part.
All the solutions for this error suggests to make the dimensions same by adding zeroes. But I cannot add zeroes since I have lot of zeroes as values in all the arrays.
Any suggestion to plot a bar graph without changing the dimensions would be appreciated

Iterating a Numpy Array for random.normal generator shape conflict

I am trying to generate n (in this case n=57) random numbers from a normal distribution for a number of sampled mean and standard deviations from a PyMc3 model (in this case 350). So I ultimately want to end up with 350 distributions of 57 length each. I'm sure this is something straightforward and I have a lack of conceptual understanding. Input is:
prior_pc5 =pm.sample_prior_predictive(samples=350,model=model_5,
var_names='μ','σ'],random_seed=21)
n=57
prpc5_μ = np.asarray(prior_pc5['μ'])
prpc5_σ = np.asarray(prior_pc5['σ'])
for x,y in np.nditer([prpc5_μ,prpc5_σ]):
y_prpc5 = np.random.normal(prpc5[:,0],prpc5[:,1], size=n)
Output is:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-180-90195f458d14> in <module>
1 for x,y in np.nditer([prpc5_μ,prpc5_σ]):
----> 2 y_prpc5 = np.random.normal(prpc5[:,0],prpc5[:,1], size=n)
mtrand.pyx in numpy.random.mtrand.RandomState.normal()
_common.pyx in numpy.random._common.cont()
_common.pyx in numpy.random._common.cont_broadcast_2()
__init__.pxd in numpy.PyArray_MultiIterNew3()
ValueError: shape mismatch: objects cannot be broadcast to a single shape
Appreciate any edification you can provide.
Your nditer loop does nothing for you. You don't even use the x,y variables. The prpc5 variable is undefined. And there's no attempt to accumulate the y_prpc5 values.
If you need to iterate on something, start with a plain iteration. Don't attempt to use nditer (unless you can read and understand all of its docs). It's not any faster, and harder to use correctly.
But the error has nothing to do with nditer.
np.random.normal(prpc5[:,0],prpc5[:,1], size=n)
doesn't use any iteration variables.
Size with scalar arguments:
In [63]: np.random.normal(1,2,size=57)
Out[63]:
array([-0.15095176, 0.68354153, 0.64270214, 1.71539129, 3.82930345,
-0.93888021, 0.34013012, 4.7390541 , 1.95647694, -0.02787572,
0.53790931, 3.64859278, -2.66455301, -1.81567149, 2.62141742,
-0.22887411, -0.36284743, 2.92298403, 1.87943503, 2.12060767,
-1.10172555, 0.04234386, 0.48707306, 5.66358341, 0.70659526,
-0.74210809, -2.04678512, -0.16496427, -0.46041457, 0.50505178,
1.66497518, 2.20486689, 1.83034991, -1.73740446, -3.117619 ,
1.12649528, 2.58059286, 1.42897399, 2.37256695, -2.34670202,
3.00318398, 2.78164509, -1.1329154 , 4.06859674, 3.13266299,
-0.35481326, 1.79429889, 1.71617491, 1.41543611, 0.9476942 ,
-0.79856396, -0.83121952, -2.63145461, 0.13941223, 0.18895024,
3.21956521, -2.75348353])
Array/list arguments with size:
In [64]: np.random.normal([1,2],[1,1],size=57)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-64-9a493d59b8d2> in <module>
----> 1 np.random.normal([1,2],[1,1],size=57)
mtrand.pyx in numpy.random.mtrand.RandomState.normal()
_common.pyx in numpy.random._common.cont()
_common.pyx in numpy.random._common.cont_broadcast_2()
__init__.pxd in numpy.PyArray_MultiIterNew3()
ValueError: shape mismatch: objects cannot be broadcast to a single shape
If size matches the size of the first two arguments, ok:
In [65]: np.random.normal([1,2],[1,1],size=2)
Out[65]: array([1.91404732, 2.79305575])

ValueError: not enough values to unpack (expected 3, got 2) with matplotlib

I am trying to plot a temperature gradient on a map of the Gulf of Mexico using matplotlib. I defined lat,long,and temp as x,y,z. Ideally I'd like to have points for each coordinate plotted on top of a map with a colored temperature gradient. Here is the code I used (I shortened the data sets for simplicity):
poslat=[29.50736,24.50824,28.99417,27.0074,27.00416,26.51342,26.00732,25.99168,26.49908]
poslon=[-86.50889,-84.49289,-87.00188,-89.99935,-91.00347,-90.99752,-91.50267,-92.00522]
postemp = [24.73,24.753,24.753,24.756,24.778,24.859,24.859,24.859,24.867,24.867,24.867,25.224]
x,y,z=m(poslon,poslat,postemp)
m.plot(x,y,'ro',markersize=5)
plt.contour(x,y,z)
plt.colorbar()
plt.show()`
and here are the errors I am getting:
Warning (from warnings module):
File "/Users/sydneyharned/anaconda3/lib/python3.6/site-packages/mpl_toolkits/basemap/__init__.py", line 1711
if limb is not ax.axesPatch:
MatplotlibDeprecationWarning: The axesPatch function was deprecated in version 2.1. Use Axes.patch instead.
Traceback (most recent call last):
File "/Users/sydneyharned/praccompproject.py", line 31, in <module>
x,y,z=m(poslon,poslat,postemp)
ValueError: not enough values to unpack (expected 3, got 2)
I can't seem to figure out values I'm missing.. does anyone know how to fix either of these? any help is appreciated.
You should check return of function "m".
As debug info describes, "m" returns only two value, while in the usage of it, it needs three.
(x,y,z) = m(poslon,poslat,postemp)
# should return (x,y) maybe

TypeError: only length-1 arrays can be converted to Python scalars in a dataset loading

I am sorry, but I am quite new withing the community. So it might be that this question could be trivial.
Anyway, I have created a numpy matrix. Now I would like to evaluate the density points by using the meanshift algorythm.
Unfortunately I am currently facing the following error:
TypeError: only length-1 arrays can be converted to Python scalars
nygrid=np.zeros((2501,901), dtype=int)
for x in range(0,39):
in_file = "C:\Users\User\Desktop\Master en BIGDATA\Trabajo Fin de
master\Practica\Data Records\part-m-000" + '{:02d}'.format(x)
for line in open(in_file):
passen, forigen, corigen, fdest, cdest = line.split('\t')
vPass=int(passen)
vFOrigen=int(forigen)
vCOrigen=int(corigen)
vFDest=int(fdest)
enter code herevCDest=int(cdest)
nygrid[vFOrigen][vCOrigen]=nygrid[vFOrigen][vCOrigen]+vPass
nygrid[vFDest][vCDest]=nygrid[vFDest][vCDest]+vPass
Now the matrix nygrid is loaded
from sklearn import datasets
import mean_shift as ms
model = ms.MeanShift(kernel_func=ms.gaussian_kernel, bandwidth=50,
seeds=500, n_jobs=-1)
Creation of variable columns and rows
columns=nygrid[:,:901]
rows=nygrid[:2501,:]
Now I have to create the X and y and the idea would be to pass the whole rows and the whole columns of the matrix as n_samples and centers
X, y = datasets.make_blobs(n_samples=rows, centers=columns,
cluster_std=np.random.normal(1, .3, n_clusters))
Now I get the following error, realizing that I cannot pass the variable rows and columns as n_sample and centers.
X, y = datasets.make_blobs(n_samples=rows, centers=columns)
File "C:\Users\User\Anaconda2\lib\site-
packages\sklearn\datasets\samples_generator.py", line 752, in make_blobs
n_samples_per_center = [int(n_samples // n_centers)] * n_centers
TypeError: only length-1 arrays can be converted to Python scalars
It might be that the logic is not accurate in order to launch tha meanshift. But as I told you I am brand new in this area.
Thanks in advance for the help.
Andrea
I can reproduce your error with:
In [29]: int(np.array([1,2]))
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-29-26c1a90e530a> in <module>()
----> 1 int(np.array([1,2]))
TypeError: only length-1 arrays can be converted to Python scalars
In
int(n_samples // n_centers)
one or both of n_samples and n_centers is an array. The // is integer division, and the result is an integer array. It's an error to try to convert that to one integer (which is what Python int function does). And there's not need to attempt this conversion. Plus astype(int) is the correct way to convert a float array to integer.

(matpolotlib) ValueError: too many values to unpack

I am getting the following error when trying to display data values instead of markers:
Complete Traceback:
Traceback (most recent call last):
File "plotpoints.py", line 45, in <module>
plt.annotate(grid_x,grid_y)
File "/usr/lib/pymodules/python2.7/matplotlib/pyplot.py", line 3405, in annotate
ret = gca().annotate(*args, **kwargs)
File "/usr/lib/pymodules/python2.7/matplotlib/axes.py", line 3404, in annotate
a = mtext.Annotation(*args, **kwargs)
File "/usr/lib/pymodules/python2.7/matplotlib/text.py", line 1813, in __init__
annotation_clip=annotation_clip)
File "/usr/lib/pymodules/python2.7/matplotlib/text.py", line 1442, in __init__
x, y = self.xytext = xytext
ValueError: too many values to unpack
Code:
m = mapformat()
dx = 0.25
grid_x, grid_y = np.mgrid[-85:64:dx, 34:49:dx]
grid_z = griddata((data[:,1],data[:,0]), data[:,2], (grid_x,grid_y), method='linear')
x,y = m(data[:,1], data[:,0]) # flip lat/lon
grid_x,grid_y = m(grid_x,grid_y)
plt.annotate(grid_x,grid_y)
#m.plot(grid_x,grid_y, 'ko', markersize=2)
What am I doing wrong?
I don't think you are calling annotate correctly
plt.annotate(grid_x,grid_y)
That looks like 2 arrays or lists of points (I haven't fully deduced how you define those 2 variables).
But the documentation is:
ax.annotate('local max', xy=(3, 1), ...)
The 1st argument is the text and the second a tuple with the coordinates.
I'm guessing that the calling sequence converts your xgrid argument to mtext, and ygrid to its xytext
x, y = self.xytext = xytext
If there are more than 2 values in ygrid, this unpacking will produce your error.
annotate adds text at a specific point on the plot; it can't be used to label the coordinates of a bunch of data points (at least not in one call).
One of the functions that you're calling on the right is returning more values than there are variables to assign to on the left.
For example, if you do the following in a REPL:
a,b = [1,2,3]
You'll get the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: too many values to unpack
It'll help to see which line the code is failing at - this way, you'll know which function is returning too many variables.

Categories

Resources