x and y must have same first dimension numpy - python

Im trying to plot a graph which converts dates into floats to be used in linear regression algorithm and then uses the original dates as strings for the x axis labels. When I plot the actual values from the csv file the program runs ok however when I plot the regression values i get the error raise ValueError(f"x and y must have same first dimension, but "
ValueError: x and y must have same first dimension, but have shapes (60,) and (1,)
here is my code:
scaler = StandardScaler()
data = pd.read_csv('food.csv')
X = data['Date'].values
X = pd.to_datetime(X, errors="coerce")
X = X.values.astype("float64").reshape(-1,1)
Y = data['TOTAL'].values.reshape(-1,1)
mean_x = np.mean(X)
mean_y = np.mean(Y)
m = len(X)
numer = 0
denom = 0
for i in range(m):
numer += (X[i] - mean_x) * (Y[i] - mean_y)
denom += (X[i] - mean_x) ** 2
m = numer / denom
c = mean_y - (m * mean_x)
print (f'm = {m} \nc = {c}')
max_x = np.max(X) + 100
min_x = np.min(X) - 100
x = np.linspace (min_x, max_x, 100)
y = c + m * x
X= data['Date'].astype('str')
x= data['Date'].astype('str')
print(X.shape)
print(y.shape)
newY = Y.transpose()[0]
newy = y.transpose()[0]
plt.scatter(X, newY, c='#ef5423', label='data points')
plt.plot(x, newy, color='#58b970', label='Regression Line')
plt.show()

It's difficult to answer without data, but by the looks of your code x, and therefore y have shape (100,) (from the x = np.linspace(...) command).
So you probably don't want to pick just the 0th element in the line
newy = y.transpose()[0]
Because then newy is only a scalar value. What happens if you omit [0] and just do this?
newy = y.transpose()

Related

Remove the intersection between two curves

I'm having a curve (parabol) from 0 to 1 on both axes as follows:
I generate another curve by moving the original curve along the x-axis and combine both to get the following graph:
How can I remove the intersected section to have only the double bottoms pattern like this:
The code I use for the graph:
import numpy as np
import matplotlib.pyplot as plt
def get_parabol(start=-1, end=1, steps=100, normalized=True):
x = np.linspace(start, end, steps)
y = x**2
if normalized:
x = np.array(x)
x = (x - x.min())/(x.max() - x.min())
y = np.array(y)
y = (y - y.min())/(y.max() - y.min())
return x, y
def curve_after(x, y, x_ratio=1/3, y_ratio=1/2, normalized=False):
x = x*x_ratio + x.max() - x[0]*x_ratio
y = y*y_ratio + y.max() - y.max()*y_ratio
if normalized:
x = np.array(x)
x = (x - x.min())/(x.max() - x.min())
y = np.array(y)
y = (y - y.min())/(y.max() - y.min())
return x, y
def concat_arrays(*arr, axis=0, normalized=True):
arr = np.concatenate([*arr], axis=axis).tolist()
if normalized:
arr = np.array(arr)
arr = (arr - arr.min())/(arr.max() - arr.min())
return arr
x, y = get_parabol()
new_x, new_y = curve_after(x, y, x_ratio=1, y_ratio=1, normalized=False)
new_x = np.add(x, 0.5)
# new_y = np.add(y, 0.2)
xx = concat_arrays(x, new_x, normalized=True)
yy = concat_arrays(y, new_y, normalized=True)
# plt.plot(x, y, '-')
plt.plot(xx, yy, '--')
I'm doing a research on pattern analysis that requires me to generate patterns with mathematical functions.
Could you show me a way to achieve this? Thank you!
First off, I would have two different parabola functions such that:
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-1, 1, 100)
y1 = np.add(x, 0.3)**2 # Parabola centered at -0.3
y2 = np.add(x, -0.3)**2 # Parabola centered at 0.3
You can choose your own offsets for y1 and y2 depending on your needs.
And then it's simply take the min of the two arrays
y_final = np.minimum(y1, y2)
plt.plot(x, y_final, '--')
This involves curve fitting. You need to find the intersection part before you drop the values. Since the values of x and y have been normalized, we would have to determine exactly where the two datasets meet. We can see that they meet when x[i] >x[i+1]. Using your cobined xx and yy from the data provided, We therefore can do the following:
data_intersect = int(np.where(np.r_[0,np.diff(xx)] < 0)[0])
x1 = xx[:data_intersect]
x2 = xx[data_intersect:]
y1 = yy[:data_intersect]
y2 = yy[data_intersect:]
difference = np.polyfit(x1, y1, 2) - np.polyfit(x2,y2,2)
meet = np.roots(difference) # all points where the two curves meet
meet = meet[(meet < max(x1)) & (meet >min(x1))] # only point curve meet
xxx = np.r_[x1[x1<meet], x2[x2>meet]]
yyy = np.r_[y1[x1<meet], y2[x2>meet]]
plt.plot(xxx, yyy, '--')

Blank areas while drawing Heighway Dragon in Python

So I've been doing some exercises from matura exam and there was one to draw a heighway dragon.
The program should focus on two pairs of variables:
x' = -0,4*x - 1
y' = -0,4*y + 0,1
and
x' = 0,76*x -0,4*y
y' = 0,4*x + 0,76*y
At the start x = 1 and y = 1, then, count a new x and y using the formula of randomly taken pair shown above (50/50 chance) and mark the point (x,y) on the chart. Everything repeat 5000 times.
So I tried it using python but the problem is that when I've finally drawn a dragon, at the chart I could see it was not one constant drawing but it had blank areas as in the photo below. Is it still acceptable or I made any mistake? Is there any way to make it look like the correct one?
My chart
The correct one to compare
My code:
import matplotlib.pyplot as plt
import random
x = 1
y = 1
sumx = 0
sumy = 0
max_x = 0
max_y = 0
for i in range(5000):
rand = random.randint(0, 1)
if rand == 0:
x = (-0.4 * x) - 1
y = (-0.4 * y) + 0.1
else:
x = (0.76 * x) - (0.4 * y)
y = (0.4 * x) + (0.76 * y)
if i >= 100:
sumx += x
sumy += y
plt.plot(x, y, c='black', marker='P', markersize=6)
if x > max_x:
max_x = x
if y > max_y:
max_y = y
plt.show()
avg_x = sumx / 5000
avg_y = sumy / 5000
print(round(avg_x, 1), round(avg_y, 1))
print('maximum x: ' + str(max_x) + ', maximum y: ' + str(max_y))
If the coordinates x' and y' are definitely calculated in the way you have written them above then your code is OK (although I'm not sure why you only start plotting once you've iterated 100 times).
However, calling pyplot's plot function is computationally expensive so I would suggest the following:
calculate all your x and y values and store them in lists
make one call to plt.scatter outside your for loop
this way the execution time for your code is drastically improved. I've done that in the following code, and also removed the condition that i >= 100. I also changed the way the random number was generated to see if this had an effect but it still produces a very similar output to the original code (see image below).
import matplotlib.pyplot as plt
import random
import sys
x = 1
y = 1
sumx = 0
sumy = 0
max_x = 0
max_y = 0
x_values = []
y_values = []
for i in range(5000):
rand = random.uniform(0,1)
if rand <= 0.5:
x = ((-0.4 * x) - 1)
y = ((-0.4 * y) + 0.1)
x_values.append(x)
y_values.append(y)
else:
x = ((0.76 * x) - (0.4 * y))
y = ((0.4 * x) + (0.76 * y))
x_values.append(x)
y_values.append(y)
sumx += x
sumy += y
if x > max_x:
max_x = x
if y > max_y:
max_y = y
plt.scatter(x_values, y_values, c='black', marker='P')
plt.ylim([-1, 0.4])
plt.xlim([-1.5, 0.5])
plt.show()
avg_x = sumx / 5000
avg_y = sumy / 5000
print(round(avg_x, 1), round(avg_y, 1))
print('maximum x: ' + str(max_x) + ', maximum y: ' + str(max_y))

CORDIC algorithm returning bad numbers

I started to implement a CORDIC algorithm from zero and I don't know what I'm missing, here's what I have so far.
import math
from __future__ import division
# angles
n = 5
angles = []
for i in range (0, n):
angles.append(math.atan(1/math.pow(2,i)))
# constants
kn = []
fator = 1.0
for i in range (0, n):
fator = fator * (1 / math.pow(1 + (2**(-i))**2, (1/2)))
kn.append(fator)
# taking an initial point p = (x,y) = (1,0)
z = math.pi/2 # Angle to be calculated
x = 1
y = 0
for i in range (0, n):
if (z < 0):
x = x + y*(2**(-1*i))
y = y - x*(2**(-1*i))
z = z + angles[i]
else:
x = x - y*(2**(-1*i))
y = y + x*(2**(-1*i))
z = z - angles[i]
x = x * kn[n-1]
y = y * kn[n-1]
print x, y
When I plug z = π/2 it returns 0.00883479322917 and 0.107149125055, which makes no sense.
Any help will be great!
#edit, I made some changes and now my code has this lines instead of those ones
for i in range (0, n):
if (z < 0):
x = x0 + y0*(2**(-1*i))
y = y0 - x0*(2**(-1*i))
z = z + angles[i]
else:
x = x0 - y0*(2**(-1*i))
y = y0 + x0*(2**(-1*i))
z = z - angles[i]
x0 = x
y0 = y
x = x * kn[n-1]
y = y * kn[n-1]
Now it's working way better, I had the problem because I wasn't using temporary variables as x0 and y0, now when I plug z = pi/2 it gives me better numbers as (4.28270993661e-13, 1.0) :)

TypeError: can only concatenate list (not "int") to list 4

I'm required to take a Python module for my course and I get this error for my script. It's plotting the trajectory of a projectile and calculating a few other variables. I've typed the script exactly as in the booklet we are given.
Because I am an absolute beginner I can't understand other answers to this error. I would appreciate it an awful lot if someone could give me a quick fix, I don't have time at the moment to learn enough to fix it myself.
Code:
import matplotlib.pyplot as plt
import numpy as np
import math # need math module for trigonometric functions
g = 9.81 #gravitational constant
dt = 1e-3 #integration time step (delta t)
v0 = 40 # initial speed at t = 0
angle = math.pi/4 #math.pi = 3.14, launch angle in radians
time = np.arange(0,10,dt) #time axis
vx0 = math.cos(angle)*v0 # starting velocity along x axis
vy0 = math.sin(angle)*v0 # starting velocity along y axis
xa = vx0*time # compute x coordinates
ya = -0.5*g*time**2 + vy0*time # compute y coordinates
fig1 = plt.figure()
plt.plot(xa, ya) # plot y versus x
plt.xlabel ("x")
plt.ylabel ("y")
plt.ylim(0, 50)
plt.show()
def traj(angle, v0): # function for trajectory
vx0 = math.cos(angle) * v0 # for some launch angle and starting velocity
vy0 = math.sin(angle) * v0 # compute x and y component of starting velocity
x = np.zeros(len(time)) #initialise x and y arrays
y = np.zeros(len(time))
x[0], y[0], 0 #projecitle starts at 0,0
x[1], y[1] = x[0] + vx0 * dt, y[0] + vy0 * dt # second elements of x and
# y are determined by initial
# velocity
i = 1
while y[i] >= 0: # conditional loop continuous until
# projectile hits ground
x[i+1] = (2 * x[i] - x[i - 1]) # numerical integration to find x[i + 1]
y[i+1] = (2 * y[i] - y[i - 1]) - g * dt ** 2 # and y[i + 1]
i = [i + 1] # increment i for next loop
x = x[0:i+1] # truncate x and y arrays
y = y[0:i+1]
return x, y, (dt*i), x[i] # return x, y, flight time, range of projectile
x, y, duration, distance = traj(angle, v0)
print "Distance:" ,distance
print "Duration:" ,duration
n = 5
angles = np.linspace(0, math.pi/2, n)
maxrange = np.zeros(n)
for i in range(n):
x,y, duration, maxrange [i] = traj(angles[i], v0)
angles = angles/2/math.pi*360 #convert rad to degress
print "Optimum angle:", angles[np.where(maxrange==np.max(maxrange))]
The error explicitly:
File "C:/Users/***** at *****", line 52, in traj
x = x[0:i+1] # truncate x and y arrays
TypeError: can only concatenate list (not "int") to list
As is pointed out in the comments, this is the offending line
i = [i + 1] # increment i for next loop
Here, i is not actually being incremented as the comment suggests. When i is 1, it's being set to [1 + 1], which evaluates to [2], the list containing only the number 2. Remove the brackets.

Python: heat density plot in a disk

My goal is to make a density heat map plot of sphere in 2D. The plotting code below the line works when I use rectangular domains. However, I am trying to use the code for a circular domain. The radius of sphere is 1. The code I have so far is:
from pylab import *
import numpy as np
from matplotlib.colors import LightSource
from numpy.polynomial.legendre import leggauss, legval
xi = 0.0
xf = 1.0
numx = 500
yi = 0.0
yf = 1.0
numy = 500
def f(x):
if 0 <= x <= 1:
return 100
if -1 <= x <= 0:
return 0
deg = 1000
xx, w = leggauss(deg)
L = np.polynomial.legendre.legval(xx, np.identity(deg))
integral = (L * (f(x) * w)[None,:]).sum(axis = 1)
c = (np.arange(1, 500) + 0.5) * integral[1:500]
def r(x, y):
return np.sqrt(x ** 2 + y ** 2)
theta = np.arctan2(y, x)
x, y = np.linspace(0, 1, 500000)
def T(x, y):
return (sum(r(x, y) ** l * c[:,None] *
np.polynomial.legendre.legval(xx, identity(deg)) for l in range(1, 500)))
T(x, y) should equal the sum of c the coefficients times the radius as a function of x and y to the l power times the legendre polynomial where the argument is of the legendre polynomial is cos(theta).
In python: integrating a piecewise function, I learned how to use the Legendre polynomials in a summation but that method is slightly different, and for the plotting, I need a function T(x, y).
This is the plotting code.
densityinterpolation = 'bilinear'
densitycolormap = cm.jet
densityshadedflag = False
densitybarflag = True
gridflag = True
plotfilename = 'laplacesphere.eps'
x = arange(xi, xf, (xf - xi) / (numx - 1))
y = arange(yi, yf, (yf - yi) / (numy - 1))
X, Y = meshgrid(x, y)
z = T(X, Y)
if densityshadedflag:
ls = LightSource(azdeg = 120, altdeg = 65)
rgb = ls.shade(z, densitycolormap)
im = imshow(rgb, extent = [xi, xf, yi, yf], cmap = densitycolormap)
else:
im = imshow(z, extent = [xi, xf, yi, yf], cmap = densitycolormap)
im.set_interpolation(densityinterpolation)
if densitybarflag:
colorbar(im)
grid(gridflag)
show()
I made the plot in Mathematica for reference of what my end goal is
If you set the values outside of the disk domain (or whichever domain you want) to float('nan'), those points will be ignored when plotting (leaving them in white color).

Categories

Resources