plot to show large data points on x axis using python - python

how to show variance of these data points over time? I used this plot to show them but because the time starts from 0 to 20 000 seconds and it is difficult to see all the points properly to observe the variance or invariance, the problem is: the points are overlapped to each other.
after zoom in

I finally could solve this problem by subtracting each time from the minimum time for each subject. Now all the times starts from 0 and the variance between subjects can be seen easily

Normalize your axes to 1 by dividing with the maximum value. Afterwards you can scale your axis by a factor X.

Related

Fourier Transform Time Series in Python

I've got a time series of sunspot numbers, where the mean number of sunspots is counted per month, and I'm trying to use a Fourier Transform to convert from the time domain to the frequency domain. The data used is from https://wwwbis.sidc.be/silso/infosnmtot.
The first thing I'm confused about is how to express the sampling frequency as once per month. Do I need to convert it to seconds, eg. 1/(seconds in 30 days)? Here's what I've got so far:
fs = 1/2592000
#the sampling frequency is 1/(seconds in a month)
fourier = np.fft.fft(sn_value)
#sn_value is the mean number of sunspots measured each month
freqs = np.fft.fftfreq(sn_value.size,d=fs)
power_spectrum = np.abs(fourier)
plt.plot(freqs,power_spectrum)
plt.xlim(0,max(freqs))
plt.title("Power Spectral Density of the Sunspot Number Time Series")
plt.grid(True)
I don't think this is correct - namely because I don't know what the scale of the x-axis is. However I do know that there should be a peak at (11years)^-1.
The second thing I'm wondering from this graph is why there seems to be two lines - one being a horizontal line just above y=0. It's more clear when I change the x-axis bounds to: plt.xlim(0,1).
Am I using the fourier transform functions incorrectly?
You can use any units you want. Feel free to express your sampling frequency as fs=12 (samples/year), the x-axis will then be 1/year units. Or use fs=1 (sample/month), the units will then be 1/month.
The extra line you spotted comes from the way you plot your data. Look at the output of the np.fft.fftfreq call. The first half of that array contains positive values from 0 to 1.2e6 or so, the other half contain negative values from -1.2e6 to almost 0. By plotting all your data, you get a data line from 0 to the right, then a straight line from the rightmost point to the leftmost point, then the rest of the data line back to zero. Your xlim call makes it so you don’t see half the data plotted.
Typically you’d plot only the first half of your data, just crop the freqs and power_spectrum arrays.

dealing with line equations in not proportional scales

I am calculating trendlines for stock market, and want to know the angle between 2 lines.
The X-axis is epoch timestamp (in ms) and Y-axis is the price.
The problem is that because epoch ts's number is so high (lets say 1,591,205,309,000 ms) and the price per share can vary from 0.078$ to 10,000$, the scales are not proportional.
I am also a trader, and when I trade I see charts as described in the picture below:
This way, the ploting is probably scaling the axes to fit in some way (compressing X axis and stretching Y axis).
Also, this scaling is generic (whether I am looking at 5 minute chart or 1 day chart), when I draw lines (in the same timeframe), I see it in a comfortable way.
If you will take those lines and plot it on a ts/price graph, you will probably see 2 parallel lines.
I also must keep the line equation in ts because I need to forcast when the trade will be in the future (giving it the ts, and it returns the price where it will be at)
Right now, when calculating this angle I get around 0.0003 degrees, I want to get the degrees of the lines like in the chart above.

Draw a smoother line and figure out the sharp changes(slope) point from a time series data where value will be in between 0 to 100

I have scattered time series data where x is date-time and y is the value ranging from 0 to 100. I want to draw a smoother line which will be best fit for given data. I also want to get the points where there is a substantial changes occurs.
You can find the sample data-set here
Note, There might be some null value of y for given x. I will appreciate any example using python.

What is xscale and yscale?

I am reading some code, therein I noticed a big change while using pyplot.xaxis('log') and pyplot.yaxis('log'). So the basic scatter plot looks like this:
and after adding:
plt.xscale('log')
plt.yscale('log')
the graph looks like this:
I went to look the documentation but there was not enough explanation about it. like, What is xscale and yscale and what is the function of their respective parameters like log, linear, symlog and logit?
I am absolutely new in graph and matplotlib. I have no good knowledge of these, can you please help me out with this and explain what is xscale, yscale and what is the function of their respective parameters like log, linear, symlog and logit?
Thank you for help
You are setting the scale of your y and x-axis to be logarithmic in scale.
Normally every y distance on your axis your values increment by a fixed amount, example:
0 at 0cm
1 at 1cm
2 at 2cm
...
1000 at 10m
With logarythmic the values scale by a magnitude. Example for powers of 10:
0 at 0cm
1 at 1cm
10 at 2cm
100 at 3cm
1000 at 4cm etc.
It is a way to display widely spread data in a compacter format.
See logarithmic scale on wikipedia
Your data has a cluster of values and an outlier - by printing with a logarithmic scale your blob gets shown over distance whatever and the big distance between the blob and the outlier takes less screenarea due to it being logarithmic.
Other examples for log-plots:
https://plot.ly/python/log-plot/
matplotlib.pyplot.xscale.html
matplotlib.pyplot.yscale.html
and
matplotlib.scale.LogScale

Normalizing time series measurements

I have read the following sentence:
Figure 3 depicts how the pressure develops during a touch event. It
shows the mean over all button touches from all users. To account for
the different hold times of the touch events, the time axis has been
normalized before averaging the pressure values.
They have measured the touch pressure over touch events and made a plot. I think normalizing the time axis means to scale the time axis to 1s, for example. But how can this be made? Let's say for example I have a measurement which spans 3.34 seconds (1000 timestamps and 1000 measurements). How can I normalize this measurement?
If you want to normalize you data you can do as you suggest and simply calculate:
z_i=\frac{x_i-min(x)}{max(x)-min(x)}
(Sorry but i cannot post images yet but you can visit this )
where zi is your i-th normalized time data, and xi is your absolute data.
An example using numpy:
import numpy
x = numpy.random.rand(10) # generate 10 random values
normalized = (x-min(x))/(max(x)-min(x))
print(x,normalized)

Categories

Resources