Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
Hello dear Python experts:)
From a simulation I got data (course of energy over the time) which I have to fit. When I plot the energy it has a non-periodic oscillating course. There are a bunch of helping function like curve_fit from scipy etc. But you always have to specify a function with which the fit should take place. But I don't know a proper function a priori.
I need something like a Fourier fit to get a function representing the data (like it is possible in MatLab) to later use this function to determine its maxima. Has anyone an idea how to deal with such a problem?
Here is an example course: 2
If you like, you can have a look at the data in a .csv-file: https://1drv.ms/u/s!AuQAmr8-QRJSdzNTzyvWPhUaEnw
I would be very delighted to get some help:-)
Many thanks:-)
Using the Fourier fit in MATLAB you also specify a model (how many sin/cos you want).
For instance "Fourier 2" is:
f(x) = a0 + a1*cos(x*w) + b1*sin(x*w) +
a2*cos(2*x*w) + b2*sin(2*x*w)
Check http://exnumerus.blogspot.nl/2010/04/how-to-fit-sine-wave-example-in-python.html to see how to fit for "Fourier1".
If you really want no model you need to use something like "eureqa", which is free for academic use (http://www.nutonian.com/download/eureqa-desktop-download/).
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I'm doing a project where I have data of 100 sensors and its cycles until it breaks. It shows a lot of characteristcs until its failure, and then shows it for the replacement sensor. With this data, I have to built a model where I can predict for how long the sensor will work until its failure, but only with a few data, not the full cycle. I have no idea what machine learning model is suitable for this.
The type of problem you are describing is known as survival analysis. A wide range of both statistical and machine learning methods are available to help you solve these type of problems.
What is great about these methods is it also allow you to use data points where the event you are interested in has not occur. In your example, it means you can possibly extend your dataset by including data from sensors which has not failed yet.
When you look at the methods I suggest you also spend some time examining how to evaluate these types of models, since the evaluation methods are also slightly different then in typical machine learning problems.
A comprehensive range of techniques is available at: http://dmkd.cs.vt.edu/TUTORIAL/Survival/Slides.pdf
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
all.
I have a question on how to add missing values to a dataset object.
I'm currently working on crop growth modeling, and employ NASA Power API as a weather dataset.
However, the NASA Power dataset has missing days.
enter image description here
I used pcse library in order to extract NASA Power dataset.
My question is, how to add the missing day's data.
I tried
wdp(date) = wdp(date-timedelta(days=1))
but it gives me back 'can't assign to function call'
anyhow, it seems that the data for the missing date does not exist in the object and I am not allowed to make it.
You have the right idea, but the wrong syntax. In Python, list and dict access uses square brackets ([]), see the docs.
To add to that, pcse’s WeatherDataProvider object does not support this style access. Checking out the code in this link, it appears there is a method you can call named _store_WeatherDataContainer, where the leading _ indicates it is not intended for public use, but that doesn’t mean you can’t :-)
It should look like this:
wdp._store_WeatherDataContainer(wdp(date-timedelta(days=1)), date)
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I have 20 parameters that can take binary value which are passed to function to return the score like this.
score = fmin( para 1, para 2 , para 3,.....para20)
Now to optimize this scenario, which can be the best algorithm ?
I read about genetic algorithm where in chromosome can do mutation and crossover to select best combination out of 2^20 search points.
I also read about hyperopt that optimises the function but in less number of trials.
Which can be the better one ? Any pros or cons of using these algorithms ?
It really depends on the properties you expect your function to have. If you have reason to believe that similar parameter sets have similar scores, then you can try simulated annealing or genetic algorithms.
However, if you don't have reason to expect similar parameters will generate similar scores, those methods won't help: you would do just as well picking parameter sets at random. But (as mentioned in the comments), 2^20 isn't much more than a million trials: if your function isn't too expensive, you could just try them all.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I am trying to analyse a time series(blue one) that looks like this
As you see it's not seasonal I tried to draw the log of this series and it's look like it's not seasonal to ?!
I wonder what's possible to do to forecast the future
log of ts
Your question means "Do I predict variable Y using variable X (X being the time) assuming Y and X are independant". So short answer is no, you can't.
Now, your affirmation that your data is not cyclical seems like jumping to conclusions imo. You might have complex cycles and hidden dependant variables that might explain part of the variance leaving you with more cyclical residuals.
You could maybe try using a periodogram (there are many Python packages for it) to find important parameters, for example the frequency of your signal.
For regular sampled signals, try:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.periodogram.html
For irregular sampled signals on the other hand, I'd suggest:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.lombscargle.html
Hope this helps!
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
This is most likely a dumb question but being a beginner in Python/Numpy I will ask it anyways. I have come across a lot of posts on how to Normalize an array/matrix in numpy. But I am not sure about the WHY. Why/When does an array/matrix need to be normalized in numpy? When is it used?
Normalize can have multiple meanings in difference context. My question belongs to the field of Data Analytics/Data Science. What does Normalization mean in this context? Or more specifically in what situation should I normalize an array?
The second part to this question is - What are the different methods of Normalization and can they be used interchangeably in all situations?
The third and final part - can Normalization be used for Arrays of any dimensions?
Links to any reference material (for beginners) will be appreciated.
Consider trying to cluster objects with two numerical attributes A and B. Both are equally important. Attribute A can range from 0 to 1000 and attribute B can range from 0 to 5.
If you did not normalize A and B you would end up with attribute A completely overpowering attribute B when applying any standard distance metric.