Python time series data bases - python

I will have hourly temperature data for each city in US which needs some kind of statistical and plotting post-processing using grandfathered tools in Python. Right now the list has only few cities but it will grown to include more cities.
Each city with hourly temperature data is thus a 2 column vector. The first column being hourly date-time and the second column as temperature.
What is the suggested open source database tool that I can use as I am trying to avoid keeping this temperature data in csv file.
Edit: I also get 10 day hourly temperature forecast each day for each city. I want to store the city historical forecast I receive from vendor every day also.

Related

Use TensorFlow model to guess / predict values between 2 points

My question is something that I didn't encounter anywhere, I've been wondering if it was possible for a TF Model to determinate values between 2 dates that have real / validated values assigned to them.
I have an example :
Let's take the price of Nickel, here's it's chart the last week :
There is no data for the two following dates : 19/11 and 20/11
But we have the data points before and after.
So is it possible to use the datas from before and after these 2 points to guess the values of the 2 missing dates ?
Thank you a lot !
It would be possible to create a machine learning model to predict the prices given a dataset of previous prices. Take a look at this post for instance. You would have to modify it slightly such that it predicts the prices in the gaps given previous and upcoming prices.
But for the example you gave assuming the dates are of this year 2022, these are a Saturday and Sunday, the stock market is closed on the weekends, hence there is not price of the item. Also notice that there are other days in the year where there is not trading occurring, think about holidays, then there also is not price of course.

Approach to storing forecast time series data using python

So I want to scrape a weather forecast table once a day and store my results for future analysis. I want to store the data but im not sure how to.
Example of the data: Forecast Table
My four variables of interest are wind speed, wind gusts, wave height, and wave period.
This is my first python project involving time series data and I’m fairly new to databases so go easy on me and ELI5 please.
In the Python For Everyone course I recently took, I learned about relational databases and using SQLlite. The main idea here was basically to be efficient in storing data and never storing the same data twice. However none of the examples involved time series data. And so now I’m not sure what the best approach here is.
If I create a table for each variable and lastly one for the date I scraped the forecast. The date of scraping would then serve as the primary key. In this example the variables such as windspeed's first column would be date of scraping followed by the next columns being the forecasted values for the time stamps. Although this would make the storage more efficient as opposed to creating a new table every day, there are a few problems. The timestamps are not uniform (see image, forecast times are only from 3am to 9pm). Also depending on the time of day that the forecast is scraped the date and time values on the timestamps are always changing and thus the next timestamp is not always in 2 hours.
Seeing as each time I scrape the forecast, I get a new table, should I create a new database table each time in sqlite? This seems like a rather rudimentary solution, and I’m sure there are better ways to store the data.
How would you go about this?
Summarizing my comments:
You may want to append forecast data from a new scrapping to the existing data in the same database table.
From each new web-scrapping you will get approx. 40 new records with the same scrapping time stamp but different forecast time stamp.
e.g., this would be the columns of the table using ID as primary key with AUTOINCREMENT:
ID
Scrapping_time
Forecast_hours
Wind_speed
Wind_gusts
Wind_direction
Wave
Wave_period
wave_direction
Note:
if you use SQLite, you could leave out the ID column as SQLite would add such ROWID column by default if no other primary key had been specified
(https://www.sqlite.org/autoinc.html)

Finding the price change between two dates in yfinance

I'm trying to find the price change betweeen two dates using yfinance. I have my code set up right now to show me a graph of a selected stock from the start and end of The Great Recession with matplotlib. But I can't figure out how to get just one day of data and store it in a variable. Is there any way to store the closing price of a certain date or even get the price change between two dates?

Calculating EWMA for athlete training load data with Python

Good day all!
I am new to the programming world and I'm struggling with the following.
Training load is calculated by multiplying the sessions rating (sRPE) with the duration of the session in minutes. The acute load of the past 7 days is then compared to the chronic load of the past 28 days. Below is an example of such a table:
Training load calculations
The challenge comes when the EWMA (exponentially weighted moving averages) must be calculated (formula is below). Lambda = 2/(number of days + 1).
EWMA formula
The part I'm struggling with is the EWMA of the previous day. In Excel, I could select the individual cell or a Vlookup to reference to the athlete name and training date.
In order to select the previous EWMA, the athlete name and training date must be used to select the correct value for each athlete's workload ratio.
How can I do this with Python? I tried to use Pandas (not very well), but to no avail :(

what does "Aggregate the data to weekly level, so that there is one row per product-week combination" mean and how can I do it using python (pandas)

I am working on a transactions data frame using python (anaconda) and I was told to Aggregate the data to a weekly level so that there is one row per product-week combination
I want to make sure if the following code is correct because I don't think I fully understood what I need to do
dataset.groupby(['id', dataset['history_date'].dt.strftime('%W')])['sales'].sum()
Note my dataset contains the following:
id history_date item_id price inventory sales category_id
Aggregating data means combining datasets based on a certain criteria, to narrow it down.
For example, it sounds like your dataset may be broken down by daily dates, where each row corresponds to a specific date.
What you need to do is aggregate the data into weekly segments, instead of having it broken down on a daily basis.
This is achieved by grouping your datasets based on the date & the most granular / detailed /specific pairing of your dataset.

Categories

Resources