How to solve mysql daily analytics that happens when date changes

How to solve mysql daily analytics that happens when date changes - python

I have two separate programs; one counts the daily view stats and another calculates earning based on the stats.
Counter runs first and followed by Earning Calculator a few seconds later.
Earning Calculator works by getting stats from counter table using date(created_at) > date(now()).
The problem I'm facing is that let's say at 23:59:59 Counter added 100 views stats and by the time the Earning Calculator ran it's already the next day.
Since I'm using date(created_at) > date(now()), I will miss out the last 100 views added by the Counter.
One way to solve my problem is to summarise the previous daily report at 00:00:10 every day. But I do not like this.
Is there any other ways to solve this issue?
Thanks.

You have to put a date on your data and instead of using now() use it.

Related

Approach to storing forecast time series data using python

So I want to scrape a weather forecast table once a day and store my results for future analysis. I want to store the data but im not sure how to.
Example of the data: Forecast Table
My four variables of interest are wind speed, wind gusts, wave height, and wave period.
This is my first python project involving time series data and I’m fairly new to databases so go easy on me and ELI5 please.
In the Python For Everyone course I recently took, I learned about relational databases and using SQLlite. The main idea here was basically to be efficient in storing data and never storing the same data twice. However none of the examples involved time series data. And so now I’m not sure what the best approach here is.
If I create a table for each variable and lastly one for the date I scraped the forecast. The date of scraping would then serve as the primary key. In this example the variables such as windspeed's first column would be date of scraping followed by the next columns being the forecasted values for the time stamps. Although this would make the storage more efficient as opposed to creating a new table every day, there are a few problems. The timestamps are not uniform (see image, forecast times are only from 3am to 9pm). Also depending on the time of day that the forecast is scraped the date and time values on the timestamps are always changing and thus the next timestamp is not always in 2 hours.
Seeing as each time I scrape the forecast, I get a new table, should I create a new database table each time in sqlite? This seems like a rather rudimentary solution, and I’m sure there are better ways to store the data.
How would you go about this?

Summarizing my comments:
You may want to append forecast data from a new scrapping to the existing data in the same database table.
From each new web-scrapping you will get approx. 40 new records with the same scrapping time stamp but different forecast time stamp.
e.g., this would be the columns of the table using ID as primary key with AUTOINCREMENT:
ID
Scrapping_time
Forecast_hours
Wind_speed
Wind_gusts
Wind_direction
Wave
Wave_period
wave_direction
Note:
if you use SQLite, you could leave out the ID column as SQLite would add such ROWID column by default if no other primary key had been specified
(https://www.sqlite.org/autoinc.html)

How to Write a Limit Order Strategy with Backtrader?

I am new to Backtrader and I can't figure out how to write the following strategy:
Every morning it places a limit buy order at 80% of Open price. If the order is executed during the day (i.e. Low price < the limit price for that day), then sell the stock at Close.
I am using Yahoo's OHLC daily data.
Can any one show me how to write the Strategy part of the code? I posted a similar question on BT's official forum but couldn't get an answer.
Thanks.

Cummulative Time Spent in Specific States

I have a dataset that looks as follows:
What I would like to do with this data is calculate how much time was spent in specific states, per day. So say for example I wanted to know how long the unit was running today. I would just like to know the sum of the time the unit spent RUNNING: 45 minutes, NOT_RUNNING: 400 minutes, WARMING_UP: 10 minutes, etc.
I know how to summarize the column data on its own, but I'm looking to reference the time stamp I have available to subtract the first time it was on, from the last time it was on and get that measure of difference. I haven't had any luck searching for this solution, but there's no way I'm the first to come across this and know it can be done some how, just looking to learn how. Anything helps, Thanks!

Iterating a custom date range over multiple years in Pandas

The short of it: How do I parse through yearly data in non-standard year. In my case Sept to Sept.
I've got a script to parse through years' worth of hourly temperature data and calculate the accumulated growing degree days (GDD) per year. Some demo data and the script are on this gist if you're curious where I'm at. The meat and potatoes though is getting the yearly cumulative sum with this:
df[col_name] = df.resample('Y')['dGDD'].cumsum()
and all works well. Each day will show the accumulated GDD in the proper column until Dec 31 when it starts from zero again.
My next goal is to calculate Chilling Degree Days which works similarly as GDD but it runs from Sept to Sept each year and I have no idea how to work that in (or what to properly google for help). I know I can set a date range to run it over, ie df['2012-9-1':'2013-9-1'] but I'm not sure how to automate it for the entirety of my data (2007-2018).
Thanks!

Solved it! Turns out the period of time I'm looking for is known as a 'Water Year'. Learning that lead me to another question and answer. That, combined with a closer look at the .resample() docs where I learned you can direct the resample to something other than index, got me this:
df['WaterYear'] = df.index + pd.DateOffset(months=-8)
df[col_name] = df.resample('Y', on='WaterYear')['dCDD'].cumsum()
And everything seems to be working swimmingly now.

Items movement daily collection database design system issue

I am doing a very simple database in mysql to track movement of items. The current paper form looks like this:
Date totalFromPreviousDay NewToday LeftToday RemainAtEndOfDay
1.1.2017 5 5 2 8 (5+5-2)
2.1.2017 8 3 0 11 ( 8+ 3 -0)
3.1.2017 11 0 5 6 (11+0-5)
And so forth. In my table, I want to make totalFromPreviousDay and RemainAtEndOfDay calculated fields which I show in my front end only. That is mainly cos we tend to erase on the paper due to errors. I want them to be reflected based on changes to the other two fields. As such, I did my table like this:
id
date
NewToday
LeftToday
Now the problem I am facing is, I want to select any date and be able to say "there were 5 items at the start of the day or from previous day, then 5 were added, 0 left and the day ended with 10 items"
So far, I can't really think of a way going about it. Theoretically, I want to try something like this: if the requested day is Feb. 1, 2017, start at 0 cos that's the day we started collecting data. If not, loop thru the records at 0 and doing the math until the requested date is found.
But that is obviously inefficient cos i have to start form first date until the last every time.
Is my approach ok or I should include the columns in the table? If the first, what would be the way to do it in python/mysql?

I think you have to step back a little bit and define the business needs first (it is worthwhile to talk somebody, who worked with stocks before) because these determine your table structure.
A system always tracks the current level of stocks and the movement. It is a business decision how often you save your historical stock level and this influences how you store the data.
You may save the current stock level along with all transactions. In this case you would store the stock level in the transactions table. You do not even have to sum up a transactions per day because the last transaction per day will have the daily closing stock level anyway.
You may choose to save the historic stock levels regularly (on a daily / weekly / monthly, etc. basis). In this case you will have a separate historic stock levels table with stock id, stock name (name may change over the time, so may be a good idea to save it), date and the level. If you would like to know the historic stock level for any point of time that falls between your saved points, then you need to take the latest saved stock level before the period you are looking for, and sum up all transactions to the saved period.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.