Resetting a rolling sum

Resetting a rolling sum - python

I am trying to create a rolling stock count from the quantity. The count should reset everytime there is a real stock count (TypeOfMovement = 3). This should work on each ArticleNo and be grouped by date.
I can get a running stock count (lines <123 in image) and then take the real stock count when TypeOfMovement = 3 (Line 123 of image), but the count doesn't reset, it continues from before the real stock count (ResetRunningTotal in line124 should be 6293).
The solution should run in SSMS. Alternatively a python solution could be run.
My query so far is:
WITH a
AS
(SELECT
DateOfMovement, Quantity, ArticleNo, TypeOfChange,
CASE
WHEN TypeOfChange = 3 Then 0
ELSE Quantity
END AS RunningTotal
FROM Stock_Table
Group by DateOfMovement, Quantity, ArticleNo, TypeOfChange)
SELECT
*
,CASE
WHEN TypeOfChange= 3 THEN Quantity
ELSE Sum(Quantity) OVER(ORDER BY ArticleNo, DateOfMovement)
END AS ResetRunningTotal
FROM a
WHEre ArticleNo = 9410
group by DateOfMovement, ArticleNo, Quantity, TypeOfChange, RunningTotal
order by DateOfMovement asc
Image of results table is..

Ok so you want running totals for each ArticleNo ordered by DateOfMovement that reset whenever you encounter a TypeOfChange value of 3.
To do this you need to create a grouping_id (Grp) for each running total. You can do this with a CTE that calculates group ids, then do the running totals with the CTE results:
with Groups as (
select st.*
, sum(case TypeOfChange when 3 then 1 else 0 end)
over (partition by ArticleNo order by DateOfMovement) Grp
from Stock_Table st
)
select Groups.*
, sum(Quantity) over (partition by ArticleNo, Grp order by DateOfMovement) RunningTotal
from Groups
order by ArticleNo, dateofmovement

Related

SQL Comparison Operator Python) I can't execute comparison operators to SQL query through python file

sql = "WITH comparing_price AS (SELECT CODE, DATE, OPEN, high, low, close, volume,"\
"LEAD(OPEN, 1) OVER (PARTITION BY CODE ORDER BY DATE) AS 'd1_open',"\
"LEAD(OPEN, 2) OVER (PARTITION BY CODE ORDER BY DATE) AS 'd2_open',"\
"LEAD(OPEN, 3) OVER (PARTITION BY CODE ORDER BY DATE) AS 'd3_open',"\
"LEAD(high, 1) OVER (PARTITION BY CODE ORDER BY DATE) AS 'd1_high',"\
"LEAD(high, 2) OVER (PARTITION BY CODE ORDER BY DATE) AS 'd2_high',"\
"LEAD(high, 3) OVER (PARTITION BY CODE ORDER BY DATE) AS 'd3_high',"\
"LEAD(high, 4) OVER (PARTITION BY CODE ORDER BY DATE) AS 'd4_high',"\
"LEAD(low, 1) OVER (PARTITION BY CODE ORDER BY DATE) AS 'd1_low',"\
"LEAD(low, 2) OVER (PARTITION BY CODE ORDER BY DATE) AS 'd2_low',"\
"LEAD(low, 3) OVER (PARTITION BY CODE ORDER BY DATE) AS 'd3_low',"\
"LEAD(close, 1) OVER (PARTITION BY CODE ORDER BY DATE) AS 'd1_close',"\
"LEAD(close, 2) OVER (PARTITION BY CODE ORDER BY DATE) AS 'd2_close',"\
"LEAD(close, 3) OVER (PARTITION BY CODE ORDER BY DATE) AS 'd3_close',"\
"LEAD(volume, 1) OVER (PARTITION BY CODE ORDER BY DATE) AS 'd1_volume',"\
"LEAD(volume, 2) OVER (PARTITION BY CODE ORDER BY DATE) AS 'd2_volume',"\
"AVG(close) OVER (PARTITION BY CODE ORDER BY DATE, DATE ROWS BETWEEN 2 preceding AND 2 following) AS 'd2_MA5',"\
"AVG(close) OVER (PARTITION BY CODE ORDER BY DATE, DATE ROWS BETWEEN 7 preceding AND 2 following) AS 'd2_MA10',"\
"AVG(close) OVER (PARTITION BY CODE ORDER BY DATE, DATE ROWS BETWEEN 17 preceding AND 2 following) AS 'd2_MA20',"\
"AVG(close) OVER (PARTITION BY CODE ORDER BY DATE, DATE ROWS BETWEEN 57 preceding AND 2 following) AS 'd2_MA60',"\
"AVG(close) OVER (PARTITION BY CODE ORDER BY DATE, DATE ROWS BETWEEN 117 preceding AND 2 following) AS 'd2_MA120',"\
"STD(close) OVER (PARTITION BY CODE ORDER BY DATE, DATE ROWS BETWEEN 17 preceding AND 2 following) AS 'd2_std'"\
"FROM daily_price)"\
"SELECT * "\
"FROM comparing_price"\
"WHERE "\
"volume > 1 AND d1_volume > 1 AND d2_volume > 1"
execute(sql)
And I get an error message:
pymysql.err.ProgrammingError:
(1064, "You have an error in your SQL syntax;
check the manual that corresponds to your MariaDB server version for the right syntax to use near
'> 1 AND d1_volume > 1 AND d2_volume > 1' at line 1")
I think the problem is to do with sending comparison operator by execute(), because when directly running this query in HeidiSQL it works fine. Or is there any other ideas why it may not work?

SQLite fetch until

I have a question about SQL.
I have the following sample table called Records:
record_id
subject
start_timestamp
end_timestamp
interval
2
Start Product 2
2021-04-21T16:22:39
2021-04-21T16:23:40
0.97
3
error 1
2021-04-21T16:25:44
2021-04-21T16:25:54
10.0
4
End Product 2
2021-04-21T16:30:13
2021-04-21T16:30:14
0.97
5
Start Product 1
2021-04-21T16:35:13
2021-04-21T16:35:13
0.6
6
End Product 1
2021-04-21T16:36:13
2021-04-21T16:36:13
0.45
First I select all the items that have start in there subject with and are not in the table BackupTO (for now the table BackupTO is not important):
SELECT Records.record_id, Records.start_timestamp, Records.interval FROM Records
LEFT JOIN BackupTO ON BackupTO.record_id = Records.record_id
WHERE BackupTO.record_id IS NULL AND Records.log_type = 1 AND Records.subject LIKE '%start%'
When I ran this we get:
record_id
start_timestamp
interval
2
2021-04-21T16:22:39
0.97
5
2021-04-21T16:35:13
0.6
Oke, all good now comes my question, I fetch this in Python and loop through the data, first I calculate the product number based on the interval with:
product = round(result[2] / 0.5)
So a interval of 0.97 is product 2, and a interval of 0.6,0.45 is product 1, all great!
So I know record_id 2 is product 2 and I want to execute a sql query thats returns all items starting from record_id 2 untils its find a items that has %end%2 in its name (the 2 is for product 2, could also be product 1).
For example its finds Start Product 2 I get a list with record_id 3 and 4.
I want to get all items from the start until end.
So it gets me a list like this, this are all the items found under Start Product 2 until %end%2 was found. For product 1, it just would return just record_id 6, because there is nothing between the start and stop.
record_id
start_timestamp
interval
3
2021-04-21T16:22:39
10.0
4
2021-04-21T16:35:13
0.97
I tried OFFSET and FETCH, but I couldnt get it to work, somebody that could help me out here?

Use your query as a CTE and join it to the table Records.
Then with MIN() window function find the record_id up to which you want the rows returned:
WITH cte AS (
SELECT r.*
FROM Records r LEFT JOIN BackupTO b
ON b.record_id = r.record_id
WHERE b.record_id IS NULL AND r.log_type = 1 AND r.subject LIKE '%start%'
)
SELECT *
FROM (
SELECT r.*,
MIN(CASE WHEN r.subject LIKE '%end%' THEN r.record_id END) OVER () id_end
FROM Records r INNER JOIN cte c
ON r.record_id > c.record_id
WHERE c.record_id = ?
)
WHERE COALESCE(record_id <= id_end, 1)
Change ? to 2 or 5 for each case.
If you have the record_ids returned by your query, it is simpler:
SELECT *
FROM (
SELECT r.*,
MIN(CASE WHEN r.subject LIKE '%end%' THEN r.record_id END) OVER () id_end
FROM Records r
WHERE r.record_id > ?
)
WHERE COALESCE(record_id <= id_end, 1)
See the demo.

Update values in sqlite database when there are multiple with the same name

I'll do my best to explain my problem.
I'm working on cs50 C$50 Finanace currently implementing a function called sell. The purpose of this function is to update the cash value of a specific user into the database and update his portfolio.
I'm struggling with updating the portfolio database.
This is the database query for better clarification:
CREATE TABLE portfolio(id INTEGER, username TEXT NOT NULL, symbol TEXT NOT NULL, shares INTEGER, PRIMARY KEY(id));
Let's say I've these values in it:
id | username | symbol | shares
1 | eminem | AAPL | 20
2 | eminem | NFLX | 5
3 | eminem | AAPL | 5
And the user sells some of his stocks. I have to update the shares.
If it was for NFLX symbol it is easy. A simple query like the below is sufficient
db.execute("UPDATE portfolio SET shares=shares - ? WHERE username=?
AND symbol=?", int(shares), username, quote["symbol"])
However if I wanted the update the AAPL shares, here is where the problem arises. If I executed the above query, lets say the user sold 5 of his shares, the above query will change the AAPL shares in both ids 1 and 3 into 20, making the total shares of AAPL to 40 not 20.
Which approach should I consider? Should I group the shares based on symbol before inserting them into portfolio table. If so, how? or is there a query that could solve my problem?

If your version of SQLite is 3.33.0+, then use the UPDATE...FROM syntax like this:
UPDATE portfolio AS p
SET shares = (p.id = t.id) * t.shares_left
FROM (
SELECT MIN(id) id, username, symbol, shares_left
FROM (
SELECT *, SUM(shares) OVER (ORDER BY id) - ? shares_left -- change ? to the number of stocks the user sold
FROM portfolio
WHERE username = ? AND symbol = ?
)
WHERE shares_left >= 0
) AS t
WHERE p.username = t.username AND p.symbol = t.symbol AND p.id <= t.id;
The window function SUM() returns an incremental sum of the shares until it reaches the number of shares sold.
The UPDATE statement will set, in all rows with id less than than the first id that exceeds the sold stocks, the column shares to 0 and in the row with with id equal to the first id that exceeds the sold stocks to the difference between the incremental sum and the number of sold shares.
See a simplified demo.
For prior versions you can use this:
WITH
cte AS (
SELECT MIN(id) id, username, symbol, shares_left
FROM (
SELECT *, SUM(shares) OVER (ORDER BY id) - ? shares_left -- change ? to the number of stocks the user sold
FROM portfolio
WHERE username = ? AND symbol = ?
)
WHERE shares_left >= 0
)
UPDATE portfolio
SET shares = (id = (SELECT id FROM cte)) * (SELECT shares_left FROM cte)
WHERE (username, symbol) = (SELECT username, symbol FROM cte) AND id <= (SELECT id FROM cte)
See a simplified demo.

SQL Query, Getting Rid of Repeat Selects

I'm using sqlite3 in python and I have a table that has the following columns:
recordid(int), username(text), locations(text), types(text), occupancy(int), time_added(datetime), token(text) and (undo).
I have the following query where I am selecting data from the table depending on what the occupancy is and the time added is between the 2 specified times the user inputs which is start_date and end_date:
('''SELECT locations, types,
(SELECT COUNT (occupancy) FROM traffic WHERE undo = 0 AND occupancy = 1 AND types = ? AND time_added BETWEEN ? AND ?),
(SELECT COUNT (occupancy) FROM traffic WHERE undo = 0 AND occupancy = 2 AND types = ? AND time_added BETWEEN ? AND ?),
(SELECT COUNT (occupancy) FROM traffic WHERE undo = 0 AND occupancy = 3 AND types = ? AND time_added BETWEEN ? AND ?),
(SELECT COUNT (occupancy) FROM traffic WHERE undo = 0 AND occupancy = 4 AND types = ? AND time_added BETWEEN ? AND ?),
FROM traffic WHERE types = ? GROUP BY type''',
(vehicle, start_date, end_date, vehicle, start_date, end_date, vehicle, start_date, end_date, vehicle, start_date, end_date, vehicle)
Is there anyway to condense this so I don't have to copy and paste the same thing multiple times just to change the occupancy? I tried using a for loop but that didn't really get me anywhere.
Cheers!

I'm pretty sure can simplify the query considerably:
SELECT type
SUM( occupancy = 1 ) as cnt_1,
SUM( occupancy = 2 ) as cnt_2,
SUM( occupancy = 3 ) as cnt_3,
SUM( occupancy = 4 ) as cnt_4
FROM traffic
WHERE undo = 0 AND
type = ? AND
time_added BETWEEN ? AND ?
GROUP BY type;
I'm not sure if that is exactly what your question has in mind, though.

Find the number of users with negative total

I am trying to figure out a SQL query or a Python Pandas code for the following solution.
There are n number of USER_ID with various transactions.
Every USER_ID has more than one transaction.
Example USER_ID
000e88bb-d302-4fdc-b757-2b1a2c33e7d6
001926be-3245-43fa-86dd-b40ee160b6f9
Every Transaction has a TYPE
TOPUP
Bank_Transaction
P2P
and a couple more
I want to write a query in which
(TOPUP) - (total of every other type of transaction) and returns all the USER_ID where TOPUP < Total of all the transactions.
Finding all the users who have less topup and more spending.
I hope I am making myself clear?

I believe that the following may produce the result that you want :-
WITH counter AS (
SELECT user_id
FROM transactions AS a
WHERE
coalesce((SELECT sum(amount) FROM transactions WHERE transaction_type = 'TOPUP' AND user_id = a.user_id),0.0) -
coalesce((SELECT sum(amount) FROM transactions WHERE transaction_type <> 'TOPUP' AND user_id = a.user_id),0.0)
< 0
GROUP BY user_id
)
SELECT count() FROM counter;
this assumes that the table name is transactions.
If you consider the following data :-
INSERT INTO transactions VALUES
('000e88bb-d302-4fdc-b757-2b1a2c33e7d6','TOPUP',25.00)
,('000e88bb-d302-4fdc-b757-2b1a2c33e7d6','P2P',125.00)
,('000e88bb-d302-4fdc-b757-2b1a2c33e7d6','BANK-TRANSACTION',75.00)
,('000e88bb-d302-4fdc-b757-2b1a2c33e7d6','TOPUP',25.00)
,('000e88bb-d302-4fdc-b757-2b1a2c33e7d6','BANK-TRANSACTION',75.00)
,('000e88bb-d302-4fdc-b757-2b1a2c33e7d6','TOPUP',25.00)
,('000e88bb-d302-4fdc-b757-2b1a2c33e7d6','TOPUP',25.00)
,('000e88bb-d302-4fdc-b757-2b1a2c33e7d6','BANK-TRANSACTION',75.00)
,('000e88bb-d302-4fdc-b757-2b1a2c33e7d6','TOPUP',25.00)
,('000e88bb-d302-4fdc-b757-2b1a2c33e7d6','BANK-TRANSACTION',75.00)
,('000e88bb-d302-4fdc-b757-2b1a2c33e7d6','BANK-TRANSACTION',75.00)
,('000e88bb-d302-4fdc-b757-2b1a2c33e7d6','BANK-TRANSACTION',75.00)
,('001926be-3245-43fa-86dd-b40ee160b6f9','TOPUP',10.00)
,('001926be-3245-43fa-86dd-b40ee160b6f9','TOPUP',10.00)
,('001926be-3245-43fa-86dd-b40ee160b6f9','TOPUP',10.00)
,('001926be-3245-43fa-86dd-b40ee160b6f9','TOPUP',10.00)
,('001926be-3245-43fa-86dd-b40ee160b6f9','TOPUP',10.00)
,('XX1926be-3245-43fa-86dd-b40ee160b6f9','P2P',50.00)
,('XX1926be-3245-43fa-86dd-b40ee160b6f9','P2P',50.00)
,('XX1926be-3245-43fa-86dd-b40ee160b6f9','P2P',50.00)
,('XX1926be-3245-43fa-86dd-b40ee160b6f9','P2P',50.00)
,('XX1926be-3245-43fa-86dd-b40ee160b6f9','P2P',50.00)
;
Then the result of the above is :-
i.e. the first and third users have a negative account balance, whilst the 2nd has a positive balance (and is therefore excluded from the count).

Think of the topup amounts to be positive and other spendings negative, then we are simply looking for users with a negative balance.
select user_id
from transaction
group by user_id
having sum(case when transaction_type = 'TOPUP' then amount else -amount end) < 0

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Resetting a rolling sum - python

Related

SQL Comparison Operator Python) I can't execute comparison operators to SQL query through python file

SQLite fetch until

Update values in sqlite database when there are multiple with the same name

SQL Query, Getting Rid of Repeat Selects

Find the number of users with negative total

Categories

Resources