I have a Deephaven table with some data. I want to add timestamps to it and play it back in real-time in Python. What's the best way to do that?
Deephaven's TableReplayer class allows you to replay table data. In order to construct a TableReplayer, you need to specify a start time and end time in Deephaven's DateTime format. The start and end times correspond to those in the table you want replayed.
from deephaven import time as dhtu
start_time = dhtu.to_datetime("2022-01-01T00:00:00 NY")
end_time = dhtu.to_datetime("2022-01-01T00:10:00 NY")
replayer = TableReplayer(start_time, end_time)
To create the replayed table, use add_table. This method points at a pre-existing table and specifies the column name that contains timestamps:
replayed_table = replayer.add_table(some_table, "Timestamp")
From there, use start to start the replay:
replayer.start()
If some_table doesn't have a column of timestamps, here's a simple way to add one:
def create_timestamps(index):
return dhtu.plus_period(start_time, dhtu.to_period(f"T{index}S"))
some_table = some_table.update(["Timestamp = (DateTime)create_timestamps(i)"])
Note that the function above creates timestamps spaced one second apart.
I have the following table in my database, which represents the shifts of a working day.
When a new product is added to another table 'Products' I want to assign a shift to it based on the start_timestamp.
So when I insert into Products its takes start_timestamp and looks in table ProductionPlan and looks for a result (ProductionPlan.name) where it is between the start and end timestamp of that shift.
On that way I can assign a shift to the product.
I hope somebody can help me out with this!
Table ProductionPlan
name
start_timestamp
end_timestamp
shift 1
2021-05-10T07:00:00
2021-05-10T11:00:00
shift 2
2021-05-10T11:00:00
2021-05-10T15:00:00
shift 3
2021-05-10T15:00:00
2021-05-10T19:00:00
shift 1
2021-05-11T07:00:00
2021-05-11T11:00:00
shift 2
2021-05-11T11:00:00
2021-05-11T15:00:00
shift 3
2021-05-11T15:00:00
2021-05-11T19:00:00
Table Products
id
name
start_timestamp
end_timestamp
shift
1
Schroef
2021-05-10T08:09:05
2021-05-10T08:19:05
2
Bout
2021-05-10T08:20:08
2021-04-28T08:30:11
3
Schroef
2021-05-10T12:09:12
2021-04-28T12:30:15
I have the following code to insert into Products:
def insertNewProduct(self, log):
"""
This function is used to insert a new product into the database.
#param log : a object to log
#return None.
"""
debug("Class: SQLite, function: insertNewProduct")
self.__openDB()
timestampStart = datetime.fromtimestamp(int(log.startTime)).isoformat()
queryToExecute = "INSERT INTO Products (name, start_timestamp) VALUES('{0}','{1}')".format(log.summary,
timestampStart)
self.cur.execute(queryToExecute)
self.__closeDB()
return self.cur.lastrowid
It's just a simple INSERT INTO but I want to add a query or even extend this query to fill in the column shift.
You can use a SELECT inside an INSERT.
queryToExecute = """INSERT INTO Products (name, start_timestamp, shift)
SELECT :1, :2, name FROM ProductionPlan pp
WHERE :2 BETWEEN pp.start_timestamp and pp.end_timestamp"""
self.cur.execute(queryToExecute, (log.summary, timestampStart))
In above code I have used a parameterized query because I hate inserting parameters as strings inside a query. It was the cause of too many SQL injection attacks...
I'm pretty new to SQL and the Sqlite3 module and I want to edit the timestamps of all the records in my DB randomly.
import sqlite3
from time import time
import random
conn = sqlite3.connect('database.db')
c = sqlite3.Cursor(conn)
ts_new = round(time())
ts_old = 1537828957
difference = ts_new - ts_old
for i in range(1,309):
#getting a new, random timestamp
new_ts = ts_old + random.randint(0, difference)
t = (new_ts, i)
c.executemany("UPDATE questions SET timestamp = (?) WHERE rowid = (?)", t)
#conn.commit()
When run, I get a ValueError: parameters are of unsupported type.
To add the timestamp value originally I set t to a tuple and the current UNIX timestamp as the first value of a it e.g (1537828957, ). Is this error displaying because I've used two (?) unlike the single one I used in the statement to add the timestamps to begin with?
You're using executemany instead of execute. executemany takes an iterator of tuples and executes the query for each tuple.
You want to use execute instead, it executes the query once using your tuple.
c.execute('UPDATE questions SET timestamp = (?) where rowid = (?)', t)
I have a table in my SQL Server that is being updated every minute.
Currently, I get the data from my table using this lines of code:
conn = pymssql.connect(server, user, password, "tempdb")
def print_table():
cursor = conn.cursor(as_dict=True)
cursor.execute('SELECT * FROM EmotionDisturbances WHERE name=%s', 'John Doe')
for row in cursor:
#Show the data:
print("rate=%d, emotion=%s" % (row['rate'], row['emotion']))
conn.close()
In my application, I run this the function every 10 seconds.
How do I update the function so that I only print the last appended data from my table?
Thanks
Assuming you have an auto-incrementing index in column id you'd do:
SELECT * FROM EmotionDisturbances WHERE name = % ORDER BY id DESC LIMIT 1
EDIT: If you want all data that was added after a certain time, then you'll need to migrate your schema to have a created date column if it doesn't have one already, then you can do:
SELECT *
FROM EmotionDisturbances
WHERE name = % AND created >= DATEADD(second, -10, GETDATE())
This would get all of the records created over the last 10 seconds, since you said this function runs every 10 seconds.
So I found a great script over at QuantState that had a great walk-through on setting up my own securities database and loading in historical pricing information. However, I'm not trying to modify the script so that I can run it daily and have the latest stock quotes added.
I adjusted the initial data load to just download 1 week worth of historicals, but I've been having issues with writing the SQL statement to see if the row exists already before adding. Can anyone help me out with this. Here's what I have so far:
def insert_daily_data_into_db(data_vendor_id, symbol_id, daily_data):
"""Takes a list of tuples of daily data and adds it to the
database. Appends the vendor ID and symbol ID to the data.
daily_data: List of tuples of the OHLC data (with
adj_close and volume)"""
# Create the time now
now = datetime.datetime.utcnow()
# Amend the data to include the vendor ID and symbol ID
daily_data = [(data_vendor_id, symbol_id, d[0], now, now,
d[1], d[2], d[3], d[4], d[5], d[6]) for d in daily_data]
# Create the insert strings
column_str = """data_vendor_id, symbol_id, price_date, created_date,
last_updated_date, open_price, high_price, low_price,
close_price, volume, adj_close_price"""
insert_str = ("%s, " * 11)[:-2]
final_str = "INSERT INTO daily_price (%s) VALUES (%s) WHERE NOT EXISTS (SELECT 1 FROM daily_price WHERE symbol_id = symbol_id AND price_date = insert_str[2])" % (column_str, insert_str)
# Using the postgre connection, carry out an INSERT INTO for every symbol
with con:
cur = con.cursor()
cur.executemany(final_str, daily_data)
Some notes regarding your code above:
It's generally easier to defer to now() in pure SQL than to try in Python whenever possible. It avoids lots of potential pitfalls with timezones, library differences, etc.
If you construct a list of columns, you can dynamically generate a string of %s's based on its size, and don't need to hardcode the length into a repeated string with is then sliced.
Since it appears that insert_daily_data_into_db is meant to be called from within a loop on a per-stock basis, I don't believe you want to use executemany here, which would require a list of tuples and is very different semantically.
You were comparing symbol_id to itself in the sub select, instead of a particular value (which would mean it's always true).
To prevent possible SQL Injection, you should always interpolate values in the WHERE clause, including sub selects.
Note: I'm assuming that you're using psycopg2 to access Postgres, and that the primary key for the table is a tuple of (symbol_id, price_date). If not, the code below would need to be tweaked at least a bit.
With those points in mind, try something like this (untested, since I don't have your data, db, etc. but it is syntactically valid Python):
def insert_daily_data_into_db(data_vendor_id, symbol_id, daily_data):
"""Takes a list of tuples of daily data and adds it to the
database. Appends the vendor ID and symbol ID to the data.
daily_data: List of tuples of the OHLC data (with
adj_close and volume)"""
column_list = ["data_vendor_id", "symbol_id", "price_date", "created_date",
"last_updated_date", "open_price", "high_price", "low_price",
"close_price", "volume", "adj_close_price"]
insert_list = ['%s'] * len(column_str)
values_tuple = (data_vendor_id, symbol_id, daily_data[0], 'now()', 'now()', daily_data[1],
daily_data[2], daily_data[3], daily_data[4], daily_data[5], daily_data[6])
final_str = """INSERT INTO daily_price ({0})
VALUES ({1})
WHERE NOT EXISTS (SELECT 1
FROM daily_price
WHERE symbol_id = %s
AND price_date = %s)""".format(', '.join(column_list), ', '.join(insert_list))
# Using the postgre connection, carry out an INSERT INTO for every symbol
with con:
cur = con.cursor()
cur.execute(final_str, values_tuple, values_tuple[1], values_tuple[2])