SQLite 3, using unixEpoch and optimise time condition - python

I got about 15 days of data from I would like to query the last 5 days. I would also like to work with epochTime in my conditions.
Here is what I'm using right now :
To get x values every 200 rows, very fast method by working with rowid only.
connection = sql.connect("/home/pi/data.db")
cursor = connection.cursor()
cursor.execute("SELECT epochTime,x from (select rowid,epochTime,x from table1 where (rowid % 200 = 0)) order by rowid ASC")
x = cursor.fetchall()
First try using classic timestamp, however I'm suprised this is taking way more time than soluce 1 (about +2 seconds if fetching the 15 days). Am I doing it wrong ?
cursor.execute("SELECT timestamp, epochTime, x from (select rowid,epochTime,x from table1 where (rowid % 200 = 0) and timestamp>(select datetime('now','-5 day'))) order by rowid ASC")
Finally, a first attempt by using epochTime but it's not working:
cursor.execute("SELECT epochTime, x from (select rowid,epochTime,x from table1 where (rowid % 200 = 0) and epochTime>(select datetime('now','-68400'))) order by rowid ASC")
Note: here is how epochTime is parsed, I had to add %s100 to work with highcharts, it can maybe causing an issue for sqlite conditions ? :
now = datetime.datetime.now()
epochTime = now.strftime("%s100")
Does working with "time" condition will automatically slowing the request ?
How should i write this down to work properly with UnixEpoch ?
Thanks a lot!

Related

Python: cx_Oracle does not like how I am entering date

I am trying to do a simple select all query in python using the Cx_oracle module. When I do a select all for the first ten rows in a table I am able to print our the output. However when I do a select all for the first ten rows for a specific date in the table all that gets printed out is a blank list like this: [].
Here is the query select all query that prints out all the results:
sql_query = "select * from table_name fetch first 10 rows only"
cur = db_eng.OpenCursor()
db_eng.ExecuteQuery(cur, sql_query)
result = db_eng.FetchResults(cur)
print(result)
The above query works and is able to print out the results.
Here is the query that I am having trouble with and this query below works in sql developer:
sql_query = "select * from table_name where requested_time = '01-jul-2021' fetch first 10 rows only"
cur = db_eng.OpenCursor()
db_eng.ExecuteQuery(cur, sql_query)
result = db_eng.FetchResults(cur)
print(result)
I also tried this way where I define the date outside of the query.
specific_date = '01-jul-2021'
sql_query = "select * from table_name where requested_time = '{0}' fetch first 10 rows only".format(specific_date)
cur = db_eng.OpenCursor()
db_eng.ExecuteQuery(cur, sql_query)
result = db_eng.FetchResults(cur)
print(result)
Oracle dates have a time portion. The query
select * from table_name where requested_time = '01-jul-2021' fetch first 10 rows only
Will only give you the rows for which the value for the column requested_time is 01-jul-2021 00:00. Chances are that you have other rows for which there is a time portion as well.
To cut off the time portion there are several options. Note that I explicitly added the a TO_DATE function to the date - you're assuming that the database is expecting a dd-mon-yyyy format and successfully will do the implicit conversion but it's safer to let the database know.
TRUNC truncate the column - this will remove the time portion
SELECT *
FROM table_name
WHERE TRUNC(requested_time) = TO_DATE('01-jul-2021','DD-mon-YYYY')
FETCH FIRST 10 ROWS ONLY
Format the column date to the same format as the date you supplied and compare the resulting string:
SELECT *
FROM table_name
WHERE TO_CHAR(requested_time,'DD-mon-YYYY') = '01-jul-2021'
FETCH FIRST 10 ROWS ONLY
Example:
pdb1--KOEN>create table test_tab(requested_time DATE);
Table TEST_TAB created.
pdb1--KOEN>BEGIN
2 INSERT INTO test_tab(requested_time) VALUES (TO_DATE('08-AUG-2021 00:00','DD-MON-YYYY HH24:MI'));
3 INSERT INTO test_tab(requested_time) VALUES (TO_DATE('08-AUG-2021 01:00','DD-MON-YYYY HH24:MI'));
4 INSERT INTO test_tab(requested_time) VALUES (TO_DATE('08-AUG-2021 02:10','DD-MON-YYYY HH24:MI'));
5 END;
6 /
PL/SQL procedure successfully completed.
pdb1--KOEN>SELECT COUNT(*) FROM test_tab WHERE requested_time = TO_DATE('08-AUG-2021','DD-MON-YYYY');
COUNT(*)
----------
1
--only 1 row. That is the rows with time 00:00. Other rows are ignored
pdb1--KOEN>SELECT COUNT(*) FROM test_tab WHERE TRUNC(requested_time) = TO_DATE('08-AUG-2021','DD-MON-YYYY');
-- all rows
COUNT(*)
----------
3

cx_Oracle: fetchall() stops working with big SELECT statements

I'm trying to read data from an oracle db.
I have to read on python the results of a simple select that returns a million of rows.
I use the fetchall() function, changing the arraysize property of the cursor.
select_qry = db_functions.read_sql_file('src/data/scripts/03_perimetro_select.sql')
dsn_tns = cx_Oracle.makedsn(ip, port, sid)
con = cx_Oracle.connect(user, pwd, dsn_tns)
start = time.time()
cur = con.cursor()
cur.arraysize = 1000
cur.execute('select * from bigtable where rownum < 10000')
res = cur.fetchall()
# print res # uncomment to display the query results
elapsed = (time.time() - start)
print(elapsed, " seconds")
cur.close()
con.close()
If I remove the where condition where rownum < 10000 the python environment freezes and the fetchall() function never ends.
After some trials I found a limit for this precise select, it works till 50k lines, but it fails if I select 60k lines.
What is causing this problem? Do I have to find another way to fetch this amount of data or the problem is the ODBC connection? How can I test it?
Consider running in batches using Oracle's ROWNUM. To combine back into single object append to a growing list. Below assumes total row count for table is 1 mill. Adjust as needed:
table_row_count = 1000000
batch_size = 10000
# PREPARED STATEMENT
sql = """SELECT t.* FROM
(SELECT *, ROWNUM AS row_num
FROM
(SELECT * FROM bigtable ORDER BY primary_id) sub_t
) AS t
WHERE t.row_num BETWEEN :LOWER_BOUND AND :UPPER_BOUND;"""
data = []
for lower_bound in range(0, table_row_count, batch_size):
# BIND PARAMS WITH BOUND LIMITS
cursor.execute(sql, {'LOWER_BOUND': lower_bound,
'UPPER_BOUND': lower_bound + batch_size - 1})
for row in cur.fetchall():
data.append(row)
You are probably running out of memory on the computer running cx_Oracle. Don't use fetchall() because this will require cx_Oracle to hold all result in memory. Use something like this to fetch batches of records:
cursor = connection.cursor()
cursor.execute("select employee_id from employees")
res = cursor.fetchmany(numRows=3)
print(res)
res = cursor.fetchmany(numRows=3)
print(res)
Stick the fetchmany() calls in a loop, process each batch of rows in your app before fetching the next set of rows, and exit the loop when there is no more data.
What ever solution you use, tune cursor.arraysize to get best performance.
The already given suggestion to repeat the query and select subsets of rows is also worth considering. If you are using Oracle DB 12 there is a newer (easier) syntax like SELECT * FROM mytab ORDER BY id OFFSET 5 ROWS FETCH NEXT 5 ROWS ONLY.
PS cx_Oracle does not use ODBC.

How to get last record in sqlite3 db with where condition

I am trying search records in sqlite3 table to get last record inserted with where condition, but I can do it with only one condition WHERE CODE = df = "DS3243". But what I want to do is with multiple WHERE conditions jf = "QS2134", df = "DS3243", sf = "MS5787", so that I can get the last record inserted with the codes provided.
DEMONTSTRATION
CODE POINT
QS2134 1000
DS3244 2000
MS5787 3000
QS2134 130
QS2134 200 # want to get this because it last with such code
DS3244 300
MS5787 4500
DS3244 860 # want to get this because it last with such code
MS5787 567
MS5787 45009 # want to get this because it last with such code
Am able to do for only one variable cur.execute("SELECT * FROM PROFILE WHERE CODE=? ORDER BY POINT ASC LIMIT 1 ",(df,)) but i want to do for multiple varaiables.
import sqlite3
jf = "QS2134"
df = "DS3243"
sf = "MS5787"
con = sqlite3.connect("TEST.db")
cur = con.cursor()
cur.execute("SELECT * FROM PROFILE WHERE CODE=? ORDER BY POINT ASC LIMIT 1 ",(df,)) # limit one means last one
rows = cur.fetchall()
for row in rows:
print(row)
con.commit()
con.close()
I'm not sure I understand your question, but is it possible that you meant that you want to group the results?
Is it "group by" clause that you're looking for?
Something like:
select CODE, MAX(POINT) group by CODE;
I think you are simply trying to extend your query, in which case, why don't you try string formatting?
x = "SELECT * FROM my_table where col1 = '{0}' or col2 ='{1}';".format(var_1, var_2)
cur.execute(x)
That way you can extend your query with as many conditions as you like.

How to stream/print the several last appended data from a table in SQL Server using python?

I have a table in my SQL Server that is being updated every minute.
Currently, I get the data from my table using this lines of code:
conn = pymssql.connect(server, user, password, "tempdb")
def print_table():
cursor = conn.cursor(as_dict=True)
cursor.execute('SELECT * FROM EmotionDisturbances WHERE name=%s', 'John Doe')
for row in cursor:
#Show the data:
print("rate=%d, emotion=%s" % (row['rate'], row['emotion']))
conn.close()
In my application, I run this the function every 10 seconds.
How do I update the function so that I only print the last appended data from my table?
Thanks
Assuming you have an auto-incrementing index in column id you'd do:
SELECT * FROM EmotionDisturbances WHERE name = % ORDER BY id DESC LIMIT 1
EDIT: If you want all data that was added after a certain time, then you'll need to migrate your schema to have a created date column if it doesn't have one already, then you can do:
SELECT *
FROM EmotionDisturbances
WHERE name = % AND created >= DATEADD(second, -10, GETDATE())
This would get all of the records created over the last 10 seconds, since you said this function runs every 10 seconds.

Python foreach from a MySQLdb

I'm trying to fetch a list of timestamps in MySQL by Python. Once I have the list, I check the time now and check which ones are longer than 15min ago. Onces I have those, I would really like a final total number. This seems more challenging to pull off than I had originally thought.
So, I'm using this to fetch the list from MySQL:
db = MySQLdb.connect(host=server, user=mysql_user, passwd=mysql_pwd, db=mysql_db, connect_timeout=10)
cur = db.cursor()
cur.execute("SELECT heartbeat_time FROM machines")
row = cur.fetchone()
print row
while row is not None:
print ", ".join([str(c) for c in row])
row = cur.fetchone()
cur.close()
db.close()
>> 2016-06-04 23:41:17
>> 2016-06-05 03:36:02
>> 2016-06-04 19:08:56
And this is the snippet I use to check if they are longer than 15min ago:
fmt = '%Y-%m-%d %H:%M:%S'
d2 = datetime.strptime('2016-06-05 07:51:48', fmt)
d1 = datetime.strptime('2016-06-04 23:41:17', fmt)
d1_ts = time.mktime(d1.timetuple())
d2_ts = time.mktime(d2.timetuple())
result = int(d2_ts-d1_ts) / 60
if str(result) >= 15:
print "more than 15m ago"
I'm at a loss how I am able to combine these though. Also, now that I put it in writing, there must be a easier/better way to filter these?
Thanks for the suggestions!
You could incorporate the 15min check directly into your SQL query. That way there is no need to mess around with timestamps and IMO it's far easier to read the code.
If you need some date from other columns from your table:
select * from machines where now() > heartbeat_time + INTERVAL 15 MINUTE;
If the total count is the only thing you are interested in:
SELECT count(*) FROM machines WHERE NOW() > heartbeat_time + INTERVAL 15 MINUTE;
That way you can do a cur.fetchone() and get either None or a tuple where the first value is the number of rows with a timestamp older than 15 minutes.
For iterating over a resultset it should be sufficient to write
cur.execute('SELECT * FROM machines')
for row in cur:
print row
because the base cursor already behaves like an iterator using .fetchone().
(all assuming you have timestamps in your DB as you stated in the question)
#user5740843: if str(result) >= 15: will not work as intended. This will always be True because of the str().
I assume heartbeat_time field is a datetime field.
import datetime
import MySQLdb
import MySQLdb.cursors
db = MySQLdb.connect(host=server, user=mysql_user, passwd=mysql_pwd, db=mysql_db, connect_timeout=10,
cursorclass=MySQLdb.cursors.DictCursor)
cur = db.cursor()
ago = datetime.datetime.utcnow() - datetime.timedelta(minutes=15)
try:
cur.execute("SELECT heartbeat_time FROM machines")
for row in cur:
if row['heartbeat_time'] <= ago:
print row['heartbeat_time'], 'more than 15 minutes ago'
finally:
cur.close()
db.close()
If data size is not that huge, loading all of them to memory is a good practice, which will release the memory buffer on the MySQL server. And for DictCursor, there is not such a difference between,
rows = cur.fetchall()
for r in rows:
and
for r in cur:
They both load data to the client. MySQLdb.SSCursor and SSDictCursor will try to transfer data as needed, while it requires MySQL server to support it.

Categories

Resources