Extract timestamp data with jumps in MYSQL - python

I'm trying to get information out of my MySQL database, but the data is stored every 10 seconds and I want to extract it every minute or every hour. I am using this sql command to get the data from a sensor with tagid = 1186. Thanks!.
SELECT t_stamp, floatvalue FROM database1.sqlt_data_1_2022_01 where tagid=1186;
console:
t_stamp floatvalue
1641013208360 86.2012939453125
1641013218361 86.32317352294922
1641013228362 86.3144760131836
1641013238365 86.53619384765625
1641013248366 86.37206268310547
1641013258367 86.31449890136719
1641013268368 86.36858367919922
1641013278369 86.26002502441406
1641013288370 86.34619903564453
1641013298375 86.14665985107422
1641013308372 86.06439971923828
1641013318373 86.54731750488281

Sounds to me you're asking how to perform a timely job. You can create a record table and store the output from your query every 1 minute or so. Try using 'EVENT'.
set ##global.event_scheduler=on; -- first of all enable the event scheduler
delimiter //
CREATE EVENT get_data_every_minute
ON SCHEDULE EVERY 1 minute STARTS now() DO
BEGIN
insert into record_table SELECT t_stamp, floatvalue FROM database1.sqlt_data_1_2022_01 where tagid=1186;
END//
delimiter ;
On the other hand ,if you intend to read the onscreen output other than storing the data. I would suggest using a procedure to run the query on a timely basis in an infinite loop.
delimiter //
create procedure show_result()
begin
while true do
SELECT t_stamp, floatvalue FROM database1.sqlt_data_1_2022_01 where tagid=1186;
select sleep(60); -- in seconds
end while;
end//
delimiter ;
call show_result; -- start the procedure
To terminate the procedure(infinite loop) , check the processlist and kill its pid.
show processlist; -- the process of the infinite loop looks like below:
| 144306 | root | % | testdb | Query | 0 | User sleep | select sleep(10) |
kill 144306 ; -- kill the pid to stop it

Related

SQLite3 export Data

I am currently working on a python projekt with tensorflow and I need to preprocess my data.
The data I want to use is stored in an sqlite3 database with the columns:
timestamp|dev|event
10:00 |01 | on
11:00 |02 | off
11:15 |01 | off
11:30 |02 | on
And I would like to export the Data into a file (.csv) looking like that:
Timestamp|01 |02 |...
10:00 |on |0 |...
11:00 |on |off|...
11:15 |off|off|...
11:30 |off|on |...
Which always has the latest information of every Device associated with the current timestamp and with every new timestamp the old values should stay and if there is an update only those value(s) should be updated.
The number of Devices does not change and I can find that number with
SELECT COUNT(DISTINCT dev) FROM table01;
Currently that Number is 38 diffrent devices and a total of 10000 entries.
Is there a way to to this computation with sqlite3 or do I have to write a program in python to process the data. I am new to both topics.
~Fabian
You can work it in sqlite, something along this lines
select
timestamp,
group_concat(case when dev="01" then event else "" end, "") as D01,
group_concat(case when dev="02" then event else "" end, "") as D02
from
table01
group by
timestamp;
Basically you are pivoting the table.
Challenges are, as the pivot needs to be kind of dynamic i.e. the list of the devices are not fixed. You need to query the list of devices and then build this query i.e. case when else part based on the list of the devices.
Also, generally you need to group based on the timestamp, as for device status for different devices will be in different row for a single timestamp.
Also if the {timestamp, device} is not unique you need make it unique.

How to subtract timestamps from two seperate columns, then enter this data into the table

I am attempting to create a sign-in, sign-out program using python and mysql on a raspberry pi. I have so far successfully created a system for registering, signing in, and signing out, but am having trouble with the final component. I am trying to subtract the two separate time-stamps (signin, signout) using TIMESTAMPDIFF, and then entering this value into a separate column in the table. However, the statement I am using is not working.
Code:
employee_id = raw_input("Please enter your ID number")
minutes_calc = "INSERT INTO attendance (minutes) WHERE id = %s VALUES (TIMESTAMPDIFF(MINUTE, signin, signout))"
try:
curs.execute(minutes_calc, (employee_id))
db.commit()
except:
db.rollback()
print "Error"
Table Structure so far (apologies for formatting):
name | id | signin | signout | minutes |
Dr_Kerbal 123 2016-08-21 22:57:25 2016-08-21 22:59:58 NULL
Please Note that I am in need of a statement that can subtract the timestamps regardless of their value instead of a statement that focuses on the specific timestamps in the example.
Additionally the datatype for the column minutes is decimal (10,0) as I initially intended for it to be in seconds (that was when I encountered other issues which have been solved since), its null status is set to YES, and the default value is NULL.
Thank You for your Assistance
You need an update query not INSERT
UPDATE attendance
SET `minutes` = TIMESTAMPDIFF(MINUTE, signin, signout);

Apache Hive getting error while using Python UDF

I am using Python user defined function in Apache hive to change characters from lower case character to upper case. I am getting error as "Hive Runtime Error while closing operators".
Below are the query I tried:
describe table1;
OK
item string
count int
city string
select * from table1;
aaa 1 tokyo
aaa 2 london
bbb 3 washington
ccc 4 moscow
ddd 5 bejing
From the above table, item and city field should change from lower case to upper case and count should increment by 10.
Python script used:
cat caseconvert.py
import sys
import string
for line in sys.stdin:
line = line.strip()
item,count,city=line.split('\t')
ITEM1=item.upper()
COUNT1=count+10
CITY1=city.upper()
print '\t'.join([ITEM1,str(COUNT1),FRUIT1])
Inserting table1 data to table2
create table table2(ITEM1 string, COUNT1 int, CITY1 string) row format delimited fields terminated by ',';
add FILE caseconvert.py
insert overwrite table table2 select TRANSFORM(item,count,city) using 'python caseconvert.py' as (ITEM1,COUNT1,CITY1) from table1;
If I execute I am getting the following error. I could'nt able to trace the issue. Can I know it going wrong?
Total MapReduce jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201508151858_0014, Tracking URL = http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201508151858_0014
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_201508151858_0014
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2015-08-15 22:24:06,212 Stage-1 map = 0%, reduce = 0%
2015-08-15 22:25:01,559 Stage-1 map = 100%, reduce = 100%
Ended Job = job_201508151858_0014 with errors
Error during job, obtaining debugging information...
Job Tracking URL: http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201508151858_0014
Examining task ID: task_201508151858_0014_m_000002 (and more) from job job_201508151858_0014
Task with the most failures(4):
-----
Task ID:
task_201508151858_0014_m_000000
URL:
http://localhost.localdomain:50030/taskdetails.jsp?jobid=job_201508151858_0014&tipid=task_201508151858_0014_m_000000
-----
Diagnostic Messages for this Task:
java.lang.RuntimeException: Hive Runtime Error while closing operators
at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:224)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20003]: An error occurred when trying to close the Operator running your custom script.
at org.apache.hadoop.hive.ql.exec.ScriptOperator.close(ScriptOperator.java:488)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:570)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:5
FAILED: Execution Error, return code 20003 from org.apache.hadoop.hive.ql.exec.MapRedTask. An error occurred when trying to close the Operator running your custom script.
MapReduce Jobs Launched:
Job 0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
In the last line of your Python script, where you print the output to STDOUT, you call FRUIT1 without having defined it. This should be CITY1. You have also imported string but not used it. I'd write the script a bit differently:
import sys
import string
while True:
line = sys.stdin.readline()
if not line:
break
line = string.strip(line, '\n ')
item,count,city=string.split(line, '\t')
ITEM1=item.upper()
COUNT1=count+10
CITY1=city.upper()
print '\t'.join([ITEM1,str(COUNT1),CITY1])
Then, I'd use a CREATE TABLE AS SELECT query (assuming both TABLE1 and your python script live in the HDFS):
create table TABLE2
as select transform(item, count, city)
using 'hdfs:///user/username/caseconvert.py'
as (item1 string, count1 string, city1 string)
FROM TABLE1;
This has worked for me. However, there is a much easier way to make the transformations you want using Hive built-in functions:
upper(string A) >>> returns the string resulting from converting all characters of A to upper case. For example, upper('fOoBaR') results in 'FOOBAR'.
And of course for city you could make it: (city + 10) AS city1.
So, TABLE2 could be created as follows:
CREATE TABLE2
AS SELECT
UPPER(ITEM) AS ITEM1,
COUNT + 10 AS COUNT1,
UPPER CITY AS CITY1
FROM TABLE1;
Much less trouble than writing your custom UDF.

Python sqlite3 never returns an inner join with 28 milion+ rows

Sqlite database with two tables, each over 28 million rows long. Here's the schema:
CREATE TABLE MASTER (ID INTEGER PRIMARY KEY AUTOINCREMENT,PATH TEXT,FILE TEXT,FULLPATH TEXT,MODIFIED_TIME FLOAT);
CREATE TABLE INCREMENTAL (INC_ID INTEGER PRIMARY KEY AUTOINCREMENT,INC_PATH TEXT,INC_FILE TEXT,INC_FULLPATH TEXT,INC_MODIFIED_TIME FLOAT);
Here's an example row from MASTER:
ID PATH FILE FULLPATH MODIFIED_TIME
---------- --------------- ---------- ----------------------- -------------
1 e:\ae/BONDS/0/0 100.bin e:\ae/BONDS/0/0/100.bin 1213903192.5
The tables have mostly identical data, with some differences between MODIFIED_TIME in MASTER and INC_MODIFIED_TIME in INCREMENTAL.
If I execute the following query in sqlite, I get the results I expect:
select ID from MASTER inner join INCREMENTAL on FULLPATH = INC_FULLPATH and MODIFIED_TIME != INC_MODIFIED_TIME;
That query will pause for a minute or so, return a number of rows, pause again, return some more, etc., and finish without issue. Takes about 2 minutes to fully return everything.
However, if I execute the same query in Python:
changed_files = conn.execute("select ID from MASTER inner join INCREMENTAL on FULLPATH = INC_FULLPATH and MODIFIED_TIME != INC_MODIFIED_TIME;")
It will never return - I can leave it running for 24 hours and still have nothing. The python32.exe process doesn't start consuming a large amount of cpu or memory - it stays pretty static. And the process itself doesn't actually seem to go unresponsive - however, I can't Ctrl-C to break, and have to kill the process to actually stop the script.
I do not have these issues with a small test database - everything runs fine in Python.
I realize this is a large amount of data, but if sqlite is handling the actual queries, python shouldn't be choking on it, should it? I can do other large queries from python against this database. For instance, this works:
new_files = conn.execute("SELECT DISTINCT INC_FULLPATH, INC_PATH, INC_FILE from INCREMENTAL where INC_FULLPATH not in (SELECT DISTINCT FULLPATH from MASTER);")
Any ideas? Are the pauses in between sqlite returning data causing a problem for Python? Or is something never occurring at the end to signal the end of the query results (and if so, why does it work with small databases)?
Thanks. This is my first stackoverflow post and I hope I followed the appropriate etiquette.
Python tends to have older versions of the SQLite library, especially Python 2.x, where it is not updated.
However, your actual problem is that the query is slow.
Use the usual mechanisms to optimize it, such as creating a two-column index on INC_FULLPATH and INC_MODIFIED_TIME.

Postgresql DROP TABLE doesn't work

I'm trying to drop a few tables with the "DROP TABLE" command but for a unknown reason, the program just "sits" and doesn't delete the table that I want it to in the database.
I have 3 tables in the database:
Product, Bill and Bill_Products which is used for referencing products in bills.
I managed to delete/drop Product, but I can't do the same for bill and Bill_Products.
I'm issuing the same "DROP TABLE Bill CASCADE;" command but the command line just stalls. I've also used the simple version without the CASCADE option.
Do you have any idea why this is happening?
Update:
I've been thinking that it is possible for the databases to keep some references from products to bills and maybe that's why it won't delete the Bill table.
So, for that matter i issued a simple SELECT * from Bill_Products and after a few (10-15) seconds (strangely, because I don't think it's normal for it to last such a long time when there's an empty table) it printed out the table and it's contents, which are none. (so apparently there are no references left from Products to Bill).
What is the output of
SELECT *
FROM pg_locks l
JOIN pg_class t ON l.relation = t.oid AND t.relkind = 'r'
WHERE t.relname = 'Bill';
It might be that there're other sessions using your table in parallel and you cannot obtain Access Exclusive lock to drop it.
Just do
SELECT pid, relname
FROM pg_locks l
JOIN pg_class t ON l.relation = t.oid AND t.relkind = 'r'
WHERE t.relname = 'Bill';
And then kill every pid by
kill 1234
Where 1234 is your actual pid from query results.
You can pipe it all together like this (so you don't have to copy-paste every pid manually):
psql -c "SELECT pid FROM pg_locks l
JOIN pg_class t ON l.relation = t.oid AND t.relkind = 'r'
WHERE t.relname = 'Bill';" | tail -n +3 | head -n -2 | xargs kill
So I was hitting my head against the wall for some hours trying to solve the same issue, and here is the solution that worked for me:
Check if PostgreSQL has a pending prepared transaction that's never been committed or rolled back:
SELECT database, gid FROM pg_prepared_xacts;
If you get a result, then for each transaction gid you must execute a ROLLBACK from the database having the problem:
ROLLBACK PREPARED 'the_gid';
For further information, click here.
If this is for AWS postgres run the first statement to get the PID (Process ID) and then run the second query to terminate the process (it would be very similar to do kill -9 but since this is in the cloud that's what AWS recommends)
-- gets you the PID
SELECT pid, relname FROM pg_locks l JOIN pg_class t ON l.relation = t.oid AND t.relkind = 'r' WHERE t.relname = 'YOUR_TABLE_NAME'
-- what actually kills the PID ...it is select statement but it kills the job!
SELECT pg_terminate_backend(YOUR_PID_FROM_PREVIOUS_QUERY);
source
I ran into this today, I was issuing a:
DROP TABLE TableNameHere
and getting ERROR: table "tablenamehere" does not exist. I realized that for case-sensitive tables (as was mine), you need to quote the table name:
DROP TABLE "TableNameHere"
Had the same problem.
There were not any locks on the table.
Reboot helped.
Old question but ran into a similar issue. Could not reboot the database so tested a few things until this sequence worked :
truncate table foo;
drop index concurrently foo_something; times 4-5x
alter table foo drop column whatever_foreign_key; times 3x
alter table foo drop column id;
drop table foo;
The same thing happened for me--except that it was because I forgot the semicolon. face palm

Categories

Resources