Running Python directly is much faster compared to when Django runs python - python

So I have a SQL query that takes really long to load using Django, 10000 rows takes about 30 seconds. If I run the exact same code directly using python it does this in 2 seconds. For some reason, the loop I built takes really long to execute when Django runs the code, does anyone know why that is? Can I do something to increase the performance and get rid of this inconvenience?
import psycopg2
def doQuery( conn ) :
cur = conn.cursor()
cur.execute("SELECT * FROM table WHERE substring(addr from 0 for 5)
= '\\x82332355'::bytea")
return cur.fetchall()
myConnection = psycopg2.connect( host=hostname, user=username,
password=password, dbname=database )
results = doQuery( myConnection )
def lists(t):
if type(t) == list or type(t) == tuple:
return [lists(i) for i in t]
return t
results = lists(results)
for result in results:
result[1] = str(result[1]).encode("hex"))
result[3] = datetime.datetime.fromtimestamp(int(result[3])).strftime('%Y-%m-%d %H:%M:%S')
result[6] = "Not Avaliable"
print result
This for loop ^^^^^^^^ takes really long in Django, fast in python
myConnection.close()

Related

Validate list elements if they take X time to run in a loop

I have a list that contains SQL code that can be executed in an external Trino CLI. So for instance, my nested list would look like:
sql = []
sql = [['test 1', 'SELECT * FROM a.testtable1'],['test 2', 'SELECT * FROM a.testtable1']]
This simple loop detects if there's a syntax error:
sql_results = []
for l in sql:
sql_code = l[1]
try:
cur.execute(sql_code
rows = cur.fetchall()
except Exception as e:
status = str(e)
status = 'OK' if len(rows) > 1
sql_results.append([l[0],sql_code,status])
It works good, but sometimes the queries take too long and kill the process. Knowing that if one query lasts more than 3 seconds in its execution, then it means the syntax is OK (and I'm only interested in checking the syntax, and NOT in the result of the query) I'd like to add a time validation. Something like:
If the SQL execution lasts more than 3 seconds, then kill it and status = 'OK'
I tried this using time:
import time
sql_results = []
for l in sql:
sql_code = l[1]
try:
timeout = time.time() + 2
cur.execute(sql_code)
rows = cur.fetchall()
except Exception as e:
status = str(e)
status = 'OK' if len(rows) > 1 or time.time() > timeout
sql_results.append([l[0],sql_code,status])
But it does not do much, and I keep getting the occasional timeouts. Any idea?
Instead of actually running the query, you can ask Trino if the query syntax is valid. Just add the following to each of your queries:
EXPLAIN (TYPE VALIDATE)
https://trino.io/docs/current/sql/explain.html#explain-type-validate

Python + MySQL: Search function returning all entries

I'm putting together an inventory program using Python and MySQL. I want to implement a search function that returns entries based on user input (programmed in a separate GUI file). In the code below, I expected that the search function would return entries with the brand "UGreen". Instead, it returns all of the entries in the table.
I'm not sure what I'm doing wrong here. I have used a similar structure in another program with a sqlite database instead and the search worked fine.
Any and all help/suggestions would be greatly appreciated :)
import mysql.connector
equipdb = mysql.connector.connect(
host = "localhost",
user = "root",
password = "REDACTED",
database = "tel_inventory"
)
def view():
cur = equipdb.cursor()
cur.execute("SELECT * FROM equipment")
result = cur.fetchall()
return result
def search(name="", brand="", model="", consumables="", storage="", room="", photo=""):
cur = equipdb.cursor()
cur.execute("SELECT * FROM equipment WHERE name=%s OR brand=%s OR model=%s OR consumables=%s OR storage=%s OR room=%s OR photo=%s", (name, brand, model, consumables, storage, room, photo))
result = cur.fetchall()
return result
#print(view())
print(search(brand="UGreen"))
Try using keyword argument directly
def search(**kwargs):
cur = equipdb.cursor()
key = str(list(kwargs.keys())[0])
value = str(kwargs[key])
cur.execute('SELECT * FROM equipment WHERE {} = "{}"'.format(key,value))
result = cur.fetchall()
return result

The Code is not working in jupyter and giving the error as pictured below [duplicate]

I am trying to query on a local MySQL database using Python's (3.4) MySQL module with the following code:
class databases():
def externaldatabase(self):
try:
c = mysql.connector.connect(host="127.0.0.1", user="user",
password="password", database="database")
if c.is_connected():
c.autocommit = True
return(c)
except:
return(None)
d = databases().externaldatabase()
c = d.cursor()
r = c.execute('''select * from tbl_wiki''')
print(r)
> Returns: None
As far as I can tell, the connection is successful, the database is composed of several rows but the query always returns the none type.
What instances does the MySQL execute function return None?
Query executions have no return values.
The pattern you need to follow is:
cursor creation;
cursor, execute query;
cursor, *fetch rows*;
Or in python:
c = d.cursor()
c.execute(query) # selected rows stored in cursor memory
rows = c.fetchall() # get all selected rows, as Barmar mentioned
for r in rows:
print(r)
Also some db modules allow you to iterate over the cursor using the for...in pattern, but triple-check that regarding mysql.
For my case, I return the cursor as I need the value to return a string specifically, for instance, I return the password (string) for inspect whether user used the same password twice. Here's how I did it (In my case):
def getUserPassword(metadata):
cursorObject.execute("SELECT password FROM users WHERE email=%s AND password=%s LIMIT 1", (metadata['email'], metadata['password']))
return cursorObject.fetchall()[0]['password']
Which I can easily call from another class by calling the method:
assert getUserPassword({"email" : "email", "password" : "oldpass"}) is not None
And which the getUserPassword itself is returning a string

Removing quotes from mysql query in Python

I know that this question has been asked in the past, but thorough searching hasn't seemed to fix my issue. I'm probably just missing something simple, as I'm new to the Python-mysql connector supplied by mysql.
I have a Python script which accesses a mysql database, but I'm having issues with removing quotes from my query. Here is my code:
import mysql.connector
try:
db = mysql.connector.connect(user='root', password='somePassword', host='127.0.0.1', database='dbName')
cursor = db.cursor()
query = "select * from tags where %s = %s"
a = 'tag_id'
b = '0'
cursor.execute(query, (a, b))
print cursor
data = cursor.fetchall()
print data
except mysql.connector.Error as err:
print "Exception tripped..."
print "--------------------------------------"
print err
cursor.close()
db.close()
My database is set up properly (as I'll prove shortly).
My output for this program is:
MySQLCursor: select * from tags where 'tag_id' = '0'
[]
Yet when I change my query to not use variables, for example:
cursor.execute("select * from tags where tag_id = 0")
Then my output becomes:
MySQLCursor: select * from tags where tag_id = 0
[(0, u'192.168.1.110')]
To me, this means that the only difference between my Cursor queries are the quotes.
How do I remove them from the query?
Thanks in advance.
I personally believe this code is correct and safe, but you should be extremely skeptical of using code like this without carefully reviewing it yourself or (better yet) with the help of a security expert. I am not qualified to be such an expert.
Two important things I changed:
I changed b = '0' to b = 0 so it ends up as a number rather than a quoted string. (This part was an easy fix.)
I skipped the built-in parameterization for the column name and replaced it with my own slight modification to the escaping/quoting built in to mysql-connector. This is the scary part that should give you pause.
Full code below, but again, be careful with this if the column name is user input!
import mysql.connector
def escape_column_name(name):
# This is meant to mostly do the same thing as the _process_params method
# of mysql.connector.MySQLCursor, but instead of the final quoting step,
# we escape any previously existing backticks and quote with backticks.
converter = mysql.connector.conversion.MySQLConverter()
return "`" + converter.escape(converter.to_mysql(name)).replace('`', '``') + "`"
try:
db = mysql.connector.connect(user='root', password='somePassword', host='127.0.0.1', database='dbName')
cursor = db.cursor()
a = 'tag_id'
b = 0
cursor.execute(
'select * from tags where {} = %s'.format(escape_column_name(a)),
(b,)
)
print cursor
data = cursor.fetchall()
print data
except mysql.connector.Error as err:
print "Exception tripped..."
print "--------------------------------------"
print err
cursor.close()
db.close()
I encountered a similar problem using pymysql and have shown my working code here, hope this will help.
What I did is overwrite the escape method in class 'pymysql.connections.Connection', which obviously adds "'" arround your string.
better have shown my code:
from pymysql.connections import Connection, converters
class MyConnect(Connection):
def escape(self, obj, mapping=None):
"""Escape whatever value you pass to it.
Non-standard, for internal use; do not use this in your applications.
"""
if isinstance(obj, str):
return self.escape_string(obj) # by default, it is :return "'" + self.escape_string(obj) + "'"
if isinstance(obj, (bytes, bytearray)):
ret = self._quote_bytes(obj)
if self._binary_prefix:
ret = "_binary" + ret
return ret
return converters.escape_item(obj, self.charset, mapping=mapping)
config = {'host':'', 'user':'', ...}
conn = MyConnect(**config)
cur = conn.cursor()

Process very large 900M row MySQL table line by line with Python

I often need to process several hundred million rows of a MySQL table on a line by line basis using Python. I want a script that is robust and does not need to be monitored.
Below I pasted a script that classifying the language of the message field in my row. It utilizes the sqlalchemy and MySQLdb.cursors.SSCursor modules. Unfortunately this script consistently throws a 'Lost connection to MySQL server during query' error after 4840 rows when I run remotely and 42000 rows when I run locally.
Also, I have checked and max_allowed_packet = 32M on my MySQL server's /etc/mysql/my.cnf file as per the answers to this stackoverflow question Lost connection to MySQL server during query
Any advice for either fixing this error, or using another approach to use Python for processing very large MySQL files in a robust way would be much appreciated!
import sqlalchemy
import MySQLdb.cursors
import langid
schema = "twitterstuff"
table = "messages_en" #900M row table
engine_url = "mysql://myserver/{}?charset=utf8mb4&read_default_file=~/.my.cnf".format(schema)
db_eng = sqlalchemy.create_engine(engine_url, connect_args={'cursorclass': MySQLdb.cursors.SSCursor} )
langid.set_languages(['fr', 'de'])
print "Executing input query..."
data_iter = db_eng.execute("SELECT message_id, message FROM {} WHERE langid_lang IS NULL LIMIT 10000".format(table))
def process(inp_iter):
for item in inp_iter:
item = dict(item)
(item['langid_lang'], item['langid_conf']) = langid.classify(item['message'])
yield item
def update_table(update_iter):
count = 0;
for item in update_iter:
count += 1;
if count%10 == 0:
print "{} rows processed".format(count)
lang = item['langid_lang']
conf = item['langid_conf']
message_id = item['message_id']
db_eng.execute("UPDATE {} SET langid_lang = '{}', langid_conf = {} WHERE message_id = {}".format(table, lang, conf, message_id))
data_iter_upd = process(data_iter)
print "Begin processing..."
update_table(data_iter_upd)
According to MySQLdb developer Andy Dustman,
[When using SSCursor,] no new queries can be issued on the connection until
the entire result set has been fetched.
That post says that if you issue another query you will get a "commands out of sequence" error, which is not the error you are seeing. So I am not sure that the following will necessarily fix your problem. Nevertheless, it might be worth trying to remove SSCursor from your code and use the simpler default Cursor just to test if that is the source of the problem.
You could, for example, use LIMIT chunksize OFFSET n in your SELECT statement
to loop through the data set in chunks:
import sqlalchemy
import MySQLdb.cursors
import langid
import itertools as IT
chunksize = 1000
def process(inp_iter):
for item in inp_iter:
item = dict(item)
(item['langid_lang'], item['langid_conf']) = langid.classify(item['message'])
yield item
def update_table(update_iter, engine):
for count, item in enumerate(update_iter):
if count%10 == 0:
print "{} rows processed".format(count)
lang = item['langid_lang']
conf = item['langid_conf']
message_id = item['message_id']
engine.execute(
"UPDATE {} SET langid_lang = '{}', langid_conf = {} WHERE message_id = {}"
.format(table, lang, conf, message_id))
schema = "twitterstuff"
table = "messages_en" #900M row table
engine_url = ("mysql://myserver/{}?charset=utf8mb4&read_default_file=~/.my.cnf"
.format(schema))
db_eng = sqlalchemy.create_engine(engine_url)
langid.set_languages(['fr', 'de'])
for offset in IT.count(start=0, step=chunksize):
print "Executing input query..."
result = db_eng.execute(
"SELECT message_id, message FROM {} WHERE langid_lang IS NULL LIMIT {} OFFSET {}"
.format(table, chunksize, offset))
result = list(result)
if not result: break
data_iter_upd = process(result)
print "Begin processing..."
update_table(data_iter_upd, db_eng)

Categories

Resources