How can I avoid SQL issue? - python

When I run my query I got this issue --- pyodbc.ProgrammingError: ('42000', '[42000] [IBM][System i Access ODBC Driver][DB2 for i5/OS]SQL0010 - String constant beginning " " not delimited. (-10) (SQLExecDirectW)')
Can someone help me please!
cursor = db2_conn.cursor()
cursor.execute
("select * from qs36f.DSTHSTP join qs36f.calendar on date_ccyymmd = dhindt join qs36f.itemp i on DHITM#=i.ZIITM# join qs36f.subcl2p on i.zisbcl = s2sbcl and i.ziclss = s2clss join qs36f.subclsp on sbclss=s2clss and SBSBCL=s2sbcl where date_iso between(current date - 10 day) and current date and DHCUS# in (" + open_stores + ") and dhqtss >= 1 and SBDEPT='MT' AND dhclss = " + class_nbr + " and ((dhqtss*dhrt5s)*DHPACK) > " + start_range + "")
I tried run this query and get some data from db for testing. With this query I should get store number, item number and price greater than 250.

This is a classic SQL injection scenario caused by concatenating unchecked input to generate queries. String concatenation or interpolation are the root cause for this problem and can't be solved with any amount of quoting or escaping.
The real and easy solution is to use parameterized queries. Another significant improvement is creating the query string separately from the call. Putting everything in a single line is neither cleaner, faster nor easier.
sql="""select *
from qs36f.DSTHSTP
join qs36f.calendar on date_ccyymmd = dhindt
join qs36f.itemp i on DHITM#=i.ZIITM#
join qs36f.subcl2p on i.zisbcl = s2sbcl and i.ziclss = s2clss
join qs36f.subclsp on sbclss=s2clss and SBSBCL=s2sbcl
where date_iso between(current date - 10 day) and current date
and DHCUS# in (?,?,?)
and dhqtss >= 1
and SBDEPT='MT'
AND dhclss = ?
and ((dhqtss*dhrt5s)*DHPACK) > ?
"""
rows=cursor.execute(sql,shop1,shop2,shop3,class_nbr,start_range)
? specifies query parameters by position. Parameter values never become part of the query. They are sent to the server as strongly typed values (integers, dates, flaots) outside the query itself. The database compiles the query into a parameterized execution plan and then executes that with the parameter values.
This has several benefits :
Obviously, SQL injections are eliminated as the values never become part of the string.
Formatting errors are eliminated because numbers and dates are passed as binary values. You no longer have to worry about 34,56, MM/DD/YYYY or DD/MM/YYYY. You can compare date fields directly against date values, numeric fields with numbers.
The database can reuse already compiled execution plans for the same query string. That offers significant benefits for big, busy systems.

Related

How to drop a table and remake it with more columns in SQLAlchemy

I've got this code. It does what I want, but it only works the first time I run it while the second time around it enters an infinite loop for some reason
Using FastApi and SQLAlchemy with MySQL
As an alternative to this, just to avoid the XY issue, what I am actually intending to do is make a table that contains variable columns (as populated by table "Attributes"). This could be a view, but I haven't really found a way to make a view that contains everything I want.
The endgoal is to connect my database with all relevant columns and rows in a single view or table and connect that to a software called Altium
def get_altium_plugin():
engine.execute("DROP TABLE IF EXISTS altium_plugin")
aRows = session.query(Attributes).group_by(Attributes.name).all()
cRows = session.query(Component_Attributes).all()
concat = " "
x = 0
while x < len(aRows):
concat = concat + (aRows[x].name + " VARCHAR(20), ")
x = x+1
print(concat)
concat = concat[:-2]
engine.execute("CREATE TABLE IF NOT EXISTS altium_plugin (id INTEGER PRIMARY KEY AUTO_INCREMENT," + concat + ")")
x = 0
y = 0
z = 1
while x < len(aRows):
while y < len(cRows):
if cRows[y].attribute_id == aRows[x].id:
dbstring = "INSERT INTO altium_plugin (" + aRows[x].name + ") VALUES ('" + cRows[y].value + "')"
engine.execute(dbstring)
print("X :" + str(x) + ", Y: " + str(y) + ", " + dbstring)
z = z + 1
y = y+1
y = 0
x = x+1
rows = session.query(Altium_Plugin).all()
return rows
I've got the above code, I've tried doing it with non-raw sql and that throws errors, I've done it directly in the SQL side as a script and it works with that raw sql, but the second time that you run this def after starting the python script it gets caught by the
engine.execute("DROP TABLE IF EXISTS altium_plugin")
And just loops infinitely, or at least it doesn't continue beyond that point while trying to process something
EDIT: It seems that after running the code once, I'm no longer able to use scripts directly within MySQL Workbench until I stop my python. The engine is probably continuing to work with SQL but without doing anything other than taking up space and processing power, how can I tell the engine to stop without also killing the engine outright?
DROP TABLE ...
... just loops infinitely
I wouldn't be surprised to find the occasional
lock outstanding if you query the locks table.
An ALTER or DROP will have to acquire an Exclusive
lock before it can begin.
It doesn't loop, it just hangs,
patiently waiting for the lock to be granted.
I didn't notice any COMMIT or ROLLBACK statements in your code.
Dropping the TCP connection,
or bouncing the backend DB daemon,
are other (more violent) ways of releasing locks,
including Reader locks.
Notice that your interactive workbench
can hold un-committed transaction locks, as well.
Recommend you COMMIT before attempting DDL.

Leading Zeroes on Char column are Not being Persisted (Python + SQL)

The following function assigns a unique key (derived from a SQL table) to files so they will comply with a naming convention
def assign(fList, p):
for i in fList:
p += 1
lz = leadingZero(p)
oldName = fileDirPath + fr'\{i}'
if lz == 1:
newName = fileDirPath + r'<prefix value>' + str(p) + '<suffix value>'
print(newName)
else:
newName = fileDirPath + r'<prefix value>' + str(p) + '<suffix value>'
print(newName)
if leadingZero(p) == 1:
sqlConnectWrite('0' + str(p))
else:
sqlConnectWrite(str(p))
In order to properly comply with the naming convention the key 'p' must always be 5 digits, and have a leading zero if the key value is less than 10,000. The following function sets an integer "lz" equal to 1 if a leading zero needs to be added, and 0 if it does not need to be added.
def leadingZero(num):
lz = 0
if num < 10000:
lz = 1
elif num >= 10000:
lz = 0
else:
logging.error("Leading Zero Boolean: something has gone terribly wrong")
print("ERROR: Invalid Integer Passed, please email <email>")
return lz
The first function (def assign) then passes the last key assigned to the following function so that it can update the SQL table that stores the most recent key value, so we can keep track of what key values have been assigned
def sqlConnectWrite(pFinal):
try:
conn = pyodbc.connect('Driver={SQL Server};'
'Server=<server>;'
'Database=<database>;'
r'UID=<user>;'
'PWD=<pass>;'
'Trusted_Connection=yes;')
cursor = conn.cursor()
print("SQL Connection Successful!")
print("Running Query....")
print('SQL WRITE OPERATION INITIATED')
print(pFinal)
cursor.execute(f'UPDATE <SQL TABLE> SET [Last Used Number] = {str(pFinal)}')
conn.commit()
except pyodbc.Error:
logging.exception('SQL: Exception thrown in write process')
finally:
print('SQL Operations Successful')
conn.close()
Despite my best efforts, when I update the SQL table, the p value seems to persistently revert back to an integer, which removes the leading zero (shown below). The SQL table value is an nchar(5) data type but I cannot seem to find a way to update the table such that the leading zero is retained. I cannot determine why this is the case.
SQL Table
String values in an SQL expression need to be surrounded with single quotes. Otherwise, they are interpreted as integers, and integers don't have leading zeros.
cursor.execute(f"UPDATE <SQL TABLE> SET [Last Used Number] = '{pFinal:05d}'")
Quoting is vitally important in an SQL context. Even better would be to get in the habit of allowing your database connector to do the substitution:
cursor.execute("UPDATE <SQL TABLE> SET [Last Used Number] = ?", (f"{pFinal:05d}",))
Assuming SQL Server uses ? for substitution. Some databases use %s.

SQLite3 How to Select first 100 rows from database, then the next 100

Currently I have database filled with 1000s of rows.
I want to SELECT the first 100 rows, and then select the next 100, then the next 100 and so on...
So far I have:
c.execute('SELECT words FROM testWords')
data = c.fetchmany(100)
This allows me to get the first 100 rows, however, I can't find the syntax for selecting the next 100 rows after that, using another SELECT statement.
I've seen it is possible with other coding languages, but haven't found a solution with Python's SQLite3.
When you are using cursor.fetchmany() you don't have to issue another SELECT statement. The cursor is keeping track of where you are in the series of results, and all you need to do is call c.fetchmany(100) again until that produces an empty result:
c.execute('SELECT words FROM testWords')
while True:
batch = c.fetchmany(100)
if not batch:
break
# each batch contains up to 100 rows
or using the iter() function (which can be used to repeatedly call a function until a sentinel result is reached):
c.execute('SELECT words FROM testWords')
for batch in iter(lambda: c.fetchmany(100), []):
# each batch contains up to 100 rows
If you can't keep hold of the cursor (say, because you are serving web requests), then using cursor.fetchmany() is the wrong interface. You'll instead have to tell the SELECT statement to return only a selected window of rows, using the LIMIT syntax. LIMIT has an optional OFFSET keyword, together these two keywords specify at what row to start and how many rows to return.
Note that you want to make sure that your SELECT statement is ordered so you get a stable result set you can then slice into batches.
batchsize = 1000
offset = 0
while True:
c.execute(
'SELECT words FROM testWords ORDER BY somecriteria LIMIT ? OFFSET ?',
(batchsize, offset))
batch = list(c)
offset += batchsize
if not batch:
break
Pass the offset value to a next call to your code if you need to send these batches elsewhere and then later on resume.
sqlite3 is nothing to do with Python. It is a standalone database; Python just supplies an interface to it.
As a normal database, sqlite supports standard SQL. In SQL, you can use LIMIT and OFFSET to determine the start and end for your query. Note that if you do this, you should really use an explicit ORDER BY clause, to ensure that your results are consistently ordered between queries.
c.execute('SELECT words FROM testWords ORDER BY ID LIMIT 100')
...
c.execute('SELECT words FROM testWords ORDER BY ID LIMIT 100 OFFSET 100')
You can crate iterator and call it multiple times:
def ResultIter(cursor, arraysize=100):
while True:
results = cursor.fetchmany(arraysize)
if not results:
break
for result in results:
yield result
Or simply like this for returning the first 5 rows:
num_rows = 5
cursor = dbconn.execute("SELECT words FROM testWords" )
for row in cursor.fetchmany(num_rows):
print( "Words= " + str( row[0] ) + "\n" )

Insert indefinite amount of values into sqlite

I have a list and len(list) is say 199 but it can increase during runtime at the same pace that the columns of my table will increase.
I want to INSERT these elements in a table in sqlite.
Code so far:
conn.execute('INSERT INTO Financial_data VALUES (?)',list)
I have read the docs and other s.o questions but I cannot place 199 question marks over there and even if I could, maybe after 20 seconds of running there will be 200 columns, what then?
What I want to do is fill all the columns of a row with data
Before the above code I have this, which inserts in a list elements for every header that exists in the sql database and if my other source has no value for one of the headers it appends a None value so I could be able to dump an entire row at once.
if header in lista_header_tabel:
lista_valori.append(valoare)
else:
lista_valori.append(None)
This is happening because the insert part of the program is going really slow and I don't know why. I tried to include them inside a conn.execute("begin") and it improves performance but...
Shouldnt sqlite be able to handle sizes of say 20-50K rows with 200 columns? At this point I have better performance using shelve or json than using sqlite.
Where am I going wrong?
Current working solution but slow after 2k rows:
conn.execute("begin")
for celula in rand:
if sheet_results.cell(row=1,column=celula.col_idx).value in lista_header_tabel and celula.value!=None:
header_coloana=sheet_results.cell(row=1,column=celula.col_idx).value
valoare_aferenta=str(celula.value).replace('"','^')
nume_fara_ghilimele=str(rand[1].value).replace('"','^')
query='UPDATE Financial_data SET "' + header_coloana + '" = "' + valoare_aferenta + '" WHERE `Company name`="' + nume_fara_ghilimele + '"'
#print(query)
conn.execute(query)
conn.commit()

Any recommendations to improve this function?

I am very new to working with SQL queries. Any suggestions to improve this bit of code:
(by the way, I really don't care about sql security here; this is a bit of code that will be in a pyexe file connecting to a local sqlite file - so it doesnt make sense to worry about security of the query here).
def InitBars(QA = "GDP1POP1_20091224_gdp", QB = "1 pork", reset = False):
global heights, values
D, heights, values, max, = [], {}, {}, 0.0001
if reset: GHolder.remove()
Q = "SELECT wbcode, Year, "+QA+" FROM DB WHERE commodity='"+QB+"' and "+QA+" IS NOT 'NULL'"
for i in cursor.execute(Q):
D.append((str(i[0]) + str(i[1]), float(i[2])))
if float(i[2]) > max: max = float(i[2])
for (i, n) in D: heights[i] = 5.0 / max * n; values[i] = n
Gui["YRBox_Slider"].set(0.0)
Gui["YRBox_Speed"].set(0.0)
after following the advices, this is what I got:
def InitBars(QA = "GDP1POP1_20091224_gdp", QB = "1 pork", reset = False):
global heights, values; D, heights, values, max, = [], {}, {}, 0.0001
if reset: GHolder.remove()
Q = "SELECT wbcode||Year, %s FROM DB WHERE commodity='%s' and %s IS NOT 'NULL'" % (QA, QB, QA)
for a, b in cursor.execute(Q):
if float(b) > max: max = float(b)
values[a] = float(b)
for i in values: heights[i] = 5.0 / max * values[i]
Gui["YRBox_Slider"].set(0.0); Gui["YRBox_Speed"].set(0.0)
If this is a one-off script where you totally trust all of the input data and you just need to get a job done, then fine.
If this is part of a system, and this is indicative of the kind of code in it, there are several problems:
Don't construct SQL queries by appending strings. You said that you don't care about security, but this is such a big problem and so easily solved, then really -- you should do it right all of the time
This function seems to use and manipulate global state. Again, if this is a small one-time use script, then go for it -- in systems that span just a few files, this becomes impossible to maintain.
Naming conventions --- not following any consistency in capitalization
Names of things are not helpful at all. QA, D, QB, -- QA and QB don't even seem to be the same kind of thing -- one is a field, and the other is a value.
All kinds of questionable things are uncommented -- why is max .0001? What the heck is GHolder? What could that loop be doing at the end? Really, the code should be clearer, but if not, throw the maintainer a bone.
Use more descriptive variable names than QA and QB.
Comment the code.
Don't put multiple statements in the same line
Try not to use globals. Use member variables instead.
if QA and QB may come from user input, don't use them to build SQL queries
You should check for SQL injection. Make sure that there's no SQL statement in QA. Also you should probably add slashes if it applies.
Use
Q = "SELECT wbcode, Year, %s FROM DB WHERE commodity='%s' and %s IS NOT 'NULL'" % (QA, QB, QA)
instead:
Q = "SELECT wbcode, Year, "+QA+" FROM DB WHERE commodity='"+QB+"' and "+QA+" IS NOT 'NULL'"
Care about security (sql injection).
Look at any ORM (SqlAlchemy, for example). It makes things easy :)

Categories

Resources