Why is SQL Inserting the Incorrect Value? - python

I'm writing a SQL query in Python that inserts new values to the end of a SQL table.
month = str(pd.to_datetime(datetime.datetime.strptime(current_month, "%B").replace(year = current_year)))[:7]
more_orders = int(total.iloc[-1][0] - total.iloc[-1][4] * total.iloc[-1][0])
more_sales = total.iloc[-1][1] - total.iloc[-1][5] * total.iloc[-1][1]
st.write(more_sales)
st.write(more_orders)
insertQuery = "insert into TABLE values ({}, 'First-time', {}, {}, 0.0)".format(month, more_orders, more_sales)
insertStmt = ibm_db.exec_immediate(connection, insertQuery)
customer_type = 'Returning'
more_orders = int(total.iloc[-1][0])
more_sales = total.iloc[-1][1]
insertQuery2 = "INSERT INTO TABLE VALUES ({}, 'Returning', {}, {}, 0.0)".format(month, more_orders, more_sales)
insertStmt2 = ibm_db.exec_immediate(connection, insertQuery2)
current_month and current_year are user-defined values.
There are two problems with the code that I don't understand.
When month = '2020-08', SQL records it as just '2012'.
Why does this happen? I even printed the value in the variable to make sure that it has '2020'-08'. It does.
Also, SQL inserts these values to the head of the table. I want to insert the values to the end of the table.
I'm using the IBM DB2 database.

2020-08 isn't a string, it's an arithmetic expression - "two thousand and twenty minus eight", which is 2012.
You could surround this expression with quotes so it's treated as a string, but the proper solution would probably be to use bind variables.

Related

Illegal Variable Name/Number when Passing in Python List

I'm trying to run SQL statements through Python on a list.
By passing in a list, in this case date. Since i want to run multiple SELECT SQL queries and return them.
I've tested this by passing in integers, however when trying to pass in a date I am getting ORA-01036 error. Illegal variable name/number. I'm using an Oracle DB.
cursor = connection.cursor()
date = ["'01-DEC-21'", "'02-DEC-21'"]
sql = "select * from table1 where datestamp = :date"
for item in date:
cursor.execute(sql,id=item)
res=cursor.fetchall()
print(res)
Any suggestions to make this run?
You can't name a bind variable date, it's an illegal name. Also your named variable in cursor.execute should match the bind variable name. Try something like:
sql = "select * from table1 where datestamp = :date_input"
for item in date:
cursor.execute(sql,date_input=item)
res=cursor.fetchall()
print(res)
Some recommendation and warnings to your approach:
you should not depend on your default NLS date setting, while binding a String (e.g. "'01-DEC-21'") to a DATE column. (You probably need also remone one of the quotes).
You should ommit to fetch data in a loop if you can fetch them in one query (using an IN list)
use prepared statement
Example
date = ['01-DEC-21', '02-DEC-21']
This generates the query that uses bind variables for your input list
in_list = ','.join([f" TO_DATE(:d{ind},'DD-MON-RR','NLS_DATE_LANGUAGE = American')" for ind, d in enumerate(date)])
sql_query = "select * from table1 where datestamp in ( " + in_list + " )"
The sql_query generate is
select * from table1 where datestamp in
( TO_DATE(:d0,'DD-MON-RR','NLS_DATE_LANGUAGE = American'), TO_DATE(:d1,'DD-MON-RR','NLS_DATE_LANGUAGE = American') )
Note that the INlist contains one bind variable for each member of your input list.
Note also the usage of to_date with explicite mask and fixing the language to avoid problems with interpretation of the month abbreviation. (e.g. ORA-01843: not a valid month)
Now you can use the query to fetch the data in one pass
cur.prepare(sql_query)
cur.execute(None, date)
res = cur.fetchall()

How to update an integer value in an SQL record?

I'm pretty new to SQL and the Sqlite3 module and I want to edit the timestamps of all the records in my DB randomly.
import sqlite3
from time import time
import random
conn = sqlite3.connect('database.db')
c = sqlite3.Cursor(conn)
ts_new = round(time())
ts_old = 1537828957
difference = ts_new - ts_old
for i in range(1,309):
#getting a new, random timestamp
new_ts = ts_old + random.randint(0, difference)
t = (new_ts, i)
c.executemany("UPDATE questions SET timestamp = (?) WHERE rowid = (?)", t)
#conn.commit()
When run, I get a ValueError: parameters are of unsupported type.
To add the timestamp value originally I set t to a tuple and the current UNIX timestamp as the first value of a it e.g (1537828957, ). Is this error displaying because I've used two (?) unlike the single one I used in the statement to add the timestamps to begin with?
You're using executemany instead of execute. executemany takes an iterator of tuples and executes the query for each tuple.
You want to use execute instead, it executes the query once using your tuple.
c.execute('UPDATE questions SET timestamp = (?) where rowid = (?)', t)

Passing a column name in a SELECT statement in Python

if count == 1:
cursor.execute("SELECT * FROM PacketManager WHERE ? = ?", filters[0], parameters[0])
all_rows = cursor.fetchall()
elif count == 2:
cursor.execute("SELECT * FROM PacketManager WHERE ? = ? AND ? = ?", filters[0], parameters[0], filters[1], parameters[1])
all_rows = cursor.fetchall()
elif count == 3 :
cursor.execute("SELECT * FROM PacketManager WHERE ? = ? AND ? = ? AND ? = ?", filters[0], parameters[0], filters[1], parameters[1], filters[2], parameters[2])
all_rows = cursor.fetchall()
This is a code snippet in my program. What I'm planning to do is pass the column name and the parameter in the query.
The filters array contains the columnnames, the parameter array contains the parameters. The count is the number of filters set by the user. The filters and paramters array are already ready and have no problem. I just need to pass it to the query for it to execute. This give me an error of "TypeError: function takes at most 2 arguments"
You cannot use SQL parameters to interpolate column names. You'll have to use classic string formatting for those parts. That's the point of SQL parameters; they quote values so they cannot possibly be interpreted as SQL statements or object names.
The following, using string formatting for the column name works, but be 100% certain that the filters[0] value doesn't come from user input:
cursor.execute("SELECT * FROM PacketManager WHERE {} = ?".format(filters[0]), (parameters[0],))
You probably want to validate the column name against a set of permissible column names, to ensure no injection can take place.
You can only set parameters using ?, not table or column names.
You could build a dict with predefined queries.
queries = {
"foo": "SELECT * FROM PacketManager WHERE foo = ?",
"bar": "SELECT * FROM PacketManager WHERE bar = ?",
"foo_bar": "SELECT * FROM PacketManager WHERE foo = ? AND bar = ?",
}
# count == 1
cursor.execute(queries[filters[0], parameters[0])
# count == 2
cursor.execute(queries[filters[0] + "_" + queries[filters[1], parameters[0])
This approach will make you save from SQL injection in filters[0].

Update PostgreSQL database with daily stock prices in Python

So I found a great script over at QuantState that had a great walk-through on setting up my own securities database and loading in historical pricing information. However, I'm not trying to modify the script so that I can run it daily and have the latest stock quotes added.
I adjusted the initial data load to just download 1 week worth of historicals, but I've been having issues with writing the SQL statement to see if the row exists already before adding. Can anyone help me out with this. Here's what I have so far:
def insert_daily_data_into_db(data_vendor_id, symbol_id, daily_data):
"""Takes a list of tuples of daily data and adds it to the
database. Appends the vendor ID and symbol ID to the data.
daily_data: List of tuples of the OHLC data (with
adj_close and volume)"""
# Create the time now
now = datetime.datetime.utcnow()
# Amend the data to include the vendor ID and symbol ID
daily_data = [(data_vendor_id, symbol_id, d[0], now, now,
d[1], d[2], d[3], d[4], d[5], d[6]) for d in daily_data]
# Create the insert strings
column_str = """data_vendor_id, symbol_id, price_date, created_date,
last_updated_date, open_price, high_price, low_price,
close_price, volume, adj_close_price"""
insert_str = ("%s, " * 11)[:-2]
final_str = "INSERT INTO daily_price (%s) VALUES (%s) WHERE NOT EXISTS (SELECT 1 FROM daily_price WHERE symbol_id = symbol_id AND price_date = insert_str[2])" % (column_str, insert_str)
# Using the postgre connection, carry out an INSERT INTO for every symbol
with con:
cur = con.cursor()
cur.executemany(final_str, daily_data)
Some notes regarding your code above:
It's generally easier to defer to now() in pure SQL than to try in Python whenever possible. It avoids lots of potential pitfalls with timezones, library differences, etc.
If you construct a list of columns, you can dynamically generate a string of %s's based on its size, and don't need to hardcode the length into a repeated string with is then sliced.
Since it appears that insert_daily_data_into_db is meant to be called from within a loop on a per-stock basis, I don't believe you want to use executemany here, which would require a list of tuples and is very different semantically.
You were comparing symbol_id to itself in the sub select, instead of a particular value (which would mean it's always true).
To prevent possible SQL Injection, you should always interpolate values in the WHERE clause, including sub selects.
Note: I'm assuming that you're using psycopg2 to access Postgres, and that the primary key for the table is a tuple of (symbol_id, price_date). If not, the code below would need to be tweaked at least a bit.
With those points in mind, try something like this (untested, since I don't have your data, db, etc. but it is syntactically valid Python):
def insert_daily_data_into_db(data_vendor_id, symbol_id, daily_data):
"""Takes a list of tuples of daily data and adds it to the
database. Appends the vendor ID and symbol ID to the data.
daily_data: List of tuples of the OHLC data (with
adj_close and volume)"""
column_list = ["data_vendor_id", "symbol_id", "price_date", "created_date",
"last_updated_date", "open_price", "high_price", "low_price",
"close_price", "volume", "adj_close_price"]
insert_list = ['%s'] * len(column_str)
values_tuple = (data_vendor_id, symbol_id, daily_data[0], 'now()', 'now()', daily_data[1],
daily_data[2], daily_data[3], daily_data[4], daily_data[5], daily_data[6])
final_str = """INSERT INTO daily_price ({0})
VALUES ({1})
WHERE NOT EXISTS (SELECT 1
FROM daily_price
WHERE symbol_id = %s
AND price_date = %s)""".format(', '.join(column_list), ', '.join(insert_list))
# Using the postgre connection, carry out an INSERT INTO for every symbol
with con:
cur = con.cursor()
cur.execute(final_str, values_tuple, values_tuple[1], values_tuple[2])

Python Sqlite3 insert operation with a list of column names

Normally, if i want to insert values into a table, i will do something like this (assuming that i know which columns that the values i want to insert belong to):
conn = sqlite3.connect('mydatabase.db')
conn.execute("INSERT INTO MYTABLE (ID,COLUMN1,COLUMN2)\
VALUES(?,?,?)",[myid,value1,value2])
But now i have a list of columns (the length of list may vary) and a list of values for each columns in the list.
For example, if i have a table with 10 columns (Namely, column1, column2...,column10 etc). I have a list of columns that i want to update.Let's say [column3,column4]. And i have a list of values for those columns. [value for column3,value for column4].
How do i insert the values in the list to the individual columns that each belong?
As far as I know the parameter list in conn.execute works only for values, so we have to use string formatting like this:
import sqlite3
conn = sqlite3.connect(':memory:')
conn.execute('CREATE TABLE t (a integer, b integer, c integer)')
col_names = ['a', 'b', 'c']
values = [0, 1, 2]
conn.execute('INSERT INTO t (%s, %s, %s) values(?,?,?)'%tuple(col_names), values)
Please notice this is a very bad attempt since strings passed to the database shall always be checked for injection attack. However you could pass the list of column names to some injection function before insertion.
EDITED:
For variables with various length you could try something like
exec_text = 'INSERT INTO t (' + ','.join(col_names) +') values(' + ','.join(['?'] * len(values)) + ')'
conn.exec(exec_text, values)
# as long as len(col_names) == len(values)
Of course string formatting will work, you just need to be a bit cleverer about it.
col_names = ','.join(col_list)
col_spaces = ','.join(['?'] * len(col_list))
sql = 'INSERT INTO t (%s) values(%s)' % (col_list, col_spaces)
conn.execute(sql, values)
I was looking for a solution to create columns based on a list of unknown / variable length and found this question. However, I managed to find a nicer solution (for me anyway), that's also a bit more modern, so thought I'd include it in case it helps someone:
import sqlite3
def create_sql_db(my_list):
file = 'my_sql.db'
table_name = 'table_1'
init_col = 'id'
col_type = 'TEXT'
conn = sqlite3.connect(file)
c = conn.cursor()
# CREATE TABLE (IF IT DOESN'T ALREADY EXIST)
c.execute('CREATE TABLE IF NOT EXISTS {tn} ({nf} {ft})'.format(
tn=table_name, nf=init_col, ft=col_type))
# CREATE A COLUMN FOR EACH ITEM IN THE LIST
for new_column in my_list:
c.execute('ALTER TABLE {tn} ADD COLUMN "{cn}" {ct}'.format(
tn=table_name, cn=new_column, ct=col_type))
conn.close()
my_list = ["Col1", "Col2", "Col3"]
create_sql_db(my_list)
All my data is of the type text, so I just have a single variable "col_type" - but you could for example feed in a list of tuples (or a tuple of tuples, if that's what you're into):
my_other_list = [("ColA", "TEXT"), ("ColB", "INTEGER"), ("ColC", "BLOB")]
and change the CREATE A COLUMN step to:
for tupl in my_other_list:
new_column = tupl[0] # "ColA", "ColB", "ColC"
col_type = tupl[1] # "TEXT", "INTEGER", "BLOB"
c.execute('ALTER TABLE {tn} ADD COLUMN "{cn}" {ct}'.format(
tn=table_name, cn=new_column, ct=col_type))
As a noob, I can't comment on the very succinct, updated solution #ron_g offered. While testing, though I had to frequently delete the sample database itself, so for any other noobs using this to test, I would advise adding in:
c.execute('DROP TABLE IF EXISTS {tn}'.format(
tn=table_name))
Prior the the 'CREATE TABLE ...' portion.
It appears there are multiple instances of
.format(
tn=table_name ....)
in both 'CREATE TABLE ...' and 'ALTER TABLE ...' so trying to figure out if it's possible to create a single instance (similar to, or including in, the def section).

Categories

Resources