Adding parameters to SQLITE3 SELECT column queries python - python

I am trying to streamLine queries to SQLITE3. I use it for financial price modelling and so am re-using the same basic query alot, but have to keep changing the hard coding to get out different column queries each time. So I want a generic query where I just write in what I want once, then it spits out the columns as lists. This is a basic version of what I want but basically still hard coded so you can see what I am trying to create.
dbName = 'NASDAQ_Equities'
ticker = 'AAPL'
def pullDataTest(dbPathName, ticker, *args):
datep = []
openp = []
highp = []
db = sqlite3.connect(dbPathName + '.mydb', detect_types=sqlite3.PARSE_DECLTYPES | sqlite3.PARSE_COLNAMES, timeout=3)
cursor = db.cursor()
cursor.execute('''SELECT ''' + str(args) + ''' FROM ''' + ticker)
for row in cursor:
datep.append(row[0])
openp.append(row[1])
highp.append(row[2])
pullData(dbName, ticker, 'datep', 'openp', 'highp')
At the moment I am lost on how to enter an *args into the select statement as it rejects it because of the () brackets. Also what will be an issue is creating empty lists and appending to those lists from from an *args. Would it be better to create a order dict to append to, then brake that into lists at the end somehow? On returning values for use later down the track I was thinking of making them globals? Any suggestions? Thanks

Related

Why are there brackets around the response from database?

I was trying to understand the different values that we can get from database
'''
tmppass = db.execute("SELECT * from users WHERE username=:username", {"username": session['user_id']}).fetchone()
tmppass_1 = db.execute("SELECT password from users WHERE username=:username", {"username": session['user_id']}).fetchone()
old_password = request.form.get('old_password')
newpass_1 = request.form.get('new_password_1')
newpass_2 = request.form.get('new_password_2')
hashOfNewPass = str(pbkdf2_sha256.hash(newpass_1))
oldPassHash = pbkdf2_sha256.hash(old_password)
print(tmppass['password'])
print(tmppass_1)
'''
I am getting different results from the database for tmppass and tmppass_1
tmppass_1 = ('$pbkdf2-sha256$29000$bs1Z6z0HYOw9R4hR6t37nw$.ZtoRLUsZCYmkbRVNTiZt1uLLQwuJ.iyxrNcHg43SYA',)
tmppass['password'] = $pbkdf2-sha256$29000$bs1Z6z0HYOw9R4hR6t37nw$.ZtoRLUsZCYmkbRVNTiZt1uLLQwuJ.iyxrNcHg43SYA
[tmppass_1 which only gets the hash from the database is printing the string with the brackets][1]
Generally speaking, (x,) in Python is a tuple with a single element.
Since you are using SELECT *, a tuple is returned (even if there is only one column in the table). It is a design choice in order to achieve consistency between all SELECT * queries regardless of the actual number of columns (specifically, to make sure existing code does not break if a column is later added to a table used in a SELECT * query).

%s variable in Query Execution Python 3.8 (pymssql)

I have a python script with a basic GUI that logs into a DB and executes a query.
The Python script also asks for 1 parameter called "collection Name" which is taken from the tkinter .get function and is added as a %s inside the Query text. The result is that each time I can execute a query with a different "Collection name". This works and it is fine
Now, I want to add a larger string of Collection Names into my .get function so I can do cursor.execute a query with multiple collection names to get more complex data. But I am having issues with inputing multiple "collection names" into my app.
Below is a piece of my Query1, which has the %s variable that it then gets from the input to tkinter.
From #Session1
Join vGSMRxLevRxQual On(#Session1.SessionId = vGSMRxLevRxQual.SessionId)
Where vGSMRxLevRxQual.RxLevSub<0 and vGSMRxLevRxQual.RxLevSub>-190
and #Session1.CollectionName in (%s)
Group by
#Session1.Operator
Order by #Session1.Operator ASC
IF OBJECT_ID('tempdb..#SelectedSession1') IS NOT NULL DROP TABLE #SelectedSession1
IF OBJECT_ID('tempdb..#Session1') IS NOT NULL DROP TABLE #Session1
Here, is where I try to execute the query
if Query == "GSMUERxLevelSub" :
result = cursor.execute(GSMUERxLevelSub, (CollectionName,))
output = cursor.fetchmany
df = DataFrame(cursor.fetchall())
filename = "2021_H1 WEEK CDF GRAPHS().xlsx"
df1 = DataFrame.transpose(df, copy=False)
Lastly, here is where I get the value for the Collection name:
CollectionName = f_CollectionName.get()
enter image description here
enter code here
Your issues are due to a list/collection being a invalid parameter.
You'll need to transform collectionName
collection_name: list[str] = ['collection1', 'collection2']
new_collection_name = ','.join(f'"{c}"' for c in collection_name)
cursor.execute(sql, (new_collection_name,))
Not sure if this approach will be susceptible to SQL injection if that's a concern.
Edit:
Forgot the DBAPI would put another set of quotes around the parameters. If you can do something like:
CollectionName = ["foo", "bar"]
sql = f"""
From #Session1
Join vGSMRxLevRxQual On(#Session1.SessionId = vGSMRxLevRxQual.SessionId)
Where vGSMRxLevRxQual.RxLevSub<0 and vGSMRxLevRxQual.RxLevSub>-190
and #Session1.CollectionName in ({",".join(["%s"] * len(CollectionName))})
"""
sql += """
Group by
#Session1.Operator
Order by #Session1.Operator ASC
"""
cursor.execute(sql, (CollectionName,))
EDIT: Update to F-string

Retriveing intergers from MySQL using Python to preform mathfunctions

im very new to Python but want to preform some mathmatic functions using Python's libraries getting interger values from a mysql table i have running,
ive sucessfully established a connection using mysql.connector however im at a loss,
I can select and print Rows and columbs but im unsure of the Syntax to physically define my query as an "x" or "y" in order to preform mathmatic operations with the varible.
Any help would be greatly appreciated.
EDIT
sql_select_Query = "select * from ATABLE"
cursor = mySQLconnection .cursor()
cursor.execute(sql_select_Query)
records = cursor.fetchall()`
and
for row in records:
print("Name = ", row[1], )
print("X_num = ", row[2])
print("Y_num = ", row[3])
print("Signal_Strength = ", row[4], "\n")
cursor.close()
gives me as an example
Name = X,
X_num = Y,
Y_num = Z,
SS = Q
what i would prefer in my selection operation is to define the X , Y, Z, Q to a Global name that i could then use for atleast my application math operations using Numpy libraries for example being able to perform an operator
X*Y-Z+Q
I hope that is a bit clearer
From the gate, I would recommend following the advice of this thread highlighting the use of select *. Turning a field into an integer is possible with your SQL selection statement int the way of CAST or CONVERT. Sort of like this (my daily language is SQL Server; check out the mysql documentation for exacts):
sql_select_Query = "select Name, CAST(X as INT),CAST(Y as BIGINT) from ATABLE"
In my personal experience, SQL tends to age better than Python (tongue in cheek). Aside, if your SQL instance is on a server; I code to the workhorse as error catching is better.
But coming from it in the other direction, if you want these elements to be re-callable later, I'm suggest fetching your feedback into a dictionary.
Information about Python dictionaries can be found here. At least that way, you're pretty much working from a global but fairly structured set of captured data.
It is a bad idea to play with locals() and globals() if you don't exactly know what you're doing. Create a dictionary.
sql_select_Query = "select * from ATABLE"
cursor = mySQLconnection.cursor()
cursor.execute(sql_select_Query)
records = cursor.fetchall()
columns = [item[0] for item in cursor.description] # Grab the table column names
for record in records:
# Create a dictionary {column_name: value, ...} for each row
variable_dict = dict(zip(columns, record))
print("X variable is: ", variable_dict['X'])
# <Calculation here>
You can also configure MySQL to return values as a dictionary but this is probably an easier starting point.
This way, your "variable X" value would just be variable_dict['X'] and there's no need to make any global values other than the dictionary.

Using a string to define variable with tuples

I apologize if my question seems novice, but I have hit a roadblock when it comes to assigning variables based on a string for tuples in python (2.7).
In the past I have had no issues assigning a variable and using a string to give it a name (eg: rowId = '%sDays' %workoutMode). But in a tuple environment, I am having some issues.
I have three different tables I want to pull from a database and apply the same code to them. In this instance I want to pull the data and print it based on the string from the defined list. But I am having a problem assigning variables based on strings. Here is my code:
def workoutCycle():
catagories = 'Legs', 'Arms', 'Back'
for catagory in catagories:
conn = sqlite3.connect('workoutData.db')
c = conn.cursor()
c.execute ('SELECT round(%sDays, 1), round(%sDaysTotal, 1) FROM Profile_%s' %(catagory, catagory, catagory))
originalData = c.fetchall()
('%sDays' %catagory, '%sDaysTotal' %catagory) = tuple(originalData[0])
print originalData
print '%sDays' %catagory
print '%sDaysTotal' %catagory
This code returns:
SyntaxError: invalid syntax
I've tried different modifications and I'm not having luck. Is there a specific format that I am missing for strings/tuples?
**********EDIT**************
It seems like I wasn't very clear with what I was trying to do. Essentially I'm trying to create a loop to replace having to write a code out for each catagory. So for example I want to print the data pertaining to "Arms", this code works:
def armCycle():
conn = sqlite3.connect('workoutData.db')
c = conn.cursor()
c.execute ('SELECT round(ArmsDays, 1), round(ArmsDaysTotal, 1) FROM Profile_Arms')
#originalData = c.fetchall()
originalData = c.fetchall()
(ArmsDays, ArmsDaysTotal) = tuple(originalData[0])
print originalData
print ArmsDays
print ArmsDaysTotal
I'm trying to create a code that is a little more dynamic than just creating a function for each catagory. I'm sure I'm going about this the wrong way. I apologize, I'm new to programming.
It looks like you want to create new variables (ArmsDays, ArmsDaysTotal) that are named on the basis of the variable category that contains the value "Arms". The simplest and usual way to do this is not to create individual variables dynamically, but instead use a dict.
c.execute ('SELECT round(%sDays, 1), round(%sDaysTotal, 1) FROM Profile_%s' %(category, category, category))
originalData = c.fetchall()
my_dynamic_data = {}
my_dynamic_data['%sDays' %category] = originalData[0][0]
my_dynamic_data['%sDaysTotal' %category] = originalData[0][1]
The keys of the dict are formed on exactly the same principle as the way you construct the column names in the select statement. Then, instead of
print '%sDays' %category
print '%sDaysTotal' %category
do
print my_dynamic_data['%sDays' %category]
print my_dynamic_data['%sDaysTotal' %category]
You can create new variables dynamically in Python, but if you need to ask how, you shouldn't be doing that. You need a very good reason not to use a dict for this.

Read optimisation cassandra using python

I have a table with the following model:
CREATE TABLE IF NOT EXISTS {} (
user_id bigint ,
pseudo text,
importance float,
is_friend_following bigint,
is_friend boolean,
is_following boolean,
PRIMARY KEY ((user_id), is_friend_following)
);
I also have a table containing my seeds. Those (20) users are the starting point of my graph. So I select their ID and search in the table above to get their Followers and friends, and from there I build my graph (networkX).
def build_seed_graph(cls, name):
obj = cls()
obj.name = name
query = "SELECT twitter_id FROM {0};"
seeds = obj.session.execute(query.format(obj.seed_data_table))
obj.graph.add_nodes_from(obj.seeds)
for seed in seeds:
query = "SELECT friend_follower_id, is_friend, is_follower FROM {0} WHERE user_id={1}"
statement = SimpleStatement(query.format(obj.network_table, seed), fetch_size=1000)
friend_ids = []
follower_ids = []
for row in obj.session.execute(statement):
if row.friend_follower_id in obj.seeds:
if row.is_friend:
friend_ids.append(row.friend_follower_id)
if row.is_follower:
follower_ids.append(row.friend_follower_id)
if friend_ids:
for friend_id in friend_ids:
obj.graph.add_edge(seed, friend_id)
if follower_ids:
for follower_id in follower_ids:
obj.graph.add_edge(follower_id, seed)
return obj
The problem is that the time it takes to build the graph is too long and I would like to optimize it.
I've got approximately 5 millions rows in my table 'network_table'.
I'm wondering if it would be faster for me if instead of doing a query with a where clauses to just do a single query on whole table? Will it fit in memory? Is that a good Idea? Are there better way?
I suspect the real issue may not be the queries but rather the processing time.
I'm wondering if it would be faster for me if instead of doing a query with a where clauses to just do a single query on whole table? Will it fit in memory? Is that a good Idea? Are there better way?
There should not be any problem with doing a single query on the whole table if you enable paging (https://datastax.github.io/python-driver/query_paging.html - using fetch_size). Cassandra will return up to the fetch_size and will fetch additional results as you read them from the result_set.
Please note that if you have many rows in the table that are non seed related then a full scan may be slower as you will receive rows that will not include a "seed"
Disclaimer - I am part of the team building ScyllaDB - a Cassandra compatible database.
ScyllaDB have published lately a blog on how to efficiently do a full scan in parallel http://www.scylladb.com/2017/02/13/efficient-full-table-scans-with-scylla-1-6/ which applies to Cassandra as well - if a full scan is relevant and you can build the graph in parallel than this may help you.
It seems like you can get rid of the last 2 if statements, since you're going through data that you already have looped through once:
def build_seed_graph(cls, name):
obj = cls()
obj.name = name
query = "SELECT twitter_id FROM {0};"
seeds = obj.session.execute(query.format(obj.seed_data_table))
obj.graph.add_nodes_from(obj.seeds)
for seed in seeds:
query = "SELECT friend_follower_id, is_friend, is_follower FROM {0} WHERE user_id={1}"
statement = SimpleStatement(query.format(obj.network_table, seed), fetch_size=1000)
for row in obj.session.execute(statement):
if row.friend_follower_id in obj.seeds:
if row.is_friend:
obj.graph.add_edge(seed, row.friend_follower_id)
elif row.is_follower:
obj.graph.add_edge(row.friend_follower_id, seed)
return obj
This also gets rid of many append operations on lists that you're not using, and should speed up this function.

Categories

Resources