I need to pass in a date '20200303' into an SQL table reading jinja.
the python script is as follows:
record_date = '20170303'
sql_data = {'date':record_date}
for file in files_sql:
with open(file) as file_reader:
sql_template = file_reader.read()
sql_template = jinja(sql_template)
sql_query = sql_template.render(data=sql_data)
spark.sql(sql_query)
the sql query file (query_A.sql) it's reading looks like this:
SELECT * FROM table where date <= {{date}}
however this is not working and is returning 0 rows. what am I doing wrong here?
EDIT: fixed key from record_date to date, but still having issues
Related
python & mysql
I am making a query on MySQL database in python module, as follows :
qry = "select qtext,a1,a2,a3,a4,rightanswer from question where qno = 1 ")
mycursor.execute(qry)
myresult = mycursor.fetchone()
qtext.insert('1', myresult[0])
I access the fields by their index number (i.e myresult[0])
my question is how can I access fields by their field-name instead of their index in the query ?
I have to add the following line before executing the query
mycursor = mydb.cursor(dictionary=True)
this line converts the query result to a dictionary that enabled me to access fields by their names names instead of index as follows
qtext.insert('1', myresult["qtext"])
qanswer1.insert('1',myresult["a1"]) # working
qanswer2.insert('1',myresult["a2"]) # working
qanswer3.insert('1',myresult["a3"]) # working
qanswer4.insert('1',myresult["a4"]) # working
r = int(myresult["rightanswer"])
Here is your answer: How to retrieve SQL result column value using column name in Python?
cursor.execute("SELECT name, category FROM animal")
result_set = cursor.fetchall()
for row in result_set:
print "%s, %s" % (row["name"], row["category"])```
First of all I am trying to retrieve a list of all possible databases, that works fine.
In the second part it executes a query for each database in the list. And it will give me back the name and create_Date for each database where the create_Date is equal or greater than 01-01-2020.
So when I when do 'print(row)' it gives me exaclty what I want.
But how do I write the result of the query to an Excel file? I already import pandas as pd.
cnxn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};'f'Server={server};'f'Database=
{db};'f'UID={username};'f'PWD={password};')
cursor = cnxn.cursor()
cursor.execute("SELECT name FROM master.dbo.sysdatabases")
result = cursor.fetchall()
ams_sql02 = []
for row in result:
ams_sql02.append(row[0])
ams_sql02 = [databases.lower() for databases in ams_sql02]
cursor = cnxn.cursor()
for db in ams_sql02:
cursor.execute(f'SELECT name, convert(varchar(10),create_date,103) as dateCreated fROM
sys.databases where name = \'{db}\' and create_date > \'2020-01-01 10:13:03.290\'
order by create_date')
result = cursor.fetchall()
for row in result:
print(row)
Why not put SQL query to Excel without Python? Excel works with datasources like MS SQL Server quite well.
I created a python script in Pycharm that works like a charm... But these days I'm sensing that I could have a problem with size of monthly .csv file and also I would like to analyze data using SQL over Python so I could automatize process and make charts, pies from that queries.
So instead of exporting to csv I would like to append to SQL table.
Here is a part of code that exports data to .csv:
for content in driver.find_elements_by_class_name('companychatroom'):
people = content.find_element_by_xpath('.//div[#class="theid"]').text
if people != "Monthy" and people != "Python" :
pass
mybook = open(r'C:\Users\Director\PycharmProjects\MyProjects\Employees\archive' + datetime.now().strftime("%m_%y") + '.csv', 'a')
mybook.write('%20s,%20s\n'%(people, datetime.now().strftime("%d/%m/%y %H:%M")))
mybook.close()
================== EDIT: ==============
I tried over sqlite3 and these is what I could manage to write by now and it works... But how to append data into SQL table it always overwrite previous with INSERT TO and it shouldn't?
import sqlite3
from datetime import datetime
sqlite_file = (r"C:\Users\Director\PycharmProjects\MyProjects\Employees\MyDatabase.db")
conn = sqlite3.connect(sqlite_file)
cursor = conn.cursor()
table_name = 'Archive' + datetime.now().strftime("%m_%y")
sql = 'CREATE TABLE IF NOT EXISTS ' + table_name + '("first_name" varchar NOT NULL, "Currdat"date NOT NULL)'
cursor.execute(sql)
sql = 'INSERT INTO ' + table_name + '(first_name,Currdat) VALUES ("value1",CURRENT_TIMESTAMP);'
cursor.execute(sql)
sql = 'SELECT * FROM Archive06_16 '
for row in cursor.execute(sql):
print(row)
cursor.close()
Thanks in advance,
[Using Python3.x]
The basic idea is that I have to run a first query to pull a long list of IDs (text) (about a million IDs) and use those IDs in an IN() clause in a WHERE statement in another query. I'm using python string formatting to make this happen, and works well if the number of IDs is small - say 100k - but gives me an error (pyodbc.Error: ('08S01', '[08S01] [MySQL][ODBC 5.2(a) Driver][mysqld-5.5.31-MariaDB-log]MySQL server has gone away (2006) (SQLExecDirectW)')) when the set is indeed about a million IDs long.
I tried to read into it a bit and think it might have something with the default(?) limits set by SQLite. Also I am wondering if I'm approaching this in the right way anyway.
Here's my code:
Step 1: Getting the IDs
def get_device_ids(con_str, query, tb_name):
local_con = lite.connect('temp.db')
local_cur = local_con.cursor()
local_cur.execute("DROP TABLE IF EXISTS {};".format(tb_name))
local_cur.execute("CREATE TABLE {} (id TEXT PRIMARY KEY, \
lang TEXT, first_date DATETIME);".format(tb_name))
data = create_external_con(con_str, query)
device_id_set = set()
with local_con:
for row in data:
device_id_set.update([row[0]])
local_cur.execute("INSERT INTO srv(id, lang, \
first_date) VALUES (?,?,?);", (row))
lid = local_cur.lastrowid
print("Number of rows inserted into SRV: {}".format(lid))
return device_id_set
Step 2: Generating the query with 'dynamic' IN() clause
def gen_queries(ids):
ids_list = str(', '.join("'" + id_ +"'" for id_ in ids))
query = """
SELECT e.id,
e.field2,
e.field3
FROM table e
WHERE e.id IN ({})
""".format(ids_list)
return query
Step 3: Using that query in another INSERT query
This is where things go wrong
def get_data(con_str, query, tb_name):
local_con = lite.connect('temp.db')
local_cur = local_con.cursor()
local_cur.execute("DROP TABLE IF EXISTS {};".format(tb_name))
local_cur.execute("CREATE TABLE {} (id TEXT, field1 INTEGER, \
field2 TEXT, field3 TEXT, field4 INTEGER, \
PRIMARY KEY(id, field1));".format(tb_name))
data = create_external_con(con_str, query) # <== THIS IS WHERE THAT QUERY IS INSERTED
device_id_set = set()
with local_con:
for row in data:
device_id_set.update(row[1])
local_cur.execute("INSERT INTO table2(id, field1, field2, field3, \
field4) VALUES (?,?,?,?,?);", (row))
lid = local_cur.lastrowid
print("Number of rows inserted into table2: {}".format(lid))
Any help is very much appreciated!
Edit
This is probably the right solution to my problem, however when I try to use "SET SESSION max_allowed_packet=104857600" I get the error: SESSION variable 'max_allowed_packet' is read-only. Use SET GLOBAL to assign the value (1621). Then when I try to change SESSION to GLOBAL i get an access denied message.
Insert the IDs into a (temporary) table in the same database, and then use:
... WHERE e.ID IN (SELECT ID FROM TempTable)
The code in the sequence is working fine, but looking to improve the MySQL code to a more efficient format.
The first case is about a function that received a parameter and returns the customerID from MySQL db:
def clean_table(self,customerName):
getCustomerIDMySQL="""SELECT customerID
FROM customer
WHERE customerName = %s;"""
self.cursorMySQL.execute(getCustomerIDMySQL,(customerName))
for getID_row in self.cursorMySQL:
customerID=getID_row[0]
return customerID
In the case we know before hand that the result will be just one output, how to get the same thing into my getID_row, without using "for" statement?
For the second case, the function is running with the table name ('customer') on it...
def clean_tableCustomer(self):
cleanTableQuery = """TRUNCATE TABLE customer;"""
self.cursorMySQL.execute(cleanTableQuery)
setIndexQuery = """ALTER TABLE customer AUTO_INCREMENT = 1;"""
self.cursorMySQL.execute(setIndexQuery)
then, how to replace the table name as a parameter passed through the function? Here is how I tried to get this done:
def clean_table(self,tableName):
cleanTableQuery = """TRUNCATE TABLE %s;"""
self.cursorMySQL.execute(cleanTableQuery,(tableName))
setIndexQuery = """ALTER TABLE %s AUTO_INCREMENT = 1;"""
self.cursorMySQL.execute(setIndexQuery,(tableName))
But MySQL didn't work this time.
All comments and suggestions are highly appreciated.
For the first case (simple, but easy to get a KeyError when there is no row):
customerID = self.cursorMySQL.fetchone()[0]
More correct is to implement a new method for the cursor class:
def autofetch_value(self, sql, args=None):
""" return a single value from a single row or None if there is no row
"""
self.execute(sql, args)
returned_val = None
row = self.fetchone()
if row is not None:
returned_val = row[0]
return returned_val
For the second case:
def clean_table(self,tableName):
cleanTableQuery = """TRUNCATE TABLE %s;""" % (tableName,)
self.cursorMySQL.execute(cleanTableQuery)
setIndexQuery = """ALTER TABLE %s AUTO_INCREMENT = 1;""" % (tableName,)
self.cursorMySQL.execute(setIndexQuery)
Make sure you sanitize the data, since the cursor won't.
Unfortunately, you cannot parametrize the name of a table (see this post). You will have to use Python string operations to do what you are attempting here.
Hope this helps, it took me a while to find out when I ran into this issue.