How to execute DBCC CHECKIDENT with Python? - python

I have Python script that I am trying to use to execute this function below on my SQL Server
DBCC CHECKIDENT('TableName', RESEED, 0)
My script looks like this:
qry = '''DBCC CHECKIDENT('TableName', RESEED, 0)'''
def mssql_cmd(qry, env):
# Import Dependencies
import pyodbc
import sqlalchemy as sa
import urllib
import pandas as pd
import sqlalchemy
import json
try:
# Read config json file into config dict
with open("../parameters/config.json") as cf:
config = json.load(cf)
# Try to establish the connection to MSSQL
params = urllib.parse.quote_plus(f'DRIVER={config[env][0]["driver"]};'
f'Server={config[env][0]["server"]};'
f'Database={config[env][0]["database"]};'
f'User={config[env][0]["user"]};'
f'Password={config[env][0]["password"]};'
f'Trusted_connection={config[env][0]["Trusted_connection"]};')
# Establish the engine
engine = sa.create_engine("mssql+pyodbc:///?odbc_connect=%s" % params)
db = engine.connect()
print("Connection to Data Warehouse -- SUCCESSFUL")
if db.connect():
try:
db.execute(qry)
db.close()
engine.dispose()
except Exception as e:
print(e)
except Exception as e:
print(e)
I don't get any errors the scrips executes but it doesn't reset my autogen Id on the table.
If I replace the line
db.execute(qry)
with
data = pd.read_sql(sql_qry, db)
then I am able to extract the data.
So the script works if I run the query however I can't make it to run the function to reset my auto gen id.
Does anyone have any clue as to what I am doing wrong here?

Related

Do I need to close pyodbc sql server connection when reading the data into the Pandas Dataframe?

I am confused how to use context manager with pyodbc connection. As far as I know, it is usually necessary to close the database connection and using context manager is a good practice for that (for pyodbc, I saw some examples which closes the cursor only). Long story short, I am creating a python app which pulls data from sql server and want to read them into a Pandas Dataframe.
I did some search on using contextlib and wrote an script sql_server_connection:
import pyodbc
import contextlib
#contextlib.contextmanager
def open_db_connection(server, database):
"""
Context manager to automatically close DB connection.
"""
conn = pyodbc.connect('DRIVER={SQL Server};SERVER='+server+';DATABASE='+database+';Trusted_Connection=yes;')
try:
yield
except pyodbc.Error as e:
print(e)
finally:
conn.close()
I then called this in another script:
from sql_server_connection import open_db_connection
with open_db_connection(server, database) as conn:
df = pd.read_sql_query(query_string, conn)
which raises this error:
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\sql.py", line 436, in read_sql_query
return pandas_sql.read_query(
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\sql.py", line 2116, in read_query
cursor = self.execute(*args)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\sql.py", line 2054, in execute
cur = self.con.cursor()
AttributeError: 'NoneType' object has no attribute 'cursor'
I didn't define a cursor here because I expect that Pandas handle it as it did before I think about closing the connection. If the approach above is wrong how would I close the connection? Or does pyodbc handle it?
Thanks!
You yield nothing (None) from your open_db_connection.
import pyodbc
import contextlib
#contextlib.contextmanager
def open_db_connection(server, database):
"""
Context manager to automatically close DB connection.
"""
conn = pyodbc.connect('DRIVER={SQL Server};SERVER='+server+';DATABASE='+database+';Trusted_Connection=yes;')
try:
yield conn # yield the connection not None
except pyodbc.Error as e:
print(e)
finally:
conn.close()
Also, I should point out two things:
pyodbc does not expect the user to close the connections (docs):
Connections are automatically closed when they are deleted (typically when they go out of scope) so you should not normally need to call this, but you can explicitly close the connection if you wish.
pandas expects a SQLAlchemy connectable, a SQLAlchemy URL str or a sqlite3 connections in its read_sql_* functions (docs), not a pyODBC connection, so your mileage may vary.
pandas.read_sql_query(sql, con, index_col=None, coerce_float=True, params=None, parse_dates=None, chunksize=None, dtype=None)
con: SQLAlchemy connectable, str, or sqlite3 connection
Using SQLAlchemy makes it possible to use any DB supported by that library. If a DBAPI2 object, only sqlite3 is supported.
I would probably simplify to the following:
No need for error handling, this function must return a connection (also split up the string on several line and used f-strings rather than concat).
import pyodbc
def open_db_connection(server, database):
"""
Context manager to automatically close DB connection.
"""
return pyodbc.connect(
"DRIVER={SQL Server};"
f"SERVER={server};"
f"DATABASE={database};"
"Trusted_Connection=yes;"
)
Call the above function directly in the argument list to scope the connection to inside read_sql_query. Might want to do error handling here as well, but that depends on what you're writing.
import pandas as pd
from sql_server_connection import open_db_connection
df = pd.read_sql_query(
query_string,
open_db_connection(server, database),
)

Output of Python script differ depending on way of activation

hei guys,
I have an executable python script, say get_data.py (located in project_x/src/) which is working properly, when started by: python get_data.py . It gets data (a list of id's which are necessary for further calculations) from a database via mysql.connector and then processes these data in parallel (via multiprocessing) using pool.map.
BUT it is supposed to be started by an .exe-file (located in project_x/exec/)[EDIT: This .exe uses the php command exec() to directly addresses my python script]. This is not working properly but ending in the try-except-block (in wrapper_fun) when catching the (unknown) mistake and not terminating when deleting the try-except-commands.
Do you have any idea what could be going wrong? I would appreciate any idea. I tried logging but there seems to be a permission problem. My idea is that the connection the db cannot be established and therefore there are no id's.
def calculations():
do_something...
def wrapper_fun(id):
try:
calculations(id)
except Exception:
return(False)
if __name__ == "__main__":
import multiprocessing
import mysql.connector
from mysql.connector import Error
host_name = <secret_1>
user_name = <secret_2>
passt = <secret_3>
connection = None
try:
connection = mysql.connector.connect(
host=host_name,
user=user_name,
passwd=user_password
)
except Error as err:
print(f"Error: '{err}'")
d = pd.read_sql_query(query, connection,coerce_float=False)
connection.close()
id_s = list(d.ids)
results = [pool.map(wrapper_fun,id_s)]
...

Python SQLalchemy - can I pass the connection object between functions?

I have a python application that is reading from mysql/mariadb, uses that to fetch data from an api and then inserts results into another table.
I had setup a module with a function to connect to the database and return the connection object that is passed to other functions/modules. However, I believe this might not be a correct approach. The idea was to have a small module that I could just call whenever I needed to connect to the db.
Also note, that I am using the same connection object during loops (and within the loop passing to the db_update module) and call close() when all is done.
I am also getting some warnings from the db sometimes, those mostly happen at the point where I call db_conn.close(), so I guess I am not handling the connection or session/engine correctly. Also, the connection id's in the log warning keep increasing, so that is another hint, that I am doing it wrong.
[Warning] Aborted connection 351 to db: 'some_db' user: 'some_user' host: '172.28.0.3' (Got an error reading communication packets)
Here is some pseudo code that represents the structure I currently have:
################
## db_connect.py
################
# imports ...
from sqlalchemy import create_engine
def db_connect():
# get env ...
db_string = f"mysql+pymysql://{db_user}:{db_pass}#{db_host}:{db_port}/{db_name}"
try:
engine = create_engine(db_string)
except Exception as e:
return None
db_conn = engine.connect()
return db_conn
################
## db_update.py
################
# imports ...
def db_insert(db_conn, api_result):
# ...
ins_qry = "INSERT INTO target_table (attr_a, attr_b) VALUES (:a, :b);"
ins_qry = text(ins_qry)
ins_qry = ins_qry.bindparams(a = value_a, b = value_b)
try:
db_conn.execute(ins_qry)
except Exception as e:
print(e)
return None
return True
################
## main.py
################
from sqlalchemy import text
from db_connect import db_connect
from db_update import db_insert
def run():
try:
db_conn = db_connect()
if not db_conn:
return False
except Exception as e:
print(e)
qry = "SELECT *
FROM some_table
WHERE some_attr IN (:some_value);"
qry = text(qry)
search_run_qry = qry.bindparams(
some_value = 'abc'
)
result_list = db_conn.execute(qry).fetchall()
for result_item in result_list:
## do stuff like fetching data from api for every record in the query result
api_result = get_api_data(...)
## insert into db:
db_ins_status = db_insert(db_conn, api_result)
## ...
db_conn.close
run()
EDIT: Another question:
a) Is it ok in a loop, that does an update on every iteration to use the same connection, or would it be wiser to instead pass the engine to the run() function and call db_conn = engine.connect() and db_conn.close() just before and after each update?
b) I am thinking about using ThreadPoolExecutor instead of the loop for the API calls. Would this have implications on how to use the connection, i.e. can I use the same connection for multiple threads that are doing updates to the same table?
Note: I am not using the ORM feature mostly because I have a strong DWH/SQL background (though not so much as DBA) and I am used to writing even complex sql queries. I am thinking about switching to just using PyMySQL connector for that reason.
Thanks in advance!
Yes you can return/pass connection object as parameter but what is the aim of db_connect method, except testing connection ? As I see there is no aim of this db_connect method therefore I would recommend you to do this as I done it before.
I would like to share a code snippet from one of my project.
def create_record(sql_query: str, data: tuple):
try:
connection = mysql_obj.connect()
db_cursor = connection.cursor()
db_cursor.execute(sql_query, data)
connection.commit()
return db_cursor, connection
except Exception as error:
print(f'Connection failed error message: {error}')
and then using this one as for another my need
db_cursor, connection, query_data = fetch_data(sql_query, query_data)
and after all my needs close the connection with this method and method call.
def close_connection(connection, db_cursor):
"""
This method used to close SQL server connection
"""
db_cursor.close()
connection.close()
and calling method
close_connection(connection, db_cursor)
I am not sure can I share my github my check this link please. Under model.py you can see database methods and to see how calling them check it main.py
Best,
Hasan.

Unable to copy data into AWS RedShift

I tried a lot however I am unable to copy data available as json file in S3 bucket(I have read only access to the bucket) to Redshift table using python boto3. Below is the python code which I am using to copy the data. Using the same code I was able to create the tables in which I am trying to copy.
import configparser
import psycopg2
from sql_queries import create_table_queries, drop_table_queries
def drop_tables(cur, conn):
for query in drop_table_queries:
cur.execute(query)
conn.commit()
def create_tables(cur, conn):
for query in create_table_queries:
cur.execute(query)
conn.commit()
def main():
try:
config = configparser.ConfigParser()
config.read('dwh.cfg')
# conn = psycopg2.connect("host={} dbname={} user={} password={} port={}".format(*config['CLUSTER'].values()))
conn = psycopg2.connect(
host=config.get('CLUSTER', 'HOST'),
database=config.get('CLUSTER', 'DB_NAME'),
user=config.get('CLUSTER', 'DB_USER'),
password=config.get('CLUSTER', 'DB_PASSWORD'),
port=config.get('CLUSTER', 'DB_PORT')
)
cur = conn.cursor()
#drop_tables(cur, conn)
#create_tables(cur, conn)
qry = """copy DWH_STAGE_SONGS_TBL
from 's3://udacity-dend/song-data/A/A/A/TRAAACN128F9355673.json'
iam_role 'arn:aws:iam::xxxxxxx:role/MyRedShiftRole'
format as json 'auto';"""
print(qry)
cur.execute(qry)
# execute a statement
# print('PostgreSQL database version:')
# cur.execute('SELECT version()')
#
# # display the PostgreSQL database server version
# db_version = cur.fetchone()
# print(db_version)
print("Executed successfully")
cur.close()
conn.close()
# close the communication with the PostgreSQL
except Exception as error:
print("Error while processing")
print(error)
if __name__ == "__main__":
main()
I don't see any error in the Pycharm console but I see Aborted status in the redshift query console. I don't see any reason why it has been aborted(or I don't know where to look for that)
Other thing that I have noticed is when I run the copy statement in Redshift query editor , it runs fine and data gets moved into the table. I tried to delete and recreate the cluster but no luck. I am not able to figure what I am doing wrong. Thank you
Quick read - it looks like you haven't committed the transaction and the COPY is rolled back when the connection closes. You need to either change the connection configuration to be in "autocommit" or add an explicit "commit()".

Launch SQL stored procedures from python with sqlalchemy?

I can successfully connect to SQL Server Management Studio from my jupyter notebook with this script:
from sqlalchemy import create_engine
import pyodbc
import csv
import time
import urllib
params = urllib.parse.quote_plus('''DRIVER={SQL Server Native Client 11.0};
SERVER=SV;
DATABASE=DB;
TRUSTED_CONNECTION=YES;''')
engine = create_engine("mssql+pyodbc:///?odbc_connect=%s" % params)
I managed to execute some SQL scripts like this:
engine.execute("delete from table_name_X")
However, I can't execute stored procedures. I tried the following scripts from what I've seen in stored procedures with sqlAlchemy. These following scripts have an output like "sqlalchemy.engine.result.ResultProxy at 0x173ed18e470", but the procedure wasn't executed in reality (nothing happened):
# test 1
engine.execute('stored_procedure_name')
# test 2
from sqlalchemy import func
from sqlalchemy.orm import sessionmaker
session = sessionmaker(bind=engine)()
session.execute(func.upper('stored_procedure_name'))
Could you please give me the correct way to execute stored procedures?
The way you can call a stored procedure using pyodbc is :
cursor.execute("{CALL usp_StoreProcedure}")
I found a solutions in reference to this link . https://github.com/mkleehammer/pyodbc/wiki/Calling-Stored-Procedures
Here a example :
import pyodbc
import urllib
import sqlalchemy as sa
params = urllib.parse.quote_plus("DRIVER={SQL Server Native Client 11.0};"
"SERVER=xxx.xxx.xxx.xxx;"
"DATABASE=DB;"
"UID=user;"
"PWD=pass")
engine = sa.create_engine("mssql+pyodbc:///?odbc_connect={}".format(params))
connection = engine.raw_connection()
try:
cursor = connection.cursor()
cursor.execute("{CALL stored_procedure_name}")
result = cursor.fetchall()
print(result)
connection.commit()
finally:
connection.close()
Finally solved my problem with the following function :
def execute_stored_procedure(engine, procedure_name):
res = {}
connection = engine.raw_connection()
try:
cursor = connection.cursor()
cursor.execute("EXEC "+procedure_name)
cursor.close()
connection.commit()
res['status'] = 'OK'
except Exception as e:
res['status'] = 'ERROR'
res['error'] = e
finally:
connection.close()
return res

Categories

Resources