How to do df.to_sql using SQL Server in Azure

How to do df.to_sql using SQL Server in Azure - python

I can do a df.to_slq on my local instance of SQL Server just fine. I am getting stuck when trying to do the same df.to_sll using Python and Azure SQL Server. I thought it would essentially be done like this.
import urllib.parse
params = urllib.parse.quote_plus(
'Driver=%s;' % '{ODBC Driver 17 for SQL Server}' +
'Server=%s,1433;' % 'ryan-server.database.windows.net' +
'Database=%s;' % 'ryan_sql_db' +
'Uid=%s;' % 'UN' +
'Pwd={%s};' % 'PW' +
'Encrypt=no;' +
'TrustServerCertificate=no;'
)
from sqlalchemy.engine import create_engine
conn_str = 'mssql+pyodbc:///?odbc_connect=' + params
engine = create_engine(conn_str)
connection = engine.connect()
connection
all_data.to_sql('health', engine, if_exists='append', chunksize=100000, method=None,index=False)
That is giving me this error.
OperationalError: (pyodbc.OperationalError) ('08S01', '[08S01] [Microsoft][ODBC Driver 17 for SQL Server]TCP Provider: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.\r\n (10060) (SQLExecDirectW); [08S01] [Microsoft][ODBC Driver 17 for SQL Server]Communication link failure (10060)')
[SQL: INSERT INTO health ([0], [Facility_BU_ID], [Code_Type], [Code], [Description], [UB_Revenue_Code], [UB_Revenue_Description], [Gross_Charge], [Cash_Charge], [Min_Negotiated_Rate], [Max_Negotiated_Rate], etc., etc., etc.
I found this link today:
https://learn.microsoft.com/en-us/sql/machine-learning/data-exploration/python-dataframe-sql-server?view=sql-server-ver15
I tried to do something similar, like this.
import pyodbc
import pandas as pd
df = all_data
# server = 'myserver,port' # to specify an alternate port
server = 'ryan-server.database.windows.net'
database = 'ryan_sql_db'
username = 'UN'
password = 'PW'
cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
cursor = cnxn.cursor()
# Insert Dataframe into SQL Server:
for index, row in df.iterrows():
cursor.execute(all_data.to_sql('health', cnxn, if_exists='append', chunksize=100000, method=None,index=False))
cnxn.commit()
cursor.close()
When I run that, I get this error.
DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': ('42S02', "[42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Invalid object name 'sqlite_master'. (208) (SQLExecDirectW); [42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Statement(s) could not be prepared. (8180)")
What I'm really hoping to to is df.to_sql, not Insert Into. I am working in Spyder and trying to send the data from my local machine to the cloud.

I read the two links below, and got it working.
https://learn.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-set-database-firewall-rule-azure-sql-database?view=azuresqldb-current
https://www.virtual-dba.com/blog/firewalls-database-level-azure-sql/
Basically, you need to open your command window on your local machine, enter 'ipconfig', and grab two IP addresses. Then, enter those into SQL Server in Azure.
EXECUTE sp_set_database_firewall_rule
N'health',
'192.0.1.1',
'192.0.0.5';
Finally, run the small script below, in SQL Server, to confirm that the changes were made correctly.
USE [ryan_sql_db]
GO
SELECT * FROM sys.database_firewall_rules
ORDER BY modify_date DESC

Related

Python connect to Microsoft SQL Server with sqlalchemy for df.to_sql: target machine actively refused connection

Insert pandas df into local Microsoft SQL Server database table using df.to_sql
Created connection_url for sqlalchemy engine:
connection_url = URL.create(
"mssql+pyodbc",
username="",
password="",
host="localhost",
port=1433,
database="priority",
query={
"driver": "ODBC Driver 17 for SQL Server",
"authentication": "ActiveDirectoryIntegrated",
},
)
Used connection_url to create engine:
engine = sqlalchemy.create_engine(connection_url)
Attempted to insert df into SQL database table 'test1' with df.to_sql:
df.to_sql('test1', engine, if_exists='replace')
Received error
OperationalError: (pyodbc.OperationalError) ('08001', '[08001] [Microsoft][ODBC Driver 17 for SQL Server]TCP Provider: No connection could be made because the target machine actively refused it.\r\n (10061) (SQLDriverConnect); [08001] [Microsoft][ODBC Driver 17 for SQL Server]Login timeout expired (0); [08001] [Microsoft][ODBC Driver 17 for SQL Server]A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online. (10061)')
What I've tried
Opened the SQL Port 1433
Ran the following in the cmd (administrator mode)
netsh advfirewall firewall add rule name= "SQL Port" dir=in action=allow protocol=TCP localport=1433
Checked the port has been added
netsh firewall show state
Which gave the following:
Ports currently open on all network interfaces:
Port Protocol Version Program
1433 TCP Any (null)
Replaced 'localhost' with '127.0.0.1'
Replaced 'localhost' with the hostname of SQL Server
I found the hostname of SQL Server by executing the following SQL query
SELECT HOST_NAME() AS HostName
Which returned the following (changed the name for security)
LAPTOP-NumDig
[Edit update] 4. Updated ODBC driver to driver 17(https://learn.microsoft.com/en-us/sql/connect/odbc/download-odbc-driver-for-sql-server?view=sql-server-ver16).
[Edit update] 5. Replaced "authentication": "ActiveDirectoryIntegrated" in query with "Trusted_Connection": "yes"
Prior connections have worked with pyodbc package
I have been able to connect to the SQL database 'priority' and created the table 'test1' using the pyodbc package in Python.
Connect to SQL
conn = pyodbc.connect('Driver={SQL Server};'
'Server=LAPTOP-NumDig;'
"Database=priority;"
'UID=;' # username
'PWD=;' # password
)
cursor = conn.cursor()
Create table called 'test1' in 'priority' database
cursor.execute("""
CREATE TABLE test1 (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255)
);""")
Let me know your thoughts
What I'm wanting to do is insert a df as a table in SQL. Any suggestions would be great :)
FYI - Software & package versions:
VSCode 1.71.2
Python 3.9.13
Microsoft SQL Server 2019 on Windows 10 Home 10.0 (Build 22000: )
sqlalchemy 1.4.41
pyodbc 4.0.34

I found a solution that uses an alternative approach that works: passing through the exact Pyodbc string to sqlalchemy (https://pydoc.dev/sqlalchemy/latest/sqlalchemy.dialects.mssql.pyodbc.html).
import sqlalchemy
from sqlalchemy.engine import URL
connection_string = "DRIVER={SQL Server};SERVER=LAPTOP-NumDig;DATABASE=priority;UID=;PWD="
connection_url = URL.create("mssql+pyodbc", query={"odbc_connect": connection_string})
engine = sqlalchemy.create_engine(connection_url)
df.to_sql('test1', engine, if_exists='replace')
Note, this way of inserting took a long time to upload a df into an SQL table.
[Edit update] To make the df.to_sql() run faster, include the chunksize parameter and method='multi'. (https://towardsdatascience.com/dramatically-improve-your-database-inserts-with-a-simple-upgrade-6dfa672f1424).
I used the following code to speed up the pandas df to SQL table upload:
df.to_sql('test1', engine, if_exists='replace', chunksize=20, method='multi')

pyodbc connection failing in sqlalchemy but working through direct pyodbc connection

Why does this workc(I get a result set back):
sql_server = 'myserver.database.windows.net'
sql_database = 'pv'
sql_username = 'sqladmin'
sql_password = 'password1'
sql_driver= '{ODBC Driver 17 for SQL Server}'
with pyodbc.connect('DRIVER='+sql_driver+';SERVER=tcp:'+sql_server+';DATABASE='+sql_database+';UID='+sql_username+';PWD='+ sql_password) as conn:
with conn.cursor() as cursor:
cursor.execute("SELECT TOP 3 SAPPHIRE_CASE_ID FROM PV_ALL_SUBMISSIONS_SL")
row = cursor.fetchone()
while row:
print (str(row[0]))
row = cursor.fetchone()
But this fails:
import pyodbc
sql_engine = sqlalchemy.create_engine(f'mssql+pyodbc://{sql_username}:{sql_password}#{sql_server}/{sql_database}?driver=ODBC+Driver+17+for+SQL+Server')
df.to_sql('PV_ALL_CLOSED_CASES_SL', con=sql_engine, if_exists='append')
Error is:
OperationalError: (pyodbc.OperationalError) ('08001', '[08001]
[Microsoft][ODBC Driver 17 for SQL Server]Named Pipes Provider: Could
not open a connection to SQL Server [53]. (53) (SQLDriverConnect);
[08001] [Microsoft][ODBC Driver 17 for SQL Server]Login timeout
expired (0); [08001] [Microsoft][ODBC Driver 17 for SQL Server]A
network-related or instance-specific error has occurred while
establishing a connection to SQL Server. Server is not found or not
accessible. Check if instance name is correct and if SQL Server is
configured to allow remote connections. For more information see SQL
Server Books Online. (53)') (Background on this error at:
https://sqlalche.me/e/14/e3q8)
While I know one is doing a read and the other a write, my issue seems to be just establishing a connection one way vs another, when using the same connection details. It isn't an Azure firewall issue as I am able to connect and run a select statment via the first method, but when using create_engine() of sqlalchemy, it fails to make the connection - but I am pretty sure the connection string is correct.
It is the same variables for server, user name and password being used in both connections.
I think the issue is that the real password as an "#" symbol in it, and so this interferes with the latter connection string.

Thanks to #Larnu, this worked:
from sqlalchemy.engine import URL
connection_string = f"DRIVER={sql_driver};SERVER={sql_server};DATABASE={sql_database};UID={sql_username};PWD={sql_password}"
connection_url = URL.create("mssql+pyodbc", query={"odbc_connect": connection_string})
sql_engine = sqlalchemy.create_engine(connection_url)
I dont have to url encode when I use a cx_Oracle connection, but hey it works now.

How can we load data from a data frame to Azure SQL Server?

I am trying, for the first time ever, to send data from a data frame in Spyder to Azure SQL Server...I think it's called Synapse. I created a small table in the database and when I run the code below, I see the results I expect to see.
import pyodbc
server = 'ryan-server.database.windows.net'
database = 'ryan_sql_db'
username = 'UN'
password = 'PW'
driver= '{ODBC Driver 17 for SQL Server}'
with pyodbc.connect('DRIVER='+driver+';SERVER=tcp:'+server+';PORT=1433;DATABASE='+database+';UID='+username+';PWD='+ password) as conn:
with conn.cursor() as cursor:
cursor.execute("SELECT * From Order_Table")
row = cursor.fetchone()
while row:
print (str(row[0]) + " " + str(row[1]))
row = cursor.fetchone()
So, the connection is fine. I guess I am just stuck on the syntax to push a data frame to SQL Server in Azure. I tested the code below.
import pyodbc
server = 'ryan-server.database.windows.net'
database = 'ryan_sql_db'
username = 'UN'
password = 'PW'
driver= '{ODBC Driver 17 for SQL Server}'
conn = pyodbc.connect('DRIVER='+driver+';SERVER=tcp:'+server+';PORT=1433;DATABASE='+database+';UID='+username+';PWD='+ password)
all_data.to_sql('health', conn, if_exists='replace', index=True)
When I run that code, I get this error.
DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': ('42S02', "[42S02] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Invalid object name 'sqlite_master'. (208) (SQLExecDirectW); [42S02] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Statement(s) could not be prepared. (8180)")

Try to do it by importing Pandas and pyodbc.
Below are few basic steps which we follow usually:
Connect to SQL Server.
Install all your python packages in your local.
Load the data into CSV.
Later you can use below Python script to load it from dataframe.
import pyodbc
import pandas as pd
df = pd.read_csv("c:\\user\\username\department.csv")
server = 'yourservername'
database = 'AdventureWorks'
username = 'username'
password = 'yourpassword'
cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
cursor = cnxn.cursor()
#Insert Dataframe into SQL Server:
for index, row in df.iterrows():
cursor.execute("INSERT INTO HumanResources.DepartmentTest (DepartmentID,Name,GroupName) values(?,?,?)", row.DepartmentID, row.Name, row.GroupName)
cnxn.commit()
cursor.close()
After completing the configs you can run below command to get the data from SQL:
SELECT count(*) from HumanResources.DepartmentTest;
Refer to this official doc for detailed explanation.

SQL Alchemy with SQL Server

I wasn't sure what to title my post, if you have a better idea, feel free to edit the title.
I have not used SQL Alchemy before and the documentation that I have looked at located in the following places, is not helpful:
Connecting to SQL Database Using SQL Alchemy in Python
Tutorial Point
Here is the code I am using:
import sqlalchemy as sal
from sqlalchemy import create_engine
#Here are the parameters I am using:
- server = 'Q-20/fake_example'
- database = 'AdventureWorks2017'
- driver = 'ODBC Driver 17 for SQL Server'
- trusted_connection='yes'
DATABASE_CONNECTION = 'mssql+pyodbc://#server = ' + server + '/database = ' + database + '?trusted_connection = ' + trusted_connection + '&driver=' + driver
engine = sal.create_engine(DATABASE_CONNECTION)
All of that seems to work fine without any problems; however, when I add this line:
connection=engine.connect()
I get the following error message:
sqlalchemy.exc.OperationalError: (pyodbc.OperationalError) ('08001',
'[08001] [Microsoft][ODBC Driver 17 for SQL Server]Named Pipes
Provider: Could not open a connection to SQL Server [53]. (53)
(SQLDriverConnect); [08001] [Microsoft][ODBC Driver 17 for SQL
Server]Login timeout expired (0); [08001] [Microsoft][ODBC Driver 17
for SQL Server]Invalid connection string attribute (0); [08001]
[Microsoft][ODBC Driver 17 for SQL Server]A network-related or
instance-specific error has occurred while establishing a connection
to SQL Server. Server is not found or not accessible. Check if
instance name is correct and if SQL Server is configured to allow
remote connections. For more information see SQL Server Books Online.
(53)')
I am not sure what is wrong with what I am doing, does anyone have any suggestions?
What I have tried so far:
I have confirmed that SQL Server is configured to allow remote connections. I did this check by following the instructions here
Removing the "#" sign before the server, but this just generated the same error message.

I figured out part of what I needed to do. I needed to change my parameters.
Old Parameters:
server = 'Q-20/fake_example'
database = 'AdventureWorks2017'
driver = 'ODBC Driver 17 for SQL Server'
trusted_connection='yes'
New Parameters:
server = 'Q-20'
database = 'AdventureWorks2017'
driver = 'SQL+SERVER+NATIVE+CLIENT+11.0'
trusted_connection='yes'
This is what my code ultimately looked like:
database_connection = 'mssql+pyodbc://Q-20/AdventureWorks2017?trusted_connection=yes&driver=SQL+SERVER+NATIVE+CLIENT+11.0'

pyodbc connection error when trying to connect to DB on localhost

I have a local DB on my machine called 'Test' which contains a table called 'Tags'. I am able to access this DB and query from this table through SQL Server management studio 2008.
However, when using pyodbc I keep running into problems.
Using this:
conn = pyodbc.connect('DRIVER={SQL Server};SERVER=localhost:1433;DATABASE=Test')
yields the error:
pyodbc.Error: ('08001', '[08001] [Microsoft][ODBC SQL Server Driver][DBNETLIB]Invalid connection. (14) (SQLDriverConnectW); [01000] [Microsoft][ODBC SQL Server Driver][DBNETLIB]ConnectionOpen (Invalid Instance()). (14)')
(with or without specifying the port)
Trying an alternative connection string:
conn = pyodbc.connect('DRIVER={SQL Server};SERVER=localhost\Test,1433')
yields no error, but then:
cur = conn.cursor()
cur.execute("SELECT * FROM Tags")
yields the error:
pyodbc.ProgrammingError: ('42S02', "[42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Invalid object name 'Tags'. (208) (SQLExecDirectW)")
Why could this be?

I tried changing your query to
SELECT * FROM Test.dbo.Tags
and it worked.

I don't see any authentication attributes in your connection strings. Try this (I'm using Windows authentication):
conn = pyodbc.connect('Trusted_Connection=yes', driver = '{SQL Server}',
server = 'localhost', database = 'Test')
cursor = conn.cursor()
# assuming that Tags table is in dbo schema
cursor.execute("SELECT * FROM dbo.Tags")

For me, apart from maintaining the connection details (user, server, driver, correct table name etc.),
I took these steps:
Checked the ODBC version here (Windows 10) ->
(search for) ODBC ->
Select 32/64 bit version ->
Drivers ->
Verify that the ODBC driver version is present there. If it is not, use this link to download the relevant driver: here
Reference Link: here

conn = pyodbc.connect('DRIVER={SQL Server};SERVER=localhost:1433;DATABASE=Test')
This connection lack of instance name and the port shouldn't be writen like this.
my connection is this:
cn=pyodbc.connect('DRIVER={SQL Server};SERVER=localhost\SQLEXPRESS;PORT=1433;DATABASE=ybdb;UID=sa;PWD=*****')
enter image description here

Try replacing 'localhost' with either '(local)' or '.'. This solution fixed the problem for me.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to do df.to_sql using SQL Server in Azure - python

Related

Python connect to Microsoft SQL Server with sqlalchemy for df.to_sql: target machine actively refused connection

pyodbc connection failing in sqlalchemy but working through direct pyodbc connection

How can we load data from a data frame to Azure SQL Server?

SQL Alchemy with SQL Server

pyodbc connection error when trying to connect to DB on localhost

Categories

Resources