I'm getting classes for the tables in the DB as follows:
import sqlalchemy as sa
import sqlalchemy.ext.automap
eng = sa.create_engine(CONNECTION_URL)
Base = sa.ext.automap.automap_base()
Base.prepare(eng, reflect=True)
Session = sa.orm.sessionmaker(bind=eng)
Table1 = Base.classes.Table1
In my case Table1 is system versioned which I understand sqlalchemy doesn't explicitly support.
When running the following code:
t = Table1(field1=1, field2=3)
with Session() as session:
session.add(t)
session.commit()
I get the following error:
[42000] [Microsoft][ODBC SQL Server Driver][SQL Server]Cannot insert explicit value into a GENERATED ALWAYS column in table 'DBName.dbo.Table1'. Use INSERT with a column list to exclude the GENERATED ALWAYS column, or insert a DEFAULT into GENERATED ALWAYS column. (13536) (SQLExecDirectW);
I understand this probably has to do with the ValidTo and ValidFrom columns
Table1.__table__.columns.keys()
# Column('ValidFrom', DATETIME2(), table=<Table1>, nullable=False)
# Column('ValidTo', DATETIME2(), table=<Table1>, nullable=False)
How do I tell sqlalchemy to ignore those columns during the insert statement?
EDIT
I'm guessing the below is the relevant part of the create statement?
CREATE TABLE [dbo].[Table1]
[TableID] [int] NOT NULL IDENTITY,
...
[ValidFrom] [datetime2](7) GENERATED ALWAYS AS ROW START NOT NULL,
[ValidTo] [datetime2](7) GENERATED ALWAYS AS ROW END NOT NULL
I've got this code below working using a sqlalchemy.
CREATE TABLE dbo.Customer2
(
Id INT NOT NULL PRIMARY KEY CLUSTERED,
Name NVARCHAR(100) NOT NULL,
StartTime DATETIME2 GENERATED ALWAYS AS ROW START
NOT NULL,
EndTime DATETIME2 GENERATED ALWAYS AS ROW END
NOT NULL ,
PERIOD FOR SYSTEM_TIME (StartTime, EndTime)
)
WITH(SYSTEM_VERSIONING=ON (HISTORY_TABLE=dbo.CustomerHistory2))
If the StartTime / EndTime columns are hidden (which these arent't) then a value isn't needed in the insert statement you can add just the required. However the date columns in my table are required, so using default.
sql = "INSERT INTO dbo.Customer2 VALUES (2,'Someone else', default,default)"
print(sql)
with engine.connect() as con:
rs = con.execute(sql)
Related
For the schema below:
CREATE TABLE BatchData (
pk INTEGER PRIMARY KEY AUTOINCREMENT,
batchid TEXT NOT NULL,
status TEXT NOT NULL,
strategyname TEXt NOT NULL,
createdon DATETIME
);
I have been trying to update a column value based on list of batchids.
Snapshot of data in db is:
pk,batchid,status,strategyname,createdon
1,a3eaa908-dbfc-4d9e-aa2a-2604ee3fdd95,FINISHED,OP_Ma,2023-02-15 06:20:21.924608
2,8813d314-4548-4c14-bd28-f2775fd7a1a7,INPROGRESS,OP_Ma,2023-02-16 06:01:19.335228
3,d7b0ef19-97a9-47b1-a885-925761755992,INPROGRESS,OP_CL,2023-02-16 06:20:52.748321
4,e30e2485-e62c-4d3c-9640-05e1b980654b,INPROGRESS,OP_In,2023-02-15 06:25:04.201072
While I'm able to update this table with following query executed directly in the console:
UPDATE BatchData SET status = 'FINISHED' WHERE batchid in ('a3eaa908-dbfc-4d9e-aa2a-2604ee3fdd95',
'8813d314-4548-4c14-bd28-f2775fd7a1a7',
'd7b0ef19-97a9-47b1-a885-925761755992')
When I try to do the same using Sqlalchemy:
import sqlalchemy as sa
sqlite_eng = sa.create_engine('blah.db')
...
...
status = 'FINISHED'
tuple_data = tuple(batchids)
STMT = sa.text("""UPDATE BatchData SET status = :stat WHERE batchid IN (:bids)""")
STMT_proxy = sqlite_eng.execute(STMT, stat=status, bids=tuple_data)
I have also made sure status is of type <str> and bids of type tuple(<str>).
Still getting the following error:
InterfaceError: (sqlite3.InterfaceError) Error binding parameter 1 - probably unsupported type.
[SQL: UPDATE BatchData SET status = ? WHERE batchid IN (?)]
[parameters: ('FINISHED', ('e30e2485-e62c-4d3c-9640-05e1b980654b', 'ea5df18f-1610-4f45-a3ee-d27b7e3bd1b4',
'd226c86f-f0bc-4d0c-9f33-3514fbb675c2',
'4a6b53cd-e675-44a1-aea4-9ae0 ... (21900 characters truncated) ... -c3d9-430f-b06e-c660b8ed13d8',
'66ed5802-ad57-4192-8d76-54673bd5cf8d', 'e6a3a343-b2ca-4bc4-ad76-984ea4c55e7e', '647dc42d-eccc-4119-b060-9e5452c2e9e5'))]
Can someone please help me find the problem with parameter type mismatch or parameter binding mistake?
Cannot pass tuple parameter to sqlite using Sqlalchemy like this.
Based on info in comment:
Since I don't have the table knowledge and the table was created with raw SQL etc.
I ended up going to this link and created a class obj as below:
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import sessionmaker
import sqlalchemy as sa
Session = sessionmaker(bind=sqlite_eng)
# these two lines perform the "database reflection" to analyze tables and relationships
Base = automap_base()
Base.prepare(sqlite_eng, reflect=True)
# there are many tables in the database but I want `products` and `categories`
# only so I can leave others out
BatchData = Base.classes.BatchData
# for debugging and passing the query results around
# I usually add as_dict method on the classes
def as_dict(obj):
data = obj.__dict__
data.pop('_sa_instance_state')
return data
# add the `as_dict` function to the classes
for c in [BatchData]:
c.as_dict = as_dict
objs = ('a3eaa908-dbfc-4d9e-aa2a-2604ee3fdd95',
'8813d314-4548-4c14-bd28-f2775fd7a1a7',
'd7b0ef19-97a9-47b1-a885-925761755992')
with Session() as session:
q = session.query(BatchData).filter(BatchData.batchid.in_(objs)).update({BatchData.status: 'FINISHED'}, synchronize_session = False)
session.commit()
row_updated = q
print(row_updated)
It worked. So for others - Here's the complete way!
I'm trying to copy a couple of tables from one database ("db1", PostgreSQL) to another ("db2", SQL Server).
Unfortunately, I face an issue due to the BOOLEAN type for some fields in the PostgreSQL database which is not recognized as a valid type for SQL Server.
Here is my code sample:
db2_engine = "postgresql+psycopg2://" + str(db2_user) + ":" + str(db2_password) + "#" + str(db2_host) + ":" + str(db2_port) + "/" + str(db2_database)
db2 = sqlalchemy.create_engine(db2_engine)
lst_tablename_totr = ["contract",
"subscription",
"contractdelivery",
"businesspartner"
]
for table_name in lst_tablename_totr:
table = Table(table_name, metadata, autoload=True, autoload_with=db2)
table.create(bind=db1)
query = """
SELECT
*
FROM """ + str(table_name) + """
"""
df_hg = pd.read_sql(query, db2_engine)
df_hg.to_sql(table_name, db1, schema='dbo', index=False, if_exists='append')
For now, the issue is located to the table = Table(table_name, metadata, autoload=True, autoload_with=db_hgzl) table.create(bind=db1) part of the code.
Here is the error message:
ProgrammingError: (pyodbc.ProgrammingError) ('42000', '[42000] [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Column, parameter or variable #8\xa0: data type BOOLEAN not found. (2715) (SQLExecDirectW)')
I couldn't find any way to force the conversion between PostgreSQL Boolean type and SQL Server Bit type.
You are seeing a difference between SQLAlchemy's dialect-specific BOOLEAN type and its generic Boolean type. For an existing PostgreSQL table
CREATE TABLE IF NOT EXISTS public.so68683260
(
id character varying(5) COLLATE pg_catalog."default" NOT NULL,
bool_col boolean NOT NULL,
CONSTRAINT so68683260_pkey PRIMARY KEY (id)
)
if we reflect the table then the boolean columns are defined as BOOLEAN
tbl = sa.Table(table_name, sa.MetaData(), autoload_with=pg_engine)
print(type(tbl.columns["bool_col"].type))
# <class 'sqlalchemy.sql.sqltypes.BOOLEAN'>
and then if we try to create the table in SQL Server we end up doing the equivalent of
tbl = sa.Table(
table_name,
sa.MetaData(),
sa.Column("id", sa.VARCHAR(5), primary_key=True),
sa.Column("bool_col", sa.BOOLEAN, nullable=False),
)
tbl.drop(ms_engine, checkfirst=True)
tbl.create(ms_engine)
and that fails with the error you cite because the DDL rendered is
CREATE TABLE so68683260 (
id VARCHAR(5) NOT NULL,
bool_col BOOLEAN NOT NULL,
PRIMARY KEY (id)
)
However, if we use the generic Boolean type
tbl = sa.Table(
table_name,
sa.MetaData(),
sa.Column("id", sa.VARCHAR(5), primary_key=True),
sa.Column("bool_col", sa.Boolean, nullable=False),
)
tbl.drop(ms_engine, checkfirst=True)
tbl.create(ms_engine)
we are successful because the DDL rendered is
CREATE TABLE so68683260 (
id VARCHAR(5) NOT NULL,
bool_col BIT NOT NULL,
PRIMARY KEY (id)
)
and BIT is the valid corresponding column type in T-SQL.
Feel free to open a SQLAlchemy issue if you believe that this behaviour should be changed.
[Note also that the text column is VARCHAR(5) because the table uses the default encoding for my PostgreSQL test database (UTF8), but creating the table in SQL Server will create a VARCHAR (non-Unicode) column instead of a NVARCHAR (Unicode) column.]
I have a dataframe df created as follow :
df = pd.DataFrame(list(zip(product_urlList, nameList, priceList, picList)),
columns =['URL','NomProduit', 'Prix', "LienPic"])
df['IdUnique'] = df['NomProduit'] + df['Prix']
My target is to import it into a MySQL database.
I created an SQL Database (called "Sezane") and its table called "Robes" via Python with MySQL.connector.
import mysql.connector as mysql
db = mysql.connect(
host = "localhost",
user = "root",
passwd = "password",
database = "sezane"
)
cursor = db.cursor()
cursor.execute('CREATE TABLE Robes (id INT(11) NOT NULL AUTO_INCREMENT PRIMARY KEY, Nom_Robes VARCHAR(255), Prix_Robes VARCHAR(255), liens_Robes VARCHAR(300), Images_robes VARCHAR (300), Id_Robes VARCHAR (255))'
Then, I try to insert this dataframe in the table :
from sqlalchemy import create_engine
engine = create_engine('mysql+mysqlconnector://root:password#Localhost:3306/sezane', echo=True)
df.to_sql(name='Robes', con=engine, if_exists = 'append')
I have the following error :
ProgrammingError: (mysql.connector.errors.ProgrammingError) 1054 (42S22): Unknown column 'index' in 'field list'
I made some researches about this error and found that it could become a problem of quote bracket "/' interversion.
However, after many hours on it, I still don't understand where it comes from. Why is the error message about "Index" ?
My target is to be able to make my df as a table.
By default to_sql tries to export the dataframe index as a column. You should be able to change this:
df.to_sql(name='Robes', con=engine, if_exists = 'append')
To this:
df.to_sql(name='Robes', con=engine, if_exists = 'append', index = False) and you will no longer get the same error.
I need to update a SQL Server database using a stored procedure and a table as a parameter using PYODBC. The stored procedure should be fine but I'm not sure about the syntax used in the Python script:
Python:
import pandas as pd
import pyodbc
# Create dataframe
data = pd.DataFrame({
'STATENAME':[state1, state2],
'COVID_Cases':[value1, value2],
})
data
conn = pyodbc.connect('Driver={SQL Server};'
'Server=mydb;'
'Database=mydbname;'
'Username=username'
'Password=password'
'Trusted_Connection=yes;')
cursor = conn.cursor()
params = ('#StateValues', data)
cursor.execute("{CALL spUpdateCases (?,?)}", params)
Stored procedure:
[dbo].[spUpdateCases]
#StateValues tblTypeCOVID19 readonly,
#Identity int out
AS
BEGIN
INSERT INTO tblCOVID19
SELECT * FROM #StateValues
SET #Identity = SCOPE_IDENTITY()
END
Here is my user-defined type:
CREATE TYPE [dbo].[tblTypeCOVID19] AS TABLE
(
[ID] [int] NOT NULL,
[StateName] [varchar](50) NULL,
[COVID_Cases] [int] NULL,
[DateEntered] [datetime] NULL
)
I'm not getting any error when executing the Python script.
Using Table-Valued Parameters directly requires client support which I don't think pyodbc has implemeneted. But the best way to read and write tabular data from python is with JSON.
You can send tabular results to SQL Server as JSON, and parse the doc on the server to load a TVP or regular table.
Here's a walk-through. Also, I've modified your stored procedure to return multiple rows as the result, instead of just a single SCOPE_IDENTITY value:
In SQL Server Run:
use tempdb
go
drop table if exists tblCOVID19
drop procedure [spUpdateCases]
drop type [dbo].[tblTypeCOVID19]
go
create table tblCOVID19
(
ID int identity not null primary key,
StateName varchar(200),
COVID_Cases int,
DateEntered datetime default getdate()
)
go
CREATE TYPE [dbo].[tblTypeCOVID19] AS TABLE(
[ID] [int] NULL,
[StateName] [varchar](50) NULL,
[COVID_Cases] [int] NULL,
[DateEntered] [datetime] NULL
)
go
create or alter procedure [dbo].[spUpdateCases]
#StateValues tblTypeCOVID19 readonly
AS
BEGIN
set nocount on;
insert into tblCOVID19 (StateName, COVID_Cases)
output inserted.*
select StateName, COVID_Cases
from #StateValues;
END
And verify that it's working in TSQL like this:
declare #json nvarchar(max) = '[{"STATENAME":"TX","COVID_Cases":212},{"STATENAME":"OK","COVID_Cases":41}]'
declare #tvp tblTypeCOVID19
insert into #tvp(StateName, COVID_Cases)
select StateName, COVID_Cases
from openjson(#json)
with
(
StateName varchar(200) '$.STATENAME',
COVID_Cases int '$.COVID_Cases'
)
exec [dbo].[spUpdateCases] #tvp
Then call it from python like this:
import pandas as pd
import pyodbc
import json
# Create dataframe
data = pd.DataFrame({
'STATENAME':["TX", "OK"],
'COVID_Cases':[212, 41],
})
conn = pyodbc.connect('Driver={SQL Server};'
'Server=localhost;'
'Database=tempdb;'
'Trusted_Connection=yes;')
cursor = conn.cursor()
jsonData = data.to_json(orient='records')
print(jsonData)
sql = """
set nocount on;
declare #tvp tblTypeCOVID19
insert into #tvp(StateName, COVID_Cases)
select StateName, COVID_Cases
from openjson(?)
with
(
StateName varchar(200) '$.STATENAME',
COVID_Cases int '$.COVID_Cases'
)
exec [dbo].[spUpdateCases] #tvp
"""
cursor.execute(sql, jsonData)
results = cursor.fetchall()
print(results)
I've a Pandas dataframe which I'm trying to insert into a MySQL table, using MySQLdb and to_sql. The table has 'allocationid' as primary key and autoincrement.. I will want to do this daily, deleting out the day's previous data from the MySQL table and reinserting updated data from the Pandas dataframe. Hence would like the primary key to autoincrement automatically (I won't be using it down the line, but may want to refer to it).
code is...
columns = ('date','tradeid','accountid','amount')
splitInput = pd.DataFrame(columns = columns)
splitInput['accountid'] = newHFfile['acctID']
splitInput['tradeid'] = newHFfile['Ref']
splitInput['amount'] = newHFfile['AMOUNT1']
splitInput['date'] = newHFfile['Trade Date']
db = MySQLdb.connect(host="(hostIP)", port=3306, user="user", passwd="(passwd)", db="(database)")
cursor = db.cursor()
query = """delete from splittrades where date = """ + runymdformat + """ """
cursor.execute(query)
db.commit()
splitInput.to_sql(con = db, name = 'splittrades',if_exists = 'append',flavor = 'mysql',index = False)
db.commit()
db.close()
The problem is that without adding a column for primary key, I get 'OperationalError: (1364, "Field 'allocationid' doesn't have a default value")'
If I add a primary key column and leave it blank, null, I get OperationalError: (1366, "Incorrect integer value: '' for column 'allocationid' at row 1")
If I use 1 or 0 in the allocationid column I get a duplicated value error msg.
MySQL usually auto-increments the primary key if you don't specify it - is there a way I can make this work from Python?
PS am not a Python expert so pls treat me gently - thanks :-)