All I want is a simple Upsert from the DataFrame to SQLite. However, since pd.to_sql() does not have Upsert, I had to implement it with SQLAlchemy instead.
SQLite:
CREATE TABLE test (col1 INTEGER, col2 text, col3 REAL, PRIMARY KEY(col1, col2));
python:
import pandas as pd
from sqlalchemy import create_engine
from sqlalchemy import Table
from sqlalchemy.dialects.postgresql import insert
from sqlalchemy.ext.automap import automap_base
def test_upsert():
df = pd.DataFrame({'col1':1, 'col2':'a', 'col3':1.5}, index=[0])
sql_url = 'sqlite:///testDB.db'
table = 'test'
engine = create_engine(sql_url)
with engine.connect() as conn:
base = automap_base()
base.prepare(engine, reflect=True)
target_table = Table(table, base.metadata, autoload=True, autoload_with=engine)
stmt = insert(target_table).values(df.to_dict(orient='records'))
update_dict = {c.name: c for c in stmt.excluded if not c.primary_key}
conn.execute(stmt.on_conflict_do_update(constraint=f'{table}_pkey', set_=update_dict))
The script above works with Postgres previously but it keeps giving me the error when used with SQLite.
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) near "ON": syntax error
[SQL: INSERT INTO test (col1, col2, col3) VALUES (?, ?, ?) ON CONFLICT (test_pkey) DO UPDATE SET col3 = excluded.col3]
[parameters: (1, 'a', 1.5)]
(Background on this error at: http://sqlalche.me/e/14/e3q8)
I'm not sure what I did wrong, or if there's any better solution since it seems like a very common operation.
Any help is appreciated.
Related
I have multiple SQLite3 databases for which the models are not available.
def index_db(name, tempdb):
print(f'{name.ljust(padding)} Indexing file: {tempdb}')
if tempdb.endswith('primary.sqlite'):
conn = sqlite3.connect(tempdb)
conn.execute('CREATE INDEX packageSource ON packages (rpm_sourcerpm)')
conn.commit()
conn.close()
How can I perform the same operation using SQLAlchemy?
I can come up with two ways to add that index through SQLAlchemy:
if you do not reflect, execute the SQL statement directly
if you reflect you table/model, add an index to it
Firstly, let's create the table to work on.
import sqlite3
con = sqlite3.connect("/tmp/73526761.db")
con.execute("CREATE TABLE t73526761 (id INT PRIMARY KEY, name VARCHAR)")
con.commit()
con.close()
Then, without reflecting, you can execute your raw SQL with the following.
import sqlalchemy as sa
engine = sa.create_engine("sqlite:////tmp/73526761.db", future=True)
with engine.begin() as con:
con.execute(sa.text("CREATE INDEX t73526761_name_idx ON t73526761 (name)"))
con.commit()
Or if you reflect the table only (SQLAlchemy core):
import sqlalchemy as sa
metadata_obj = sa.MetaData()
engine = sa.create_engine("sqlite:////tmp/73526761.db", future=True)
t73526761 = sa.Table("t73526761", metadata_obj, autoload_with=engine)
t73526761_name_idx = sa.Index("t73526761_name_idx", t73526761.c.name)
t73526761_name_idx.create(bind=engine) # emits CREATE INDEX t73526761_name_idx ON t73526761 (name)
Or if you reflect the model (SQLAlchemy orm):
import sqlalchemy as sa
from sqlalchemy import orm
Base = orm.declarative_base()
engine = sa.create_engine("sqlite:////tmp/73526761.db", future=True)
class K73526761(Base):
__table__ = sa.Table("t73526761", Base.metadata, autoload_with=engine)
t73526761_name_idx = sa.Index("t73526761_name_idx", K73526761.name)
t73526761_name_idx.create(bind=engine) # emits CREATE INDEX t73526761_name_idx ON t73526761 (name)
Am new to Postgres. Anyone can tell how to have it work?
What I want to do is to write Pandas datataframe to PostgreSQL database. I have already created a database 'customer' and table 'users'.
I am creating a simple Pandas dataframe as follows:
data = {'Col1':[1,2,3,4,5], 'Col2':[1,2,3,4,5]}
df = pd.DataFrame(data)
After that I am creating Postgres database connection to my 'customer' database follows:
conn = psycopg2.connect(
database="customer", user='postgres', password='password', host='127.0.0.1', port= '5432')
Then, I am using the following command to insert records from dataframe into table 'users':
df.to_sql('users', conn, if_exists='replace')
conn.commit()
conn.close()
Error that I am getting is:
pandas.io.sql.DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': syntax error at or near ";"
LINE 1: ...ELECT name FROM sqlite_master WHERE type='table' AND name=?;
^
df.to_sql() does not work for "conn" in psycopg2. It is for "engine" in sqlalchemy. For psycopg2, try insert instead:
Step 1: Creation of an empty table
First you need to create a cursor and then create a table:
cursor = conn.cursor()
cursor.execute("CREATE TABLE users_table (col1 integer, col2 integer)")
conn.commit()
Step 2: Insert pandas df to the users_table
tuples = [tuple(x) for x in df.to_numpy()]
cols = ','.join(list(df.columns))
query = "INSERT INTO %s(%s) VALUES(%%s,%%s)" % (users_table, cols) #two columns
cursor.executemany(query, tuples)
conn.commit()
If you want to use df.to_sql():
from sqlalchemy import create_engine
engine = create_engine('postgresql+psycopg2://user:password#hostname/database_name')
df.to_sql('users', engine)
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_sql.html
I have a dataframe df created as follow :
df = pd.DataFrame(list(zip(product_urlList, nameList, priceList, picList)),
columns =['URL','NomProduit', 'Prix', "LienPic"])
df['IdUnique'] = df['NomProduit'] + df['Prix']
My target is to import it into a MySQL database.
I created an SQL Database (called "Sezane") and its table called "Robes" via Python with MySQL.connector.
import mysql.connector as mysql
db = mysql.connect(
host = "localhost",
user = "root",
passwd = "password",
database = "sezane"
)
cursor = db.cursor()
cursor.execute('CREATE TABLE Robes (id INT(11) NOT NULL AUTO_INCREMENT PRIMARY KEY, Nom_Robes VARCHAR(255), Prix_Robes VARCHAR(255), liens_Robes VARCHAR(300), Images_robes VARCHAR (300), Id_Robes VARCHAR (255))'
Then, I try to insert this dataframe in the table :
from sqlalchemy import create_engine
engine = create_engine('mysql+mysqlconnector://root:password#Localhost:3306/sezane', echo=True)
df.to_sql(name='Robes', con=engine, if_exists = 'append')
I have the following error :
ProgrammingError: (mysql.connector.errors.ProgrammingError) 1054 (42S22): Unknown column 'index' in 'field list'
I made some researches about this error and found that it could become a problem of quote bracket "/' interversion.
However, after many hours on it, I still don't understand where it comes from. Why is the error message about "Index" ?
My target is to be able to make my df as a table.
By default to_sql tries to export the dataframe index as a column. You should be able to change this:
df.to_sql(name='Robes', con=engine, if_exists = 'append')
To this:
df.to_sql(name='Robes', con=engine, if_exists = 'append', index = False) and you will no longer get the same error.
I am struggling to get my python and sql datatype speaking to each other. I'm not sure what I am missing here.
My Code:
import pandas as pd
import sqlite3
from sqlalchemy import create_engine
from sqlalchemy import MetaData
from sqlalchemy import Table
from sqlalchemy import Column
from sqlalchemy import Integer, String, Float, DATE, DECIMAL
def db_conn():
global conn
global engine
db_uri = "sqlite:///db.sqlite"
engine = create_engine(db_uri)
conn = engine.connect()
meta_db = MetaData(engine)
print("Connected to Database at: " + str(datetime.today()))
return conn
db_conn()
# create table
meta = MetaData(engine)
table = Table('MarketData2', meta,
Column('DT', Integer, primary_key=True),
Column('Currency', String),
Column('Signal', String),
Column('Value', Float))
meta.create_all()
def update_MarketData_db(dt, currency, sig, val):
# insert multiple data
conn.execute(table.insert(),[
{'DT': str(dt),
'Currency': str(currency),
'Signal': str(sig),
'Value': float(val)}])
update_MarketData_db('2014-05-04', 'test', 'test_blac', 420.0)
As you can see, I explicitly declare each datatype in the function. Also I have tried the Float, FLOAT, Decimal and DECIMAL classes in sqlalchemy and that hasnt worked either.
The traceback:
IntegrityError: (sqlite3.IntegrityError) datatype mismatch [SQL: 'INSERT INTO "MarketData2" ("DT", "Currency", "Signal", "Value") VALUES (?, ?, ?, ?)'] [parameters: ('2014-05-04', 'test', 'test_blac', 420.0)] (Background on this error at: http://sqlalche.me/e/gkpj)
i have following:
from sqlalchemy import create_engine
engine1 = create_engine('mysql://user:password#host1/schema', echo=True)
engine2 = create_engine('mysql://user:password#host2/schema')
connection1 = engine1.connect()
connection2 = engine2.connect()
table1 = connection1.execute("select * from table1")
table2 = connection2.execute("select * from table2")
Now i want to insert all entries from this table1 into an identical empty table table2 in connection2.
How can i achive that?
I could also create a dict out of table1 and insert it then into table2. As i learned from the documentation of sqlalchemy there is a way to do that, but the examples there assume that you create a whole new table in order to insert into it with new_table.insert(). It doesnt work for my existing tables.
Thanks