This question already has answers here:
Executing multiple statements with Postgresql via SQLAlchemy does not persist changes
(3 answers)
Closed 3 years ago.
I am trying to execute a sql query from a file using sqlalchemy.
When I run the queries, I get a result saying that it affected x amount of rows, but when I check the DB it doesn't actually insert anything to the tables.
Here is my current code:
def import_to_db(df, table_name):
df.to_sql(
table_name,
con=engine,
schema='staging',
if_exists='replace',
index= False,
method= 'multi'
)
print('imported data to staging.{}'.format(table_name))
with open('/home/kyle/projects/data_pipelines/ahc/sql/etl_{}.sql'.format(table_name)) as fp:
etl = fp.read()
result = engine.execute(etl)
print('moved {} rows to public.{}'.format(result.rowcount, table_name))
When I run the .sql scripts manually, they work fine. I even tried making stored procedures but that didn't work either. Here is an example of one of the sql files im executing:
--Delete Id's in prod table that are in current staging table
DELETE
FROM public.table
WHERE key IN
(SELECT key FROM staging.table);
--Insert new/old id's into prod table and do any cleaning
INSERT INTO
public.table
SELECT columna, columnb, columnc
FROM staging.table;
Found a solution, although I don't fully understand it.
I added BEGIN; at the top of my script, and COMMIT; at the bottom.
This works, but my row count now say -1 so it doesn't help me much for logging.
Related
This question already has answers here:
Sqlite insert query not working with python?
(2 answers)
Closed 1 year ago.
I am currently learning SQL for one of my projects and the site, that I learn from, advised me to use DB Browser to see my Database Content. However, I can't see the data inside the SQL. This is how my code looks like. I'm creating a table and then trying to write some values in it. It creates the DB successfully but the data doesn't show up.
import sqlite3 as sql
connection = sql.connect("points.db")
cursor = connection.cursor()
cursor.execute("CREATE TABLE IF NOT EXISTS servers (server_id TEXT, name TEXT, exp INTEGER)")
cursor.execute("INSERT INTO servers VALUES ('848117357214040104', 'brknarsy', 20)")
Can you check that your data is inserted?
Something like this in the end:
cursor.execute("SELECT * FROM servers")
r = cursor.fetchall()
for i in r:
print(r)
Perhaps SQLite browser just needs a refresh
I'm using SQL Server 2014, pandas 0.23.4, sqlalchemy 1.2.11, pyodbc 4.0.24, and Python 3.7.0. I have a very simple stored procedure that performs an UPDATE on a table and then a SELECT on it:
CREATE PROCEDURE my_proc_1
#v2 INT
AS
BEGIN
UPDATE my_table_1
SET v2 = #v2
;
SELECT * from my_table_1
;
END
GO
This runs fine in MS SQL Server Management Studio. However, when I try to invoke it via Python using this code:
import pandas as pd
from sqlalchemy import create_engine
if __name__ == "__main__":
conn_str = 'mssql+pyodbc://#MODEL_TESTING'
engine = create_engine(conn_str)
with engine.connect() as conn:
df = pd.read_sql_query("EXEC my_proc_1 33", conn)
print(df)
I get the following error:
sqlalchemy.exc.ResourceClosedError: This result object does not return
rows. It has been closed automatically.
(Please let me know if you want full stack trace, I will update if so)
When I remove the UPDATE from the stored proc, the code runs and the results are returned. Note also that selecting from a table other than the one being updated does not make a difference, I get the same error. Any help is much appreciated.
The issue is that the UPDATE statement is returning a row count, which is a scalar value, and the rows returned by the SELECT statement are "stuck" behind the row count where pyodbc cannot "see" them (without additional machinations).
It is considered a best practice to ensure that our stored procedures always start with a SET NOCOUNT ON; statement to suppress the returning of row count values from DML statements (UPDATE, DELETE, etc.) and allow the stored procedure to just return the rows from the SELECT statement.
For me I got the same issue for another reason, I was using sqlachmey the newest syntax select to get the entries of a table and I had forgot to write the name of the table class I want to get values from, so I got this error, so I had only added the name of the table as an argument to fix the error.
the code leaded to the error
query = select().where(Assessment.created_by == assessment.created_by)
simply fix it by adding the table class name sometimes issues are only in the syntax hhh
query = select(Assessment).where(Assessment.created_by == assessment.created_by)
Below is the last part of my selenium web scraper that loops through the different tabs of this website page, selects the "export data" button, downloads the data, adds a "yearid" column, then loads the data into a MySQL table.
df = pd.read_csv(desired_filepath)
df["yearid"] = datetime.today().year
df[df.columns[df.columns.str.contains('%')]] = \
(df.filter(regex='%')
.apply(lambda x: pd.to_numeric(x.str.replace(r'[\s%]', ''),
errors='coerce')))
df.to_csv(desired_filepath)
engine = create_engine("mysql+pymysql://{user}:{pw}#localhost/{db}"
.format(user="walker",
pw="password",
db="data"))
df.to_sql(con=engine, name='fg_test_hitting_{}'.format(button_text), if_exists='replace')
time.sleep(10)
driver.quit()
Everything works great, but I would like to import the data into the MySQL table and replace only if the yearid=2018. Does anyone know if it is possible to load data and replace given a specific condition? Thanks in advance!
I think rather than deleting from your table it may be better to just let MySQL handle the replacing. You can do this by creating a temporary table with the new data, replace into the permanent table, then delete the temp table. The big caveat here is that you will need to set the keys in your table (Ideally only once). I don't know what your key fields are so its tough to help in this regard.
Replace the commented line with this:
# df.to_sql(con=engine, name='fg_test_hitting_{}'.format(button_text), if_exists='replace')
conn = engine.connect()
# should fail if temporary table already exists (we want it to fail in this case)
df.to_sql('fg_test_hitting_{}_tmp'.format(button_text), conn)
# Will create the permanent table if it does not already exist (will only matter in the first run)
# note that you may have to create keys here so that mysql knows what constitutes a replacement
conn.execute('CREATE TABLE IF NOT EXISTS fg_test_hitting_{} LIKE fg_test_hitting_{}_tmp;'.format(button_text, button_text))
# updating the permanent table and dropping the temporary table
conn.execute('REPLACE INTO fg_test_hitting_{} (SELECT * FROM fg_test_hitting_{}_tmp);'.format(button_text, button_text))
conn.execute('DROP TABLE IF EXISTS fg_test_hitting_{}_tmp;'.format(button_text))
As described by #Leo in comments first delete that part of data (from MySQL table) that you were going to update and then save it to MySQL table:
conn = engine.connect()
cur = conn.cursor()
...
cur.execute('delete from fg_test_hitting_{} where yearid=?'.format(button_text),
(pd.datetime.today().year,))
df.to_sql(con=engine, name='fg_test_hitting_{}'.format(button_text), if_exists='replace')
I'm attempting to use python with sqlalchemy to download some data, create a temporary staging table on a Teradata Server, then MERGEing that table into another table which I've created to permanently store this data. I'm using sql = slqalchemy.text(merge) and td_engine.execute(sql) where merge is a string similar to the below:
MERGE INTO perm_table as p
USING temp_table as t
ON p.Id = t.Id
WHEN MATCHED THEN
UPDATE
SET col1 = t.col1,
col2 = t.col2,
...
col50 = t.col50
WHEN NOT MATCHED THEN
INSERT (col1,
col2,
...
col50)
VALUES (t.col1,
t.col2,
...
t.col50)
The script runs all the way to the end without error and the SQL executes properly through Teradata Studio, but for some reason the table won't update when I execute it through SQLAlchemy. However, I've also run different SQL expressions, like the insert that populated perm_table from the same python script and it worked fine. Maybe there's something specific to the MERGE and SQLAlchemy combo?
Since you're using the engine directly, without using a transaction, you're probably (barring unseen configuration on your part) relying on SQLAlchemy's version of autocommit, which works by detecting data changing operations such as INSERTs etc. Possibly MERGE is not one of the detected operations. Try
sql = sqlalchemy.text(merge).execution_options(autocommit=True)
td_engine.execute(sql)
This question already has answers here:
How to UPSERT (MERGE, INSERT ... ON DUPLICATE UPDATE) in PostgreSQL?
(7 answers)
Closed 7 years ago.
I have a python script that is using the Psycopg adapter; I am parsing a JSON Array and inserting into my PostgreSQL database.
for item in data["SchoolJSONData"]:
mId = item.get("Id")
mNumofRooms = item.get("NumofRooms")
mFloors = item.get("Floors")
con = None
con = psycopg2.connect("dbname='database' user='dbuser'")
cur = con.cursor()
cur.execute('INSERT INTO Schools(Id, NumofRooms, Floors)VALUES(%s, %s, %s)',(mId, mNumofRooms, mFloors))
con.commit()
Everytime I run the script again, I get the following:
psycopg2.IntegrityError: duplicate key value violates unique constraint "schools_pkey"
How can I run the insert script so that it will ignore existing entries in the database?
EDIT: Thanks for the replies all... I am trying to NOT overwrite any data, only ADD (if the PK is not already in the table), and ignore any errors. In my case I will only be adding new entries, never updating data.
There is no single one way to solve this problem. As well this problem has little to do with python. It is valid exception generated by the database ( not just postgre all databases will do the same ).
But you can try - catch this exception and continue smoothly later.
OR
you can use "select count(*) where id = mId" to ensure it is not existing already.