Insert pandas dataframe to mysql using sqlalchemy

Insert pandas dataframe to mysql using sqlalchemy - python

I simply try to write a pandas dataframe to local mysql database on ubuntu.
from sqlalchemy import create_engine
import tushare as ts
df = ts.get_tick_data('600848', date='2014-12-22')
engine = create_engine('mysql://user:passwd#127.0.0.1/db_name?charset=utf8')
df.to_sql('tick_data',engine, flavor = 'mysql', if_exists= 'append')
and it pop the error
biggreyhairboy#ubuntu:~/git/python/fjb$ python tushareDB.py
Error on sql SHOW TABLES LIKE 'tick_data'
Traceback (most recent call last):
File "tushareDB.py", line 13, in <module>
df.to_sql('tick_data', con = engine,flavor ='mysql', if_exists= 'append')
File "/usr/lib/python2.7/dist-packages/pandas/core/frame.py", line 1261, in to_sql
self, name, con, flavor=flavor, if_exists=if_exists, **kwargs)
File "/usr/lib/python2.7/dist-packages/pandas/io/sql.py", line 207, in write_frame
exists = table_exists(name, con, flavor)
File "/usr/lib/python2.7/dist-packages/pandas/io/sql.py", line 275, in table_exists
return len(tquery(query, con)) > 0
File "/usr/lib/python2.7/dist-packages/pandas/io/sql.py", line 90, in tquery
cur = execute(sql, con, cur=cur)
File "/usr/lib/python2.7/dist-packages/pandas/io/sql.py", line 53, in execute
con.rollback()
AttributeError: 'Engine' object has no attribute 'rollback'
the dataframe is not empty, database is ready without tables, i have tried other method to create table in python with mysqldb and it works fine.
a related question:
Writing to MySQL database with pandas using SQLAlchemy, to_sql
but no actual reason was explained

You appear to be using an older version of pandas. I did a quick git bisect to find the version of pandas where line 53 contains con.rollback(), and found pandas at v0.12, which is before SQLAlchemy support was added to the execute function.
If you're stuck on this version of pandas, you'll need to use a raw DBAPI connection:
df.to_sql('tick_data', engine.raw_connection(), flavor='mysql', if_exists='append')
Otherwise, update pandas and use the engine as you intend to. Note that you don't need to use the flavor parameter when using SQLAlchemy:
df.to_sql('tick_data', engine, if_exists='append')

Related

Writing data into Snowflake table using Python

I am trying to read data from Excel to pandas dataframe and then write the dataframe to Snowflake table. Code as below.
Connection is established and Excel read is working fine but write to snowflake table is not working. Am getting below error . Requesting help to resolve the error
snowflake.connector.errors.MissingDependencyError: Missing optional dependency: pandas Process finished with exit code 1
import pandas as pd
from sqlalchemy import create_engine
from snowflake.sqlalchemy import URL
from snowflake.connector.pandas_tools import pd_writer
url = URL(
account = '',
user = '',
schema = 'TMP',
database = 'TMP',
warehouse= 'DATABRICKS',
role = '',
authenticator='externalbrowser',
)
engine = create_engine(url)
con = engine.connect()
df = pd.read_excel("C:\\Final.xlsx")
df.columns = df.columns.astype(str)
table_name = 'test_connect'
if_exists = 'replace'
df.to_sql(name=table_name.lower(), con=con,index= False, if_exists=if_exists, method=pd_writer)
Detailed Error info below
Traceback (most recent call last):
File "C:\Users\XYZ\AppData\Roaming\JetBrains\DataSpell2022.2\scratches\scratch.py", line 32, in <module>
df.to_sql(name=table_name.lower(), con=con,index= False, if_exists=if_exists, method=pd_writer)
File "C:\Users\XYZ\AppData\Roaming\Python\Python310\site-packages\pandas\core\generic.py", line 2963, in to_sql
return sql.to_sql(
File "C:\Users\XYZ\AppData\Roaming\Python\Python310\site-packages\pandas\io\sql.py", line 697, in to_sql
return pandas_sql.to_sql(
File "C:\Users\XYZ\AppData\Roaming\Python\Python310\site-packages\pandas\io\sql.py", line 1739, in to_sql
total_inserted = sql_engine.insert_records(
File "C:\Users\XYZ\AppData\Roaming\Python\Python310\site-packages\pandas\io\sql.py", line 1322, in insert_records
return table.insert(chunksize=chunksize, method=method)
File "C:\Users\XYZ\AppData\Roaming\Python\Python310\site-packages\pandas\io\sql.py", line 950, in insert
num_inserted = exec_insert(conn, keys, chunk_iter)
File "C:\Users\XYZ\AppData\Roaming\Python\Python310\site-packages\snowflake\connector\pandas_tools.py", line 320, in pd_writer
df = pandas.DataFrame(data_iter, columns=keys)
File "C:\Users\XYZ\AppData\Roaming\Python\Python310\site-packages\snowflake\connector\options.py", line 36, in __getattr__
raise MissingDependencyError(self._dep_name)
snowflake.connector.errors.MissingDependencyError: Missing optional dependency: pandas
Process finished with exit code 1

I believe the following dependency install step has not been completed: https://docs.snowflake.com/en/user-guide/python-connector-pandas.html#installation

SQLAlchemy Reflection Failing in Python 3.6 with PyMySQL

Howdie do,
I'm attempting to use Python 3.6 with SQLAlchemy. I am able to connect to the database, but all reflection attempts are failing:
Traceback (most recent call last):
File "/Users/jw1050/Python/projects/label_automation/generate.py", line 14, in <module>
metadata.reflect(engine, only=['parcel', 'order', 'address', 'document'])
File "/Users/jw1050/.virtualenvs/psd_label_automatiion/lib/python3.6/site-packages/sqlalchemy/sql/schema.py", line 3874, in reflect
bind.engine.table_names(schema, connection=conn))
File "/Users/jw1050/.virtualenvs/psd_label_automatiion/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 2128, in table_names
return self.dialect.get_table_names(conn, schema)
File "<string>", line 2, in get_table_names
File "/Users/jw1050/.virtualenvs/psd_label_automatiion/lib/python3.6/site-packages/sqlalchemy/engine/reflection.py", line 42, in cache
return fn(self, con, *args, **kw)
File "/Users/jw1050/.virtualenvs/psd_label_automatiion/lib/python3.6/site-packages/sqlalchemy/dialects/mysql/base.py", line 1756, in get_table_names
self.identifier_preparer.quote_identifier(current_schema))
File "/Users/jw1050/.virtualenvs/psd_label_automatiion/lib/python3.6/site-packages/sqlalchemy/sql/compiler.py", line 2888, in quote_identifier
self._escape_identifier(value) + \
File "/Users/jw1050/.virtualenvs/psd_label_automatiion/lib/python3.6/site-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 78, in _escape_identifier
value = value.replace(self.escape_quote, self.escape_to_quote)
TypeError: a bytes-like object is required, not 'str'
My connection information is below:
engine = create_engine('mysql+pymysql://:127.0.0.1:3306/(db_name)?charset=utf8&use_unicode=0')
session = scoped_session(sessionmaker(bind=engine))()
metadata = MetaData()
metadata.reflect(engine, only=['parcel', 'order', 'address', 'document'])
Base = automap_base(metadata=metadata)
Base.prepare()
Everything works fine in Python 2, but I do not want to use Python 2 here. Has anybody else run into this issue and able to resolve?

I've figured this out.
This was due to my connection string having use_unicode=0 in it.
According to the SQLAlchemy Docs this setting should never be used in Python3. In Python 2, it gives superior performance but not in Python 3
Hopefully this helps someone

can't use pony orm on sqlite3 blob fields

Just trying some basic exercises with pony ORM (and python3.5, sqlite3).
I just want to print a select query of some data I have without further processing to start with. Pony orm does not seem to like that at all....
The sqlite db dump
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE sums (t text, path BLOB, name BLOB, sum text, primary key (path,name));
INSERT INTO "sums" VALUES('directory','','','');
INSERT INTO "sums" VALUES('file','','sums-backup-f.db','6859b35f9f026317c5df48932f9f2a91');
INSERT INTO "sums" VALUES('file','','md5-tree.py','c7af81d4aad9d00e88db7af950c264c2');
INSERT INTO "sums" VALUES('file','','test.db','a403e9b46e54d6ece851881a895b1953');
INSERT INTO "sums" VALUES('file','','sirius-alexa.db','22a20434cec550a83c675acd849002fa');
INSERT INTO "sums" VALUES('file','','sums-reseau-y.db','1021614f692b5d7bdeef2a45b6b1af5b');
INSERT INTO "sums" VALUES('file','','.md5-tree.py.swp','1c3c195b679e99ef18b3d46044f6e6c5');
INSERT INTO "sums" VALUES('file','','compare-md5.py','cfb4a5b3c7c4e62346aa5e1affef210a');
INSERT INTO "sums" VALUES('file','','charles.local.db','9c50689e8185e5a79fd9077c14636405');
COMMIT;
Here is the code I try to run on python3.5 interactive shell:
from pony.orm import *
db = Database()
class File(db.Entity) :
_table_ = 'sums'
t = Required(str)
path = Required(bytes)
name = Required(bytes)
sum = Required(str)
PrimaryKey(path,name)
db.bind('sqlite','/some/edited/path/test.db')
db.generate_mapping()
File.select().show()
And it fails like this :
Traceback (most recent call last):
File "/usr/lib/python3.5/site-packages/pony/orm/core.py", line 5149, in _fetch
try: result = cache.query_results[query_key]
KeyError: (('f', 0, ()), (<pony.orm.ormtypes.SetType object at 0x7fd2d2701708>,), False, None, None, None, False, False, False, ())
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 2, in show
File "/usr/lib/python3.5/site-packages/pony/utils/utils.py", line 75, in cut_traceback
raise exc # Set "pony.options.CUT_TRACEBACK = False" to see full traceback
File "/usr/lib/python3.5/site-packages/pony/utils/utils.py", line 60, in cut_traceback
try: return func(*args, **kwargs)
File "/usr/lib/python3.5/site-packages/pony/orm/core.py", line 5256, in show
query._fetch().show(width)
File "/usr/lib/python3.5/site-packages/pony/orm/core.py", line 5155, in _fetch
used_attrs=translator.get_used_attrs())
File "/usr/lib/python3.5/site-packages/pony/orm/core.py", line 3859, in _fetch_objects
real_entity_subclass, pkval, avdict = entity._parse_row_(row, attr_offsets)
File "/usr/lib/python3.5/site-packages/pony/orm/core.py", line 3889, in _parse_row_
avdict[attr] = attr.parse_value(row, offsets)
File "/usr/lib/python3.5/site-packages/pony/orm/core.py", line 1922, in parse_value
val = attr.validate(row[offset], None, attr.entity, from_db=True)
File "/usr/lib/python3.5/site-packages/pony/orm/core.py", line 2218, in validate
val = Attribute.validate(attr, val, obj, entity, from_db)
File "/usr/lib/python3.5/site-packages/pony/orm/core.py", line 1894, in validate
if from_db: return converter.sql2py(val)
File "/usr/lib/python3.5/site-packages/pony/orm/dbapiprovider.py", line 619, in sql2py
if not isinstance(val, buffer): val = buffer(val)
TypeError: string argument without an encoding
Am I using this wrong, or is this a bug ? I don't mind go filing a bug, but it's the first time I'm using this orm, so I thought it might be better to check first ...

SQLite has a (mis)feature, which allows a column to store an arbitrary value disregarding the column type. Instead of rigid data type, each SQLite column has an affinity, while each value has a storage class which can be different within the same column. For example, you can store text value inside an integer column, and vice versa. See Datatypes In SQLite Version 3 for more information.
The reason for the error is that the table contains values of "wrong" type in its BLOB columns. Correct SQLite binary literal looks like x'abcdef'. The INSERT commands that you use insert UTF8 strings instead.
This problem was somewhat fixed in the latest version of Pony which you can take from GitHub. Now if Pony receives a string value from a BLOB column it just keep that value without throwing an exception.
If you populate the table with Pony, it will writes BLOB data as a correct binary values, so it can read them later without any problem.

Replacing Bad Data in Pandas Data Frame

In Python 2.7, I'm connecting to an external data source using the following:
import pypyodbc
import pandas as pd
import datetime
import csv
import boto3
import os
# Connect to the DataSource
conn = pypyodbc.connect("DSN = FAKE DATA SOURCE; UID=FAKEID; PWD=FAKEPASSWORD")
# Specify the query we're going to run on it
script = ("SELECT * FROM table")
# Create a dataframe from the above query
df = pd.read_sql_query(script, conn)
I get the following error:
C:\Python27\python.exe "C:/Thing.py"
Traceback (most recent call last):
File "C:/Thing.py", line 30, in <module>
df = pd.read_sql_query(script,conn)
File "C:\Python27\lib\site-packages\pandas-0.18.1-py2.7-win32.egg\pandas\io\sql.py", line 431, in read_sql_query
parse_dates=parse_dates, chunksize=chunksize)
File "C:\Python27\lib\site-packages\pandas-0.18.1-py2.7-win32.egg\pandas\io\sql.py", line 1608, in read_query
data = self._fetchall_as_list(cursor)
File "C:\Python27\lib\site-packages\pandas-0.18.1-py2.7-win32.egg\pandas\io\sql.py", line 1617, in _fetchall_as_list
result = cur.fetchall()
File "build\bdist.win32\egg\pypyodbc.py", line 1819, in fetchall
File "build\bdist.win32\egg\pypyodbc.py", line 1871, in fetchone
ValueError: could not convert string to float: ?
It's seems to me that in one of the float columns, there is a '?' symbol for some reason. I've reached out to the owner of the data source, but they cannot change the underlying table.
Is there a way to replace incorrect data like this using pandas? I've tried using replace after the read_sql_query statement, but I get the same error.

Hard to know for certain without having your data obviously, but you could try setting coerce_float to False, i.e. replace your last line with
df = pd.read_sql_query(script, conn, coerce_float=False)
See the documentation of read_sql_query.

select a single column from Mysql DB using sqlalchemy

How do I get values from a single column using sqlalchemy?
In MySQL
select id from request r where r.product_id = 1;
In Python
request = meta.tables['request']
request.select(request.c.product_id==1).execute().rowcount
27L
>>> request.select([request.c.id]).where(request.c.product_id==1).execute()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "build/bdist.freebsd-6.3-RELEASE-i386/egg/sqlalchemy/sql/expression.py", line 2616, in select
File "build/bdist.freebsd-6.3-RELEASE-i386/egg/sqlalchemy/sql/expression.py", line 305, in select
File "build/bdist.freebsd-6.3-RELEASE-i386/egg/sqlalchemy/sql/expression.py", line 5196, in __init__
File "build/bdist.freebsd-6.3-RELEASE-i386/egg/sqlalchemy/sql/expression.py", line 1517, in _literal_as_text
sqlalchemy.exc.ArgumentError: SQL expression object or string expected.

I found the answer, I have to use the general select vs the table select.
Leaving this incase more folks find it useful.
conn = engine.connect()
stmt = select([request.c.id]).where(request.c.product_id==1)
conn.execute(stmt).rowcount
27L

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Insert pandas dataframe to mysql using sqlalchemy - python

Related

Writing data into Snowflake table using Python

SQLAlchemy Reflection Failing in Python 3.6 with PyMySQL

can't use pony orm on sqlite3 blob fields

Replacing Bad Data in Pandas Data Frame

select a single column from Mysql DB using sqlalchemy

Categories

Resources