I built a simple class with a couple methods to make my life a little easier when loading data into Postgres with Python. I also attempted to package it so I could pip install it (just to experiment, never done that before).
import psycopg2
from sqlalchemy import create_engine
import io
class py_psql:
engine = None
def engine(self, username, password, hostname, port, database):
connection = 'postgresql+psycopg2://{}:{}#{}:{}/{}'.format(ntid.lower(), pw, hostname, port, database)
self.engine = create_engine(connection)
def query(self, query):
pg_eng = self.engine
return pd.read_sql_query(query, pg_eng)
def write(self, write_name, df, if_exists='replace', index=False):
mem_size = df.memory_usage().sum()/1024**2
pg_eng = self.engine
def write_data():
df.head(0).to_sql(write_name, pg_eng, if_exists=if_exists,index=index)
conn = pg_eng.raw_connection()
cur = conn.cursor()
output = io.StringIO()
df.to_csv(output, sep='\t', header=False, index=False)
output.seek(0)
contents = output.getvalue()
cur.copy_from(output, write_name, null="")
conn.commit()
if mem_size > 100:
validate_size = input('DataFrame is {}mb, proceed anyway? (y/n): '.format(mem_size))
if validate_size == 'y':
write_data()
else:
print("Canceling write to database")
else:
write_data()
My package directory looks like this:
py_psql
py_psql.py
__init__.py
setup.py
My init.py is empty since I read elsewhere that I was able to do that. I'm not remotely an expert here...
I was able to pip install that package and import it, and if I were to paste this class into a python shell, I would be able to do something like
test = py_psql()
test.engine(ntid, pw, hostname, port, database)
and have it create the sqlalchemy engine. However, when I import it after the pip install I can't even initialize a py_psql object:
>>> test = py_psql()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'module' object is not callable
>>> py_psql.engine(ntid, pw, hostname, port, database)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'py_psql' has no attribute 'engine'
I'm sure I'm messing up something obvious here, but I found the process of packaging fairly confusing while researching this. What am I doing incorrectly?
Are you sure you imported your package correctly after pip install?
For example:
from py_psql.py_psql import py_psql
test = py_psql()
test.engine(ntid, pw, hostname, port, database)
Related
Got a simple script that is given below. When trying to run the script, I get traceback error that I cannot quite figure out. The error and script are below. I am running python 3.4 as is required. Not sure what is going on exactly. Any help is appreciated. It says its a syntax error on import... yet its the only import.
Traceback (most recent call last):
File "script.py", line 1, in
import pymysql
File "C:\Python34\lib\site-packages\pymysql_init_.py", line 59, in
from . import connections # noqa: E402
File "C:\Python34\lib\site-packages\pymysql\connections.py", line 206
):
^
SyntaxError: invalid syntax
import pymysql
myConnection = pymysql.connect(host='localhost', user = 'root', password='******', database='accidents')
cur = myConnection.cursor()
cur.execute('SELECT vtype FROM vehicle_type WHERE vtype LIKE "%otorcycle%";')
cycleList = cur.fetchall()
selectSQL = ('''
SELECT t.vtype, a.accident_severity
FROM accidents_2016 AS a
JOIN vehicles_2016 AS v ON a.accident_index = v.Accident_Index
JOIN vehicle_type AS t ON v.Vehicle_Type = t.vcode
WHERE t.vtype LIKE %s
ORDER BY a.accident_severity;
''')
insertSQL = ('''
INSERT INTO accident_medians VALUES (%s, %s);
''')
for cycle in cycleList:
cur.execute(selectSQL,cycle[0])
accidents = cur.fetchall()
quotient, remainder = divmod(len(accidents),2)
if remainder:
med_sev = accidents[quotient][1]
else:
med_sev = (accidents[quotient][1] + accidents[quotient+2][1])/2
print('Finding median for',cycle[0])
cur.execute(insertSQL,(cycle[0],med_sev))
myConnection.commit()
myConnection.close()
This is from connections.py. Which seems to be the standard install file.
def __init__(
self,
*,
user=None, # The first four arguments is based on DB-API 2.0 recommendation.
password="",
host=None,
database=None,
unix_socket=None,
port=0,
charset="",
sql_mode=None,
read_default_file=None,
conv=None,
use_unicode=True,
client_flag=0,
cursorclass=Cursor,
init_command=None,
connect_timeout=10,
read_default_group=None,
autocommit=False,
local_infile=False,
max_allowed_packet=16 * 1024 * 1024,
defer_connect=False,
auth_plugin_map=None,
read_timeout=None,
write_timeout=None,
bind_address=None,
binary_prefix=False,
program_name=None,
server_public_key=None,
ssl=None,
ssl_ca=None,
ssl_cert=None,
ssl_disabled=None,
ssl_key=None,
ssl_verify_cert=None,
ssl_verify_identity=None,
compress=None, # not supported
named_pipe=None, # not supported
passwd=None, # deprecated
db=None, # deprecated
): -- line 206
I had a similar issue when running a script using python 3.5.2. I followed the suggestion here https://github.com/miguelgrinberg/microblog/issues/282#issuecomment-776937071 and downgraded my pymysql, which solved the problem
pip uninstall pymysql
pip install -Iv pymysql==0.9.3
I want to use prepared statements to insert data into a MySQL DB (version 5.7) using python, but I keep getting a NotImplementedError.
I'm following the documentation here: https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursorprepared.html
Using Python 2.7 and version 8.0.11 of mysql-connector-python library:
pip show mysql-connector-python
---
Metadata-Version: 2.1
Name: mysql-connector-python
Version: 8.0.11
Summary: MySQL driver written in Python
Home-page: http://dev.mysql.com/doc/connector-python/en/index.html
This is a cleaned version (no specific hostname, username, password, columns, or tables) of the python script I'm running:
import mysql.connector
from mysql.connector.cursor import MySQLCursorPrepared
connection = mysql.connector.connect(user=username, password=password,
host='sql_server_host',
database='dbname')
print('Connected! getting cursor')
cursor = connection.cursor(cursor_class=MySQLCursorPrepared)
select = "SELECT * FROM table_name WHERE column1 = ?"
param = 'param1'
print('Executing statement')
cursor.execute(select, (param,))
rows = cursor.fetchall()
for row in rows:
value = row.column1
print('value: '+ value)
I get this error when I run this:
Traceback (most recent call last):
File "test.py", line 18, in <module>
cursor.execute(select, (param,))
File "/home/user/.local/lib/python2.7/site-packages/mysql/connector/cursor.py", line 1186, in execute
self._prepared = self._connection.cmd_stmt_prepare(operation)
File "/home/user/.local/lib/python2.7/site-packages/mysql/connector/abstracts.py", line 969, in cmd_stmt_prepare
raise NotImplementedError
NotImplementedError
CEXT will be enabled by default if you have it, and prepared statements are not supported in CEXT at the time of writing.
You can disable the use of CEXT when you connect by adding the keyword argument use_pure=True as follows:
connection = mysql.connector.connect(user=username, password=password,
host='sql_server_host',
database='dbname',
use_pure=True)
Support for prepared statements in CEXT will be included in the upcoming mysql-connector-python 8.0.17 release (according to the MySQL bug report). So once that is available, upgrade to at least 8.0.17 to solve this without needing use_pure=True.
I am very new to python and I just can't seem to find an answer to this error. When I run the code below I get the error
AttributeError: module 'odbc' has no attribute 'connect'
However, the error only shows in eclipse. There's no problem if I run it via command line. I am running python 3.5. What am I doing wrong?
try:
import pyodbc
except ImportError:
import odbc as pyodbc
# Specifying the ODBC driver, server name, database, etc. directly
cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER=PXLstr,17;DATABASE=Dept_MR;UID=guest;PWD=password')
The suggestion to remove the try...except block did not work for me. Now the actual import is throwing the error as below:
Traceback (most recent call last):
File "C:\Users\a\workspace\TestPyProject\src\helloworld.py", line 2, in <module>
import pyodbc
File "C:\Users\a\AppData\Local\Continuum\Anaconda3\Lib\site-packages\sqlalchemy\dialects\mssql\pyodbc.py", line 105, in <module>
from .base import MSExecutionContext, MSDialect, VARBINARY
I do have pyodbc installed and the import and connect works fine with the command line on windows.
thank you
The problem here is that the pyodbc module is not importing in your try / except block. I would highly recommend not putting import statements in try blocks. First, you would want to make sure you have pyodbc installed (pip install pyodbc), preferably in a virtualenv, then you can do something like this:
import pyodbc
cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER=PXLstr,17;DATABASE=Dept_MR;UID=guest;PWD=password')
cursor = cnxn.cursor()
cursor.execute('SELECT 1')
for row in cursor.fetchall():
print(row)
If you're running on Windows (it appears so, given the DRIVER= parameter), take a look at virtualenvwrapper-win for managing Windows Python virtual environments: https://pypi.python.org/pypi/virtualenvwrapper-win
Good luck!
Flipper's answer helped to establish that the problem was with referencing an incorrect library in External Libraries list in eclipse. After fixing it, the issue was resolved.
What is the name of your python file? If you inadvertently name it as 'pyodbc.py', you got that error. Because it tries to import itself instead of the intended pyodbc module.
here is the solution!
simply install and use 'pypyodbc' instead of 'pyodbc'!
I have my tested example as below. change your data for SERVER_NAME and DATA_NAME and DRIVER. also put your own records.good luck!
import sys
import pypyodbc as odbc
records = [
['x', 'Movie', '2020-01-09', 2020],
['y', 'TV Show', None, 2019]
]
DRIVER = 'ODBC Driver 11 for SQL Server'
SERVER_NAME = '(LocalDB)\MSSQLLocalDB'
DATABASE_NAME = 'D:\ASPNET\SHOJA.IR\SHOJA.IR\APP_DATA\DATABASE3.MDF'
conn_string = f"""
Driver={{{DRIVER}}};
Server={SERVER_NAME};
Database={DATABASE_NAME};
Trust_Connection=yes;
"""
try:
conn = odbc.connect(conn_string)
except Exception as e:
print(e)
print('task is terminated')
sys.exit()
else:
cursor = conn.cursor()
insert_statement = """
INSERT INTO NetflixMovies
VALUES (?, ?, ?, ?)
"""
try:
for record in records:
print(record)
cursor.execute(insert_statement, record)
except Exception as e:
cursor.rollback()
print(e.value)
print('transaction rolled back')
else:
print('records inserted successfully')
cursor.commit()
cursor.close()
finally:
if conn.connected == 1:
print('connection closed')
conn.close()
I try to COPY a CSV file from a folder to a postgres table using python and psycopg2 and I get the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
psycopg2.ProgrammingError: must be superuser to COPY to or from a file
HINT: Anyone can COPY to stdout or from stdin. psql's \copy command also works for anyone.
I also tried to run it through the python environment as:
constr = "dbname='db_name' user='user' host='localhost' password='pass'"
conn = psycopg2.connect(constr)
cur = conn.cursor()
sqlstr = "COPY test_2 FROM '/tmp/tmpJopiUG/downloaded_xls.csv' DELIMITER ',' CSV;"
cur.execute(sqlstr)
I still get the above error. I tried \copy command but this works only in psql. What is the alternative in order to be able to execute this through my python script?
EDITED
After having a look in the link provided by #Ilja Everilä I tried this:
cur.copy_from('/tmp/tmpJopiUG/downloaded_xls.csv', 'test_copy')
I get an error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: argument 1 must have both .read() and .readline() methods
How do I give these methods?
Try using cursor.copy_expert():
constr = "dbname='db_name' user='user' host='localhost' password='pass'"
conn = psycopg2.connect(constr)
cur = conn.cursor()
sqlstr = "COPY test_2 FROM STDIN DELIMITER ',' CSV"
with open('/tmp/tmpJopiUG/downloaded_xls.csv') as f:
cur.copy_expert(sqlstr, f)
conn.commit()
You have to open the file in python and pass it to psycopg, which then forwards it to postgres' stdin. Since you're using the CSV argument to COPY, you have to use the expert version in which you pass the COPY statement yourself.
You can also use copy_from. See the code below
with open('/tmp/tmpJopiUG/downloaded_xls.csv') as f:
cur.copy_from(f, table_name,sep=',')
conn.commit()
Fair warning: I'm a big time noob. Please handle with kid gloves.
Details:
Python 3.2
MySQL 5.5
Tornado webframe installed
pymysql installed
Windows 7
Problem:
I'm following the Tornado documentation on connecting to a mysql database here. I only want to connect to localhost, but I'm getting the following error message:
Traceback (most recent call last):
File "C:\Python32\DIP3\tornado-test.py", line 5, in <module>
class Connection(localhost,re_project, user=root, password=mypassword, max_idle_time=25200):
NameError: name 'localhost' is not defined
This is the code I'm trying to run:
import tornado.ioloop
import tornado.web
import pymysql
class Connection(localhost,re_project, user=root, password=mypassword, max_idle_time=25200):
db = database.Connection("localhost", "re_project")
for Bogota in db.query("SELECT * FROM cities_copy"):
print(Bogota.title)
MySQL is currently running when I execute the code, so I don't think that should be a problem. What else could I be doing wrong?
This line:
class Connection(localhost,re_project, user=root, password=mypassword, max_idle_time=25200):
makes no sense at all. You can't define a class like that. Did you mean to use def instead of class?
Okay, I think I understand the problem. In the documentation, the line class tornado.database.Connection(host, database, user=None, password=None, max_idle_time=25200) is part of the documentation and is not meant to be copy/pasted. That describes how to do the db = database.Connection bit.
The green code sample lines should work on their own, as long as 1) the tornado.database module is imported and 2) the db = line is adjusted to pass values appropriate for your database to the Connection method.
So:
from tornado import database # you can use "import tornado.database" here, but then
# you will have to use "tornado.database.Connection()"
# instead of "database.Connection()"
db = database.Connection("localhost", "re_project", user="root", password="mypassword")
for bogota in db.query("SELECT * FROM cities_copy"): # I changed "bogota" to lower-case because the convention in Python is for only classes, not objects, to have upper-case names.
print(bogota.title)
I haven't tested this because I do not have Python 3.2 installed, so let me know if it doesn't work and I'll try to adjust.
You're not actually defining a constructor. Look at this as a template for what you need to do:
class Connection(object):
def __init__(self, host, project, user, password, max_idle_time):
self.db = database.Connection(
host, project, user=user, password=password, max_idle_time=max_idle_time)
def some_other_method(self):
for bogota in self.db.query("SELECT * FROM cities_copy"):
print(bogota.title)