Psycopg2 relation db does not exist - python

I recently started using Macbook because my laptop was changed at work and right after that I started having problems with some of my code that I use to upload a dataframe to a postgresql database.
import psycopg2
from io import StringIO
def create_connection(user,password):
return psycopg2.connect(
host='HOST',
database='DBNAME',
user=user,
password=password)
conn = create_connection(user,password)
table = "data_analytics.tbl_summary_wingmans_rt"
buffer = StringIO()
df.to_csv(buffer, header=False, index=False)
buffer.seek(0)
cursor = conn.cursor()
cursor.copy_from(buffer, table, sep=",", null="")
conn.commit()
cursor.close()
As you can see, the code is quite simple and even before the change of equipment it ran without major problem on Windows. But as soon as I run this same code on the mac it throws me the following error:
Error: relation "data_analytics.tbl_summary_wingmans_rt" does not exist
In several posts I saw that it could be the use of double quotes but I have already used the following and I still do not have a positive result.
"data_analytics."tbl_summary_wingmans_rt""
""data_analytics"."tbl_summary_wingmans_rt""
'data_analytics."tbl_summary_wingmans_rt"'

The behaviour of copy_from changed in psycopg2 2.9 to properly quote the table name, which means that you can no longer supply a schema-qualified table name that way; you have to use copy_expert instead.

You have to separate schema and table before sending it to Postgres parser now,
when you are sending "data_analytics.tbl_summary_wingmans_rt" its a single string and unable to parse
use '"data_analytics"."tbl_summary_wingmans_rt"' this will parse the output as "schema"."table" and PostgreSQL will be able to parse

Related

How to access old .mdb file using python pyodbc [duplicate]

Can someone point me in the right direction on how to open a .mdb file in python? I normally like including some code to start off a discussion, but I don't know where to start. I work with mysql a fair bit with python. I was wondering if there is a way to work with .mdb files in a similar way?
Below is some code I wrote for another SO question.
It requires the 3rd-party pyodbc module.
This very simple example will connect to a table and export the results to a file.
Feel free to expand upon your question with any more specific needs you might have.
import csv, pyodbc
# set up some constants
MDB = 'c:/path/to/my.mdb'
DRV = '{Microsoft Access Driver (*.mdb)}'
PWD = 'pw'
# connect to db
con = pyodbc.connect('DRIVER={};DBQ={};PWD={}'.format(DRV,MDB,PWD))
cur = con.cursor()
# run a query and get the results
SQL = 'SELECT * FROM mytable;' # your query goes here
rows = cur.execute(SQL).fetchall()
cur.close()
con.close()
# you could change the mode from 'w' to 'a' (append) for any subsequent queries
with open('mytable.csv', 'w') as fou:
csv_writer = csv.writer(fou) # default field-delimiter is ","
csv_writer.writerows(rows)
There's the meza library by Reuben Cummings which can read Microsoft Access databases through mdbtools.
Installation
# The mdbtools package for Python deals with MongoDB, not MS Access.
# So install the package through `apt` if you're on Debian/Ubuntu
$ sudo apt install mdbtools
$ pip install meza
Usage
>>> from meza import io
>>> records = io.read('database.mdb') # only file path, no file objects
>>> print(next(records))
Table1
Table2
…
This looks similar to a previous question:
What do I need to read Microsoft Access databases using Python?
http://code.activestate.com/recipes/528868-extraction-and-manipulation-class-for-microsoft-ac/
Answer there should be useful.
For a solution that works on any platform that can run Java, consider using Jython or JayDeBeApi along with the UCanAccess JDBC driver. For details, see the related question
Read an Access database in Python on non-Windows platform (Linux or Mac)
In addition to bernie's response, I would add that it is possible to recover the schema of the database. The code below lists the tables (b[2] contains the name of the table).
con = pyodbc.connect('DRIVER={};DBQ={};PWD={}'.format(DRV,MDB,PWD))
cur = con.cursor()
tables = list(cur.tables())
print 'tables'
for b in tables:
print b
The code below lists all the columns from all the tables:
colDesc = list(cur.columns())
This code will convert all the tables to CSV.
Happy Coding
for tbl in mdb.list_tables("file_name.MDB"):
df = mdb.read_table("file_name.MDB", tbl)
df.to_csv(tbl+'.csv')

Pyodbc Connection to Access, creating table with Pandas to_sql(method='multi') throwing errror

I've installed sql-alchemy Access so that I'm able to use pandas and pyodbc to query my Access DB's.
The issue is, it's incredibly slow because it does single row inserts. Another post suggested I use method='multi' and while it seems to work for whoever asked that question, it throws a CompileError for me.
CompileError: The 'access' dialect with current database version settings does not support in-place multirow inserts.
AttributeError: 'CompileError' object has no attribute 'orig'
import pandas as pd
import pyodbc
import urllib
from sqlalchemy import create_engine
connection_string = (
r"DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};"
rf"DBQ={accessDB};"
r"ExtendedAnsiSQL=1;"
)
connection_uri = f"access+pyodbc:///?odbc_connect={urllib.parse.quote_plus(connection_string)}"
engine = create_engine(connection_uri)
conn = engine.connect()
# Read in tableau SuperStore data
dfSS = pd.read_excel(ssData)
dfSS.to_sql('SuperStore', conn, index=False, method='multi')
Access SQL doesn't support multi-row inserts, so a to_sql will never be able to support them as well. That other post is probably using SQLite.
Instead, you can write the data frame to CSV, and insert the CSV by using a query.
Or, of course, not read the Excel in Python at all, but just insert the Excel file by query. This will always be much faster as Access can directly read the data instead of Python reading it and then transmitting it.
E.g.
INSERT INTO SuperStore
SELECT * FROM [Sheet1$] IN "C:\Path\To\File.xlsx"'Excel 12.0 Macro;HDR=Yes'
You should be able to execute this using pyodbc without needing to involve sqlalchemy. Do note the double and single quote combination, they can be a bit painful when embedding them in other programming languages.

sqlite file is shown empty in python and r

I am trying to open a .sqlite3 file in python but I see no information is returned. So I tried r and still get empty for tables. I would like to know what tables are in this file.
I used the following code for python:
import sqlite3
from sqlite3 import Error
def create_connection(db_file):
""" create a database connection to the SQLite database
specified by the db_file
:param db_file: database file
:return: Connection object or None
"""
try:
conn = sqlite3.connect(db_file)
return conn
except Error as e:
print(e)
return None
database = "D:\\...\assignee.sqlite3"
conn = create_connection(database)
cur = conn.cursor()
rows = cur.fetchall()
but rows are empty!
This is where I got the assignee.sqlite3 from:
https://github.com/funginstitute/downloads
I also tried RStudio, below is the code and results:
> con <- dbConnect(drv=RSQLite::SQLite(), dbname="D:/.../assignee")
> tables <- dbListTables(con)
But this is what I get
first make sure you provided correct path on your connection string to the sql
light db ,
use this conn = sqlite3.connect("C:\users\guest\desktop\example.db")
also make sure you are using the SQLite library in the unit tests and the production code
check the types of sqllite connection strings and determain which one your db belongs to :
Basic
Data Source=c:\mydb.db;Version=3;
Version 2 is not supported by this class library.
SQLite
In-Memory Database
An SQLite database is normally stored on disk but the database can also be
stored in memory. Read more about SQLite in-memory databases.
Data Source=:memory:;Version=3;New=True;
SQLite
Using UTF16
Data Source=c:\mydb.db;Version=3;UseUTF16Encoding=True;
SQLite
With password
Data Source=c:\mydb.db;Version=3;Password=myPassword;
so make sure you wrote the proper connection string for your sql lite db
if you still cannot see it, check if the disk containing /tmp full otherwise , it might be encrypted database, or locked and used by some other application maybe , you may confirm that by using one of the many tools for sql light database ,
you may downliad this tool , try to navigate directly to where your db exist and it will give you indication of the problem .
download windows version
Download Mac Version
Download linux version
good luck

Python 3.6 connection to MS SQL Server for large dataframe

I am a new Python coder and also a new data scientist so please forgive any foolish sounding things here. I'll keep the details out unless anyone's curious but basically I need to connect to Microsoft SQL Server and upload a Pandas DF that is relatively large (~500k rows) and I need to do this almost every day as the project currently stands.
It doesn't have to be a Pandas DF - I've read about using odo for csv files but I haven't been able to get anything to work. The issue I'm having is that I can't bulk insert the DF because the file isn't on the same machine as the SQL Server instance. I'm consistently getting errors like the following:
pyodbc.ProgrammingError: ('42000', "[42000] [Microsoft][ODBC SQL
Server Driver][SQL Server]Incorrect syntax near the keyword 'IF'.
(156) (SQLExecDirectW)")
As I've attempted different SQL statements you can replace IF with whatever has been the first COL_NAME in the CREATE statement. I'm using SQLAlchemy to create the engine and connect to the database. This may go without saying but the pd.to_sql() method is just way too slow for how much data I'm moving so that's why I need something faster.
I'm using Python 3.6 by the way. I've put down here most of the things that I've tried that haven't been successful.
import pandas as pd
from sqlalchemy import create_engine
import numpy as np
df = pd.DataFrame(np.random.randint(0,100,size=(100, 1)), columns=list('test_col'))
address = 'mssql+pyodbc://uid:pw#server/path/database?driver=SQL Server'
engine = create_engine(address)
connection = engine.raw_connection()
cursor = connection.cursor()
# Attempt 1 <- This failed to even create a table at the cursor_execute statement so my issues could be way in the beginning here but I know that I have a connection to the SQL Server because I can use pd.to_sql() to create tables successfully (just incredibly slowly for my tables of interest)
create_statement = """
DROP TABLE test_table
CREATE TABLE test_table (test_col)
"""
cursor.execute(create_statement)
test_insert = '''
INSERT INTO test_table
(test_col)
values ('abs');
'''
cursor.execute(test_insert)
Attempt 2 <- From iabdb WordPress blog I came across
def chunker(seq, size):
return (seq[pos:pos + size] for pos in range(0, len(seq), size))
records = [str(tuple(x)) for x in take_rates.values]
insert_ = """
INSERT INTO test_table
("A")
VALUES
"""
for batch in chunker(records, 2): # This would be set to 1000 in practice I hope
print(batch)
rows = str(batch).strip('[]')
print(rows)
insert_rows = insert_ + rows
print(insert_rows)
cursor.execute(insert_rows)
#conn.commit() # don't know when I would need to commit
conn.close()
# Attempt 3 # From a related Stack Exchange Post
create the table but first drop if it already exists
command = """DROP TABLE IF EXISTS test_table
CREATE TABLE test_table # these columns are from my real dataset
"Serial Number" serial primary key,
"Dealer Code" text,
"FSHIP_DT" timestamp without time zone,
;"""
cursor.execute(command)
connection.commit()
# stream the data using 'to_csv' and StringIO(); then use sql's 'copy_from' function
output = io.StringIO()
# ignore the index
take_rates.to_csv(output, sep='~', header=False, index=False)
# jump to start of stream
output.seek(0)
contents = output.getvalue()
cur = connection.cursor()
# null values become ''
cur.copy_from(output, 'Config_Take_Rates_TEST', null="")
connection.commit()
cur.close()
It seems to me that MS SQL Server is just not a nice Database to play around with...
I want to apologize for the rough formatting - I've been at this script for weeks now but just finally decided to try to organize something for StackOverflow. Thank you very much for any help anyone can offer!
If you only need to replace the existing table, truncate it and use bcp utility to upload the table. It's much faster.
from subprocess import call
command = "TRUNCATE TABLE test_table"
take_rates.to_csv('take_rates.csv', sep='\t', index=False)
call('bcp {t} in {f} -S {s} -U {u} -P {p} -d {db} -c -t "{sep}" -r "{nl}" -e {e}'.format(t='test_table', f='take_rates.csv', s=server, u=user, p=password, db=database, sep='\t', nl='\n')
You will need to install bcp utility (yum install mssql-tools on CentOS/RedHat).
'DROP TABLE IF EXISTS test_table' just looks like invalid tsql syntax.
you can do something like this:
if (object_id('test_table') is not null)
DROP TABLE test_table

How to use the DB2 LOAD utility using the python ibm_db driver

LOAD is a DB2 utility that I would like to use to insert data into a table from a CSV file. How can I do this in Python using the ibm_db driver? I don't see anything in the docs here
CMD: LOAD FROM xyz OF del INSERT INTO FOOBAR
Running this as standard SQL fails as expected:
Transaction couldn't be completed: [IBM][CLI Driver][DB2/LINUXX8664] SQL0104N An unexpected token "LOAD FROM xyz OF del" was found following "BEGIN-OF-STATEMENT". Expected tokens may include: "<space>". SQLSTATE=42601 SQLCODE=-104
Using the db2 CLP directly (i.e. os.system('db2 -f /path/to/script.file')) is not an option as DB2 sits on a different machine that I don't have SSH access to.
EDIT:
Using the ADMIN_CMD utility also doesn't work because the file being loaded cannot be put on the database server due to firewall. For now, I've switched to using INSERT
LOAD is an IBM command line processor command, not an SQL command. Is such, it isn't available through the ibm_db module.
The most typical way to do this would be to load the CSV data into Python (either all the rows or in batches if it is too large for memory) then use a bulk insert to insert many rows at once into the database.
To perform a bulk insert you can use the execute_many method.
You could CALL the ADMIN_CMD procedure. ADMIN_CMD has support for both LOAD and IMPORT. Note that both commands require the loaded/imported file to be on the database server.
The example is taken from the DB2 Knowledge Center:
CALL SYSPROC.ADMIN_CMD('load from staff.del of del replace
keepdictionary into SAMPLE.STAFF statistics use profile
data buffer 8')
CSV to DB2 with Python
Briefly: One solution is to use an SQLAlchemy adapter and Db2’s External Tables.
SQLAlchemy:
The Engine is the starting point for any SQLAlchemy application. It’s “home base” for the actual database and its DBAPI, delivered to the SQLAlchemy application through a connection pool and a Dialect, which describes how to talk to a specific kind of database/DBAPI combination.
Where above, an Engine references both a Dialect and a Pool, which together interpret the DBAPI’s module functions as well as the behavior of the database.
Creating an engine is just a matter of issuing a single call, create_engine():
dialect+driver://username:password#host:port/database
Where dialect is a database name such as mysql, oracle, postgresql, etc., and driver the name of a DBAPI, such as psycopg2, pyodbc, cx_oracle, etc.
Load data by using transient external table:
Transient external tables (TETs) provide a way to define an external table that exists only for the duration of a single query.
TETs have the same capabilities and limitations as normal external tables. A special feature of a TET is that you do not need to define the table schema when you use the TET to load data into a table or when you create the TET as the target of a SELECT statement.
Following is the syntax for a TET:
INSERT INTO <table> SELECT <column_list | *>
FROM EXTERNAL 'filename' [(table_schema_definition)]
[USING (external_table_options)];
CREATE EXTERNAL TABLE 'filename' [USING (external_table_options)]
AS select_statement;
SELECT <column_list | *> FROM EXTERNAL 'filename' (table_schema_definition)
[USING (external_table_options)];
For information about the values that you can specify for the external_table_options variable, see External table options.
General example
Insert data from a transient external table into the database table on the Db2 server by issuing the following command:
INSERT INTO EMPLOYEE SELECT * FROM external '/tmp/employee.dat' USING (delimiter ',' MAXERRORS 10 SOCKETBUFSIZE 30000 REMOTESOURCE 'JDBC' LOGDIR '/logs' )
Requirements
pip install ibm-db
pip install SQLAlchemy
Pyton code
One example below shows how it works together.
from sqlalchemy import create_engine
usr = "enter_username"
pwd = "enter_password"
hst = "enter_host"
prt = "enter_port"
db = "enter_db_name"
#SQL Alchemy URL
conn_params = "db2+ibm_db://{0}:{1}#{2}:{3}/{4}".format(usr, pwd, hst, prt, db)
shema = "enter_name_restore_shema"
table = "enter_name_restore_table"
destination = "/path/to/csv/file_name.csv"
try:
print("Connecting to DB...")
engine = create_engine(conn_params)
engine.connect() # optional, output: DB2/linux...
print("Successfully Connected!")
except Exception as e:
print("Unable to connect to the server.")
print(str(e))
external = """INSERT INTO {0}.{1} SELECT * FROM EXTERNAL '{2}' USING (CCSID 1208 DELIMITER ',' REMOTESOURCE LZ4 NOLOG TRUE )""".format(
shema, table, destination
)
try:
print("Restoring data to the server...")
engine.execute(external)
print("Data restored successfully.")
except Exception as e:
print("Unable to restore.")
print(str(e))
Conclusion
A great solution for restoredlarge files, specifically, 600m worked without any problems.
It is also useful for copying data from one table/database to another table. So that the backup is done as an export of csv and then that csv into DB2 with the given example.
SQLAlchemy-Engine can be combined with other databases such as: sqlite, mysql, postgresql, oracle, mssql, etc.

Categories

Resources