Sqlite3 with Python - Query from external file - python

I would like to create my DB by an external file like:
database = "../data/cm4payroll.db"
query = "../data/emdb.sql"
# Datenbankverbindung herstellen
self.connection = sqlite3.connect(self.database)
self.cursor = self.connection.cursor()
# Datenbank erstellen
self.cursor.execute(self.query)
Traceback:
self.cursor.execute(self.query)
sqlite3.OperationalError: near ".": syntax error

You need to read the file contents, and pass it to cursor.executescript() instead:
self.connection = sqlite3.connect(self.database)
self.cursor = self.connection.cursor()
with open(self.query) as queryfile:
self.cursor.executescript(queryfile.read())
Your error shows you were trying to execute the filename as a SQL statement; cursor.execute() can only handle actual SQL strings, not filenames.

Related

How to pass psycopg2 cursor object to foreachPartition() in pyspark?

I'm getting following error
Traceback (most recent call last):
File "/databricks/spark/python/pyspark/serializers.py", line 473, in dumps
return cloudpickle.dumps(obj, pickle_protocol)
File "/databricks/spark/python/pyspark/cloudpickle/cloudpickle_fast.py", line 73, in dumps
cp.dump(obj)
File "/databricks/spark/python/pyspark/cloudpickle/cloudpickle_fast.py", line 563, in dump
return Pickler.dump(self, obj)
TypeError: cannot pickle 'psycopg2.extensions.cursor' object
PicklingError: Could not serialize object: TypeError: cannot pickle 'psycopg2.extensions.cursor' object
while running the below script
def get_connection():
conn_props = brConnect.value
print(conn_props)
#extract value from broadcast variables
database = conn_props.get("database")
user = conn_props.get("user")
pwd = conn_props.get("password")
host = conn_props.get("host")
db_conn = psycopg2.connect(
host = host,
user = user,
password = pwd,
database = database,
port = 5432
)
return db_conn
def process_partition_up(partition, db_cur):
updated_rows = 0
try:
for row in partition:
process_row(row, myq, db_cur)
except Exception as e:
print("Not connected")
return updated_rows
def update_final(df, db_cur):
df.rdd.coalesce(2).foreachPartition(lambda x: process_partition_up(x, db_cur))
def etl_process():
for id in ['003']:
conn = get_connection()
for t in ['email_table']:
query = f'''(select * from public.{t} where id= '{id}') as tab'''
df_updated = load_data(query)
if df_updated.count() > 0:
q1 = insert_ops(df_updated, t) #assume this function returns a insert query
query_props = q1
sc = spark.sparkContext
brConnectQ = sc.broadcast(query_props)
db_conn = get_connection()
db_cur = db_conn.cursor()
update_final(df_updated, db_cur)
conn.commit()
conn.close()
Explanation:
Here etl_process() internally calling get_connection() which returns a psycopg2 connection object. After that it's calling a update_final() which takes dataframe and psycopg2 cursor object as an arguments.
Now update_final() is calling process_partition_up() on each partition(df.rdd.coalesce(2).foreachPartition) which takes dataframe and psycopg2 cursor object as an arguments.
Here after passing psycopg2 cursor object to the process_partition_up(), I'm not getting cursor object rather I'm getting above error.
Can anyone help me out to resolve this error?
Thank you.
I think that you don't understand what's happening here.
You are creating a database connection in your driver(etl_process), and then trying to ship that live connection from the driver, across your network to executor to do the work.(your lambda in foreachPartitions is executed on the executor.)
That is what spark is telling you "cannot pickle 'psycopg2.extensions.cursor'". (It can't serialize your live connection to the database to ship it to an executor.)
You need to call conn = get_connection() from inside process_partition_up this will initialize the connection to the database from inside the executor.(And any other book keeping you need to do.)
FYI: The worst part that I want to call out is that this code will work on your local machine. This is because it's both the executor and the driver.

where to place the db.close to close the mysql connection that inside of a function in python script

i have this psudo code which i want to close the mysql connection after the for loop
but i am getting the below error:
Traceback (most recent call last):
File "./myscript.py", line 201, in <module>
db.close
NameError: name 'db' is not defined
the code looks like the below:
def get_content(id):
db = mysql.connector.connect(host='localhost',user='user',password='password',database='dcname')
#get cursor
cursor = db.cursor()
cursor.execute("select id,num from table where job_db_inx={0}".format(index))
result = cursor.fetchall()
for job in list
id = get_content(id)
print(id)
db.close()
where should i place the db.close to close all the db connections
Consider using a context manager here:
import contextlib
def get_content(id, cursor):
cursor.execute("select id,num from table where job_db_inx={0}".format(index))
result = cursor.fetchall()
with contextlib.closing(mysql.connector.connect(...)) as conn:
cursor = conn.cursor()
for job in list
id = get_content(id, cursor)
print(id)
I used contextlib.closing here, but there's a very good chance that any given db api is already implemented as its own context manager. Python has a standard dbapi that is worth reading.
The actual code should look like: (make sure conn is closed)
def get_content(id,conn):
cursor = conn.cursor()
cursor.execute("select id,num from table where job_db_inx={0}".format(id))
result = cursor.fetchall()
return result
try:
conn = mysql.connector.connect(host='localhost',user='user',password='password',database='dcname')
for job in list
id = get_content(id,conn)
print(id)
finally:
conn.close()

Error when controlling mysql with python class

I am trying to update a mysql database using class system but cannot get the update part to work. It was all ok the old way but I wanted to use the class system with exception error control. Can someone please let me know what I am doing wrong. At the moment for this script I am just trying to send the variable Boileron to the database column office.
import MySQLdb
class DBSolar:
conn = None
def connect(self):
try:
self.conn = MySQLdb.connect("192.xxx.x.x", "exxxxx", "Oxxxx", "hxxxx")
except (MySQLdb.Error, MySQLdb.Warning) as e:
print (e)
self.conn = None
return self.conn
def query(self, sql):
try:
cursor = self.conn.cursor()
cursor.execute(sql)
except (AttributeError, MySQLdb.OperationalError):
self.connect()
cursor = self.conn.cursor()
cursor.execute(sql)
return cursor
def update(self, task):
boilerState = task
try:
sql = "UPDATE dashboard SET office = ? WHERE id = 1", (boilerState)
cursor = self.conn.cursor()
cursor.execute(sql)
except (AttributeError, MySQLdb.OperationalError):
self.connect()
cursor = self.conn.cursor()
cursor.execute(sql)
return
while 1:
BoilerOn = 1
print BoilerOn
dbSolar = DBSolar()
connSolar = dbSolar.connect()
if connSolar:
dbSolar.update(BoilerOn)
Below is the error report from putty console
Traceback (most recent call last):
File "test2.py", line 47, in <module>
dbSolar.update(BoilerOn)
File "test2.py", line 29, in update
cursor.execute(sql)
File "/usr/lib/python2.7/dist-packages/MySQLdb/cursors.py", line 223, in execute
self.errorhandler(self, TypeError, m)
File "/usr/lib/python2.7/dist-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
raise errorvalue
TypeError: query() argument 1 must be string or read-only buffer, not tuple
Got this working changed the update to the following
def update(self, task):
try:
cursor = self.conn.cursor()
cursor.execute("UPDATE dashboard SET office = %s WHERE id = 1", [task])
self.conn.commit()
except (MySQLdb.Error, MySQLdb.Warning) as e:
print (e)
self.connect()
#cursor = self.conn.cursor()
#cursor.execute(sql)
#self.connect.commit()
return
In the MySQL Developer document
Since by default Connector/Python turns autocommit off, and MySQL 5.5 and higher uses transactional InnoDB tables by default, it is necessary to commit your changes using the connection's commit() method. You could also roll back using the rollback() method.
You need to add self.conn.commit() after cursor.execute(sql) to commit all the changes.
Or turn on the autocommit which is described in Python mySQL Update, Working but not updating table
Your sql creata a tuple, but cursor.execute(statement, params) is expecting a string for the statement and either a tuple or a dictionary for params.
For MySQL UPDATE, you also need to commit the execution.
Therefore try:
self.conn.cursor.execute("UPDATE dashboard SET office = %s WHERE id = 1", (str(task), ))
self.conn.commit()
I would suggest you read the 'MySQL Connector/Python Development Guide better understating of using MySQL in python.

how can I connect to a postgresql database from external psycopg2.connect?

I would like to connect to a postgresql database using python from a different server.
I triyed this :
conn_string = "host=192.168.1.1 dbname='"+db7+"' user='user' password='"+pw7+"'"
conn = psycopg2.connect(conn_string)
cursor = conn.cursor()
but I get the error:
conn = psycopg2.connect(conn_string)
File "/usr/lib/python2.7/dist-packages/psycopg2/__init__.py", line 179, in connect
connection_factory=connection_factory, async=async)
psycopg2.OperationalError: FATAL: database "database" does not exist
Remove unnecessary quotes in the syntax.
Follow this structure.
conn = psycopg2.connect(host = "localhost",database="ur_database_name", user="db_user", password="your_password")
Example.
conn = psycopg2.connect(host = "localhost",database="studentesdb", user="postgres", password="admin")

Python: MySQLdb LOAD DATA INFILE silently fails

I am attempting to use a Python script to import a csv file into a MySQL database.
It seems to fail silently.
Here is my code:
#!/usr/bin/python
import MySQLdb
class DB:
host = 'localhost'
user = 'root'
password = '**************'
sqldb = 'agriculture'
conn = None
def connect(self):
self.conn = MySQLdb.connect(self.host,self.user,self.password,self.sqldb )
def query(self, sql, params=None):
try:
cursor = self.conn.cursor()
if params is not None:
cursor.execute(sql, params)
else:
cursor.execute(sql)
except (AttributeError, MySQLdb.OperationalError):
self.connect()
cursor = self.conn.cursor()
if params is not None:
cursor.execute(sql, params)
else:
cursor.execute(sql)
print vars(cursor)
return cursor
def load_data_infile(self, f, table, options=""):
sql="""LOAD DATA LOCAL INFILE '%s' INTO TABLE %s FIELDS TERMINATED BY ',';""" % (f,table)
self.query(sql)
db = DB()
pathToFile = "/home/ariggi/722140-93805-sqltest.csv"
table_name = "agriculture.degreedays"
db.load_data_infile(pathToFile, table_name)
In an attempt to debug this situation I am dumping the cursor object to the screen within the "query()" method. Here is the output:
{'_result': None, 'description': None, 'rownumber': 0, 'messages': [],
'_executed': "LOAD DATA LOCAL INFILE
'/home/ariggi/722140-93805-sqltest.csv' INTO TABLE degreedays FIELDS
TERMINATED BY ',';", 'errorhandler': >, 'rowcount': 500L, 'connection': , 'description_flags': None,
'arraysize': 1, '_info': 'Records: 500 Deleted: 0 Skipped: 0
Warnings: 0', 'lastrowid': 0L, '_last_executed': "LOAD DATA LOCAL
INFILE '/home/ariggi/722140-93805-sqltest.csv' INTO TABLE agriculture.degreedays
FIELDS TERMINATED BY ',';", '_warnings': 0, '_rows': ()}
If I take the "_last_executed" query, which is
LOAD DATA LOCAL INFILE '/home/ariggi/722140-93805-sqltest.csv' INTO TABLE agriculture.degreedays FIELDS TERMINATED BY ',';
and run it through the mysql console it works as expected and fills the table with rows. However when I execute this script my database table remains empty.
I am pretty stumped and could use some help.
Try calling db.conn.commit() at the end of your code to make the changes permanent. Python by default does not use the "autocommit" mode, so until you issue a commit the DB module regards your changes as part of an incomplete transaction.
As #AirThomas points out in a comment it helps to us a "context manager" - though I'd say the correct formulation was
with conn.cursor() as curs:
do_something_with(curs)
because this will automatically commit any changes unless the controlled code raises an exception.

Categories

Resources