Postgresql - How to open an Autocommit Connection to handle a Vacuum Full

Postgresql - How to open an Autocommit Connection to handle a Vacuum Full - python

I am trying to achieve the same thing as in earlier question psycopg2: How to execute vacuum postgresql query in python script; however, the recommendation to open an autocommit connection includes a link which is broken.
The below code runs without error BUT the table is not vacuumed.
How does this need to be written to call the Vacuum Full correctly?
#!/usr/bin/python
import psycopg2
from config import config
def connect():
""" Connect to the PostgreSQL database server """
conn = None
try:
# read connection parameters
params = config()
# connect to the PostgreSQL server
conn = psycopg2.connect(**params)
conn.autocommit=1
# create a cursor
cur = conn.cursor()
# execute Vacuum Full
cur.execute('Vacuum Full netsuite_display')
# close the communication with the PostgreSQL
cur.close()
except (Exception, psycopg2.DatabaseError) as error:
print(error)
finally:
if conn is not None:
conn.close()
print('Database connection closed.')
if __name__ == '__main__':
connect()

Related

accessing postgresql database in python using functions

Let me start off by saying I am extremely new to Python and Postgresql so I feel like I'm in way over my head. My end goal is to get connected to the dvdrental database in postgresql and be able to access/manipulate the data. So far I have:
created a .config folder and a database.ini is within there with my login credentials.
in my src i have a config.py folder and use config parser, see below:
def config(filename='.config/database.ini', section='postgresql'):
# create a parser
parser = ConfigParser()
# read config file
parser.read(filename)
# get section, default to postgresql
db = {}
if parser.has_section(section):
params = parser.items(section)
for param in params:
db[param[0]] = param[1]
else:
raise Exception('Section {0} not found in the {1} file'.format(section, filename))
return db
then also in my src I have a tasks.py file that has a basic connect function, see below:
import pandas as pd
from clients.config import config
import psycopg
def connect():
""" Connect to the PostgreSQL database server """
conn = None
try:
# read connection parameters
params = config()
# connect to the PostgreSQL server
print('Connecting to the PostgreSQL database...')
conn = psycopg.connect(**params)
# create a cursor
cur = conn.cursor()
# execute a statement
print('PostgreSQL database version:')
cur.execute('SELECT version()')
# display the PostgreSQL database server version
db_version = cur.fetchone()
print(db_version)
# close the communication with the PostgreSQL
cur.close()
except (Exception, psycopg.DatabaseError) as error:
print(error)
finally:
if conn is not None:
conn.close()
print('Database connection closed.')
if __name__ == '__main__':
connect()
Now this runs and prints out the Postgresql database version which is all well & great but I'm struggling to figure out how to change the code so that it's more generalized and maybe just creates a cursor?
I need the connect function to basically just connect to the dvdrental database and create a cursor so that I can then use my connection to select from the database in other needed "tasks" -- for example I'd like to be able to create another function like the below:
def select_from_table(cursor, table_name, schema):
cursor.execute(f"SET search_path TO {schema}, public;")
results= cursor.execute(f"SELECT * FROM {table_name};").fetchall()
return results
but I'm struggling with how to just create a connection to the dvdrental database & a cursor so that I'm able to actually fetch data and create pandas tables with it and whatnot.
so it would be like
task 1 is connecting to the database
task 2 is interacting with the database (selecting tables and whatnot)
task 3 is converting the result from 2 into a pandas df
thanks so much for any help!! This is for a project in a class I am taking and I am extremely overwhelmed and have been googling-researching non-stop and seemingly end up nowhere fast.

The fact that you established the connection is honestly the hardest step. I know it can be overwhelming but you're on the right track.
Just copy these three lines from connect into the select_from_table method
params = config()
conn = psycopg.connect(**params)
cursor = conn.cursor()
It will look like this (also added conn.close() at the end):
def select_from_table(cursor, table_name, schema):
params = config()
conn = psycopg.connect(**params)
cursor = conn.cursor()
cursor.execute(f"SET search_path TO {schema}, public;")
results= cursor.execute(f"SELECT * FROM {table_name};").fetchall()
conn.close()
return results

Implement connection pooling with Python for connecting to SQL Server on Windows

I am writing a Python script that will read data from a SQL Server database. For this I have used pyodbc to connect to SQL Server on Windows (my driver is ODBC Driver 17 for SQL Server).
My script works fine, but I need to use a connection pool instead of a single connection to manage resources more effectively. However the documentation for pyodbc only mentions pooling without providing examples of how connection pooling can be implemented. Any ideas of how this can be done using Python while connecting to an SQL Server? I only found solutions for PostgreSQL that use psycopg2, but this does not work for me obviously.
At the moment my code looks like this (please disregard the missing indentation which happened when copying the file from my IDE):
def get_limited_rows(size):
try:
server = 'here-is-IP-address-of-servier'
database = 'here-is-my-db-name'
username = 'here-is-my-username'
password = 'here-is-my-password'
conn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+password)
cursor = conn.cursor()
print('Connected to database')
select_query = 'select APPN, APPD from MAIN'
cursor.execute(select_query)
while True:
records = cursor.fetchmany(size)
if not records:
cursor.close()
sys.exit("Completed")
else:
for record in records:
print(record)
time.sleep(10)
except pyodbc.Error as error:
print('Error reading data from table', error)
finally:
if (conn):
conn.close()
print('Data base connection closed')

Unable to copy data into AWS RedShift

I tried a lot however I am unable to copy data available as json file in S3 bucket(I have read only access to the bucket) to Redshift table using python boto3. Below is the python code which I am using to copy the data. Using the same code I was able to create the tables in which I am trying to copy.
import configparser
import psycopg2
from sql_queries import create_table_queries, drop_table_queries
def drop_tables(cur, conn):
for query in drop_table_queries:
cur.execute(query)
conn.commit()
def create_tables(cur, conn):
for query in create_table_queries:
cur.execute(query)
conn.commit()
def main():
try:
config = configparser.ConfigParser()
config.read('dwh.cfg')
# conn = psycopg2.connect("host={} dbname={} user={} password={} port={}".format(*config['CLUSTER'].values()))
conn = psycopg2.connect(
host=config.get('CLUSTER', 'HOST'),
database=config.get('CLUSTER', 'DB_NAME'),
user=config.get('CLUSTER', 'DB_USER'),
password=config.get('CLUSTER', 'DB_PASSWORD'),
port=config.get('CLUSTER', 'DB_PORT')
)
cur = conn.cursor()
#drop_tables(cur, conn)
#create_tables(cur, conn)
qry = """copy DWH_STAGE_SONGS_TBL
from 's3://udacity-dend/song-data/A/A/A/TRAAACN128F9355673.json'
iam_role 'arn:aws:iam::xxxxxxx:role/MyRedShiftRole'
format as json 'auto';"""
print(qry)
cur.execute(qry)
# execute a statement
# print('PostgreSQL database version:')
# cur.execute('SELECT version()')
#
# # display the PostgreSQL database server version
# db_version = cur.fetchone()
# print(db_version)
print("Executed successfully")
cur.close()
conn.close()
# close the communication with the PostgreSQL
except Exception as error:
print("Error while processing")
print(error)
if __name__ == "__main__":
main()
I don't see any error in the Pycharm console but I see Aborted status in the redshift query console. I don't see any reason why it has been aborted(or I don't know where to look for that)
Other thing that I have noticed is when I run the copy statement in Redshift query editor , it runs fine and data gets moved into the table. I tried to delete and recreate the cluster but no luck. I am not able to figure what I am doing wrong. Thank you

Quick read - it looks like you haven't committed the transaction and the COPY is rolled back when the connection closes. You need to either change the connection configuration to be in "autocommit" or add an explicit "commit()".

Can't SSH to Google Cloud Compute Engine with sshtunnel

I prepared two VM instance with Compute Engine on GCP.
ServerA: Data processing and read/write to SQL(mysql) on ServerB.
ServerB: SQL Server (f1-micro* This is not Cloud SQL, but normal VM instance.)
Trying to access SSH from A to B in order to read/write DB on ServerB with the code below.
error code
error: ERROR | Problem setting SSH Forwarder up: Couldn't open tunnel localhost:3306 <> localhost:3306 might be in use or destination not reachable
sshtunnel.HandlerSSHTunnelForwarderError: An error occurred while opening tunnels.
#SSH connection
with SSHTunnelForwarder(
('PublicIP of ServerA', 22),
ssh_pkey=SSH_PKEY_PATH,
ssh_username=SSH_USER,
remote_bind_address=('localhost', 3306),
local_bind_address=('localhost', 3306)
) as ssh:
try:
#DB connection
connection = mysql.connector.connect(
host='localhost',
port = 3306,
user=MYSQL_USER,
passwd=MYSQL_PASS,
db=MYSQL_DB,
charset='utf8'
)
# print(connection.is_connected())
# Get Cur
cur = connection.cursor()
sql = "use dbname"
cur.execute(sql)
for i in range(len(sqlList)):
print("DB Access：" + str(sqlList[i]))
sql = str(sqlList[i])
# sql = 'create table test (id int, content varchar(32))'
cur.execute(sql)
sqlOUTPUT = cur.fetchall()
# rows = cur.fetchall()
# for row in rows:
# print(row)
except mysql.connector.Error as err:
print("Something went wrong: {}".format(err))
connection.rollback()
raise err
finally:
#Cur close
cur.close()
# Commit
connection.commit()
#DB Connection close
connection.close()
return sqlOUTPUT
But after "local_bind_address=(localhost, MYSQL_PORT)", an error occurs despite it goes through with the same code and same private key on the shell of B or on VSCode local environment.
I don't understand why it goes through with same code using shell and VSCode although it doesn't work on GCE.
Any help?

You might be able to debug this further and discard any issues with sshtunnel if you try to create the tunnel outside of the script from the client VM, with:
$ gcloud compute ssh server-a --zone=your-zone --ssh-flag='-NL 3306:127.0.0.1:3306' &
Then attempt a connection with:
$ mysql -h 127.0.0.1

Using unittest in python with docker and psycopg2

BRIEF DESCRIPTION OF THE PROBLEM: Psycopg2 won't connect to a test DB within docker from unittest, but connects fine from console.
ERROR MESSAGE:
psycopg2.OperationalError: server closed the connection unexpectedly
This probably means the server terminated abnormally before or while processing the request.
DETAILS:
I'm trying to set up a testing database in docker, that will be created and populated before testing and then removed after.
Here's the fuction to set up database:
def set_up_empty_test_db():
client = docker.from_env()
try:
testdb = client.containers.get("TestDB")
testdb.stop()
testdb.remove()
testdb = client.containers.run(
"postgres",
ports={5432: 5433},
detach=True,
name="TestDB",
environment=["POSTGRES_PASSWORD=yourPassword"],
)
except NotFound:
testdb = client.containers.run(
"postgres",
ports={5432: 5433},
detach=True,
name="TestDB",
environment=["POSTGRES_PASSWORD=yourPassword"],
)
while testdb.status != "running":
testdb = client.containers.get("TestDB")
return
If I launch this function from console it works without an error and I can see TestDB container running. I can successfully initiate connection:
conn = psycopg2.connect("host='localhost' user='postgres' password='yourPassword' port='5433'")
But it doesn't work when unit testing. Here's the testing code:
class TestCreateCity(unittest.TestCase):
def setUp(self):
set_up_empty_test_db()
con = psycopg2.connect("host='localhost' user='postgres' password='yourPassword' port='5433'")
self.assertIsNone(con.isolation_level)
cur = con.cursor()
sql_file = open(os.path.join(ROOT_DIR + "/ddl/creates/schema_y.sql"), "r")
cur.execute(sql_file.readline())
con.commit()
con.close()
self.session = Session(creator=con)
def test_create_city(self):
cs.create_city("Test_CITY", "US")
q = self.session.query(City).filter(City.city_name == "Test_CITY").one()
self.assertIs(q)
self.assertEqual(q.city_name, "Test_CITY")
self.assertEqual(q.country_code, "US")
It breaks when trying to initiate connection. Please advise.

I know this is an old question, but I needed to do the same thing today. You try and connect to the postgres server too quickly after starting it - that's why it works in the console.
All you need to do is replace:
set_up_empty_test_db()
con = psycopg2.connect("host='localhost' user='postgres' password='yourPassword' port='5433'")
with:
set_up_empty_test_db()
con = None
while con == None:
try:
con = psycopg2.connect("host='localhost' user='postgres' password='yourPassword' port='5433'")
except psycopg2.OperationalError:
time.sleep(0.5);
Hope this helps someone else. Cheers!

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Postgresql - How to open an Autocommit Connection to handle a Vacuum Full - python

Related

accessing postgresql database in python using functions

Implement connection pooling with Python for connecting to SQL Server on Windows

Unable to copy data into AWS RedShift

Can't SSH to Google Cloud Compute Engine with sshtunnel

Using unittest in python with docker and psycopg2

Categories

Resources