Let me start off by saying I am extremely new to Python and Postgresql so I feel like I'm in way over my head. My end goal is to get connected to the dvdrental database in postgresql and be able to access/manipulate the data. So far I have:
created a .config folder and a database.ini is within there with my login credentials.
in my src i have a config.py folder and use config parser, see below:
def config(filename='.config/database.ini', section='postgresql'):
# create a parser
parser = ConfigParser()
# read config file
parser.read(filename)
# get section, default to postgresql
db = {}
if parser.has_section(section):
params = parser.items(section)
for param in params:
db[param[0]] = param[1]
else:
raise Exception('Section {0} not found in the {1} file'.format(section, filename))
return db
then also in my src I have a tasks.py file that has a basic connect function, see below:
import pandas as pd
from clients.config import config
import psycopg
def connect():
""" Connect to the PostgreSQL database server """
conn = None
try:
# read connection parameters
params = config()
# connect to the PostgreSQL server
print('Connecting to the PostgreSQL database...')
conn = psycopg.connect(**params)
# create a cursor
cur = conn.cursor()
# execute a statement
print('PostgreSQL database version:')
cur.execute('SELECT version()')
# display the PostgreSQL database server version
db_version = cur.fetchone()
print(db_version)
# close the communication with the PostgreSQL
cur.close()
except (Exception, psycopg.DatabaseError) as error:
print(error)
finally:
if conn is not None:
conn.close()
print('Database connection closed.')
if __name__ == '__main__':
connect()
Now this runs and prints out the Postgresql database version which is all well & great but I'm struggling to figure out how to change the code so that it's more generalized and maybe just creates a cursor?
I need the connect function to basically just connect to the dvdrental database and create a cursor so that I can then use my connection to select from the database in other needed "tasks" -- for example I'd like to be able to create another function like the below:
def select_from_table(cursor, table_name, schema):
cursor.execute(f"SET search_path TO {schema}, public;")
results= cursor.execute(f"SELECT * FROM {table_name};").fetchall()
return results
but I'm struggling with how to just create a connection to the dvdrental database & a cursor so that I'm able to actually fetch data and create pandas tables with it and whatnot.
so it would be like
task 1 is connecting to the database
task 2 is interacting with the database (selecting tables and whatnot)
task 3 is converting the result from 2 into a pandas df
thanks so much for any help!! This is for a project in a class I am taking and I am extremely overwhelmed and have been googling-researching non-stop and seemingly end up nowhere fast.
The fact that you established the connection is honestly the hardest step. I know it can be overwhelming but you're on the right track.
Just copy these three lines from connect into the select_from_table method
params = config()
conn = psycopg.connect(**params)
cursor = conn.cursor()
It will look like this (also added conn.close() at the end):
def select_from_table(cursor, table_name, schema):
params = config()
conn = psycopg.connect(**params)
cursor = conn.cursor()
cursor.execute(f"SET search_path TO {schema}, public;")
results= cursor.execute(f"SELECT * FROM {table_name};").fetchall()
conn.close()
return results
I'd like to create new database and new tables to this database using sqlalchemy. and I have to write create_engine twice, is there any easier writing method to do these things?
my code is here:
from sqlalchemy import create_engine
import pymysql
user = 'admin'
password = ''
host = 'database-1.czswegfdjhpn.us-east-1.rds.amazonaws.com'
port = 3306
engine = create_engine(url="mysql+pymysql://{0}:{1}#{2}:{3}".format(
user, password, host, port))
conn = engine.connect()
print('connect successfull')
conn.execute("commit")
conn.execute('create database covid_19')
database = 'covid_19'
engine1 = create_engine(url="mysql+pymysql://{0}:{1}#{2}:{3}/{4}".format(
user, password, host, port, database))
conn1 = engine.connect()
print('connect successfull')
df.to_sql(name="covid_19_world_cases_deaths_testing",con=engine1, if_exists='append', index=False, chunksize=200)
Haven’t tested this, however could you simply call
conn.execute(“USE covid_19”)
to change the database of the existing connection?
I have a sql file generated during database backup process and I want to load all database content from that sql file to a different MySQL database (secondary database).
I have created a python function to load the whole database in that sql file but when I execute the function, I get an error
'str' object is not callable
Below is python script
def load_database_dump_to_secondary_mysql(file_path='db_backup_file.sql'):
query = f'source {file_path}'
try:
connection = mysql_hook.get_conn() # connection to secondary db
cursor = connection.cursor(query)
print('LOAD TO MYSQL COMPLETE')
except Exception as xerror:
print("LOAD ERROR: ", xerror)
NB: mysql_hook is an airflow connector that contains MySQL DB connection info such as Host, user/passwd, Database name. Also, I don't have connection to the primary database, I'm only receiving sql dump file.
What I'm I missing?
source is a client builtin command: https://dev.mysql.com/doc/refman/8.0/en/mysql-commands.html
It's not an SQL query that MySQL's SQL parser understands.
So you can't execute source using cursor.execute(), because that goes directly to the dynamic SQL interface.
You must run it using the MySQL command-line client as a subprocess:
subprocess.run(['mysql', '-e', f'source {file_path}'])
You might need other options to the mysql client, such as user, password, host, etc.
try this
import mysql.connector as m
# database which you want to backup
db = 'geeksforgeeks'
connection = m.connect(host='localhost', user='root',
password='123', database=db)
cursor = connection.cursor()
# Getting all the table names
cursor.execute('SHOW TABLES;')
table_names = []
for record in cursor.fetchall():
table_names.append(record[0])
backup_dbname = db + '_backup'
try:
cursor.execute(f'CREATE DATABASE {backup_dbname}')
except:
pass
cursor.execute(f'USE {backup_dbname}')
for table_name in table_names:
cursor.execute(
f'CREATE TABLE {table_name} SELECT * FROM {db}.{table_name}')
I need to create a db in MySQL using SQLAlchemy, I am able to connect to a db if it already exists, but I want to be able to create it if it does not exist. These are my tables:
#def __init__(self):
Base = declarative_base()
class utente(Base):
__tablename__="utente"
utente_id=Column(Integer,primary_key=True)
nome_utente=Column(Unicode(20))
ruolo=Column(String(10))
MetaData.create_all()
def __repr(self):
return "utente: {0}, {1}, id: {2}".format(self.ruolo,self.nome_utente,self.utente_id)
class dbmmas(Base):
__tablename__="dbmmas"
db_id=Column(Integer,primary_key=True,autoincrement=True)
nome_db=Column(String(10))
censimento=Column(Integer)
versione=Column(Integer)
ins_data=Column(DateTime)
mod_data=Column(DateTime)
ins_utente=Column(Integer)
mod_utente=Column(Integer)
MetaData.create_all()
def __repr(self):
return "dbmmas: {0}, censimento {1}, versione {2}".format(self.nome_db,self.censimento,self.versione)
class funzione(Base):
__tablename__="funzione"
funzione_id=Column(Integer,primary_key=True,autoincrement=True)
categoria=Column(String(10))
nome=Column(String(20))
def __repr__(self):
return "funzione:{0},categoria:{1},id:{2} ".format(self.nome,self.categoria,self.funzione_id)
class profilo(Base):
__tablename__="rel_utente_funzione"
utente_id=Column(Integer,primary_key=True)
funzione_id=Column(Integer,primary_key=True)
amministratore=Column(Integer)
MetaData.create_all()
def __repr(self):
l=lambda x: "amministratore" if x==1 else "generico"
return "profilo per utente_id:{0}, tipo: {1}, funzione_id: {2}".format(self.utente_id,l(self.amministratore),self.funzione_id)
class aree(Base):
__tablename__="rel_utente_zona"
UTB_id=Column(String(10), primary_key=True) # "in realta' si tratta della seatureSignature della feature sullo shapefile"
utente_id=Column(Integer, primary_key=True)
amministratore=Column(Integer)
MetaData.create_all()
def __repr(self):
l=lambda x: "amministratore" if x==1 else "generico"
return "zona: {0}, pe utente_id:{1}, {2}".format(self.UTB_id,self.utente_id,l(self.amministratore))
class rel_utente_dbmmas(Base):
__tablename__="rel_utente_dbmmas"
utente_id=Column(Integer,primary_key=True)
db_id=Column(Integer,primary_key=True)
amministratore=(Integer)
MetaData.create_all()
def __repr(self):
l=lambda x: "amministratore" if x==1 else "generico"
return "dbregistrato: {0} per l'utente{1} {2}".format(self.db_id,self.utente_id,l(self.amministratore))
To create a mysql database you just connect to the server an create the database:
import sqlalchemy
engine = sqlalchemy.create_engine('mysql://user:password#server') # connect to server
engine.execute("CREATE DATABASE dbname") #create db
engine.execute("USE dbname") # select new db
# use the new db
# continue with your work...
of course your user has to have the permission to create databases.
You can use SQLAlchemy-Utils for that.
pip install sqlalchemy-utils
Then you can do things like
from sqlalchemy_utils import create_database, database_exists
url = 'mysql://{0}:{1}#{2}:{3}'.format(user, pass, host, port)
if not database_exists(url):
create_database(url)
I found the answer here, it helped me a lot.
I don't know what the canonical way is, but here's a way to check to see if a database exists by checking against the list of databases, and to create it if it doesn't exist.
from sqlalchemy import create_engine
# This engine just used to query for list of databases
mysql_engine = create_engine('mysql://{0}:{1}#{2}:{3}'.format(user, pass, host, port))
# Query for existing databases
existing_databases = mysql_engine.execute("SHOW DATABASES;")
# Results are a list of single item tuples, so unpack each tuple
existing_databases = [d[0] for d in existing_databases]
# Create database if not exists
if database not in existing_databases:
mysql_engine.execute("CREATE DATABASE {0}".format(database))
print("Created database {0}".format(database))
# Go ahead and use this engine
db_engine = create_engine('mysql://{0}:{1}#{2}:{3}/{4}'.format(user, pass, host, port, db))
Here's an alternative method if you don't need to know if the database was created or not.
from sqlalchemy import create_engine
# This engine just used to query for list of databases
mysql_engine = create_engine('mysql://{0}:{1}#{2}:{3}'.format(user, pass, host, port))
# Query for existing databases
mysql_engine.execute("CREATE DATABASE IF NOT EXISTS {0} ".format(database))
# Go ahead and use this engine
db_engine = create_engine('mysql://{0}:{1}#{2}:{3}/{4}'.format(user, pass, host, port, db))
CREATE DATABASE IF NOT EXISTS dbName;
Would recommend using with:
from sqlalchemy import create_engine
username = ''
password = ''
host = 'localhost'
port = 3306
DB_NAME = 'db_name'
engine = create_engine(f"mysql://{username}:{password}#{host}:{port}")
with engine.connect() as conn:
# Do not substitute user-supplied database names here.
conn.execute(f"CREATE DATABASE IF NOT EXISTS {DB_NAME}")
The mysqlclient seems to be up to 10 times faster in benchmark tests than PyMySQL, see: What's the difference between MySQLdb, mysqlclient and MySQL connector/Python?.
Yet, why not use a Python-ready package for Python, at least, if it is not about every second of query time? PyMySQL is suggested by the following links, for example:
Using SQLAlchemy to access MySQL without frustrating library installation issues
How to connect MySQL database using Python+SQLAlchemy remotely?.
Python packages:
Install with pip, at best put in "requirements.txt":
PyMySQL
SQLAlchemy
Again, if it is about the best speed of the query, use mysqlclient package. Then you need to install an additional Linux package with sudo apt-get install libmysqlclient-dev.
import statements
Only one needed:
import sqlalchemy
Connection string (= db_url)
Connection string starting with {dialect/DBAPI}+{driver}:
db_url = mysql+pymysql://
where pymysql stands for the used Python package "PyMySQL" as the driver.
Again, if it is about the best speed of the query, use mysqlclient package. Then you need mysql+msqldb:// at this point.
For a remote connection, you need to add to the connection string:
host
user
password
database
port (the port only if it is not the standard 3306)
You can create your db_url with several methods. Do not write user and password and at best any other variable value directly in the string to avoid possible attacks:
sqlalchemy.engine.URL.create(), or with .url.URL, see an example at Connecting from Cloud Functions to Cloud SQL or an example which automatically adds ? suffixes, for example ?driver=SQL+Server, at the end of the string at Building a connection URL for mssql+pyodbc with sqlalchemy.engine.url.URL
f"""...{my_var}..."""
"""...{my_var}...""".format(my_var=xyz_var)
...
Example without the url helper of SQLAlchemy:
db_url = "{dialect}+{driver}://{user}:{password}#{host}:{port}/{database}".format(
or:
db_url = "{dialect}+{driver}://{user}:{password}#{host}/{database}?host={host}?port={port}".format(
dialect = 'mysql',
driver = 'pymysql',
username=db_user,
password=db_pass,
database=db_name,
host=db_host,
port=db_port
)
Other engine configurations
For other connection drivers, dialects and methods, see the SQLAlchemy 1.4 Documentation - Engine Configuration
Create the db if not exists
See How to create a new database using SQLAlchemy?.
engine = sqlalchemy.create_engine(db_url)
if not sqlalchemy.database_exists(engine.url):
create_database(engine.url)
with engine.connect() as conn:
conn.execute("commit")
conn.execute("create database test")
I have a .ini (configuration file) where I have mentioned the server name, Database Name, UserName and Password with which I can connect my app to the MSSQL
self.db = pyodbc.connect(
'driver={SQL Server};server=homeserver;database=testdb;uid=home;pwd=1234')`
corresponding data mentioned above connect statement is now in config.ini
self.configwrite = ConfigParser.RawConfigParser()
configread = SafeConfigParser()
configread.read('config.ini')
driver = configread.get('DataBase Settings','Driver')
server = str(configread.get('DataBase Settings','Server'))
db = str(configread.get('DataBase Settings','Database'))
user = str(configread.get('DataBase Settings','Username'))
password = str(configread.get('DataBase Settings','Password'))'
How can I pass these variables in the pyodbc connect statement?
I tried this:
self.db = pyodbc.connect('driver={Driver};server=server;database=db;uid=user;pwd=password')
But I am getting an error.
Other options for the connect function:
# using keywords for SQL Server authentication
self.db = pyodbc.connect(driver=driver, server=server, database=db,
user=user, password=password)
# using keywords for Windows authentication
self.db = pyodbc.connect(driver=driver, server=server, database=db,
trusted_connection='yes')
self.db = pyodbc.connect('driver={%s};server=%s;database=%s;uid=%s;pwd=%s' % ( driver, server, db, user, password ) )
%s is used to include variables into the string
the variables are placed into the string according to their order after the %
Mixing strings and input variable in sql connection string using pyodbc library - Python
inspired of this answer
`conn=pyodbc.connect('Driver={SQL Server};'
'Server='+servername+';'
'Database=master;'
'UID=sa;'
'PWD='+pasword+';'
)`