collect all a postgresql schemas in a python - python

I am trying to dynamically show all the schemas that are in a postgresql database in the following format:
dbschema = 't1,t2,t3,t4,public'
my code looks like the following:
dbschema = 't1,t2,t3,t4,public'
engine = create_engine('postgresql://user:password#127.0.0.1:5432/dbexostock9',
connect_args={'options': '-csearch_path={}'.format(dbschema)})
DB = value_classification.to_sql('dashboard_classification', engine, if_exists='replace')
I am using SQLalchemy to write in the database. As you can see it would be more convinient to have what is in dbschema to update dynamically but remains in this format.
I can't find a way to achieve this. Anyone has a clue on how to do it?

Related

Snowflake DB Transfer to Postgres

I'm trying to make a complete copy of a Snowflake DB into PostgreSQL DB (every table/view, every row). I don't know the best way to go about accomplishing this. I've tried using a package called pipelinewise , but I could not get the access needed to convert a snowflake view to a postgreSQL table (it needs a unique id). Long story short it just would not work for me.
I've now moved on to using the snowflake-sqlalchemy package. So, I'm wondering what is the best way to just make a complete copy of the entire DB. Is it necessary to make a model for each table, because this is a big DB? I'm new to SQL alchemy in general, so I don't know exactly where to start. My guess is with reflections , but when I try the example below I'm not getting any results.
from snowflake.sqlalchemy import URL
from sqlalchemy import create_engine, MetaData
engine = create_engine(URL(
account="xxxx",
user="xxxx",
password="xxxx",
database="xxxxx",
schema="xxxxx",
warehouse="xxxx"
))
engine.connect()
metadata = MetData(bind=engine)
for t in metadata.sorted_tables:
print(t.name)
I'm pretty sure the issue is not the engine because I did do the validate.py example and it does return the version like expected. Any advice to why my code above is not working, or a better way to accomplish my goal of making a complete copy of the DB would be greatly appreciated.
Try this: I got it working on mine, but I have a few functions that I use for my sqlalchemy engine, so might not work as is:
from snowflake.sqlalchemy import URL
from sqlalchemy import create_engine, MetaData
import sqlalchemy sa
engine = sa.create_engine(URL(
account="xxxx",
user="xxxx",
password="xxxx",
database="xxxxx",
schema="xxxxx",
warehouse="xxxx"
))
inspector = sa.inspect(engine)
schema = inspector.default_schema_names
for table_name in inspector.get_table_names(schema):
print(table_name)

CREATE OR REPLACE TABLE using the Google BigQuery Python library

My Python code is like so:
from google.cloud import bigquery
client = bigquery.Client(
project='my-project',
credentials=credentials,
)
sql = '''
CREATE OR REPLACE TABLE `my-project.my_dataset.test` AS
WITH some_table AS (
SELECT * FROM `my-project.my_dataset.table_1`
),
some_other_table AS (
SELECT id, some_column FROM my-project.my_dataset.table_2
)
SELECT * FROM some_table
LEFT JOIN some_other_table ON some_table.unique_id=some_other_table.id
'''
query_job = client.query(sql)
query_job.result()
The query works in the Google BigQuery Console UI, but not when executed as above from Python.
I understand that by using CREATE OR REPLACE this is a "DDL" request, which I cannot figure out how to execute from the Python library. You can set the destination table in the job.config, which lets you CREATE a table, but then you don't get the CREATE OR REPLACE functionality.
Thanks for any assistance.
After carefully reviewing the documentation, I can say that the Python SDK for BigQuery don't specify a way to to perform DDL statements as a query. You can find the documented code for the query function you are using here. As you can see, the query parameter expects a SQL statement.
Despite that, I tried to reproduce your problem and it worked for me. I could create the table perfectly by using a DDL statement as you're trying to do. Hence we can conclude that the API consider DDL as a subset of SQL.
I suggest that you comment the error you're receiving so I can provide you a better support.

SQLAlchemy - query without writing/locking the database

I have a multithreaded data analysis pipeline, which queries a database (via SQLAlchemy). Additionally, the database is synchronized across multiple systems by syncthing - long story short, this means that write permission cannot always be guaranteed.
Even when I am able to guarantee write access, I still occasionally and rather randomly get operational errors:
OperationalError: (sqlite3.OperationalError) database is locked
The code I use to load the session for the query is the following:
def loadSession(db_path):
db_path = "sqlite:///" + path.expanduser(db_path)
engine = create_engine(db_path, echo=False)
Session = sessionmaker(bind=engine)
session = Session()
Base.metadata.create_all(engine)
return session, engine
And can be seen in its full context here.
My query (and the way I turn it into a value) look like this:
session, engine = loadSession(db_path)
sql_query=session.query(LaserStimulationProtocol).filter(LaserStimulationProtocol.code==stim_protocol_dictionary[scan_type])
mystring = sql_query.statement
mydf = pd.read_sql_query(mystring,engine)
delay = int(mydf["stimulation_onset"][0])
And again, the full context can be found here.
How could I change my code so the database can be queried without having to rely on the file being writeable/unlocked? I have checked the file's checksum, and it does not change upon query, so clearly I'm not writing anything to it. As such, I guess there should be some way to extract the info I am looking for without write access?
I've written a blog post on the subject which provides some more explanation of the issue and some ways to work around it: http://charlesleifer.com/blog/multi-threaded-sqlite-without-the-operationalerrors/
Peewee ORM has an extension that is designed to support multiple threads writing to a SQLite database. http://docs.peewee-orm.com/en/latest/peewee/playhouse.html#sqliteq

Creating Pyramid/SQLAlchemy models from MySQL database

I would like to use Pyramid and SQLAlchemy with an already existing MySQL database.
Is it possible to automatically create the models from the MySQL tables. I do not want to write them all by hand.
This could be either by retrieving the tables and structure from the Server or using a MySQL "Create Table..." script, which contains all the tables.
Thanks in advance,
Linus
In SQLAlchemy you can reflect your database like this:
from sqlalchemy import create_engine, MetaData
engine = create_engine(uri)
meta = MetaData(bind=engine)
meta.reflect()
Then, meta.tables are your tables.
By the way, it is described here: http://docs.sqlalchemy.org/en/latest/core/reflection.html
To generate the code based on the database tables there are packages such as https://pypi.python.org/pypi/sqlacodegen and http://turbogears.org/2.0/docs/main/Utilities/sqlautocode.html , but I haven't used them.

How do I extract table metadata from a database using python

I want to come up minimal set of queries/loc that extracts the table metadata within a database, on as many versions of database as possible. I'm using PostgreSQl. I'm trying to get this using python. But I've no clue on how to do this, as I'm a python newbie.
I appreciate your ideas/suggestions on this issue.
You can ask your database driver, in this case psycopg2, to return some metadata about a database connection you've established. You can also ask the database directly about some of it's capabilities, or schemas, but this is highly dependent on the version of the database you're connecting to, as well as the type of database.
Here's an example taken from http://bytes.com/topic/python/answers/438133-find-out-schema-psycopg for PostgreSQL:
>>> import psycopg2 as db
>>> conn = db.connect('dbname=billings user=steve password=xxxxx port=5432')
>>> curs = conn.cursor()
>>> curs.execute("""select table_name from information_schema.tables WHERE table_schema='public' AND table_type='BASETABLE'""")
>>> curs.fetchall()
[('contacts',), ('invoicing',), ('lines',), ('task',), ('products',),('project',)]
However, you probably would be better served using an ORM like SQLAlchemy. This will create an engine which you can query about the database you're connected to, as well as normalize how you connect to varying database types.
If you need help with SQLAlchemy, post another question here! There's TONS of information already available by searching the site.

Categories

Resources