Snowflake DB Transfer to Postgres - python

I'm trying to make a complete copy of a Snowflake DB into PostgreSQL DB (every table/view, every row). I don't know the best way to go about accomplishing this. I've tried using a package called pipelinewise , but I could not get the access needed to convert a snowflake view to a postgreSQL table (it needs a unique id). Long story short it just would not work for me.
I've now moved on to using the snowflake-sqlalchemy package. So, I'm wondering what is the best way to just make a complete copy of the entire DB. Is it necessary to make a model for each table, because this is a big DB? I'm new to SQL alchemy in general, so I don't know exactly where to start. My guess is with reflections , but when I try the example below I'm not getting any results.
from snowflake.sqlalchemy import URL
from sqlalchemy import create_engine, MetaData
engine = create_engine(URL(
account="xxxx",
user="xxxx",
password="xxxx",
database="xxxxx",
schema="xxxxx",
warehouse="xxxx"
))
engine.connect()
metadata = MetData(bind=engine)
for t in metadata.sorted_tables:
print(t.name)
I'm pretty sure the issue is not the engine because I did do the validate.py example and it does return the version like expected. Any advice to why my code above is not working, or a better way to accomplish my goal of making a complete copy of the DB would be greatly appreciated.

Try this: I got it working on mine, but I have a few functions that I use for my sqlalchemy engine, so might not work as is:
from snowflake.sqlalchemy import URL
from sqlalchemy import create_engine, MetaData
import sqlalchemy sa
engine = sa.create_engine(URL(
account="xxxx",
user="xxxx",
password="xxxx",
database="xxxxx",
schema="xxxxx",
warehouse="xxxx"
))
inspector = sa.inspect(engine)
schema = inspector.default_schema_names
for table_name in inspector.get_table_names(schema):
print(table_name)

Related

Insert in data in db2 using sqlalchemy using pyodbc and iacces

I'm trying to insert some data on a DB2 table on an IBM iSeries (AS400) server, using sqlalchemy, pyodbc and the iaccess packages.
The server allows me to run SELECT and CREATE queries, but when I try to insert rows i get the following error:
sqlalchemy.exc.DBAPIError: (pyodbc.Error) ('HY000', "[REDACTED]
SQL7008 - TABLE in DATABASE not valid for operations. (-7008)
(SQLExecDirectW)")
I'm executing the following query
INSERT INTO database.table VALUES ('A', 'B', 'C')
I know the query works because I am able to run it using the same credentials from Aqua data studio, a db management IDE.
I'm using the following python code to connect to the db:
from sqlalchemy import create_engine
import pandas as pd
engine_statement = f"iaccess+pyodbc://{user}:{pwd}#{server}/{schema_name}?DRIVER={driver}"
connection = create_engine(engine_statement)
I tried using ibmi instead of iaccess+pyodbc but nothing changes.
The closest queztion I found asks the same thing but using Java.
I tried implementing the answer there in python, by setting the isolation_level option to all possible values, but still nothing changes.
I'm not 100% sure how journaling works and therefore how to use it so I was not able to implement point 2 in the answer.
If it may help, I am able to create new tables, but not write on them, which seems surprising, but I'm no sql expert so I guess I'm missing something.

collect all a postgresql schemas in a python

I am trying to dynamically show all the schemas that are in a postgresql database in the following format:
dbschema = 't1,t2,t3,t4,public'
my code looks like the following:
dbschema = 't1,t2,t3,t4,public'
engine = create_engine('postgresql://user:password#127.0.0.1:5432/dbexostock9',
connect_args={'options': '-csearch_path={}'.format(dbschema)})
DB = value_classification.to_sql('dashboard_classification', engine, if_exists='replace')
I am using SQLalchemy to write in the database. As you can see it would be more convinient to have what is in dbschema to update dynamically but remains in this format.
I can't find a way to achieve this. Anyone has a clue on how to do it?

Does Python's SQLAlchemy support server side cursor (for MSSQL)?

I would like to query a MSSQL database using Python's SQLAlchemy. There could be tens of millions of matched rows. In order to use less memory at the server side, I consider using server-side cursor (SSCursor) to slice the matched rows. However, I cannot find examples or resources about SSCursor with SQLAlchemy.
Is it possible to use SSCursor with SQLAlchemy? If this is do-able, can someone show me examples or point out references? If not, any suggested workarounds?
Thanks!
Yes. You just specify the 'cursorclass' option in the connect_args argument. Here is an example with mysql. You'll need to use an MSSQL connector that implements server side cursors like MySQLdb does for mysql as shown below.
from sqlalchemy import create_engine, MetaData
import MySQLdb.cursors
engine = create_engine('mysql://your:details#go/here', connect_args={'cursorclass': MySQLdb.cursors.SSCursor})

Creating Pyramid/SQLAlchemy models from MySQL database

I would like to use Pyramid and SQLAlchemy with an already existing MySQL database.
Is it possible to automatically create the models from the MySQL tables. I do not want to write them all by hand.
This could be either by retrieving the tables and structure from the Server or using a MySQL "Create Table..." script, which contains all the tables.
Thanks in advance,
Linus
In SQLAlchemy you can reflect your database like this:
from sqlalchemy import create_engine, MetaData
engine = create_engine(uri)
meta = MetaData(bind=engine)
meta.reflect()
Then, meta.tables are your tables.
By the way, it is described here: http://docs.sqlalchemy.org/en/latest/core/reflection.html
To generate the code based on the database tables there are packages such as https://pypi.python.org/pypi/sqlacodegen and http://turbogears.org/2.0/docs/main/Utilities/sqlautocode.html , but I haven't used them.

How do I extract table metadata from a database using python

I want to come up minimal set of queries/loc that extracts the table metadata within a database, on as many versions of database as possible. I'm using PostgreSQl. I'm trying to get this using python. But I've no clue on how to do this, as I'm a python newbie.
I appreciate your ideas/suggestions on this issue.
You can ask your database driver, in this case psycopg2, to return some metadata about a database connection you've established. You can also ask the database directly about some of it's capabilities, or schemas, but this is highly dependent on the version of the database you're connecting to, as well as the type of database.
Here's an example taken from http://bytes.com/topic/python/answers/438133-find-out-schema-psycopg for PostgreSQL:
>>> import psycopg2 as db
>>> conn = db.connect('dbname=billings user=steve password=xxxxx port=5432')
>>> curs = conn.cursor()
>>> curs.execute("""select table_name from information_schema.tables WHERE table_schema='public' AND table_type='BASETABLE'""")
>>> curs.fetchall()
[('contacts',), ('invoicing',), ('lines',), ('task',), ('products',),('project',)]
However, you probably would be better served using an ORM like SQLAlchemy. This will create an engine which you can query about the database you're connected to, as well as normalize how you connect to varying database types.
If you need help with SQLAlchemy, post another question here! There's TONS of information already available by searching the site.

Categories

Resources