Specify schemaname in table references in ibis - python

The following code works fine if table t is on the connecting user's search_path. I would like to be able to specify, in code, which schema to search. The only way I can make the code work is by issuing a alter user on the backend.
alter user username set search_path=schema_name;
import ibis
con = ibis.postgres.connect(url = "postgresql+psycopg2:///?service=service_name")
t = con.table("t")
y = t.group_by("b").mutate(z=t.a.sum())
How can i specify seach_path on the Postgres backend to ibis?
Regards

Related

how to get the latest database collection from MongoDB with Mongo Engine

I am very new to MongoDB. I create a database within a loop. Each time (Every 2 hours), I get data from some sources and create a data collection by MongoEngine and name each collection based on the creation time (for example 05_01_2021_17_00_30).
Now, on another python code , I want to get the latest database. how can I call the latest database collection without knowing the name of it?
I saw some guidelines in Stackoverflow but codes are old and not working now. Thanks guys.
I came up with this answer:
In mongo_setup.py: when I want to create a database, it will be named after the time of creation and save the name in a text file.
import mongoengine
import datetime
def global_init():
nownow = datetime.datetime.now()
Update_file_name = str(nownow.strftime("%d_%m_%Y_%H_%M_%S"))
# For Shaking hand between Django and the last updated data base, export the name
of the latest database
# in a text file and from there, Django will understand which database is the
latest
Updated_txt = open('.\\Latest database to read for Django.txt', '+w')
Updated_txt.write(Update_file_name)
Updated_txt.close()
mongoengine.register_connection(alias='core', name=Update_file_name)
In Django views.py: we will call the text file and read the latest database's name:
database_name_text_file = 'directory of the text file...'
db_name_file = open(database_name_text_file, 'r')
db_name = db_name_file.read()
# MongoDb Database
myclient = MongoClient(port=27017)
mydatabase = myclient[db_name]
classagg = mydatabase['aggregation__class']
database_text = classagg.find()
for i in database_text:
....

Sqlite3 .database command in python

I am trying to view all the databases in sqlite3. It can be done through the command line with .database command. I want to do the same thing in Django and show the render the details in HTML.
The following is the code I wrote in the views file:
def analyzer(request):
conn = sqlite3.connect("db.sqlite3")
c = conn.cursor()
c.execute("SHOW DATABASES")
l = c.fetchall()
print (l)
return render(request, 'analyzer.html')
You could probably use PRAGMA database_list;. That, like the .databases command, will show all the attached databases.
The tables for the main database can be retrieved with
SELECT name
from sqlite_master
where type = 'table';
For attached dbs, prefix sqlite_master with the attached db's name and dot (eg db2.sqlite_master). You probably want to filter out tables that begin with sqlite_

CX_Oracle - import data from Oracle in a specific schema to Pandas dataframe

when I check this question, I don't understand why the solution is not working for me. I run the following code:
query = """SELECT*
FROM TRANSACTION
"""
df_ora = pd.read_sql(query, con=connection)
and get the error:
DatabaseError: ORA-00942: table or view does not exist
The database is organized in shemes, where Datenbasis is one scheme. So it looks like the following:
Database ---> Datenbasis -> Table --> TRANSACTION
what do I miss here, what do I have to specify?
I am connection as following:
db_connection_string = 'User/pw#server:port/Name'
con = cx_Oracle.connect(db_connection_string)
The error is clear:
1.The user you are connecting with has no privileges over the table.
2.The table does not exist at all.
db_connection_string = 'User/pw#server:port/Name'
You have to ask your DBA to grant select over the table to your user.
Besides, you would have to change:
query = """SELECT*
FROM TRANSACTION
"""
df_ora = pd.read_sql(query, con=connection)
for
query = """SELECT*
FROM SCHEMA_OWNER.TRANSACTION
"""
df_ora = pd.read_sql(query, con=connection)
Where schema_owner is the schema which owns the table. If you don't want to change this last part, you would need to create a synonym.
The answer is to pick the right schema with the curser, the working code looks like:
db_connection_string = 'User/pw#server:port/Name'
con = cx_Oracle.connect(db_connection_string)
con.current_schema = 'DATENBASIS'
query = """SELECT*
FROM TRANSACTION
"""
df_ora = pd.read_sql(query, con=connection)
check if you can select from the table in SQL plus/ SQL developer, if not then the user id that you are using to connect to the DB does not have access to that table.
SELECT * FROM Datenbasis.TRANSACTION;

Redshift create table not working via Python

As per Unload to S3 with Python using IAM Role credentials, the unload statement worked perfectly. So did other commands I tried, like copy and select statements.
However, I also tried to run a query which creates a table.. The create table query runs without error, but when it gets to the select statement, it throws an errors that relation "public.test" does not exist.
Any idea why is the table not created properly? Query below:
import sqlalchemy as sa
from sqlalchemy.orm import sessionmaker
import config
import pandas as pd
#>>>>>>>> MAKE CHANGES HERE >>>>>>>>
DATABASE = "db"
USER = "user"
PASSWORD = getattr(config, 'password') #see answer by David Bern https://stackoverflow.com/questions/43136925/create-a-config-file-to-hold-values-like-username-password-url-in-python-behave/43137301
HOST = "host"
PORT = "5439"
SCHEMA = "public" #default is "public"
########## connection and session creation ##########
connection_string = "redshift+psycopg2://%s:%s#%s:%s/%s" % (USER,PASSWORD,HOST,str(PORT),DATABASE)
engine = sa.create_engine(connection_string)
session = sessionmaker()
session.configure(bind=engine)
s = session()
SetPath = "SET search_path TO %s" % SCHEMA
s.execute(SetPath)
--create table example
query2 = '''\
create table public.test (
id integer encode lzo,
user_id integer encode lzo,
created_at timestamp encode delta32k,
updated_at timestamp encode delta32k
)
distkey(id)
sortkey(id)
'''
r2 = s.execute(query2)
--select example
query4 = '''\
select * from public.test
'''
r4 = s.execute(query4)
########## create DataFrame from SQL query output ##########
df = pd.read_sql_query(query4, connection_string)
print(df.head(50))
########## close session in the end ##########
s.close()
If I run the same directly in Redshift, it works just fine..
--Edit--
Some of the things tried:
Removing "\" from query string
adding ";" at the end of query string
changing "public.test" to "test"
removing SetPath = "SET search_path TO %s" % SCHEMA and s.execute(SetPath)
breaking the create statement- generates expected error
adding copy from S3 command after create- runs without error, but again no table created
adding a column to create statement that doesnt exist in the file that is generated from the copy command- generates expected error
adding r4 = s.execute(query4)- runs without error, but again created table not in Redshift
Apparently need to add s.commit() in order to create the table.. If you are populating it via copy command or insert into: then add it after the copy command (after the create table is optional). Basically, it does not auto commit for create/alter commands!
http://docs.sqlalchemy.org/en/latest/orm/session_basics.html#session-faq-whentocreate
http://docs.sqlalchemy.org/en/latest/core/connections.html#understanding-autocommit

How to execute raw SQL in Flask-SQLAlchemy app

How do you execute raw SQL in SQLAlchemy?
I have a python web app that runs on flask and interfaces to the database through SQLAlchemy.
I need a way to run the raw SQL. The query involves multiple table joins along with Inline views.
I've tried:
connection = db.session.connection()
connection.execute( <sql here> )
But I keep getting gateway errors.
Have you tried:
result = db.engine.execute("<sql here>")
or:
from sqlalchemy import text
sql = text('select name from penguins')
result = db.engine.execute(sql)
names = [row[0] for row in result]
print names
Note that db.engine.execute() is "connectionless", which is deprecated in SQLAlchemy 2.0.
SQL Alchemy session objects have their own execute method:
result = db.session.execute('SELECT * FROM my_table WHERE my_column = :val', {'val': 5})
All your application queries should be going through a session object, whether they're raw SQL or not. This ensures that the queries are properly managed by a transaction, which allows multiple queries in the same request to be committed or rolled back as a single unit. Going outside the transaction using the engine or the connection puts you at much greater risk of subtle, possibly hard to detect bugs that can leave you with corrupted data. Each request should be associated with only one transaction, and using db.session will ensure this is the case for your application.
Also take note that execute is designed for parameterized queries. Use parameters, like :val in the example, for any inputs to the query to protect yourself from SQL injection attacks. You can provide the value for these parameters by passing a dict as the second argument, where each key is the name of the parameter as it appears in the query. The exact syntax of the parameter itself may be different depending on your database, but all of the major relational databases support them in some form.
Assuming it's a SELECT query, this will return an iterable of RowProxy objects.
You can access individual columns with a variety of techniques:
for r in result:
print(r[0]) # Access by positional index
print(r['my_column']) # Access by column name as a string
r_dict = dict(r.items()) # convert to dict keyed by column names
Personally, I prefer to convert the results into namedtuples:
from collections import namedtuple
Record = namedtuple('Record', result.keys())
records = [Record(*r) for r in result.fetchall()]
for r in records:
print(r.my_column)
print(r)
If you're not using the Flask-SQLAlchemy extension, you can still easily use a session:
import sqlalchemy
from sqlalchemy.orm import sessionmaker, scoped_session
engine = sqlalchemy.create_engine('my connection string')
Session = scoped_session(sessionmaker(bind=engine))
s = Session()
result = s.execute('SELECT * FROM my_table WHERE my_column = :val', {'val': 5})
docs: SQL Expression Language Tutorial - Using Text
example:
from sqlalchemy.sql import text
connection = engine.connect()
# recommended
cmd = 'select * from Employees where EmployeeGroup = :group'
employeeGroup = 'Staff'
employees = connection.execute(text(cmd), group = employeeGroup)
# or - wee more difficult to interpret the command
employeeGroup = 'Staff'
employees = connection.execute(
text('select * from Employees where EmployeeGroup = :group'),
group = employeeGroup)
# or - notice the requirement to quote 'Staff'
employees = connection.execute(
text("select * from Employees where EmployeeGroup = 'Staff'"))
for employee in employees: logger.debug(employee)
# output
(0, 'Tim', 'Gurra', 'Staff', '991-509-9284')
(1, 'Jim', 'Carey', 'Staff', '832-252-1910')
(2, 'Lee', 'Asher', 'Staff', '897-747-1564')
(3, 'Ben', 'Hayes', 'Staff', '584-255-2631')
You can get the results of SELECT SQL queries using from_statement() and text() as shown here. You don't have to deal with tuples this way. As an example for a class User having the table name users you can try,
from sqlalchemy.sql import text
user = session.query(User).from_statement(
text("""SELECT * FROM users where name=:name""")
).params(name="ed").all()
return user
For SQLAlchemy ≥ 1.4
Starting in SQLAlchemy 1.4, connectionless or implicit execution has been deprecated, i.e.
db.engine.execute(...) # DEPRECATED
as well as bare strings as queries.
The new API requires an explicit connection, e.g.
from sqlalchemy import text
with db.engine.connect() as connection:
result = connection.execute(text("SELECT * FROM ..."))
for row in result:
# ...
Similarly, it’s encouraged to use an existing Session if one is available:
result = session.execute(sqlalchemy.text("SELECT * FROM ..."))
or using parameters:
session.execute(sqlalchemy.text("SELECT * FROM a_table WHERE a_column = :val"),
{'val': 5})
See "Connectionless Execution, Implicit Execution" in the documentation for more details.
result = db.engine.execute(text("<sql here>"))
executes the <sql here> but doesn't commit it unless you're on autocommit mode. So, inserts and updates wouldn't reflect in the database.
To commit after the changes, do
result = db.engine.execute(text("<sql here>").execution_options(autocommit=True))
This is a simplified answer of how to run SQL query from Flask Shell
First, map your module (if your module/app is manage.py in the principal folder and you are in a UNIX Operating system), run:
export FLASK_APP=manage
Run Flask shell
flask shell
Import what we need::
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
db = SQLAlchemy(app)
from sqlalchemy import text
Run your query:
result = db.engine.execute(text("<sql here>").execution_options(autocommit=True))
This use the currently database connection which has the application.
Flask-SQLAlchemy v: 3.0.x / SQLAlchemy v: 1.4
users = db.session.execute(db.select(User).order_by(User.title.desc()).limit(150)).scalars()
So basically for the latest stable version of the flask-sqlalchemy specifically the documentation suggests using the session.execute() method in conjunction with the db.select(Object).
Have you tried using connection.execute(text( <sql here> ), <bind params here> ) and bind parameters as described in the docs? This can help solve many parameter formatting and performance problems. Maybe the gateway error is a timeout? Bind parameters tend to make complex queries execute substantially faster.
If you want to avoid tuples, another way is by calling the first, one or all methods:
query = db.engine.execute("SELECT * FROM blogs "
"WHERE id = 1 ")
assert query.first().name == "Welcome to my blog"

Categories

Resources