I've looked over Google Cloud SQL's documentation and various searches, but I can't find out whether it is possible to use SQLAlchemy with Google Cloud SQL, and if so, what the connection URI should be.
I'm looking to use the Flask-SQLAlchemy extension and need the connection string like so:
mysql://username:password#server/db
I saw the Django example, but it appears the configuration uses a different style than the connection string. https://developers.google.com/cloud-sql/docs/django
Google Cloud SQL documentation:
https://developers.google.com/cloud-sql/docs/developers_guide_python
Update
Google Cloud SQL now supports direct access, so the MySQLdb dialect can now be used. The recommended connection via the mysql dialect is using the URL format:
mysql+mysqldb://root#/<dbname>?unix_socket=/cloudsql/<projectid>:<instancename>
mysql+gaerdbms has been deprecated in SQLAlchemy since version 1.0
I'm leaving the original answer below in case others still find it helpful.
For those who visit this question later (and don't want to read through all the comments), SQLAlchemy now supports Google Cloud SQL as of version 0.7.8 using the connection string / dialect (see: docs):
mysql+gaerdbms:///<dbname>
E.g.:
create_engine('mysql+gaerdbms:///mydb', connect_args={"instance":"myinstance"})
I have proposed an update to the mysql+gaerdmbs:// dialect to support both of Google Cloud SQL APIs (rdbms_apiproxy and rdbms_googleapi) for connecting to Cloud SQL from a non-Google App Engine production instance (ex. your development workstation). The change will also modify the connection string slightly by including the project and instance as part of the string, and not require being passed separately via connect_args.
E.g.
mysql+gaerdbms:///<dbname>?instance=<project:instance>
This will also make it easier to use Cloud SQL with Flask-SQLAlchemy or other extension where you don't explicitly make the create_engine() call.
If you are having trouble connecting to Google Cloud SQL from your development workstation, you might want to take a look at my answer here - https://stackoverflow.com/a/14287158/191902.
Yes,
If you find any bugs in SA+Cloud SQL, please let me know. I wrote the dialect code that was integrated into SQLAlchemy. There's a bit of silly business about how Cloud SQL bubbles up exceptions, so there might be some loose ends there.
For those who prefer PyMySQL over MySQLdb (which is suggested in the accepted answer), the SQLAlchemy connection strings are:
For Production
mysql+pymysql://<USER>:<PASSWORD>#/<DATABASE_NAME>?unix_socket=/cloudsql/<PUT-SQL-INSTANCE-CONNECTION-NAME-HERE>
Please make sure to
Add the SQL instance to your app.yaml:
beta_settings:
cloud_sql_instances: <PUT-SQL-INSTANCE-CONNECTION-NAME-HERE>
Enable the SQL Admin API as it seems to be necessary:
https://console.developers.google.com/apis/api/sqladmin.googleapis.com/overview
For Local Development
mysql+pymysql://<USER>:<PASSWORD>#localhost:3306/<DATABASE_NAME>
given that you started the Cloud SQL Proxy with:
cloud_sql_proxy -instances=<PUT-SQL-INSTANCE-CONNECTION-NAME-HERE>=tcp:3306
it is doable, though I haven't used Flask at all so I'm not sure about establishing the connection through that. I got it working through Pyramid and submitted a patch to SQLAlchemy (possibly to the wrong repo) here:
https://bitbucket.org/sqlalchemy/sqlalchemy/pull-request/2/added-a-dialect-for-google-app-engines
That has since been replaced and accepted into SQLAlchemy as
http://www.sqlalchemy.org/trac/ticket/2484
I don't think it's made it way to a release though.
There are some issues with Google SQL throwing different exceptions so we had issues with things like deploying a database automatically. You also need to disable connection pooling using NullPool as mentioned in the second patch.
We've since moved to using the datastore through NDB so I haven't followed the progess of these fixes for a while..
PostgreSQL, pg8000 and flask_sqlalchemy
Adding information in case someone is on the lookout how to use flask_sqlalchemy with PostgreSQL: Using pg8000 as driver, the working connection string is
postgres+pg8000://<db_user>:<db_pass>#/<db_name>
Related
link da questão
Briefly, I want to know what this "DB-API" mechanism is.
Are there multiple DB-APIs (there are more than 1 DB-API)?
Is it just a 'rules' document?
have a source code?
What is it for?
Is psycopg2 an example of a DB-API or is it a library that follows DB-APIs standards?
Is the DB-API specified in SQLAlchemy a SQLAlchemy-specific DB-API (if that is possible)?
I think that's it !!!
Regarding the dialect, I ask another question later.
The python db api is defined in https://www.python.org/dev/peps/pep-0249/ and I believe just a spec or as you say rules document.
Modules like psycopg2 fulfill those requirements, so are an implementation of that api. SqlAlchemy allows you to swap out which db api implementation you use so you can change your underlying database server or use features offered by another driver/db api implementation and still use the same database server.
As I understand it SqlAlchemy supports multiple db api implementations which you specify using a connection uri, explained here https://docs.sqlalchemy.org/en/13/core/engines.html#database-urls.
I read all documentation related to connecting to MysQL hosted in Cloud SQL from GCF and still can't connect. Also, tried all hints in documentation of SQLAlchemy related to this.
I am using the following connection
con = 'mysql+pymysql://USER:PASSWORD#/MY_DB?unix_socket=/cloudsql/Proj_ID:Zone:MySQL_Instance_ID'
mysqlEngine = sqlalchemy.create_engine(con)
The error I got was:
(pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on 'localhost' ([Errno 111] Connection refused)") (Background on this error at: http://sqlalche.me/e/e3q8)
You need to make sure you are using the correct /cloudsql/<INSTANCE_CONNECTION_NAME> (This is in the format <PROJECT_ID>:<REGION>:<INSTANCE_ID>). This should be all that's needed if your Cloud SQL instance is in the same project and region as your Function.
The GCF docs also strongly recommends limiting your pool to a single connection. This means you should set both pool_size=1 and max_overflow=0 in your engine settings.
If you would like to see an example of how to set these settings, check out this sample application on Github.
I believe that your problem is with the Connection_name represented by <PROJECT_ID>:<REGION>:<INSTANCE_ID> at the end of the con string variable.
Which by the way should be quoted:
con = 'mysql+pymysql://USER:PASSWORD#/MY_DB?unix_socket=/cloudsql/<PROJECT_ID>:<REGION>:<INSTANCE_ID>'
Check if you are writing it right with this command:
gcloud sql instances describe <INSTANCE_ID> | grep connectionName
If this is not the case, keep in mind these considerations present in the Cloud Functions official documentation:
First Generation MySQL instances must be in the same region as your Cloud Function. Second Generation MySQL instances as well as PostgreSQL instances work with Cloud Functions in any region.
Your Cloud Function has access to all Cloud SQL instances in your project. You can access Second Generation MySQL instances as well as PostgreSQL instances in other projects if your Cloud Function's service account (listed on the Cloud Function's General tab in the GCP Console) is added as a member in IAM on the project with the Cloud SQL instance(s) with the Cloud SQL Client role.
After a long thread with Google Support, we found the reason to be: simply we should enable public access to Cloud SQL without any firewall rule. It is undocumented and can drive you crazy, but the silver bullet for the support team is to say: it is in beta!
I was having this issue. Service account was correct, had the correct permissions, same exact connection string as in my App Engine application. Still got this in the logs.
dial unix /cloudsql/project:region:instance connect: no such file or directory
Switching from 2nd generation Cloud Function to 1st generation solved it. Didn't see it documented anywhere that 2nd couldn't connect to Cloud SQL instances.
I have a working Python 3 app on Google App Engine (flexible) connecting to Postgres on Google Cloud SQL. I got it working by following the docs, which at some point have you connecting via psycopg2 to a database specifier like this
postgresql://postgres:password#/dbname?host=/cloudsql/project:us-central1:dbhost
I'm trying to understand how the hostname /cloudsql/project:us-central1:dbhost works. Google calls this an "instance connection string" and it sure looks like it's playing the role of a regular hostname. Only with the / and : it's not a valid name for a DNS resolver.
Is Google's flexible Python environment modified somehow to support special hostnames like this? It looks like stock Python 3 and psycopg2, but maybe it's modified somehow? If so, are those modifications documented anywhere? The docs for the Python runtime don't have anything about this.
It turns out that host=/cloudsql/project:us-central1:dbhost specifies the name of a directory in the filesystem. Inside that directory is a file named .s.PGSQL.5432 which is a Unix domain socket. An instance of Cloud SQL Proxy is listening on that Unix domain socket and forwarding database requests via TCP to the actual database server. Despite the DB connection string being host= it actually names a directory with a Unix socket in it; that's a feature of libpq, the Postgres connection library.
Many thanks to Vadim for answering my question quickly in a comment, just writing up a more complete answer here for future readers.
Our infrastructure group has asked us to "add MultiSubnetFailover=True to all application connection strings" so that we can take advantage of a new SQL Server HA setup involving Availability Groups.
I am stuck though since we have some python programs that connect (read+write) to the database via SQL Alchemy. I have been searching and I don't see anything about this MultiSubnetFailover feature being available as an option in SQL Alchemy or any other Python driver. Is it possible to connect to an HA setup utilizing the SQL Alchemy driver, or even Python, and if so how?
FYI - The link that my infrastructure guy pointed me to is here (http://msdn.microsoft.com/en-us/library/hh205662%28v=vs.110%29.aspx), and as you can see it is specifically about how .NET applications can utilize the "MultiSubnetFailover=True" setting in the connection string among other things.
http://docs.sqlalchemy.org/en/latest/dialects/mssql.html#dialect-mssql-pyodbc-connect
You could use the example towards the end of the documentation's section like this:
import urllib
from sqlalchemy import create_engine
connection_string = '127.0.0.1;Database=MyDatabase;MultiSubnetFailover=True'
engine_string = 'mssql+pyodbc:///?odbc_connect={}'.format(urllib.quote_plus(connection_string))
engine = create_engine(engine_string)
Update from comments
For newer versions of Microsoft ODBC Driver for SQL Server, you may need to use MultiSubnetFailover=Yes instead of True
To my surprise, I haven't found this question asked elsewhere. Short version, I'm writing an app that I plan to deploy to the cloud (probably using Heroku), which will do various web scraping and data collection. The reason it'll be in the cloud is so that I can have it be set to run on its own every day and pull the data to its database without my computer being on, as well as so the rest of the team can access the data.
I used to use AWS's SimpleDB and DynamoDB, but I found SDB's storage limitations to be to small and DDB's poor querying ability to be a problem, so I'm looking for a database system (SQL or NoSQL) that can store arbitrary-length values (and ideally arbitrary data structures) and that can be queried on any field.
I've found many database solutions for Heroku, such as ClearDB, but all of the information I've seen has shown how to set up Django to access the database. Since this is intended to be script and not a site, I'd really prefer not to dive into Django if I don't have to.
Is there any kind of database that I can hook up to in Heroku with Python without using Django?
You can get a database provided from Heroku without requiring your app to use Django. To do so:
heroku addons:add heroku-postgresql:dev
If you need a larger more dedicated database, you can examine the plans at Heroku Postgres
Within your requirements.txt you'll want to add:
psycopg2
Then you can connect/interact with it similar to the following:
import psycopg2
import os
import urlparse
urlparse.uses_netloc.append('postgres')
url = urlparse.urlparse(os.environ['DATABASE_URL'])
conn = psycopg2.connect("dbname=%s user=%s password=%s host=%s " % (url.path[1:], url.username, url.password, url.hostname))
cur = conn.cursor()
query = "SELECT ...."
cur.execute(query)
I'd use MongoDB. Heroku has support for it, so I think it will be really easy to start and scale out: https://addons.heroku.com/mongohq
About Python: MongoDB is a really easy database. The schema is flexible and fits really well with Python dictionaries. That's something really good.
You can use PyMongo
from pymongo import Connection
connection = Connection()
# Get your DB
db = connection.my_database
# Get your collection
cars = db.cars
# Create some objects
import datetime
car = {"brand": "Ford",
"model": "Mustang",
"date": datetime.datetime.utcnow()}
# Insert it
cars.insert(car)
Pretty simple, uh?
Hope it helps.
EDIT:
As Endophage mentioned, another good option for interfacing with Mongo is mongoengine. If you have lots of data to store, you should take a look at that.
I did this recently with Flask. (https://github.com/HexIce/flask-heroku-sqlalchemy).
There are a couple of gotchas:
1. If you don't use Django you may have to set up your database yourself by doing:
heroku addons:add shared-database
(Or whichever database you want to use, the others cost money.)
2. The database URL is stored in Heroku in the "DATABASE_URL" environment variable.
In python you can get it by doing.
dburl = os.environ['DATABASE_URL']
What you do to connect to the database from there is up to you, one option is SQLAlchemy.
Create a standalone Heroku Postgres database. http://postgres.heroku.com