Is google cloud filesystem ephemerial? - python

Is Google cloud Engine App filesystem ephemerial such as heroku (this link is another stackoverflow question that explains how the ephemerial filesystem works) ?
l would like to deploy a python-django project there and to know if I could use the built-in django database file.

Heroku’s filesystem is both ephemeral and dyno-local, for e.g. if you try to view a saved file via heroku run bash you won't see it (that runs on a one-off dyno, not a running web dyno) and it will be lost within 24 hours due to automatic dyno restarts. You just need a database Heroku has a PostgreSQL service with a free tier that should do more than you need, or pick another data persistence addon.
Coming to App Engine,
App Engine Flexible (Managed VMs), is ephemeral (disk initialized on each VM startup). It scales across many containers so there's no promise that a file you write to one will be accessible later. You
can get away with dealing with some writing to some /tmp files
but not much more. You will be much better off writing any data to something like Cloud Datastore, Cloud SQL, Memcache, or
Cloud Storage.
The App Engine Standard filesystem is not ephemeral but it is
read-only. You cannot write to the filesystem. Python 2.7 and PHP 5.5 don't have write access to the disk whereas Java 8, Java 11, Node.js,Python 3, PHP 7, Ruby, Go 1.11, and Go 1.12+ only have read and write access to the /tmp directory.
You could use Google App Engine Blobstore or BlobProperty in
Datastore to store blobs/files. For using Blobstore (up to 2GB) see
this For using Datastore blobs (only up to 1MB) see this

Related

How to create permanent files on Heroku?

I have a telegram bot with a Postgres DB hosted on Heroku Free dyno. In one stage of my code, I want to save pickled files permanently so that I can access them later. Storing it on a table doesn't feel like a nice idea as it is a nested class with a variable number of inputs.
Problem is that Heroku deletes these files frequently or at least on each restart or push. Is there any way to tackle this problem?
You have to use external services such as AWS S3, GCP Cloud Storage (Buckets), Azure Blob Storage etc.. for that. Or you may consider using an addon such as Felix Cloud Storage, Cloud Cube, Bucketeer, HDrive for easy integration.
Here is what the documentation states:
The Heroku filesystem is ephemeral - that means that any changes to
the filesystem whilst the dyno is running only last until that dyno is
shut down or restarted. Each dyno boots with a clean copy of the
filesystem from the most recent deploy. This is similar to how many
container based systems, such as Docker, operate.
In addition, under normal operations dynos will restart every day in a
process known as "Cycling".
These two facts mean that the filesystem on Heroku is not suitable for
persistent storage of data. In cases where you need to store data we
recommend using a database addon such as Postgres (for data) or a
dedicated file storage service such as AWS S3 (for static files). If
you don't want to set up an account with AWS to create an S3 bucket we
also have addons here that handle storage and processing of static
assets https://elements.heroku.com/addons

appengine set up local host with datastore for testing

I have tried to follow googles documentation on how to set up local development using a database (https://cloud.google.com/appengine/docs/standard/python/tools/using-local-server#Python_Using_the_Datastore). However, i do not have the experience level to follow along. I am not even sure if that was the right guide. The application is a Django project that uses python 2.7. To run the local host, i usually type dev_appserver.py --host 127.0.0.1 .
My questions are:
how do i download the data store database on google cloud. I do not want to download the entire database, just enough data to populate local host so i can do tests
once the database is download, what do i need to do to connect it to the localhost? Do i have to change a parameter somewhere?
do i need to download the datastore? Can i just make a duplicate on the cloud and then connect to that datastore?
When i run localhost, should it not already be connected to the datastore? Since the site works when it is running on the cloud. Where can i find the connection URI?
Thanks for the help
The development server is meant to simulate the whole App Engine Environment, if you examine the output of the dev_appserver.py command you'll see something like Starting Cloud Datastore emulator at: http://localhost:PORT. Your code will interact with that bundled Datastore automatically, pushing and retrieving data according to the code you wrote. Your data will be saved on a file in local storage and will persist across different runs of the development server unless it's explicitly deleted.
This option doesn't provide facilities to import data from your existing Cloud Datastore instance although it's a ready to go solution if your testing procedures can afford populating the local database with mock data through the use of a custom created script that does so programmatically. If you decide for this approach just write the data creation script and execute it before running the tests.
Now, there is another option to simulate local Datastore using the Cloud SDK that comes with handy features for your purposes. You can find the available information for it under Running the Datastore Emulator documentation page. This emulator has support to import entities downloaded from your production Cloud Datastore as well as for exporting them into files.
Back to your questions:
Export data from the Cloud instance into a GCS bucket following this, then download the data from the bucket to your filesystem following this, finally import the data into the emulator with the command shown here.
To use the emulator you need to first run gcloud beta emulators datastore start in a Cloud Shell and then in a separate tab run dev_appserver.py --support_datastore_emulator=true --datastore_emulator_port=8081 app.yaml.
The development server uses one of the two aforementioned emulators, in both cases it is not connected to your Cloud Datastore. You might create another project aimed for development purposes with a copy of your database and deploy your application there so you don't use the emulator at all.
Requests at datastore are made trough the endpoint https://datastore.googleapis.com/v1/projects/project-id although this is not related to how the emulators manage the connections in your local server.
Hope this helps.

How to serve Files in a Directory from App Engine Standard Python Environment

For a project, our team is using the app engine python environment to host several scripts that are scraping a website to store data in the form of various json files and directories of images. We want to expose these directories to a url (Ex: /img/01.jpg in the app engine directory to "sample.appspot.com/img/01.jpg"). The reason is that we want to be able to download these files directly to a react-native mobile app using the fetch api. Is this feasable, efficient, and quick using app engine and how? If not what combination of google cloud services could we use to achieve the same functionality and how?
You could use Google Cloud Storage to store your files:
(for flexible environment) Application code
(for standard environment) Writing to Cloud Storage
Once stored they're pretty much static files, so for serving them you have 2 options:
serve them as static content, directly from GCS, see, for example Serving from Cloud Storage.
I'd suspect this would be faster. Your app's environment doesn't matter, since it's not even involved.
serve them dynamically, through your app's URLs & handlers, with your app reading them from GCS. See, for example:
(for flexible environment) Serving from your application
(for standard environment) Reading from Cloud Storage

What are the possible ways to automate uploading static files to GoogleAppEngine?

I've been googling for a solution for some time and tried a couple of ways to solve this.
In short:
I used the sample from https://cloud.google.com/appengine/docs/python/googlecloudstorageclient/getstarted to create my own uploader, but it dies on the error, mentioned here:
No api proxy found for service "app_identity_service" when running GAE script
So, from what I understand, the script needs to be uploaded to google app engine and run from there using App Engine console. But even if it's possible, how do I automate it ?
Or maybe there are other solutions I'm missing. I looked through appcfg.pya but didn't find such an option as well.
You are following a sample to upload from GAE to Cloud Storage. If your only goal is to upload files to Cloud storage, then simply use gsutil. You can easily script with gsutil, do streaming copy, copy full directories and rsync a file system.
Why you need GAE in your solution ?
Google App Engine allows you to easily upload static files and serve them but if you simply just want a place to store static files then Google Cloud Storage is the way to go. It's much easier to use the gsutil tool to automate uploading your content than deploying using the App Engine SDK. The infrastructure serving the Cloud Storage files is the same as App Engine so there's really no advantage to using App Engine's static files feature.
Also, If you need a way to set up a custom domain, index page, and/or error pages you may want to check out the guide on Configuring a Bucket as a Website.

Using Sqlite3 on Heroku Cedar stack

Is there a way to use Sqlite3 with Django on Heroku?
The cedar stack's filesystem is not readonly.
However, you still mustn't store any data on it because the filesystem is ephemeral.
Any time your application restarts, whatever you had written to your application's filesystem disappears forever.
Any time you add a dyno, the two dynos each have their own ephemeral system; any data stored by one dyno to its ephemeral filesystem is not available to the other dyno or to any additional dynos you may add later.
Sqlite3 writes data to the local filesystem. You cannot use Sqlite3 with Heroku.
Heroku provides a default PostgreSQL installation, which Heroku manages. You can use that.
You can also use any third-party-managed cloud database system, such as Amazon RDS' or Xeround's MySQL, MongoHQ's or MongoLab's MongoDB, or Cloudant's CouchDB - all of which are available as Heroku addons.
I'm not sure when this answer became out of date, but as of at least 21 Nov 2013, sqlite3 CAN be used on heroku: https://devcenter.heroku.com/articles/sqlite3
It will work fine if you're just doing a tiny demo app, e.g. running 1 dyno and don't care that the database gets wiped at least once every 24 hours. If not, the heroku help article suggests migrating to Postgres.
Make sure the .db file is in your git directory somewhere and not in /tmp/ though, as it would be if for instance you were following the Flask tutorial app, flaskr.

Categories

Resources