I have a free dyno instance running a simple worker that creates an RSS file and upload it to PythonAnywhere (using it just like a web server for this static rss.xml file).
I am trying to move from PythonAnywhere to use a web heroku-buildpack-static on the same worker dyno but I cannot make it work. Looks like worker and web run in different folders / environments and I cannot find where it is located.
worker: python main.py
web: bin/boot
The main.py script writes the file to the current folder and uploads it with success to PythonAnywhere, but I cannot see where this file is written on Heroku. I tried to create a folder /app/web and modify this on main.py to write to it but also I cannot see the file created / updated, I used the Heroku console to check this. I think a worker uses a different home or instance to run but I am not sure what is this structure located. I also created a .profile with the following command without success:
chmod -R 777 /app/web
The app also contains a static.json file with the following to point the correct folder and avoid cache
{
"root": "/app/web/",
"headers": {
"/": {
"Cache-Control": "no-store, no-cache"
}
}
}
Looks like worker and web run in different folders / environments
Yes, that is exactly what is happening.
on the same worker Dyno
In fact, you are not on the same dyno. Your web process and your worker process execute in isolated environments. Consider this section of the documentation under the heading "Process types vs dynos":
A process type is the prototype from which one or more dynos are instantiated. This is similar to the way a class is the prototype from which one or more objects are instantiated in object-oriented programming.
You cannot write files to your to your web dyno from your worker dyno. They are entirely isolated and do not share a filesystem.
As msche has pointed out, dyno filesystems are ephemeral. Even if you do manage to write this file, e.g. by running a web service instead of a static host that has an API endpoint to accept the file, that file will be lost every time the dyno restarts. This happens frequently (at least once per day).
Even if you are writing the file every two minutes as you say in your comment your site will be broken for one minute every day on average. I suggest storing this data elsewhere, e.g. as a file on Amazon S3 or in a client-server data store.
Note that you can also host a static site directly from Amazon S3, which might be a good fit here.
Heroku uses an ephemeral file system, see this link. The heroku documentation suggests using a third party storage such as AWS S3.
Related
I'm using Python/Django on Heroku (Cedar Stack) and I've got a management command that I need to write that will pull a file out of an S3 bucket and process it. I'm not sure I understand how to use the ephemeral filesystem. Are there only certain directories that are writeable? I found an other article that implied that there were only certain folders that were writable (but, it doesn't seem to apply to the Cedar stack). I found this dev article but it doesn't go into much detail (note: I do understand that it's just temporary. I only need to unzip the file and process the file). Can I just create a folder anywhere under the application's root? And how would I get that? It seems like I could probably just use $HOME. I did a bit of testing by connecting to via
$ heroku run bash
and running:
$ echo #HOME
returns:
/app
and running:
$ mkdir $HOME/tmp
creates a folder in the app's root and gives with the same user and group as the other files and folders.
So... anything I'm missing here? A better way to do it? Is there an OS environment variable for this? I've run "env" and I don't see a better one.
To really understand the ephemeral filesystem, you need to understand what a dyno is. You can read more about how dynos work. In a nutshell, though, a process runs on Heroku in a virtual machine with its own filesystem. That virtual machine can stop for a number of reasons, taking the filesystem along with it.
The underlying filesystem will be destroyed when an app is restarted, reconfigured (e.g. heroku config ...), scaled, etc. For example, if you have two web dynos, write some files to the ephemeral filesystem, and scale to three dynos, those files will be destroyed because your app is running on new dynos.
In general, the ephemeral filesystem works just like any filesystem. Directories you have permission to write to, such as $HOME and /tmp, you can write files to. Any files that require permanence should be written to S3, or a similar durable store. S3 is preferred as Heroku runs on AWS and S3 offers some performance advantages. Any files that can be recreated at will can be stored on the dyno's ephemeral store.
You can create a file under the '/tmp' directory, and that file will be destroyed after the request is complete. I'm doing this on Cedar, and I haven't had any problems.
So I have a bit of a issue, I want to use Heroku to host my flask web app, and then I also want to use Heroku pipeline to link to the GitHub repository where I am housing this project. The issue is that on my website I allow the user to upload files to the server, but I feel that If I were to update the GitHub repository I will lose all the files the user uploaded when the server reloads the new GitHub. I would like to know if this is a real issue and if so is there some way I could fix this?
Storing user-uploaded files to Heroku isn't a good idea because Heroku provides ephemeral filesystem.
The Heroku filesystem is ephemeral - that means that any changes to the filesystem whilst the dyno is running only last until that dyno is shut down or restarted. Each dyno boots with a clean copy of the filesystem from the most recent deploy. This is similar to how many container based systems, such as Docker, operate.
So even if you just restart your app, Users will lose their files. But they provide some alternate options to store these. As you are using python this Addon may help you.
Read More - https://help.heroku.com/K1PPS2WM/why-are-my-file-uploads-missing-deleted
What is the easiest way to deploy a python app with only two .py files but keep it in one dyno? My files are friend.py and foe.py and my Procfile looks like this:
worker: python friend.py
worker: python foe.py
But when deployed to Heroku, the only dyno I have is foe.py. I've read other similar questions but they seem to complicated and I don't yet understand the inner workings of a python web application.
If they are distinct processes working in parallel, the most direct path is two dynos, using different names (actually, friend and foe would work fine as process names) in the Procfile. Right now, you are using the name worker twice, so foe.py shows up because it's the last one defined. Two things to keep in mind -
The names in Procfile can be arbitrary; as far as I know, the only "special" name is web, which tells Heroku to expect that process to bind to a port and accept HTTP traffic from the routing mesh. worker isn't special; it's just a convention people tend to use for "something running other than the web dyno"
A dyno is closer to a Docker container than a virtual machine, so the general best practice is one kind of process per container.
If you really need only one dyno (cost?), you could write a third script whose sole job is to spawn friend.py and foe.py as subprocesses. In that case, everything comes up and down as a unit; you can't manage friend and foe independently.
Hope that helps.
I am working on a Django based application whose location on my disk is home/user/Documents/project/application. Now this application takes in some values from the user and writes them into a file located in a folder which is under the project directory i.e home/user/Documents/project/folder/file. While running the development server using the command python manage.py runserver everything worked fine, however after deployment the application/views.py which accesses the file via open('folder/path','w') is not able to access it anymore, because by default it looks in var/www folder when deployed via apache2 server using mod_wsgi.
Now, I am not putting the folder into /var/www because it is not a good practise to put any python code there as it might become readable clients which is a major security threat. Please let me know, how can I point the deployed application to read and write to correct file.
The real solution is to install your data files in /srv/data/myapp or some such so that you can give the webserver user correct permissions to only those directories. Whether you choose to put your code in /var/www or not, is a separate question, but I would suggest putting at least your wsgi file there (and, of course, specifying your <DocumentRoot..> correctly.
I am saving some information to a textfile that is stored in my Heroku app. It can be updated by post requests from a user using an IOS device. It all works and it stores the information. But as you all know the Heroku app goes idle after an hour. So after the server goes idle and i make a GET request, the information previously put is lost?
There is a link in my heroku apps like afternoon-springs.... /ResetAllInfo but that link is never accessed. I watched the heroku logs to see.
Any ideas?
It seems like Heroku does not support write to filesystem: Heroku Documentation and here:
Each dyno gets its own ephemeral filesystem, with a fresh copy of the most recently deployed code. During the dyno’s lifetime its running processes can use the filesystem as a temporary scratchpad, but no files that are written are visible to processes in any other dyno and any files written will be discarded the moment the dyno is stopped or restarted.
So it's just not possible. Heroku suggests:
Use the provided PostgreSQL database instead of a filesystem database