Gunicorn throws error 403 when accessing static files - python

python==2.7.5, django==1.11.10, gunicorn==19.7.1, RHEL 7.4
I have a django project at my job written not by me.
It was in eventcat user's home directory and with time we ran out of available space on the disk. I was to move the project to /data/.
After I moved the project directory and set up a new environment I faced the problem that static files are not loaded and throwing 403 forbidden error.
Well, I know that gunicorn is not supposed to serve static files on production, but this is an internal project with low load. I have to deal with it as is.
The server is started with a selfwritten script (I changed the environment line to new path):
#!/bin/sh
. ~/.bash_profile
. /data/eventcat/env/bin/activate
exec gunicorn -c gunicorn.conf.py eventcat.wsgi:application
The gunicorn.conf.py consists of:
bind = '127.0.0.1:8000'
backlog = 2048
workers = 1
worker_class = 'sync'
worker_connections = 1000
timeout = 120
keepalive = 2
spew = False
daemon = True
pidfile = 'eventcat.pid'
umask = 0
user = None
group = None
tmp_upload_dir = None
errorlog = 'er.log'
loglevel = 'debug'
accesslog = 'ac.log'
access_log_format = '%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s"'
proc_name = None
def post_fork(server, worker):
server.log.info("Worker spawned (pid: %s)", worker.pid)
def pre_fork(server, worker):
pass
def pre_exec(server):
server.log.info("Forked child, re-executing.")
def when_ready(server):
server.log.info("Server is ready. Spawning workers")
def worker_int(worker):
worker.log.info("worker received INT or QUIT signal")
import threading, sys, traceback
id2name = dict([(th.identm, th.name) for th in threading.enumerate()])
code = []
for threadId, stack in sys._current_frames().items():
code.append("\n# Thread: %s(%d)" % (id2name.get(threadId, ""), threadId))
for filename, lineno, name, line in traceback.exctract_stack(stack):
code.append('File: "%s", line %d, in %s' %(filename, lineno, name))
if line:
code.append(" %s" % (line.strip()))
worker.log.debug("\n".join(code))
def worker_abort(worker):
worker.log.info("worker received SIGABRT signal")
All the files in static directory are owned by eventcat user just like the directory itself.
I couldn't find any useful information in er.log and ac.log.
The server is running on https protocol and there is an ssl.conf in project directory. It has aliases for static and media pointing to previous project location and I changed all these entries to the new ones. Though I couldn't find where this config file is used.
Please, advise how can I find out what is the cause of the issue. What config files or anything should I look into?
UPDATE:
Thanks to #ruddra, gunicorn wasn't serving static at all. It was httpd that was. After making changes in httpd config everything is working.

As far as I know, gunicorn does not serve static contents. So to serve static contents, its best to use either whitenoise or you can use NGINX, Apache or any reverse proxy server. You can check Gunicorn's documentation on deployment using NGINX.
If you want to use whitenoise, then please install it using:
pip install whitenoise
Then add whitenoise to MIDDLEWARES like this(inside settings.py):
MIDDLEWARE = [
# 'django.middleware.security.SecurityMiddleware',
'whitenoise.middleware.WhiteNoiseMiddleware',
# ...
]

Related

Celery worker doesn't launch from Python

We have Python 3.6.1 set up with Django, Celery, and Rabbitmq on Ubuntu 14.04. Right now, I'm using the Django debug server (for dev and Apache isn't working). My current problem is that the celery workers get launched from Python and immediately die -- processes show as defunct. If I use the same command in a terminal window, the worker gets created and picks up the task if there is one waiting in the queue.
Here's the command:
celery worker --app=myapp --loglevel=info --concurrency=1 --maxtasksperchild=20 -n celery_1 -Q celery
The same functionality occurs for whichever queues are being set up.
In the terminal, we see the output myapp.settings - INFO - Loading... followed by output that describes the queue and lists the tasks. When running from Python, the last thing we see is the Loading...
In the code, we do have a check to be sure we are not running the celery command as root.
These are the Celery settings from our settings.py file:
CELERY_ACCEPT_CONTENT = ['json','pickle']
CELERY_TASK_SERIALIZER = 'pickle'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_IMPORTS = ('api.tasks',)
CELERYD_PREFETCH_MULTIPLIER = 1
CELERYD_CONCURRENCY = 1
BROKER_POOL_LIMIT = 120 # Note: I tried this set to None but it didn't seem to make any difference
CELERYD_LOG_COLOR = False
CELERY_LOG_FORMAT = '%)asctime)s - $(processName)s - %(levelname)s - %(message)s'
CELERYD_HIJACK_ROOT_LOGGER = False
STATIC_URL = '/static/'
STATIC_ROOT = os.path.join(psconf.BASE_DIR, 'myapp_static/')
BROKER_URL = psconf.MQ_URI
CELERY_RESULT_BACKEND = 'rpc'
CELERY_RESULT_PERSISTENT = True
CELERY_ROUTES = {}
for entry in os.scandir(psconf.PLUGIN_PATH):
if not entry.is_dir() or entry.name == '__pycache__':
continue
plugin_dir = entry.name
settings_file = f'{plugin_dir}.settings'
try:
plugin_tasks = importlib.import_module(settings_file)
queue_name = plugin_tasks.QUEUENAME
except ModuleNotFoundError as e:
logging.warning(e)
except AttributeError:
logging.debug(f'The plugin {plugin_dir} will use the general worker queue.')
else:
CELERY_ROUTES[f'{plugin_dir}.tasks.run'] = {'queue': queue_name}
logging.debug(f'The plugin {plugin_dir} will use the {queue_name} queue.')
Here is the part that kicks off the worker:
class CeleryWorker(BackgroundProcess):
def __init__(self, n, q):
self.name = n
self.worker_queue = q
cmd = f'celery worker --app=myapp --loglevel=info --concurrency=1 --maxtasksperchild=20 -n {self.name" -Q {self.worker_queue}'
super().__init__(cmd, cwd=str(psconf.BASE_DIR))
class BackgroundProcess(subprocess.Popen):
def __init__(self, args, **kwargs):
super().__init__(args, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True, **kwargs)
Any suggestions as to how to get this working from Python are appreciated. I'm new to Rabbitmq/Celery.
Just in case someone else needs this...It turns out that the problem was that the shell script which kicks off this whole app is now being launched with sudo and, even though I thought I was checking so we wouldn't launch the celery worker with sudo, I'd missed something and we were trying to launch as root. That is a no-no. I'm now explicitly using 'sudo -u ' and the workers are starting properly.

How to set max_proc_per_cpu in Scrapyd

I have the following two Scrapy projects with the following configurations
The Project1's scrapy.cfg
[settings]
default = Project1.settings
[deploy]
url = http://localhost:6800/
project = Project1
[scrapyd]
eggs_dir = eggs
logs_dir = logs
logs_to_keep = 500
dbs_dir = dbs
max_proc = 5
max_proc_per_cpu = 10
http_port = 6800
debug = off
runner = scrapyd.runner
application = scrapyd.app.application
and Project2's scrapy.cfg
[settings]
default = Project2.settings
[deploy]
url = http://localhost:6800/
project = Project2
[scrapyd]
eggs_dir = eggs
logs_dir = logs
logs_to_keep = 500
dbs_dir = dbs
max_proc = 5
max_proc_per_cpu = 10
http_port = 6800
debug = off
runner = scrapyd.runner
application = scrapyd.app.application
but when I take look at http://localhost:6800/jobs I always see just 8 items are in running, it means default max_proc_per_cpu is not applied, I delete the projects with the following commands
curl http://localhost:6800/delproject.json -d project=Project1
curl http://localhost:6800/delproject.json -d project=Project2
and deploy them again to make sure new changes are deployed. but the running spiders number still is 8 .
my VPS CPU has two cores. I could get it with
python -c 'import multiprocessing; print(multiprocessing.cpu_count())'.
how can I get the Scrapyd deployed configuration?
how can I set Max process per cpu?
According to the documentation, in Unix-like systems, the configuration file is first looked upon in the /etc/scrapyd/scrapyd.conf location. I entered the configuration file here, but it did not work. Finally, it worked when I kept the scrapy.conf file as a hidden file in the directory from which the scrapy server started. For me, it happened to be the home directory.
You can read about the details here: https://scrapyd.readthedocs.io/en/stable/config.html

UWSGI + nginx repeated logging in django

I am having some weird issues when I run my application on a dev. server using UWSGI+nginx.. It works fine when my request completes within 5-6 mins.. For long deployments and requests taking longer than that, the UWSGI logs repeats the logs after around 5mins.. It's as if it spawns another process and I get two kinds of logs(one for current process and the repeated process). I am not sure why this is happening.. Did not find anything online. I am sure this is not related to my code because the same thing works perfectly fine in the lab env.. where I use the django runserver. Any insight would be appreciated..
The uwsgi.ini:-
# netadc_uwsgi.ini file
#uid = nginx
#gid = nginx
# Django-related settings
env = HTTPS=on
# the base directory (full path)
chdir = /home/netadc/apps/netadc
# Django's wsgi file
module = netadc.wsgi
# the virtualenv (full path)
home = /home/netadc/.venvs/netadc
# process-related settings
# master
master = true
# maximum number of worker processes
processes = 10
buffer-size = 65536
# the socket (use the full path to be safe
socket = /home/netadc/apps/netadc/netadc/uwsgi/tmp/netadc.sock
# ... with appropriate permissions - may be needed
#chmod-socket = 666
# daemonize
daemonize = true
# logging
logger = file:/home/netadc/apps/netadc/netadc/uwsgi/tmp/netadc_uwsgi.log
# clear environment on exit
vacuum = true

Gunicorn Flask Caching

I have a Flask application that is running using gunicorn and nginx. But if I change the value in the db, the application fails to update in the browser under some conditions.
I have a flask script that has the following commands
from msldata import app, db, models
path = os.path.dirname(os.path.abspath(__file__))
manager = Manager(app)
#manager.command
def run_dev():
app.debug = True
if os.environ.get('PROFILE'):
from werkzeug.contrib.profiler import ProfilerMiddleware
app.config['PROFILE'] = True
app.wsgi_app = ProfilerMiddleware(app.wsgi_app, restrictions=[30])
if 'LISTEN_PORT' in app.config:
port = app.config['LISTEN_PORT']
else:
port = 5000
print app.config
app.run('0.0.0.0', port=port)
print app.config
#manager.command
def run_server():
from gunicorn.app.base import Application
from gunicorn.six import iteritems
# workers = multiprocessing.cpu_count() * 2 + 1
workers = 1
options = {
'bind': '0.0.0.0:5000',
}
class GunicornRunner(Application):
def __init__(self, app, options=None):
self.options = options or {}
self.application = app
super(GunicornRunner, self).__init__()
def load_config(self):
config = dict([(key, value) for key, value in iteritems(self.options) if key in self.cfg.settings and value is not None])
for key, value in iteritems(config):
self.cfg.set(key.lower(), value)
def load(self):
return self.application
GunicornRunner(app, options).run()
Now if i run the server run_dev in debug mode db modifications are updated
if run_server is used the modifications are not seen unless the app is restarted
However if i run like gunicorn -c a.py app:app, the db updates are visible.
a.py contents
import multiprocessing
bind = "0.0.0.0:5000"
workers = multiprocessing.cpu_count() * 2 + 1
Any suggestions on where I am missing something..
I also ran into this situation. Running flask in Gunicorn with several workers and the flask-cache won´t work anymore.
Since you are already using
app.config.from_object('default_config') (or similar filename)
just add this to you config:
CACHE_TYPE = "filesystem"
CACHE_THRESHOLD = 1000000 (some number your harddrive can manage)
CACHE_DIR = "/full/path/to/dedicated/cache/directory/"
I bet you used "simplecache" before...
I was/am seeing the same thing, Only when running gunicorn with flask. One workaround is to set Gunicorn max-requests to 1. However thats not a real solution if you have any kind of load due to the resource overhead of restarting the workers after each request. I got around this by having nginx serve the static content and then changing my flask app to render the template and write to static, then return a redirect to the static file.
Flask-Caching SimpleCache doesn't work w. workers > 1 Gunicorn
Had similar issue using version Flask 2.02 and Flask-Caching 1.10.1.
Everything works fine in development mode until you put on gunicorn with more than 1 worker. One probably reason is that on development there is only one process/worker so weirdly under this restrict circumstances SimpleCache works.
My code was:
app.config['CACHE_TYPE'] = 'SimpleCache' # a simple Python dictionary
cache = Cache(app)
Solution to work with Flask-Caching use FileSystemCache, my code now:
app.config['CACHE_TYPE'] = 'FileSystemCache'
app.config['CACHE_DIR'] = 'cache' # path to your server cache folder
app.config['CACHE_THRESHOLD'] = 100000 # number of 'files' before start auto-delete
cache = Cache(app)

Why do I get a login prompt when I deploy the django-celery example app on dotcloud?

I've been struggling to get the demo application in django-celery working on dotcloud. I have looked at the tutorial at http://docs.dotcloud.com/0.9/tutorials/python/django-celery/ but it isn't a great deal of help.
The example application is a Django 1.4 app. I'm not sure why, but when I navigate to the deployed application, instead of the index page it presents me with a username password popup. The message in the popup is
The server at TheDomain requires a username and password. The server says: RabbitMQ Management.
Does anyone know why this behaviour has been added?
The differences from the django-celery example app are:
# Django settings for project in settings.py
import os
import json
import djcelery
# Load the dotCloud environment
with open('/home/dotcloud/environment.json') as f:
dotcloud_env = json.load(f)
# Configure Celery using the RabbitMQ credentials found in the dotCloud
# environment.
djcelery.setup_loader()
BROKER_HOST = dotcloud_env['DOTCLOUD_BROKER_AMQP_HOST']
BROKER_PORT = int(dotcloud_env['DOTCLOUD_BROKER_AMQP_PORT'])
BROKER_USER = dotcloud_env['DOTCLOUD_BROKER_AMQP_LOGIN']
BROKER_PASSWORD = dotcloud_env['DOTCLOUD_BROKER_AMQP_PASSWORD']
BROKER_VHOST = '/'
Instead of the database settings in the app - I've replaced the database settings with.
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': 'template1',
'USER': dotcloud_env['DOTCLOUD_DB_SQL_LOGIN'],
'PASSWORD': dotcloud_env['DOTCLOUD_DB_SQL_PASSWORD'],
'HOST': dotcloud_env['DOTCLOUD_DB_SQL_HOST'],
'PORT': int(dotcloud_env['DOTCLOUD_DB_SQL_PORT']),
}
}
I've also added a requirements.txt file with
Django==1.4
django-celery
setproctitle
and the dotcloud.yml file
www:
type: python
broker:
type: rabbitmq
workers:
type: python-worker
db:
type: postgresql
and the supervisor.conf
[program:djcelery]
directory = $HOME/current/
command = python manage.py celeryd -E -l info -c 2
stderr_logfile = /var/log/supervisor/%(program_name)s_error.log
stdout_logfile = /var/log/supervisor/%(program_name)s.log
[program:celerycam]
directory = $HOME/current/
command = python manage.py celerycam
stderr_logfile = /var/log/supervisor/%(program_name)s_error.log
stdout_logfile = /var/log/supervisor/%(program_name)s.log
and to the postinstall file I added
dotcloud_get_env() {
sed -n "/$1/ s/.*: \"\(.*\)\".*/\1/p" < "$HOME/environment.json"
}
setup_django_celery() {
cat > $HOME/current/supervisord.conf << EOF
[program:djcelery]
directory = $HOME/current/
command = python manage.py celeryd -E -l info -c 2
stderr_logfile = /var/log/supervisor/%(program_name)s_error.log
stdout_logfile = /var/log/supervisor/%(program_name)s.log
[program:celerycam]
directory = $HOME/current/
command = python manage.py celerycam
stderr_logfile = /var/log/supervisor/%(program_name)s_error.log
stdout_logfile = /var/log/supervisor/%(program_name)s.log
EOF
}
if [ `dotcloud_get_env SERVICE_NAME` = workers ] ; then
setup_django_celery
fi
the last fi was added but not in the dotcloud tutorial.
Edit
I've whipped together a repo with this example, as when it works it should be quite useful for others. It's available at: https://github.com/asunwatcher/django-celery-dotcloud
This looks like an error in our CLI.
Try dotcloud url and you will see that your application has two URLs, one for your www service and one for your rabbitMQ, which is a rabbit management interface. You can log in there with the rabbit user name and password given in the dotCloud environment.
For some reason we're picking the wrong one to show you at the end of the push.
The url for your www service is the one you want.

Categories

Resources