Django cache, missing keys

Django cache, missing keys - python

I am trying to make a cache flow, that on a user request it cache a big dict of 870 records and it should stay in cache for some time. When the defined time pass on next request the dict should be updated in na cache memory.
So I have created such a function:
from django.core.cache import get_cache
def update_values_mapping():
cache_en = get_cache('en')
values_dict = get_values_dict() <- this make a request to obtain the dict with values
cache_en.set_many(values_dict, 120) # 120s for testing
cache_en.set('expire', datetime.datetime.now() + datetime.timedelta(seconds=120))
Then in the second function I try to get values from cache
from django.core.cache import get_cache
def get_value_details(_id):
cache = get_cache('en')
details = cache.get(_id, {}) # Values in cache has expire date so they should eventually be gone
expire = cache.get('expire', None)
if not details and expire and expire < datetime.now():
update_values_mapping()
value = cache.get(_id, {})
return details
During rendering a view get_value_details() is called many times to obtain all needed values.
The problem is that some of the values are missing e.g. cache.get('b', {}) return {} even if the value 'b' was saved to cache (and expire date does not pass yet). The missing values are changing, sometimes it is 'a', sometimes 'b', other time 'c' etc.
I have been testing it on LocMemCache and DummyCache so far.
My example cache settings:
CACHES = {
'default': {
'BACKEND': 'django.core.cache.backends.locmem.LocMemCache',
'LOCATION': 'cache-default'
},
'en': {
'BACKEND': 'django.core.cache.backends.locmem.LocMemCache',
'LOCATION': 'cache-en'
},
'pl': {
'BACKEND': 'django.core.cache.backends.locmem.LocMemCache',
'LOCATION': 'cache-pl'
}
}
When I was playing with that in a console some of the values was disappearing from cache after next call of update_values_mapping(), but some were missing from the beginning.
Does anyone have any clue what it could be ?
Or maybe how to solve described flow in another way ?

LocMemCache is exactly that - a local memory cache. That means it's local to the particular server process, and won't be visible either in other processes or in the console.
If you need something that is shared across all processes, you should use a proper cache backend like memcached or redis.

Related

django huey is always returning empty queryset while filtering

#db_task()
def test_db_access(tenant_id, batch_obj):
print('DBAccess')
print(tenant_id)
print(batch_obj.id)
files = File.objects.filter(batch_id=batch_obj.id)
print(files)
If I run this in django without django-huey, I get a filtered queryset but if I start using django-huey, I'm always getting an empty queryset. Only 'DBAccess' is getting printed and files is always '[]'.
Do I have to add other settings in settings.py?
This is my current huey settings
# Huey - Task Queue
HUEY = {
'name': 'appname',
'consumer': {
'workers': 4,
'worker_type': 'thread'
},
'immediate': False,
'connection': {
'host': RedisConfig.HOST,
'port': RedisConfig.PORT,
},
}

You're trying to pass an object as an argument to that function and it's probably not getting serialized - huey uses pickle to serialize function call data that is passed to the consumer. Instead, change your function to accept a batch_obj identifier like this:
#db_task()
def test_db_access(tenant_id, batch_obj_id):
print('DBAccess')
print(tenant_id)
print(batch_obj_id)
files = File.objects.filter(batch_id=batch_obj_id)
print(files)
and pass in batch_obj_id=batch_obj.id when you're calling test_db_access. Alternatively, you can write a custom serializer, but it should be much simpler to just pass the numerical identifier.

Cannot seem able to modify cache value from celery task

Description:
I want to have a cached value (let's call it a flag) to know when a celery task finishes execution.
I have a view for the frontend to poll this flag until it turns to False.
Code:
settings.py:
...
MEMCACHED_URL = os.getenv('MEMCACHED_URL', None) # Cache of devel or production
if MEMCACHED_URL:
CACHES = {
'default': {
'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',
'LOCATION': MEMCACHED_URL,
}
}
else:
CACHES = {
'default': {
'BACKEND': 'django.core.cache.backends.locmem.LocMemCache',
'LOCATION': 'unique-snowflake',
}
}
api/views.py:
def a_view(request):
# Do some stuff
cache.add(generated_flag_key, True)
tasks.my_celery_task.apply_async([argument_1, ..., generated_flag_key])
# Checking here with cache.get(generated_flag_key), the value is True.
# Do other stuff.
tasks.py:
#shared_task
def my_celery_task(argument_1, ..., flag_cache_key):
# Do stuff
cache.set(flag_cache_key, False) # Checking here with
# cache.get(flag_cache_key),the
# flag_cache_key value is False
views.py:
def get_cached_value(request, cache_key):
value = cache_key.get(cache_key) # This remains True until the cache key
# expires.
Problem:
If I run the task synchronously everything works as expected. When I run the task asynchronously, the cache key stays the same (as expected) and it is correctly passed around through those 3 methods, but the cached value doesn't seem to be updated between the task and the view.

If you run your tasks asynchronously, they are part of different processes which means that because of the LocMemCache backend, the task and the view will not use the same storage (each has its own memory).

Since #Linovia's answer and a dive in Django's documentation, I am now using django-redis as a workaround for my case.
The only thing that needs to change is the CACHES settings (and an active Redis server of course!):
settings.py:
CACHES = {
"default": {
"BACKEND": "django_redis.cache.RedisCache",
"LOCATION": 'redis://127.0.0.1:6379/1',
"OPTIONS": {
"CLIENT_CLASS": "django_redis.client.DefaultClient",
}
}
}
Now the cache storage is singular.
django-redis is a well-documented library and one can follow the instructions to make it work.

getChanges Sharepoint rest API

I am using Sharepoint 2013 REST api to find out the incremental changes that have happened in the root site. My request is like below:
headers = {"Authorization": 'Bearer ' + access_token, "accept": "application/json", "odata": "verbose"}
headers["content-type"] = "application/json;odata=verbose"
body = { 'query': { '__metadata': { 'type': 'SP.ChangeQuery' },'Web': True, 'Update': True, 'Add': True,
'ChangeTokenStart':{'__metadata':{'type':'SP.ChangeToken'},
'StringValue': '1;1;5b9752ee-f410-4cc6-9ab6-eb18c2ad802f;636252579049500000;89866182'}
}
}
In response I am getting lot of changerequest objects. One of them is as below:
{
'odata.type': 'SP.ChangeWeb',
'ChangeToken': {
'StringValue': '1;1;5b9752ee-f410-4cc6-9ab6-eb18c2ad802f;636252779425600000;89976872'
},
'WebId': '6e21eadd-4155-494d-9a8e-1046865bdd4b',
'ChangeType': 2,
'odata.id': 'https://<site url>/_api/SP.ChangeWeb87f1a9c6-937b-4507-973d-fc2d1b949aed',
'SiteId': '5b9752ee-f410-4cc6-9ab6-eb18c2ad802f',
'odata.editLink': 'SP.ChangeWeb87f1a9c6-937b-4507-973d-fc2d1b949aed',
'Time': '2017-03-16T16:19:02.56Z'
Can somebody help me understand the response? I am facing difficulty to find out the path where the change happened. Also, would this getchanges API capture changes that has happened in subsites within the site?

Yes Lists and Libraries at the end of the day are the same thing. You can get the list title from odata.editLink by stripping off the last segment (Items(1)) in the above case. If you call that path it'll give you the details of the list versus the item/file that was modified. If you want the user's details call /_api/Web/lists/getbytitle('User Information List')/Items(EditorId). If you want the path to the item/file call odata.editlink and the serverrelativeurl parameter returned will have the path to it and title will have the title of the item/file.

Sure, the ChangeType is the main piece of information you need which is an enumerable. You can look up the friendly names for the numbers here: ChangeType Enumeration
So in that case, that looks like an update to the settings of the SPWeb with a guid of '6e21eadd-4155-494d-9a8e-1046865bdd4b'
You may also want to look at using the $expand operator in your REST query to get additional fields back.

Converting Bottle FORMSDICT to a Python dictionary (in a thread safe way)

I just worked through the [bottle tutorial[1 and found the below helpful table (I hope I get the format right) of where what types of request attributes can be accessed
Attribute GET Form fields POST Form fields File Uploads
BaseRequest.query yes no no
BaseRequest.forms no yes no
BaseRequest.files no no yes
BaseRequest.params yes yes no
BaseRequest.GET yes no no
BaseRequest.POST no yes yes
Of course I want to try it out myself, but because Bottle data structures are special thread-safe versions, and I wanted to use json to print it in a sensible format, I wrote the following (working) test program
from bottle import run, route, request, response, template, Bottle
import uuid
import json
import os
ondersoek = Bottle()
#ondersoek.get('/x')
#ondersoek.post('/x')
def show_everything():
PythonDict={}
PythonDict['forms']={}
for item in request.forms:
PythonDict['forms'][item]=request.forms.get(item)
PythonDict['query']={}
for item in request.forms:
PythonDict['query'][item]=request.query.get(item)
#The below does not work - multipart/form-data doesn't serialize in json
#PythonDict['files']={}
#for item in request.files:
#PythonDict['files'][item]=request.files.get(item)
PythonDict['GET']={}
for item in request.GET:
PythonDict['GET'][item]=request.GET.get(item)
PythonDict['POST']={}
for item in request.POST:
PythonDict['POST'][item]=request.POST.get(item)
PythonDict['params']={}
for item in request.params:
PythonDict['params'][item]=request.params.get(item)
return json.dumps(PythonDict, indent=3)+"\n"
ondersoek.run(host='localhost', port=8080, reloader=True)
This works, I get:
tahaan#Komputer:~/Projects$ curl -G -d dd=dddd http://localhost:8080/x?q=qqq
{
"files": {},
"GET": {
"q": "qqq",
"dd": "dddd"
},
"forms": {},
"params": {
"q": "qqq",
"dd": "dddd"
},
"query": {},
"POST": {}
}
And
tahaan#Komputer:~/Projects$ curl -X POST -d dd=dddd http://localhost:8080/x?q=qqq
{
"files": {},
"GET": {
"q": "qqq"
},
"forms": {
"dd": "dddd"
},
"params": {
"q": "qqq",
"dd": "dddd"
},
"query": {
"dd": null
},
"POST": {
"dd": "dddd"
}
}
I'm quite sure that this is not thread safe because I'm copying the data one item at a time from the Bottle data structure into a Python native data structure. Right now I'm still using the default non-threaded server, but for performance reasons I would want to use a threaded server like CherryPy at some point in the future. The question therefore is How do I get data out of Bottle, or any other similar thread-safe dict into something that can be converted to JSON (easily)? Does Bottle by any chance expose a FormsDict-To-Json function somewhere?

Your code is thread safe. I.e., if you ran it in a multithreaded server, it'd work just fine.
This is because a multithreaded server still only assigns one request per thread. You have no global data; all the data in your code is contained within a single request, which means it's within a single thread.
For example, the Bottle docs for the Request object say (emphasis mine):
A thread-local subclass of
BaseRequest with a different set of attributes for each thread. There
is usually only one global instance of this class (request). If
accessed during a request/response cycle, this instance always refers
to the current request (even on a multithreaded server).
In other words, every time you access request in your code, Bottle does a bit of "magic" to give you a thread-local Request object. This object is not global; it is distinct from all other Request objects that may exist concurrently, e.g. in other threads. As such, it is thread safe.
Edit in response to your question about PythonDict in particular: This line makes your code thread-safe:
PythonDict={}
It's safe because you're creating a new dict every time a thread hits that line of code; and each dict you're creating is local to the thread that created it. (In somewhat more technical terms: it's on the stack.)
This is in contrast to the case where your threads were sharing a global dict; in that case, your suspicion would be right: it would not be thread-safe. But in your code the dict is local, so no thread-safety issues apply.
Hope that helps!

As far as I can see there's no reason to believe that there's a problem with threads, because your request is being served by Bottle in a single thread. Also there are no asynchronous calls in your own code that could spawn new threads that access shared variables.

How to do a custom insert inside a python-eve app

I have some custom flask methods in an eve app that need to communicate with a telnet device and return a result, but I also want to pre-populate data into some resources after retrieving data from this telnet device, like so:
#app.route("/get_vlan_description", methods=['POST'])
def get_vlan_description():
switch = prepare_switch(request)
result = dispatch_switch_command(switch, 'get_vlan_description')
# TODO: populate vlans resource with result data and return status
My settings.py looks like this:
SERVER_NAME = '127.0.0.1:5000'
DOMAIN = {
'vlans': {
'id': {
'type': 'integer',
'required': True,
'unique': True
},
'subnet': {
'type': 'string',
'required': True
},
'description': {
'type': 'boolean',
'default': False
}
}
}
I'm having trouble finding docs or source code for how to access a mongo resource directly and insert this data.

Have you looked into the on_insert hook? From the documentation:
When documents are about to be stored in the database, both on_insert(resource, documents) and on_insert_<resource>(documents) events are raised. Callback functions could hook into these events to arbitrarily add new fields, or edit existing ones. on_insert is raised on every resource being updated while on_insert_<resource> is raised when the <resource> endpoint has been hit with a POST request. In both circumstances, the event will be raised only if at least one document passed validation and is going to be inserted. documents is a list and only contains documents ready for insertion (payload documents that did not pass validation are not included).
So, if I get what you want to achieve, you could have something like this:
def telnet_service(resource, documents):
"""
fetch data from telnet device;
update 'documents' accordingly
"""
pass
app = Eve()
app.on_insert += telnet_service
if __name__ == "__main__":
app.run()
Note that this way you don't have to mess with the database directly as Eve will take care of that.
If you don't want to store the telnet data but only send it back along with the fetched documents, you can hook to on_fetch instead.
Lastly, if you really want to use the data layer you can use app.data.driveras seen in this example snippet.

use post_internal
Usage example:
from run import app
from eve.methods.post import post_internal
payload = {
"firstname": "Ray",
"lastname": "LaMontagne",
"role": ["contributor"]
}
with app.test_request_context():
x = post_internal('people', payload)
print(x)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.