I'm Trying to optimize my code, and I got into a problem I don't quite understand. On every page of my web app, there will be a list of notifications much like Facebook's new ticker. So, on every request, I run this code in the beggining:
notification_query = db.Query(Ticker, keys_only=True)\
.filter('friends =',self.current_user.key().name())
self._notifications_future = notification_query.run()
Then, where I find a good spot, I call the middle function which is:
notification_keys = [future.parent() for future in self._notifications_future]
self._notifications = db.get_async(notification_keys)
Finally I fetch them all at the end:
context.update({'notifications': self._notifications.get_result() })
Every thing works great except for this: If I call the middle function in the end of the request's function, I get this:
Dumb spot
And If I call it in what I think is an optimized spot, I get this:
Smart spot
As you can see the API usage is doubled by making this "optimization". What is going on here?
Call number 2, is the first snippet in both cases. Call number 12 in dumb spot is the second snippet, and call number 12 in the smart spot is the second snippet. This last switch, has nothing to do with the problem, I've tested it.
pd: Does google charge me for time the query is "idle"?
UPDATE
The problem seems to be just in the dev_server, when I tried the same example (smart version) on appspot, I got this:
Queries on appspot
Here everything works as expected, things that get called with run() or get_async() don't block other stuff. As I said the issue is only in dev_server. Still it would be nice to see this feature working on localhost, for more effective profiling.
In addition to Peter's note, you should bear in mind that Appstats has no way to know when an asynchronous request actually completes, except when you fetch the result. As a result, even fast calls will look slow if you take a long time to request the result.
Ahhhh. It is very helpful when posting about app engine to mention if your results are on the dev server or in production. The dev server does not have the same performance characteristics as the production servers, so it is not the best way to profile your application. In fact, I believe that indexes are not used at all, and that the dev server is single threaded, so you can't handle concurrent requests with it. In fact, if your app makes calls to itself, it won't work at all!
Related
I'm a fairly new to web development so this might actually be normal behavior - but when I make logic changes in my views, it can take about an hour for those changes to show up on my production site.
The changes are instant if I fire up the localhost. Server is Windows IIS 7.5. HTML, CSS, and JS changes show up instantly, it's the code in the view that takes a while to filter through. Any ideas on what is causing this and how to fix it?
Have you tried doing a manual reboot of the application pool where the site is sitting in IIS? Documentation might not be exact for the version but it should explain it well enough to give you an idea about what's going:
https://technet.microsoft.com/en-us/library/cc753179(v=ws.10).aspx
Basically, if you have the application pool recycle every 3 hours, when you make a change it could take up to 3 hours for the change to take effect. You also don't want it recycling every 5 minutes either. But you can do a manual recycle if you really want to see your changes.
I use python's urllib library for checking the update of one webpage every 5 sec.
But after I run the program a few hours, It seems that the urllib.open(url) just returns the outdated data.It usually delays by 5-10mins.I need your help.
urlItem = urllib.urlopen("http://ka.game.163.com/")
htmlSource = urlItem.read()
urlItem.close()
This looks like a caching issue. A cache is used to optimize communication so common requested data wont need to be requested all the time.
When you call urllib.open it uses under the hood the urlib.retrieve function. This function caches data locally, so to avoid this caching you should call urllib.urlcleanup before each call to urllib.open. This is stated in the documentation
Also, your question hits the same issue as described in this one, consider looking it
I am using Django and am making some long running processes that I am just interacting with through my web user interface. Such as, they would be running all the time, checking a database value every few minutes and stopping only if this has changed (would be boolean true false). So, I want to be able to use Django to interact with these, however am unsure of the way to do this. When I used to use PHP I had some method of doing this, figure it would be even easier to do in Python but am not able to find anything on this with my searches.
Basically, all I want to be able to do is to execute python code without waiting for it to finish, so it just begins execute then goes on to do whatever else it needs for django, quickly returning a new page to the user.
I know that there are ways to call an external program, so I suppose that may be the only way to go? Is there a way to do this with just calling other python code?
Thanks for any advice.
Can't vouch for it because I haven't used it yet, but "Celery" does pretty much what you're asking for and was originally built specifically for Django.
http://celeryproject.org/
Their example showing a simple task adding two numbers:
from celery.decorators import task
#task
def add(x, y):
return x + y
You can execute the task in the background, or wait for it to finish:
>>> result = add.delay(8, 8)
>>> result.wait() # wait for and return the result
16
You'll probably need to install RabbitMQ also to get it working, so it might be more complicated of a solution than you're looking for, but it will achieve your goals.
You want an asynchronous message manager. I've got a tutorial on integrating Gearman with Django. Any pickleable Python object can be sent to Gearman, which will do all the work and post the results wherever you want; the tutorial includes examples of posting back to the Django database (it also shows how to use the ORM outside of Django).
I have a website I am looking to stay updated with and scrape some content from there every day. I know the site is updated manually at a certain time, and I've set cron schedules to reflect this, but since it is updated manually it could be 10 or even 20 minutes later.
Right now I have a hack-ish cron update every 5 minutes, but I'd like to use the deferred library to do things in a more precise manner. I'm trying to chain deferred tasks so I can check if there was an update and defer that same update a for couple minutes if there was none, and defer again if need be until there is finally an update.
I have some code I thought would work, but it only ever defers once, when instead I need to continue deferring until there is an update:
(I am using Python)
class Ripper(object):
def rip(self):
if siteHasNotBeenUpdated:
deferred.defer(self.rip, _countdown=120)
else:
updateMySite()
This was just a simplified excerpt obviously.
I thought this was simple enough to work, but maybe I've just got it all wrong?
The example you give should work just fine. You need to add logging to determine if deferred.defer is being called when you think it is. More information would help, too: How is siteHasNotBeenUpdated set?
I would like to enable students to submit python code solutions to a few simple python problems. My applicatoin will be running in GAE. How can I limit the risk from malicios code that is sumitted? I realize that this is a hard problem and I have read related Stackoverflow and other posts on the subject. I am curious if the restrictions aleady in place in the GAE environment make it simpler to limit damage that untrusted code could inflict. Is it possible to simply scan the submitted code for a few restricted keywords (exec, import, etc.) and then ensure the code only runs for less than a fixed amount of time, or is it still difficult to sandbox untrusted code even in the resticted GAE environment? For example:
# Import and execute untrusted code in GAE
untrustedCode = """#Untrusted code from students."""
class TestSpace(object):pass
testspace = TestSpace()
try:
#Check the untrusted code somehow and throw and exception.
except:
print "Code attempted to import or access network"
try:
# exec code in a new namespace (Thanks Alex Martelli)
# limit runtime somehow
exec untrustedCode in vars(testspace)
except:
print "Code took more than x seconds to run"
#mjv's smiley comment is actually spot-on: make sure the submitter IS identified and associated with the code in question (which presumably is going to be sent to a task queue), and log any diagnostics caused by an individual's submissions.
Beyond that, you can indeed prepare a test-space that's more restrictive (thanks for the acknowledgment;-) including a special 'builtin' that has all you want the students to be able to use and redefines __import__ &c. That, plus a token pass to forbid exec, eval, import, __subclasses__, __bases__, __mro__, ..., gets you closer. A totally secure sandbox in a GAE environment however is a real challenge, unless you can whitelist a tiny subset of the language that the students are allowed.
So I would suggest a layered approach: the sandbox GAE app in which the students upload and execute their code has essentially no persistent layer to worry about; rather, it "persists" by sending urlfetch requests to ANOTHER app, which never runs any untrusted code and is able to vet each request very critically. Default-denial with whitelisting is still the holy grail, but with such an extra layer for security you may be able to afford a default-acceptance with blacklisting...
You really can't sandbox Python code inside App Engine with any degree of certainty. Alex's idea of logging who's running what is a good one, but if the user manages to break out of the sandbox, they can erase the event logs. The only place this information would be safe is in the per-request logging, since users can't erase that.
For a good example of what a rathole trying to sandbox Python turns into, see this post. For Guido's take on securing Python, see this post.
There are another couple of options: If you're free to choose the language, you could run Rhino (a Javascript interpreter) on the Java runtime; Rhino is nicely sandboxed. You may also be able to use Jython; I don't know if it's practical to sandbox it, but it seems likely.
Alex's suggestion of using a separate app is also a good one. This is pretty much the approach that shell.appspot.com takes: It can't prevent you from doing malicious things, but the app itself stores nothing of value, so there's no harm if you do.
Here's an idea. Instead of running the code server-side, run it client-side with Skuplt:
http://www.skulpt.org/
This is both safer, and easier to implement.