Is it possible 'step into' a method to make changes when needed - python

I’m working on a Python script that accesses number of websites. I don’t think writing a whole class for each website is a feasible option, so I am looking for a way to reduce the code and re-use code as much as I can and write it as efficiently as I can. Although the website share same or similar engine (e.g. Wordpress), they often differ with slight details like cookie names, changed URL paths (e.g. example.com/signin.php and example2.com/login.php), data names and values for POST requests, and so on.
In most cases I could use variables, but sometimes I need to find extra data from retrieved url in the middle of method to continue, so that means adding few lines of code that needs to be processed before proceeding. I found that possible solution for me could theoretically be to use superclasses. But I couldn’t find any information how to actually ‘step into’ a method to make changes. To better illustrate my point, take a look at this example
class Wordpress(object):
def login(self):
# Do some stuff here
def newentry(self):
# Code to check if I'm logged in code
# and do bunch of other stuff
#
# Retrieve a page that *** sometimes contain important information
# in which case must be processed to continue! ***
data = {'name' : title,
'message' : message,
'icon' : iconid}
# Post new entry here
As I already commented in the example above in some cases I need to make adjustement just inside a method itself, add snippet of code, or change value of the variable or value in the dictionary, etc.. How can I achieve that, if it's even possible? Maybe my perception of superclasses is way off and they aren't really for what I think they are.
Maybe there's other way I haven't thought of. Feel free to post your own ideas how you would solve this problem. I am looking forward to your replies. Thank you.

Why can't you just call some process function from the main method, process can do whatever you want it to do, and you can even override it in derived classes e.g.
class WordPress(object):
def newentry(self):
data = get_data()
data = process_data(data)
# do moe generi cthings
def process_data(self):
return data

Related

SQLAlchemy Class variable as Enum/Dict?

I'm trying to create a model in SQLAlchemy, but I'm having a hard time finding out what is the best way. I currently have a class called returns, which I want to give an additional variable. A state of which the return is in. So for example, a return can be expected, received or processed. However, in the Flask application I want to show the user a nice string. For example; processed should become "Waiting for reimboursment".
The problem however is, I don't want to send these strings to the database, since I might change them in the future or add statusses. Therefore I want some kind of translation to be made between the value saved in the DB and the 'string' value. I have tried solving this by using Enums, but it is not possible to create the 'string' values. I would like something like this to return either the 'key' or the 'value', where only the key is saved in the database.
return.status.key
return.status.value
I have tried looking for a solution but was not able to find anything that seems to be fit.
What is the best practice for these kinds of requirements?

Web2Py Multiple Smartgrids in a view

I am trying to build a view in web2py that has multiple smartgrid objects served from the same controller. It displays them absolutely fine, but whenever I try to create a new record on the second table it doesn't allow entry, it just seems to refresh the page. Also trying to search on the second table actually fills in the search field on the first table too, so there is obviously some confusion as to which smartgrid is which.
In my research I came across the multiple form technique using process to name each form, see below:
form1.process(formname='form1')
However, this methodology doesn't seem to work for smartgrid objects (as far as I can tell). I guess I could try to create my own new SQLFORM.grid, but it seems a shame that I can't make better use of the smartgrids, as they have everything I need already.
Any help appreciated.
As you have noted, you cannot have two grids on the same page in this manner, as the grid makes use of the URL to determine its current state. Instead of the iframe approach, though, you might consider putting each grid in an Ajax component. In the main view file:
{{=LOAD('default', 'grid1.load', ajax=True)}}
{{=LOAD('default', 'grid2.load', ajax=True)}}
Of course, you can also serve both grids from the same action by specifying a URL arg to differentiate them.
To allow the grid machinery to deal with generated URLs like
/app/default/grid1.load/view/record/1?_signature=88ce76119afc68bbb141fce098cbc2eaf39289e3
for a view of a single record,
you must identify grids uniquely.
So construct your grids with formname keyword. Example:
def manage_records():
q_record = (db.record.au==auth.user_id)
return dict(record = SQLFORM.grid(q_record, formname='records'))
def manage_reports():
q_report = (db.report.au==auth.user_id)
return dict(record = SQLFORM.grid(q_report, formname='reports'))
Just as Antony have pointed, you can use LOAD() functionality.
You can omit ajax=True, if you want forms will load with a main page.
In the view with two grids:
<h2>Reports</h2>
{{=LOAD('default', 'manage_reports.load')}}
<h2>Records</h2>
{{=LOAD('default', 'manage_records.load')}}

How do to explicitly define the query used in subqueryload_all?

I'm using subqueryload/subqueryload_all pretty heavily, and I've run into the edge case where I tend to need to very explicitly define the query that is used during the subqueryload. For example I have a situation where I have posts and comments. My query looks something like this:
posts_q = db.query(Post).options(subqueryload(Post.comments))
As you can see, I'm loading each Post's comments. The problem is that I don't want all of the posts' comments, I need to also take into account a deleted field, and they need to be ordered by create time descending. The only way I have observed this being done, is by adding options to the relationship() declaration between posts and comments. I would prefer not to do this, b/c it means that that relationship cannot be reused everywhere after that, as I have other places in the app where those constraints may not apply.
What I would love to do, is explicitly define the query that subqueryload/subqueryload_all uses to load the posts' comments. I read about DisjointedEagerLoading here, and it looks like I could simply define a special function that takes in the base query, and a query to load the specified relationship. Is this a good route to take for this situation? Anyone ever run into this edge case before?
The answer is that you can define multiple relationships between Posts and Comments:
class Post(...):
active_comments = relationship(Comment,
primary_join=and_(Comment.post_id==Post.post_id, Comment.deleted=False),
order_by=Comment.created.desc())
Then you should be able to subqueryload by that relationship:
posts_q = db.query(Post).options(subqueryload(Post.active_comments))
You can still use the existing .comments relationship elsewhere.
I also had this problem and it took my some time to realize that this is an issue by design. When you say Post.comments then you refer to the relationship that says "these are all the comments of that post". However, now you want to filter them. If you'd now specify that condition somewhere on subqueryload then you are essentially loading only a subset of values into Post.comments. Thus, there will be values missing. Essentially you have a faulty representation of your data in the model.
The question here is how to approach this then, because you obviously need this value somewhere. The way I go is building the subquery myself and then specify special conditions there. That means you get two objects back: The list of posts and the list of comments. That is not a pretty solution, but at least it is not displaying data in a wrong way. If you were to access Post.comments for some reason, you can safely assume it contains all posts.
But there is room for improvement: You might want to have this attached to your class so you don't carry around two variables. The easy way might be to define a second relationship, e.g. published_comments which specifies extra parameters. You could then also control that no-one writes to it, e.g. with attribute events. In these events you could, instead of forbidding manipulation, handle how manipulation is allowed. The only problem might be when updates happen, e.g. when you add a comment to Post.comments then published_comments won't be updated automatically because they are not aware of each other. Again, I'd take events for this if this is a required feature (but with the above ugly solution you would not have that either).
As a last, hybrid, solution you could take the first approach and then just assign those values to your object, e.g. Post.deleted_comments = deleted_comments.
The thing to keep in mind here is that it is generally not a clever idea to manipulate the query the ORM makes as this could lead to problems later on. I have taken this approach and manipulated the queries (with contains_eager this is easily possible) but it has created problems on some points (while generally being functional) so I dropped that approach.

"Too much contention" when creating new entity in dataStore

This morning my GAE application generated several error log: "too much contention on these datastore entities. please try again.". In my mind, this type of error only happens when multiple requests try modify the same entity or entities in the same entity group.
When I got this error, my code is inserting new entities. I'm confused. Does this mean there is a limitation of how fast we can create new entity?
My code of model definition and calling sequence is show below:
# model defnition
class ExternalAPIStats(ndb.Model):
uid = ndb.StringProperty()
api = ndb.StringProperty()
start_at = ndb.DateTimeProperty(auto_now_add=True)
end_at = ndb.DateTimeProperty()
# calling sequence
stats = ExternalAPIStats(userid=current_uid, api="eapi:hr:get_by_id", start_at=start_at, end_at=end_at)
stats.put() # **too much contention** happen here
That's pretty mysterious to me. I was wondering how I shall deal with this problem. Please let me know if any suggestion.
Without seeing how the calls are made(you show the calling code but how often is it called, via loop or many pages calling the same put at the same time) but I believe the issue is better explained here. In particular
You will also see this problem if you create new entities at a high rate with a monotonically increasing indexed property like a timestamp, because these properties are the keys for rows in the index tables in Bigtable.
with the 'start_at' being the culprit. This article explains in more detail.
Possibly (though untested) try doing your puts in batches. Do you run queries on the 'start_at' field? If not removing its indexes will also fix the issue.
How is the puts called (ie what I was asking above in a loop, multiple pages calling)? With that it might be easier to narrow down the issue.
Here is everything you need to know about Datastore Contention and how to avoid it:
https://developers.google.com/appengine/articles/scaling/contention?hl=en
(Deleted)
UPDATE:
You are reaching writes per second limit on the same entity group. Default it is 1 write per second.
https://cloud.google.com/datastore/docs/concepts/limits
Source: https://stackoverflow.com/a/47800087/1034622

Multiple URL segment in Flask and other Python frameowrks

I'm building an application in both Bottle and Flask to see which I am more comfortable with as Django is too much 'batteries included'.
I have read through the routing documentation of both, which is very clear and understandable but I am struggling to find a way of dealing with an unknown, possibly unlimited number of URL segments. ie:
http://www.example.com/seg1/seg2/seg3/seg4/seg5.....
I was looking at using something like #app.route(/< path:fullurl >) using regex to remove unwanted characters and splitting the fullurl string into a list the same length as the number of segments, but this seems incredibly inefficient.
Most PHP frameworks seem to have a method of building an array of the segment variable names regardless of the number but neither Flask, Bottle or Django seem to have a similar option, I seem to need to specify an exact number of segments to capture variables. A couple of PHP cms's seem to collect the first 9 segments immediately as variables and anything any longer gets passed as a full path which is then broken down in the way I mentioned above.
Am I not understanding the way things work in URL routing? Is the string splitting method really inefficient or the best way to do it? Or, is there a way of collecting an unknown number of segments straight into variables in Flask?
I'm pretty new on Python frameworks so a five year olds explanation would help,
many thanks.
I'm fairly new to Flask myself, but from what I've worked out so far, I'm pretty sure that the idea is that you have lots of small route/view methods, rather than one massive great switching beast.
For example, if you have urls like this:
http://example.com/unit/57/
http://example.com/unit/57/page/23/
http://example.com/unit/57/page/23/edit
You would route it like this:
#app.route('/unit/<int:unit_number>/')
def display_unit(unit_number):
...
#app.route('/unit/<int:unit_number>/page/<int:page_number>/')
def display_page(unit_number, page_number):
...
#app.route('/unit/<int:unit_number>/page/<int:page_number>/edit')
def page_editor(unit_number, page_number):
...
Doing it this way helps to keep some kind of structure in your application and relies on the framework to route stuff, rather than grabbing the URL and doing all the routing yourself. You could then also make use of blueprints to deal with the different functions.
I'll admit though, I'm struggling to think of a situation where you would need a possibly unlimited number of sections in the URL?
Splitting the string doesn't introduce any inefficiency to your program. Performance-wise, it's a negligible addition to the URL processing done by the framework. It also fits in a single line of code.
#app.route('/<path:fullurl>')
def my_view(fullurl):
params = fullurl.split('/')
it works:
#app.route("/login/<user>/<password>")
def login(user, password):
app.logger.error('An error occurred')
app.logger.error(password)
return "user : %s password : %s" % (user, password)
then:
http://example.com:5000/login/jack/hi
output:
user : jack password : hi

Categories

Resources