Python PureMVC vs Pubsub [closed]

Python PureMVC vs Pubsub [closed] - python

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I'm creating a python application and I want to implement it with MVC in mind. I was going to use pubsub to accomplish this but I came across PureMVC.
Could anyone explain these two things to me, the differences between them and the implications of using the one over the other.

I presume you are referring to pypubsub which I know a lot about (I am the author ;). However I don't know much about PureMVC for Python.
The two are very different, based on the PureMVC docs. Here are some differences that I think would matter in choosing, based on browsing the docs and listening to the presentation:
Learning curve:
Incorporating pypubsub in your app is easy: decide on "message topics", subscribe methods and functions, add send messages for those topics. Transport of the messages to destination is automatic. The "cruising speed" API is small: you have pub.subscribe and pub.sendMessage to learn and that's it.
with PureMVC you have to learn about mediators, commands, proxies, etc. These are all powerful concepts with significant functionality that you will have to learn upfront. You may even have to write a couple apps before you go from "knowledge" of their purpose, to "understanding" when/how to use them. For a one-of app, the overhead will sometimes be worth it. Most likely worth it if you create many applications that use the framework.
Impact on application design:
PyPubsub: anonymous observer design pattern.
PureMVC: MVC architectural pattern.
There are no classes to use with pypubsub "standard use". Mostly you have to classify your messages into topics and decide what to include as data. This can evolve fairly organically: you need a new dialog, and need to make some of its state available so when change a field, a label changes somewhere else: all you need to do is include a publish in the dialog, and a subscribe in the code that updates the label. If anything, pypubsub allows you to not worry about design; or rather, it allows you to focus your design on functionality rather than how to get data from one place to another.
With PureMVC there are many classes to use, they require that you design your components to derive from them and register them and implement base class functionality. It is not obvious that you can easily publish data from one place in your application and capture it in another without creating several new classes and implement such that they will do the right thing when called by the framework. Of course, the overhead (time to design) will in some cases be worth it.
Re-usability:
As long as a component documents what message topics it publishes, and what it listens to, it can be incorporated in another application, unit tested for behavior, etc. If the other application does not use pypubsub, it is easy to add, there is no impact on the architecture. Not all of an application needs to use pubsub, it can be used only where needed.
OTOH a PureMVC component could only be incorporated into an application that is already based on PureMVC.
Testability:
PureMVC facilitates testing by separating concerns across layers: visuals, logic, data.
Whereas publish-subscribe (pypubsub) facilitates it by separating across publishers vs consumers, regardless of layer. Hence testing with pypubsub consists in having the test publish data used by your component, and subscribe to data published by your component. Whereas with PureMVC the test would have to pretend to be visual and data layers. I don't know how easy that is in PureMVC.
Every publish-subscribe system can become difficult to debug without the right tools, once the application reaches a certain size: it can be difficult to trace the path of messages. Pypubsub provides classes that help with this (to be used during development), and functionality that verifies whether publishers and listeners are compatible.
It seems to me based on PureMVC diagrams that similar issues would arise: you would have to trace your way across proxies, commands, and mediators, via facades, to figure out why something went wrong. I don't know what tools PureMVC provides to deal with this.
Purpose:
The observer pattern is about how to get data from one place to the next via a sort of "data bus"; as long as components can link to the bus, state can be exchanged without knowledge of source or sink.
PureMVC is an architectural pattern: its job is to make it easy to describe your application in terms of concerns of view, control, and the data. The model does not care about how the control interacts with it; the control does not care how it is displayed; but the view needs the control to provide specific services to handle user actions and to get desired subset of data to show (since typically not all data available is shown), and the control needs the model to provide specific services (to get data, change it, validate it, save it, etc), and the control needs to instantiate view components at the right time.
Mutual exclusion: there is no reason that I can think of, based on the docs, that would prevent the two libraries from being used in the same application. They work at different levels, they have different purpose, than can coexist.
All decoupling strategies have pros and cons, and you have to weigh each. Learning curve, return on investment, re-usability, performance, testability, etc.

Related

Large(ish) django application architecture

How does one properly structure a larger django website such as to retain testability and maintainability?
In the best django spirit (I hope) we started out by not caring too much about decoupling between different parts of our website. We did separate it into different apps, but those depend rather directly upon each other, through common use of model classes and direct method calls.
This is getting quite entangled. For example, one of our actions/services looks like this:
def do_apply_for_flat(user, flat, bid_amount):
assert can_apply(user, flat)
application = Application.objects.create(
user=user, flat=flat, amount=bid_amount,
status=Application.STATUS_ACTIVE)
events.logger.application_added(application)
mails.send_applicant_application_added(application)
mails.send_lessor_application_received(application)
return application
The function does not only perform the actual business process, no, it also handles event logging and sending mails to the involved users. I don't think there's something inherently wrong with this approach. Yet, it's getting more and more difficult to properly reason about the code and even test the application, as it's getting harder to separate parts intellectually and programmatically.
So, my question is, how do the big boys structure their applications such that:
Different parts of the application can be tested in isolation
Testing stays fast by only enabling parts that you really need for a specific test
Code coupling is reduced
My take on the problem would be to introduce a centralized signal hub (just a bunch of django signals in a single python file) which the single django apps may publish or subscribe to. The above example function would publish an application_added event, which the mails and events apps would listen to. Then, for efficient testing, I would disconnect the parts I don't need. This also increases decoupling considerably, as services don't need to know about sending mails at all.
But, I'm unsure, and thus very interested in what's the accepted practice for these kind of problems.

For testing, you should mock your dependencies. The logging and mailing component, for example, should be mocked during unit testing of the views. I would usually use python-mock, this allows your views to be tested independently of the logging and mailing component, and vice versa. Just assert that your views are calling the right service calls and mock the return value/side effect of the service call.
You should also avoid touching the database when doing tests. Instead try to use as much in memory objects as possible, instead of Application.objects.create(), defer the save() to the caller, so that you can test the services without having to actually have the Application in the database. Alternatively, patch out the save() method, so it won't actually save, but that's much more tedious.

Transfer some parts of your app to different microservices. This will make some parts of your app focused on doing one or two things right (e.g. event logging, emails). Code coupling is also reduced and different parts of the site can be tested in isolation as well.
The microservice architecture style involves developing a single application as a collection of smaller services that communicates usually via an API.
You might need to use a smaller framework like Flask.
Resources:
For more information on microservices click here:
http://martinfowler.com/articles/microservices.html
http://aurelavramescu.blogspot.com/2014/06/user-microservice-python-way.html

First, try to brake down your big task into smaller classes. Connect them with usual method calls or Django signals.
If you feel that the sub-tasks are independent enough, you can implement them as several Django applications in the same project. See the Django tutorial, which describes relation between applications and projects.

recommendations for report generator (Python or web service)

I maintain several web applications and I'd like to add some "nice" reporting/analytics pages. Building that once is simple enough (e.g. using flot or similar plotting libraries) but somehow it seems like there should be a report generation library out there which "just" generates the necessary graphs without much coding + offer some filtering ability.
There are some tools out there but for some reason there was never a good fit:
must work on Linux
open source preferred though closed source works as well as long the pricing model is also suitable for small installs
Python API required (or external services using standard web protocols)
I realize that this is not exactly a unique question but I couldn't find other stackoverflow questions with the same scope. Any pointers appreciated.
Update (2012-08-09, 15:10 UTC): I realized I did not state some more requirements/wishes:
web interface to access reports
access control: Each user can only get reports on his own data (simple to do with a library, might be hard with an external server)
filtering: I need interactive filtering of values based on some parameters (e.g. "only events in this time frame", "only in place X").

Windward* is one software company that offers a solution that seems to meet most of your needs. They offer a Python API through either Jython or a RESTful API (their Java Engine and Javelin, respectively), and their main strength is that template design is done in Microsoft Office, so reports can be very flexible and are easy to put together (most people already know how to use Word, so there's also much less learning curve than other solutions out there). You can add dynamic filters that take parameters at runtime or change on-the-fly, you can output to a variety of formats including HTML and PDF, and it works with pretty much every major datasource. For a web interface, you can either build your own and easily integrate reporting into it (Engine) or buy one pre-built and modify it to your specifications (Javelin).
On the downside, they are closed-source and without knowing more about your setup, it would be difficult for me to say whether their pricing would work. Might be worth a look, though--the links above and their documentation wiki are probably good places to start looking to see if you're a fit.
*Disclaimer: I work for Windward. I do believe they are one of the better reporting packages out there, but there are others that may fit your needs too.

Learning to write reusable libraries [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
We need to write simple scripts to manipulate the configuration of our load balancers (ie, drain nodes from pools, enabled or disable traffic rules). The load balancers have a SOAP API (defined through a bunch of WSDL files) which is very comprehensive but using it is quite low-level with a lot of manual error checking and list manipulation. It doesn't tend to produce reusable, robust code.
I'd like to write a Python library to handle the nitty-gritty of interacting with the SOAP interface but I don't really know where to start; all of my coding experience is with writing one-off monolithic programs for specific jobs. This is fine for small jobs but it's not helping me or my coworkers -- we're reinventing the wheel with a different number of spokes each time :~)
The API already provides methods like getPoolNames() and getDrainingNodes() but they're a bit awkward to use. Most take a list of nodes and return another list, so (say) working out which virtual servers are enabled involves this sort of thing:
names = conn.getVirtualServerNames()
enabled = conn.getEnabled(names)
for i in range(0, len(names)):
if (enabled[i]):
print names[i]
conn.setEnabled(['www.example.com'], [0])
Whereas something like this:
lb = LoadBalancer('hostname')
for name in [vs.name for vs in lb.virtualServers() if vs.isEnabled()]:
print name
www = lb.virtualServer('www.example.com').disable()
is more Pythonic and (IMHO) easier.
There are a lot of things I'm not sure about: how to handle errors, how to deal with 20-odd WSDL files (a SOAPpy/suds instance for each?) and how much boilerplate translation from the API methods to my methods I'll need to do.
This is more an example of a wider problem (how to learn to write libraries instead of one-off scripts) so I don't want answers to these specific questions -- they're there to demonstrate my thinking and illustrate my problem. I recognise a code smell in the way I do things at the moment (one-off, non-reusable code) but I don't know how to fix it. How does one get into the mindset for tackling problems at a more abstract level? How do you 'learn' software design?

"I don't really know where to start"
Clearly false. You provided an excellent example. Just do more of that. It's that simple.
"There are a lot of things I'm not sure about: how to handle errors, how to deal with 20-odd WSDL files (a SOAPpy/suds instance for each?) and how much boilerplate translation from the API methods to my methods I'll need to do."
Handle errors by raising an exception. That's enough. Remember, you're still going to have high-level scripts using your API library.
20-odd WSDL files? Just pick something for now. Don't overengineer this. Design the API -- as you did with your example -- for the things you want to do. The WSDL's and the number of instances will become clear as you go. One, Ten, Twenty doesn't really matter to users of your API library. It only matters to you, the maintainer. Focus on the users.
Boilerplate translation? As little as possible. Focus on what parts of these interfaces you use with your actual scripts. Translate just what you need and nothing more.
An API is not fixed, cast in concrete, a thing of beauty and a joy forever. It's just a module (in your case a package might be better) that does some useful stuff.
It will undergo constant change and evolution.
Don't overengineer the first release. Build something useful that works for one use case. Then add use cases to it.
"But what if I realize I did something wrong?" That's inevitable, you'll always reach this point. Don't worry about it now.
The most important thing about writing an API library is writing the unit tests that (a) demonstrate how it works and (b) prove that it actually works.

There's an excellent presentation by Joshua Bloch on API design (and thus leading to library design). It's well worth watching. IIRC it's Java-focused, but the principles will apply to any language.

If you are not afraid of C++, there is an excellent book on the subject called "Large-scale C++ Software Design".
This book will guide you through the steps of designing a library by introducing "physical" and "logical" design.
For instance, you'll learn to flatten your components' hierarchy, to restrict dependency between components, to create levels of abstraction.
The is really "the" book on software design IMHO.

Are there any cpython libraries that work with jsr168 and/or jsr286?

On a Java portal you can have portlets that include data provided by other applications. We want to replace our existing Java portal with a Django application, which means duplicating the Java portal's ability to display portlets. The two Sun specifications in question that we want to duplicate are JSR168 and JSR286.
I need a cPython solution. Not Jython or Java. Nothing against those tools, we just don't use them. For the record, the Jython based Portletpy does the opposite of what we are aiming to do.
Also, I suspect this question has been caused by a misunderstanding on our part of how the JSR168/JSR286 specification works. I think that JSR168/JSR286 is an arcane protocol for communicating some sort of content between separate applications, but in the Java world that tends to be done by other methods such as SOAP. Instead, the issue might be that these protocols are simply definitions of how to display content objects in views. If all we have to do is handle SOAP calls and display data, then this whole question is moot.
Simple architecture image below of what we think we want to do:

I'm not sure you can do this. From JSR 168:
If I understand correctly, you want the Django application to take the place of the existing "Java Portal/Portlet Container" in the diagram. Unfortunately, the interface between the portlet container and the individual portlets is using in-memory API calls, not as a Web service. There's no easy URL-like interface where you can call into the Java piece to get a chunk of HTML which you then incorporate into a Django-served page.
JSR 286 is an update and while it refines the mechanisms for communicating between portlets, as well as serving resources from portlets, it doesn't really change the above model radically.
I'm not saying it couldn't be done - just that there's no easy, standard way to do it.

One way to get around this could be using a WSRP (Web Services for Remote Portlets, see Wikipedia) producer, that converts a JSR 168/286 into web services and consume them from django. But it seems that WSRP has not been very popular and I couldn't find any Python platform implementations (although partial works could exist). Beside this, I'm also interested in this topic.

Recommended Python publish/subscribe/dispatch module? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
From PyPubSub:
Pypubsub provides a simple way for
your Python application to decouple
its components: parts of your
application can publish messages (with
or without data) and other parts can
subscribe/receive them. This allows
message "senders" and message
"listeners" to be unaware of each
other:
one doesn't need to import the other
a sender doesn't need to know
"who" gets the messages,
what the listeners will do with the data,
or even if any listener will get the message data.
similarly, listeners don't need to worry about where messages come from.
This is a great tool for implementing
a Model-View-Controller architecture
or any similar architecture that
promotes decoupling of its components.
There seem to be quite a few Python modules for publishing/subscribing floating around the web, from PyPubSub, to PyDispatcher to simple "home-cooked" classes.
Are there specific advantages and disadvantages when comparing different different modules? Which sets of modules have been benchmarked and compared?
Thanks in advance

PyDispatcher is used heavily in Django and it's working perfectly for me (and for whole Django community, I guess).
As I remember, there are some performance issues:
Arguments checking made by PyDispatcher is slow.
Unused connections have unnecessary overhead.
AFAIK it's very unlikely you will run into this issues in a small-to-medium sized application. So these issues may not concern you. If you think you need every pound of performance (premature optimization is the root of all evil!), you can look at modifications done to PyDispatcher in Django.
Hope this helps.

The best dispatch package for python seems to be the dispatch module inside django (called signals in the documentation). It is independent of the rest of django, and is short, documented, tested and very well written.
Edit: I forked this project into an independent signal project for Python.

Here is a newer one: https://github.com/shaunduncan/smokesignal. "smokesignal is a simple python library for sending and receiving signals. It draws some inspiration from the django signal framework but is meant as a general purpose variant." Example:
from time import sleep
import smokesignal
#smokesignal.on('debug')
def verbose(val):
print "#", val
def main():
for i in range(100):
if i and i%10==0:
smokesignal.emit('debug', i)
sleep(.1)
main()

I recently looked carefully at py-amqplib to act as an AMQP client to a RabbitMQ broker. The latter tool is written in Erlang.
If you're looking to decouple your app. then why couple it to the language itself? Consider using message queues which are language neutral and then you've really got room to grow!
That being said, AMQP takes effort to understand and may be more than you are willing to take on if your app. is working just fine as is. YMMV.

Some libraries I have found that haven't yet been mentioned:
Circuits - a Lightweight, Event driven Framework with a strong Component Architecture.
C# Event Recipe

There is also the libraries by PJ Eby, RuleDispatch and the PEAK project, specially Trellis. I don't know what their status actually but the mailing list is quite active.
Last version of Trellis on PyPi
Trellis doc
I have also used the components from the Kamaelia project of the BBC. Axon is an interesting approach, but more component than publisher-consumer inspired. Well, its website is somewhat not up-to-date at all... There was a project or 2 in the Google SoC 2008 and work is being done.
Don't know if it help :)
Edit : I just found Py-notify which is an "unorthodox" implementation of the Observer pattern. It has most of the functionalities that I need for my own tools.

The fact alone that PyPubSub seems to be a somewhat chaotically managed project (the Wiki on SF is dead, the website (another Wiki) which is linked on SF is currently broken) would be enough reason for me not to use it.
PyDispatcher has an intact website, but the only documentation they seem to provide is the one for the API generated from the docstrings. No traffic on the mailing list either... a bad sign!
As Mike also mentioned, it's perfectly possible to choose a solution that is independent of Python. Now don't get me wrong, I love Python, but still, in this field it can make sense use a framework that is decoupled from the programming language.
I'm not experienced with messaging, but I'm planning to have a look into a few solutions. So far these two (free, open source) projects seem to be the most promising for me (coincidentally, both are Apache projects):
ActiveMQ
Qpid
Both seem to be reasonably matured projects, at least a far as documentation and community. I can't comment on the software's quality though, as I said, I didn't use any of the software.
Qpid ships with client libraries for Python, but you could also use py-amqplib. For ActiveMQ there's pyactivemq, which you can use to connect either via STOMP (Streaming Text Orientated Messaging Protocol) or via Openwire.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.