Recommended ways to split some functionality into functions, modules and packages? - python

There comes a point where, in a relatively large sized project, one need to think about splitting the functionality into various functions, and then various modules, and then various packages. Sometimes across different source distributions (eg: extracting a common utility, such as optparser, into a separate project).
The question - how does one decide the parts to put in the same module, and the parts to put in a separate module? Same question for packages.

There's a classic paper by David Parnas called "On the criteria to be used in decomposing systems into modules". It's a classic (and has a certain age, so can be a little outdated).
Maybe you can start from there, a PDF is available here
http://www.cs.umd.edu/class/spring2003/cmsc838p/Design/criteria.pdf

Take out a pen and piece of paper. Try to draw how your software interacts on a high level. Draw the different layers of the software etc. Group items by functionality and purpose, maybe even by what sort of technology they use. If your software has multiple abstraction layers, I would say to group them by that. On a high level, the elements of a specific layer all share the same general purpose. Now that you have your software in layers, you can divide these layers into different projects based on specific functionality or specialization.
As for a certain stage that you reach in which you should do this? I'd say when you have multiple people working on the code base or if you want to keep your project as modular as possible. Hopefully your code is modular enough to do this with. If you are unable to break apart your software on a high level, then your software is probably spaghetti code and you should look at refactoring it.
Hopefully that will give you something to work with.

See How many Python classes should I put in one file?
Sketch your overall set of class definitions.
Partition these class definitions into "modules".
Implement and test the modules separately from each other.
Knit the modules together to create your final application.
Note. It's almost impossible to decompose a working application that evolved organically. So don't do that.
Decompose your design early and often. Build separate modules. Integrate to build an application.

IMHO this should probably one of the things you do earlier in the development process. I have never worked on a large-scale project, but it would make sense that you make a roadmap of what's going to be done and where. (Not trying to rib you for asking about it like you made a mistake :D )
Modules are generally grouped somehow, by purpose or functionality. You could try each implementation of an interface, or other connections.

I sympathize with you. You are suffering from self-doubt. Don't worry. If you can speak any language, including your mother tongue, you are qualified to do modularization on your own. For evidence, you may read "The Language Instinct," or "The Math Instinct."
Look around, but not too much. You can learn a lot from them, but you can learn many bad things from them too.
Some projects/framework get a lot fo hype. Yet, some of their groupings of functionality, even names given to modules are misleading. They don't "reveal intention" of the programmers. They fail the "high cohesiveness" test.
Books are no better. Please apply 80/20 rule in your book selection. Even a good, very complete, well-researched book like Capers Jones' 2010 "Software Engineering Best Practices" is clueless. It says 10-man Agile/XP team would take 12 years to do Windows Vista or 25 years to do an ERP package! It says there is no method till 2009 for segmentation, its term for modularization. I don't think it will help you.
My point is: You must pick your model/reference/source of examples very carefully. Don't over-estimate famous names and under-estimate yourself.
Here is my help, proven in my experience.
It is a lot like deciding what attributes go to which DB table, what properties/methods go to which class/object etc? On a deeper level, it is a lot like arranging furniture at home, or books in a shelf. You have done such things already. Software is the same, no big deal!
Worry about "cohesion" first. e.g. Books (Leo Tolstoy, James Joyce, DE Lawrence) is choesive .(HTML, CSS, John Keats. jQuery, tinymce) is not. And there are many ways to arrange things. Even taxonomists are still in serious feuds over this.
Then worry about "coupling." Be "shy". "Don't talk to strangers." Don't be over-friendly. Try to make your package/DB table/class/object/module/bookshelf as self-contained, as independent as possible. Joel has talked about his admiration for the Excel team that abhor all external dependencies and that even built their own compiler.

Actually it varies for each project you create but here is an example:
core package contains modules that are your project cant live without. this may contain the main functionality of your application.
ui package contains modules that deals with the user interface. that is if you split the UI from your console.
This is just an example. and it would really you that would be deciding which and what to go where.

Related

Put keyword in a library in Robot Framework

I have a lot of automated test cases with Robot Framework and, consequently, I have more and more keywords. It's a bit difficult for me to bring order.
My question is if I can include my keywords in a library. If this is possible, how can I do it?
Thank you.
Marta
This is how you create libraries - Creating test libraries.
However, moving keywords into a library will not bring order to your system. You will only move the disorder to another place.
Keeping your test scripts maintainable deals for the most part in having a certain structure to your work. This applies to Robot Framework as much as it does to any other language.
In Robot Framework we use Resource Files to store keywords that we want to reuse across multiple Test Case files. Thought these links you should be able to learn more on how to do this. You can import Resource Files in a Resource file, so chaining them.
As for what to put in these files that is often a personal preference. However, typically adhering to development principles like DRY, Separation of Concerns and most importantly Common Sense works best.
I'd advise adhering to principles instead of to fixed structure. Separate Data from Process logic, abstract the UI from your Process logic and model your process logic as close to Business Processes as possible.
As for converting keywords to Python Code. If the logic in your resource files means you use a lot of keywords for a specific feature automation, then perhaps this makes sense. But keep in mind that for maintainability you will then be more heavily dependent on the Python skills in your organisation.

Would python be an appropriate choice for a video library for home use software

I am thinking of creating a video library software which keep track of all my videos and keep track of videos that I already haven't watched and stats like this. The stats will be specific to each user using the software.
My question is, is python appropriate to create this software or do I need something like c++.
Python is perfectly appropriate for such tasks - indeed the most popular video site, YouTube, is essentially programmed in Python (using, of course, lower-level components called from Python for such tasks as web serving, relational db, video transcoding -- there are plenty of such reusable opensource components for all these kinds of tasks, but your application's logic flow and all application-level logic can perfectly well be in Python).
Just yesterday evening, at the local Python interest group meeting in Mountain View, we had new members who just moved to Silicon Valley exactly to take Python-based jobs in the video industry, and they were saying that professional level video handing in the industry is also veering more and more towards Python -- stalwarts like Pixar and ILM had been using Python forever, but in the last year or two it's been a flood of Python adoption in the industry.
If you want your code to be REAL FAST, use C++ (or parallel fortran).
However in your application, 99% of the runtime isn't going to be in YOUR code, it's going to be in GUI libraries, OS calls, waiting for user interaction, calling libraries (written in C) to open video files and make thumbnails, that kind of stuff.
So using C++ will make your code 100 times faster, and your application will, as a result, be 1% faster, which is utterly useless. And if you write it in C++ you'll need months, whereas using Python you'll be finished much faster and have lots more fun.
Using C++ could even make it a lot slower, because in Python you can very easily build more scalable algorithms by using super powerful primitives like hashes, sets, generators, etc, try several algorithms in 5 minutes to see which one is the best, import a library which already does 90% of the work, etc.
Write it in Python.
Yes. Python is much easier to use than c++ for something like this. You may want to use it as a front-end to a DB such as sqlite3
Maybe you should take a look at this project:
Moovida
It's a complete media center, open source, written in python that is easy to extend. I don't know if it will do exactly what you want out of the box but you can probably easily add the features you want.
Of course you can use almost any programming language for almost any task. But after noting that, it's also obvious that different languages are also differently well adapted for different tasks.
C/C++ are languages that are very "hardware friendly". Basically, the languages are just one abstraction level above assembler, with C's use of pointers etc. C++ is almost like a (semi-)portable object oriented assembler, if one wants to be funny. :) This makes C/C++ fast and good at talking to hardware.
But those same features become mis-features in other cases. The pointers make it possible to walk all over the memory and unless you are careful you will leak memory all over the place. So I would say (and now C people will get angry) that C/C++ in fact are directly inappropriate for what you want to do.
You want a language that are higher, does more things like memory management automatically and invisibly. There are many to choose from there, but without a doubt Python is eminently suited for this. Python has the last couple of years emerged as The New Cool Language to write these kind of softwares in, and much multimedia software such as Freevo and the already mentioned Moovida are written in Python.

What could justify the complexity of Plone?

Plone is very complex. Zope2, Zope3, Five, ZCML, ZODB, ZEO, a whole bunch of acronyms and abbreviations.
It's hard to begin and the current state seems to be undecided. It is mainly based on Zope2, but incorporates Zope3 via Five. And there are XML config files everywhere.
Does the steep learning curve pay of? Is this complexity still justified today?
Background: I need a platform. Customers often need a CMS. I'm currently reading "Professional Plone Development", without prior knowledge of Plone.
The problem: Customers don't always want the same and you can't know beforehand. One thing is sure: They don't want the default theme of Plone. But any additional feature is a risk. You can't just start and say "If you want to see the complexity of Plone, you have to ask for it." when you don't know the system good enough to plan.
It's hard to answer your question without any background information. Is the complexity justified if you just want a blog? No. Is the complexity justified if you're building a company intranet for 400+ people? Yes. Is it a good investment if you're looking to be a consultant? Absolutely! There's a lot of Plone work out there, and it pays much better than the average PHP job.
I'd encourage you to clarify what you're trying to build, and ask the Plone forums for advice. Plone has a very mature and friendly community — and will absolutely let you know if what you're trying to do is a poor fit for Plone. You can of course do whatever you want with Plone, but there are some areas where it's the best solution available, other areas where it'll be a lot of work to change it to do something else.
Some background:
The reason for the complexity of Plone at this point in time is that it's moving to a more modern architecture. It's bridging both the old and the new approach right now, which adds some complexity until the transition is mostly complete.
Plone is doing this to avoid leaving their customers behind by breaking backwards compatibility, which they take very seriously — unlike other systems I could mention (but won't ;).
You care about your data, the Plone community cares about their data — and we'd like you to be able to upgrade to the new and better versions even when we're transitioning to a new architecture. This is one of the Plone community's strengths, but there is of course a penalty to pay for modifying the plane while it's flying, and that's a bit of temporary, extra complexity.
Furthermore, Plone as a community has a strong focus on security (compare it to any other system on the vulnerabilities reported), and a very professional culture that values good architecture, testing and reusability.
As an example, consider the current version of Plone being developed (what will become 4.0):
It starts up 3-4 times faster than the current version.
It uses about 20% less memory than the current version.
There's a much, much easier types system in the works (Dexterity), which will reduce the complexity and speed up the system a lot, while keeping the same level of functionality
The code base is already 20% smaller than the current shipping version, and getting even smaller.
Early benchmarks of the new types system show a 5× speedup for content editing, and we haven't really started optimizing this part yet.
— Alexander Limi, Plone co-founder (and slightly biased ;)
If you want to see the complexity of Plone, you have to ask for it. For most people, it's just not there. It installs in a couple of minutes through a one-click installer. Then it's one click to log in, one click to create a page, use a WYSYWIG editor, and one click to save. Everything is through an intuitive web GUI. Plone is a product.
If you want to use it as a "platform," then the platform is a stack of over one million lines of code which implements a complete content management suite. No one knows it all. However, all those "acronyms" and "files" are evidence of a software which is factored in components so that no one need know it all. You can get as deep or shallow in it as you need. If there's something you need for some aspect of content management, it's already there, you don't have to create it from scratch, and you can do it in a way that's consistent with a wide practice and review.
I found an anonymous comment here which is much better than that post itself, so I'm reposting it here in full, with a couple of typos corrected.
This summer my chess club asked me to make a new website, where the members of the board should be able to add news flashes, articles, ... Sounded like a CMS. Being a Python developer, I looked at Plone and bought the Aspeli book Professional Plone development (excellent written btw).
I took 3 weeks of my holiday the study the book and to setup a first mock up of the site.
After 3 weeks I realized that Plone has some very nice things but also some very frustrating things
On the positivie side
if you don't need to customize Plone, Plone is great in features and layout
Plone has a good security model
Plone has good off the shelf workflows
Plone is multi language (what I needed)
On the downside
Plone is terrible slow. On my development platform (a 3 years old Pc with 512 MB RAM) it takes 30 seconds to launch Plone and it takes 10 to 15 seconds to reload a page
you need a lot of different technologies to customize or develop even the simplest things
TAL and Metal are not state of the art and not adapted to the OO design of Plone.
Acquisition by default is wrong. Acquisation can be very useful (for e.g. security) but it should be explicitly defined where needed. This is a design flaw
Plone does not distinguish between content and layout. This a serious design flaw. There is no reason to apply security settings and roles on e.g. cascading style sheet or the html that creates a 3 column layout and there is no reason why these elements should be in the ZODB and not on the filesystem
Plone does not distinguish between the web designer and the content editor/publisher, again a serious flaw. The content editor/publisher add/reviews content running on the live site. The web designer adds/modifies content types, forms and layout on the test server and ports it to the live server when ready. The security restrictions Plone put in place for the content editor should not be applied for the web designer, who has access to the filesystem on the server.
Plone does not distinguish between the graphical aspects and the programming aspects of a web designer. Graphical artists uses tools that only speak html, css and a little bit of javasccript, but no Python, adapters and other advanced programming concepts. As a consequence the complete skinning system in Plone is a nightmare
I assume that Plone is so slow because of points 4, 5, 6 and 7.
Points 6 and 7 made me dropping Plone. I looked around for other options and eventually decided to develop my own CMS on Pylons, which is blazingly fast compared to Plone. On the same development server I have a startup time of 1 second, and a reload page time is not measurable.
The site www.kosk.be is running (it is in Dutch). The CMS behind it, named Red Devil, will be launched as a separate open source project beginning next year
I see four things that can justify an investment of time in using Plone:
Plone has a large and helpful community. Most of the things you need, somebody else
already did at some time in the past. He probably asked some questions and got helpful
answers, or he wrote a tutorial. Usually that leaves traces easy to find.
about how he did it.
You won't need to understand the whole complexity for many of your customizing needs.
Plone developers are aware of their complex stack, and are discussing how this can be
reduced. Plone has proven in the past that it is able to renew itself and drop old
infrastructure in a clean way with defined deprecation phases.
There are many local user groups with helpful people.
Oh wait, I was told the plone developer meetings are one of the best!
Like that one
From a system administrator standpoint, Plone is just shy of being the absolute devil. Upgrading, maintaining, and installing where you want to install things is all more painful than necessary on the Linux platform. That's just my two cents though, and why I typically prefer to avoid the Zope/Plone stack.
Note: it's better with newer releases, but older releases.... ugh
Accretion.
About the comment here I think Plone doesn't work like that (at least not anymore).
1 - Plone is somehow slower than other CMS solutions indeed, but from the out-of-the-box setup to a Apache-Varnish-Zope-Relstorage solution, there is a lot of optimization space.
2 - That is true. The answer here kind of explains it, but indeed Plone is a complex animal.
3 - Not sure what you mean. TAL Path expressions are based on the concept of object attribute traversal. Seems OO to me.
4 - True. Although after you understand how Acquisition works, it stays out of your way. And in Plone not many things depend on Acquisition, I guess.
5 - Not true. Zope Page Templates are all about separating content from presentation. The fact that content and presentation can be viewed from the ZODB (and actually most of the templates stay in the filesystem, you just see a "view" of them in the ZODB) is more related to the fact that the ZODB is a big object database - which in turn does not mean that they are all content. Everything in "pure" OO system is an object, it is just the kind of object (presentation objects, content object, etc) that matters.
6 - Plone does distinguish between webdesigners and content creators. The designers do all customizations (templates, CSSs, JSs,etc) and then content creators create the content using Plone UI. The point here is that Plone is mainly a CMS, which means that content creators are supposed to be laymen in terms of design.
7 - Partially true. Considering that the UI structure won't change, all presentation specification is contained in CSS files. If the UI structure needs to change, the designer might work with a plogrammer :-) to adequate the templates.
I guess in no system that outputs dynamic pages the designer is completely free to speak only HTML, CSS and JS, and leave out some other technology, be it PHP, Python, ASP or Java. If he does, there will be definitely a programmer that will get the HTML,CSS and JS from the designer and "dynamize it". This model definitely exists in Plone.
Don't use it, if you don't have to. The whole ZOPE universe is a dinosaur. Grown for ages, has collected lots of cruft and rust. Many things would be done completely different nowadays. Overly complex for most stuff, hard to handle for complex stuff. It's the opposite of slim and scalable design. And for seriously fixing this, I don't see the necessary manpower involved in the project.
Sorry for the harsh words, I also wish it would be any better.

I need a really good reason to use Python

I have been trying to make a case for using Python at my work. We use C# and ASP.NET for basically all of our development. 80% or more of our projects are web applications. It seems natural that we would look at some of the nice dynamic web languages (Ruby, Python, etc), and with things like IronRuby and IronPython, I started seriously investigating.
I love Python. It's a beautiful, expressive language. It's a joy to code in, for sure. The multitude of python modules and frameworks make it very appealing. Problem is, I cannot think of any specific problems, any specific hurdles that would require a language like Python. ASP.NET gives us RAD, it gives us a full-featured framework and all that good stuff. Also, we all already know C# and have lots of projects in C#, learning a new language just because doesn't quite work.
Can you guys help me think of something to finally convince my boss to really learn Python and start using it on projects?
Edit: I know that no problem requires only one language, I just meant, are there any specific problems in which dynamic languages excel over static languages.
Edit again: Let me also mention that my boss prompted ME to investigate this. He has put aside hours to research these languages, find a good one, learn it, and then figure out how we can use it. I'm at the last step here, I do not need a lecture on why I should consider my motivation for changing something my company does because they do it for a reason.
"Can you guys help me think of something to finally convince my boss to really learn Python and start using it on projects?"
Nope.
Nothing succeeds like success. Use Python. Be successful. Make people jealous.
When asked why you're successful, you can talk about Python. Not before.
Choose projects wisely: things where a dynamic language has significant advantages. Things where the requirements are not nailed down in detail. Things like data transformations, log-file scraping, and super-sophisticated replacements for BAT files.
Use Python to get started doing something useful while everyone else is standing around trying to get enough business and domain information to launch a project to develop a complicated MVC design.
Edit: Some Python to the Rescue stories.
Exploratory Programming
Tooling to build test cases
What's Central Here?
Control-Break Reporting
One More Cool Thing About Python Is...
In Praise of Serialization
And that's just me.
Edit: "boss prompted ME to investigate", "figure out how we can use it" changes everything.
The "finally convince my boss to really learn Python" is misleading. You aren't swimming upstream. See How Do I Make the Business Case for Python for the "convince my boss" problem. The edit says you're past this phase.
Dynamic languages offer flexibility. Exploit that. My two sets of examples above are two areas where flexibility matters.
Requirements aren't totally nailed down. With a dynamic language, you can get started. Rework won't be a deal-breaker. With Java (and C++ and C#) you are reluctant to tackle devastating design changes because it's hard to break everything and get it to compile and work again. In Python, devastating changes aren't as expensive.
Design is in flux because you can't pick components. You can write Wrappers and Facades very easily in Python. It's a scripting language. And, Python modules compose into larger aggregates very simply.
Coding is in flux because requirements and design keep changing. It's scripted -- not compiled. You just make a change to the code and you're off and running. Testing is easier because the work cycle is shorter. It isn't code-compile-build-test it's code-test.
Testing is in flux because the requirements keep changing. Same as above. The work cycle is shorter and faster.
Almost no problem requires a specific programming language, that's just not how things work.
The easiest way to get a new language into an environment like yours is to start a new work project in your own time in the new language. Make it do something you need doing, and write it on your own time. Use it yourself, and other people will probably notice it. They then say "Can you send me that program?" and boom, they're using your new language.
If you really want to something, I would probably write a site in Django, simply because its admin interface blows everyone away.
The main point to remember is that if you start using python, that's one more thing everyone else has to learn, and it's another bullet point that will need to be on every prospective employee's resume. That can get expensive, and management won't like it.
Sneaking a language in is often done by automating tedious manual tasks (especially dynamic/scripting languages like Python/Ruby etc). Set it up so something like deploying builds, or shuffling backups, or whatever is done with Python.
Then casually slip in how easy it was to do, and try to spread some of the enthusiasm around.
Acceptance and awareness should slowly grow from that, and before you know it, management is seriously considering Python for a new project.
Can you guys help me think of
something to finally convince my boss
to really learn Python and start using
it on projects?
Unfortunately, the answer is no. As Harley said, no problem is going to require a specific language. The approach Harley suggested is also a good one. Learn on your time, build an useful app on your time, and maybe, just maybe, someone at your work will want to use it, like it, love it, then want to learn more about it.
Another approach you could take is to get a better understanding of why your company uses .Net (therefore, Windows Server, and probably SQL server) for nearly all development. Once you have a good understanding of why they aren't open to other languages, then you have some fire power to build a business case for the "why not?".
Why pay licensing costs when you have tools that can accomplish the same things? There are free alternatives out there, like Python, that run on free servers.
Why not give your team the chance to grow their professional tool-belt? This is my opinion, but a good developer is a developer that isn't afraid to learn new ways of doing the same thing they've done before.
But, in the end, I wouldn't get your hopes up. Bottom line, it costs money to introduce a new language/environment into an IT shop. Whether it's software, training, or employee rollover, there is a business reason behind utilizing a single language for a company, and sticking to it.
I'm in the exact scenario you're in. I work in a .Net shop, and that's not going to change any time soon. I get by, by working on my own projects in my "free" time. I enjoy it, and it makes for a good balance.
Hope this helps.
Take a step back, and look at your approach. "I know what I want the answer to be, but I can't find any evidence to support it."
Despite the fact that Python is my current first choice language, I am afraid I find myself on the side of your boss! Sorry.
I think you should open your mind and consider all the options from the stance of your organisation's best interest, and see if you don't come to a different conclusion about the best language.
There are many factors in the choice of language, and how pretty it is is a fairly minor one. The availability of staff is a key one. The cost of retraining, availability of the libraries and meta-tools, performance, etc. etc.
Once you have taken into consideration all the different factors (and not just "Oooh! It'd be fun!") and made a balanced assessment (rather than a predetermined answer), you may find that your boss is more willing to listen.
p.s. The suggestion to secretly use Python for work code, and leaving the company with a terrible code debt in a language they are not prepared for seems very unprofessional to me.
The best leverage you're likely going to have is tools and libraries; as others have pointed out, no language is required to solve any particular program. So let's look at Things You Can Leverage Using Python:
Google App Engine
SciPy
pywinauto
django
Those are off the top of my head; finding what's applicable to your team and your company is left as an exercise for the questioner :)
Well, here's a view of why Python programmers make better Java programmers; the concepts are much the same as for your situation.
Essentially, people who learn a language because they want to show that they enjoy programming, like to learn new things, and are more likely to think outside the box.
...if a company chooses to write
its software in a comparatively
esoteric language, they'll be able to
hire better programmers, because
they'll attract only those who cared
enough to learn it. And for
programmers the paradox is even more
pronounced: the language to learn, if
you want to get a good job, is a
language that people don't learn
merely to get a job.
Not only that, but Python enforces "good looking" code and you don't have to do the whole code/compile routine. With IronPython, you can simply code in Python and use it as is; just another .NET tool.
Python got a good start in the Java world as Jython for unit testing. In fact many Java people started using it first that way. Its dynamic scripting nature makes it a great fit for unit tests. Just yesterday I was wishing I could use it or something like it for the unit tests I was writing for a VB.Net project. I'd have to say that it isn't so much about the individual language IronRuby or IronPython as it is about the style of development that they enable. You can write static language like code in either but you don't fully reap the benefits until you can start to think dynamically. Once you grasp those concepts you'll start to slowly change the way you code and your projects will require less classes and less code to implement. Testing, particularly unit tests will become a must since you give up the warm blanket known as a compiler with type safety checks for other efficiencies.
The language is almost never the key to success. Good programmers can be successful in a variety of languages, and you'll find successful projects in almost any language. You won't find the failures that much because those projects just go away never to be heard of again. If you're looking for a new language because you don't have good programmers, even the best language in the world isn't going to help.
And, you haven't said anything about the sort of work you're doing. Python might be a good choice because it has well-supported and widely-used libraries that are critical for you. On the other hand, C# might have better support for the stuff that you want to do. A tool outside of context has no intrinsic merit. You might love screwdrivers, but that doesn't help you row a boat.
If you want to use Python, or any other language, just because you like it, be honest with yourself and those around you. It looks like you've made a decision to switch, don't know why you are switching, and now need to rationalize it with reasons that had nothing to do with your desire to switch. If you had a good reason, you wouldn't be asking here :)
That's not entirely a bad thing, though. Programming is a human enterprise. If the programmers (at whatever level) insanely love a particular language, no matter how stupid the reason, they are probably going to produce more. It's just not a technological solution though.
Good luck, :)
I am pretty sure (100%) that you don't need to use Python for MS Windows at least.
In cases of other platforms you can use any language you like.

If it is decided that our system needs an overhaul, what is the best way to go about it?

We are mainting a web application that is built on Classic ASP using VBScript as the primary language. We are in agreement that our backend (framework if you will) is out dated and doesn't provide us with the proper tools to move forward in a quick manner. We have pretty much embraced the current webMVC pattern that is all over the place, and cannot do it, in a reasonable manner, with the current technology. The big missing features are proper dispatching and templating with inheritance, amongst others.
Currently there are two paths being discussed:
Port the existing application to Classic ASP using JScript, which will allow us to hopefully go from there to .NET MSJscript without too much trouble, and eventually end up on the .NET platform (preferably the MVC stuff will be done by then, ASP.NET isn't much better than were we are on now, in our opinions). This has been argued as the safer path with less risk than the next option, albeit it might take slightly longer.
Completely rewrite the application using some other technology, right now the leader of the pack is Python WSGI with a custom framework, ORM, and a good templating solution. There is wiggle room here for even django and other pre-built solutions. This method would hopefully be the quickest solution, as we would probably run a beta beside the actual product, but it does have the potential for a big waste of time if we can't/don't get it right.
This does not mean that our logic is gone, as what we have built over the years is fairly stable, as noted just difficult to deal with. It is built on SQL Server 2005 with heavy use of stored procedures and published on IIS 6, just for a little more background.
Now, the question. Has anyone taken either of the two paths above? If so, was it successful, how could it have been better, etc. We aren't looking to deviate much from doing one of those two things, but some suggestions or other solutions would potentially be helpful.
Don't throw away your code!
It's the single worst mistake you can make (on a large codebase). See Things You Should Never Do, Part 1.
You've invested a lot of effort into that old code and worked out many bugs. Throwing it away is a classic developer mistake (and one I've done many times). It makes you feel "better", like a spring cleaning. But you don't need to buy a new apartment and all new furniture to outfit your house. You can work on one room at a time... and maybe some things just need a new paintjob. Hence, this is where refactoring comes in.
For new functionality in your app, write it in C# and call it from your classic ASP. You'll be forced to be modular when you rewrite this new code. When you have time, refactor parts of your old code into C# as well, and work out the bugs as you go. Eventually, you'll have replaced your app with all new code.
You could also write your own compiler. We wrote one for our classic ASP app a long time ago to allow us to output PHP. It's called Wasabi and I think it's the reason Jeff Atwood thought Joel Spolsky went off his rocker. Actually, maybe we should just ship it, and then you could use that.
It allowed us to switch our entire codebase to .NET for the next release while only rewriting a very small portion of our source. It also caused a bunch of people to call us crazy, but writing a compiler is not that complicated, and it gave us a lot of flexibility.
Also, if this is an internal only app, just leave it. Don't rewrite it - you are the only customer and if the requirement is you need to run it as classic asp, you can meet that requirement.
Use this as an opportunity to remove unused features! Definitely go with the new language. Call it 2.0. It will be a lot less work to rebuild the 80% of it that you really need.
Start by wiping your brain clean of the whole application. Sit down with a list of its overall goals, then decide which features are needed based on which ones are used. Then redesign it with those features in mind, and build.
(I love to delete code.)
It works out better than you'd believe.
Recently I did a large reverse-engineering job on a hideous old collection of C code. Function by function I reallocated the features that were still relevant into classes, wrote unit tests for the classes, and built up what looked like a replacement application. It had some of the original "logic flow" through the classes, and some classes were poorly designed [Mostly this was because of a subset of the global variables that was too hard to tease apart.]
It passed unit tests at the class level and at the overall application level. The legacy source was mostly used as a kind of "specification in C" to ferret out the really obscure business rules.
Last year, I wrote a project plan for replacing 30-year old COBOL. The customer was leaning toward Java. I prototyped the revised data model in Python using Django as part of the planning effort. I could demo the core transactions before I was done planning.
Note: It was quicker to build a the model and admin interface in Django than to plan the project as a whole.
Because of the "we need to use Java" mentality, the resulting project will be larger and more expensive than finishing the Django demo. With no real value to balance that cost.
Also, I did the same basic "prototype in Django" for a VB desktop application that needed to become a web application. I built the model in Django, loaded legacy data, and was up and running in a few weeks. I used that working prototype to specify the rest of the conversion effort.
Note: I had a working Django implementation (model and admin pages only) that I used to plan the rest of the effort.
The best part about doing this kind of prototyping in Django is that you can mess around with the model, unit tests and admin pages until you get it right. Once the model's right, you can spend the rest of your time fiddling around with the user interface until everyone's happy.
Whatever you do, see if you can manage to follow a plan where you do not have to port the application all in one big bang. It is tempting to throw it all away and start from scratch, but if you can manage to do it gradually the mistakes you do will not cost so much and cause so much panic.
Half a year ago I took over a large web application (fortunately already in Python) which had some major architectural deficiencies (templates and code mixed, code duplication, you name it...).
My plan is to eventually have the system respond to WSGI, but I am not there yet. I found the best way to do it, is in small steps. Over the last 6 month, code reuse has gone up and progress has accelerated.
General principles which have worked for me:
Throw away code which is not used or commented out
Throw away all comments which are not useful
Define a layer hierarchy (models, business logic, view/controller logic, display logic, etc.) of your application. This has not to be very clear cut architecture but rather should help you think about the various parts of your application and help you better categorize your code.
If something grossly violates this hierarchy, change the offending code. Move the code around, recode it at another place, etc. At the same time adjust the rest of your application to use this code instead of the old one. Throw the old one away if not used anymore.
Keep you APIs simple!
Progress can be painstakingly slow, but should be worth it.
I would not recommend JScript as that is definitely the road less traveled.
ASP.NET MVC is rapidly maturing, and I think that you could begin a migration to it, simultaneously ramping up on the ASP.NET MVC framework as its finalization comes through.
Another option would be to use something like ASP.NET w/Subsonic or NHibernate.
Don't try and go 2.0 ( more features then currently exists or scheduled) instead build your new platform with the intent of resolving the current issues with the code base (maintainability/speed/wtf) and go from there.
A good place to begin if you're considering the move to Python is to rewrite your administrator interface in Django. This will help you get some of the kinks worked out in terms of getting Python up and running with IIS (or to migrate it to Apache). Speaking of which, I recommend isapi-wsgi. It's by far the easiest way to get up and running with IIS.
I agree with Michael Pryor and Joel that it's almost always a better idea to continue evolving your existing code base rather than re-writing from scratch. There are typically opportunities to just re-write or re-factor certain components for performance or flexibility.

Categories

Resources