Python: one single module (file .py) for each class? [closed]

Python: one single module (file .py) for each class? [closed] - python

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I've started programming in python 2 weeks ago. I'm making a separate file (module) for each class as I've done before in languages like Java or C#.
But now, seeing tutorials and code from other people, I've realized that many people use the same files to define more than 1 class and the main function but I don't know if they do it like that because are just examples or because it's a python convention or something like that (to define and group many classes in the same files).
So, in Python, one file for each class or many classes in the same files if they can be grouped by any particular feature? (like motor vehicles by one side and just vehicles by the other side).
It's obvious that each one has his own style, but when I ask, I hope general answers or just the conventions, anyway, if someone wants to tell me his opinion about his own style and why, feel free to do it! ;)

one file for each class
Do not do this. In Java, you usually will not have more than one class in a file (you can, of course nest).
In Python, if you group related classes in a single file, you are on the safe side. Take a look at the Python standard library: many modules contain multiple classes in a single file.
As for the why? In short: Readability. I, personally, enjoy not having to switch between files to read related or similar code. It also makes imports more concise.
Imagine socketserver.py would spread UDPServer, TCPServer, ForkingUDPServer, ForkingTCPServer, ThreadingUDPServer, ThreadingTCPServer, BaseRequestHandler, StreamRequestHandler, DatagramRequestHandler into nine files. How would you import these? Like this?
from socketserver.tcp.server import TCPServer
from socketserver.tcp.server.forking import ForkingTCPServer
...
That's plain overhead. It's overhead, when you write it. It's overhead, when you read it. Isn't this easier?
from socketserver import TCPServer, ForkingTCPServer
That said, no one will stop you, if you put each class into a single file. It just might not be pythonic.

Python has the concept of packages, modules and classes. If you put one class per module, the advantage of having modules is gone. If you have a huge class, it might be ok to put this class in a separate file, but then again, is it good to have big classes? NO, it's hard to test and maintain. Better have more small classes with specific tasks and put them logically grouped in as few files as possible.

It's not wrong to have one class per file at all. Python isn't directly aimed at object oriented design so that's why you can get away with multiple classes per file.
I recommend having a read over some style guides if you're confused about what the 'proper' way to do it is.
I suggest either Google's style guide or the official style guide by the Python Foundation
You can also find more material relating to Python's idioms and meta analysis in the PEP index

Related

Keep files (modules) as "big" as possible [duplicate]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I'm working on a Python web application in which I have some small modules that serve very specific functions: session.py, logger.py, database.py, etc. And by "small" I really do mean small; each of these files currently includes around 3-5 lines of code, or maybe up to 10 at most. I might have a few imports and a class definition or two in each. I'm wondering, is there any reason I should or shouldn't merge these into one module, something like misc.py?
My thoughts are that having separate modules helps with code clarity, and later on, if by some chance these modules grow to more than 10 lines, I won't feel so bad about having them separated. But on the other hand, it just seems like such a waste to have a bunch of files with only a few lines in each! And is there any significant difference in resource usage between the multi-file vs. single-file approach? (Of course I'm nowhere near the point where I should be worrying about resource usage, but I couldn't resist asking...)
I checked around to see whether this had been asked before and didn't see anything specific to Python, but if it's in fact a duplicate, I'd appreciate being pointed in the right direction.

My thoughts are that having separate
modules helps with code clarity, and
later on, if by some chance these
modules grow to more than 10 lines, I
won't feel so bad about having them
separated.
This. Keep it the way you have it.

As a user of modules, I greatly prefer when I can include the entire module via a single import. Don't make a user of your package do multiple imports unless there's some reason to allow for importing different alternates.
BTW, there's no reason a single modules can't consist of multiple source files. The simplest case is to use an __init__.py file to simply load all the other code into the module's namespace.

Personally I find it easier to keep things like this in a single file, just for the practicality of editing a smaller number of files in my editor.
The important thing to do is treat the different pieces of code as though they were in separate files, so you ensure that you can trivially separate them later, for the reasons you cite. So for instance, don't introduce dependencies between the different pieces that will make it hard to disentangle them later.

For command line scripts there most likely will not be much difference unless each invocation invokes all files in the module, in which case there will be a slight performance cost as n files need to be opened vs one.
For mod_python there most likely will be no difference as byte-compiled modules stay alive for the duration of the apache process.
For google app engine though there will be a performance hit unless the service is constantly used and is "hot" as each cold start would require opening all files.

Off course you can have as many modules as you like.
But now let as think a little, what happens when we put every small code snippet into one single file.
We will end up in hundreds of import statements in any less trivial module. And off course you could also save a little by having all explicit in seperated files. But guess what: Nobody can remember so many module names and you might end up in searching for the right file anyway ...
I try to put things that belong together in one single file (unless it becomes to big!). But when I have small functions or classes that do not belong to other components in my system, I have "util" modules or the like. I also try to group these for example according to my application layering or seperate them by other means. One seperation criteria could be: Utilities that are used for UI and those that are not.

Small.

Is it a good (or bad) habit to make only one class in each python file? [duplicate]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
The community reviewed whether to reopen this question 3 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I'm used to the Java model where you can have one public class per file. Python doesn't have this restriction, and I'm wondering what's the best practice for organizing classes.

A Python file is called a "module" and it's one way to organize your software so that it makes "sense". Another is a directory, called a "package".
A module is a distinct thing that may have one or two dozen closely-related classes. The trick is that a module is something you'll import, and you need that import to be perfectly sensible to people who will read, maintain and extend your software.
The rule is this: a module is the unit of reuse.
You can't easily reuse a single class. You should be able to reuse a module without any difficulties. Everything in your library (and everything you download and add) is either a module or a package of modules.
For example, you're working on something that reads spreadsheets, does some calculations and loads the results into a database. What do you want your main program to look like?
from ssReader import Reader
from theCalcs import ACalc, AnotherCalc
from theDB import Loader
def main( sourceFileName ):
rdr= Reader( sourceFileName )
c1= ACalc( options )
c2= AnotherCalc( options )
ldr= Loader( parameters )
for myObj in rdr.readAll():
c1.thisOp( myObj )
c2.thatOp( myObj )
ldr.laod( myObj )
Think of the import as the way to organize your code in concepts or chunks. Exactly how many classes are in each import doesn't matter. What matters is the overall organization that you're portraying with your import statements.

Since there is no artificial limit, it really depends on what's comprehensible. If you have a bunch of fairly short, simple classes that are logically grouped together, toss in a bunch of 'em. If you have big, complex classes or classes that don't make sense as a group, go one file per class. Or pick something in between. Refactor as things change.

I happen to like the Java model for the following reason. Placing each class in an individual file promotes reuse by making classes easier to see when browsing the source code. If you have a bunch of classes grouped into a single file, it may not be obvious to other developers that there are classes there that can be reused simply by browsing the project's directory structure. Thus, if you think that your class can possibly be reused, I would put it in its own file.

It entirely depends on how big the project is, how long the classes are, if they will be used from other files and so on.
For example I quite often use a series of classes for data-abstraction - so I may have 4 or 5 classes that may only be 1 line long (class SomeData: pass).
It would be stupid to split each of these into separate files - but since they may be used from different files, putting all these in a separate data_model.py file would make sense, so I can do from mypackage.data_model import SomeData, SomeSubData
If you have a class with lots of code in it, maybe with some functions only it uses, it would be a good idea to split this class and the helper-functions into a separate file.
You should structure them so you do from mypackage.database.schema import MyModel, not from mypackage.email.errors import MyDatabaseModel - if where you are importing things from make sense, and the files aren't tens of thousands of lines long, you have organised it correctly.
The Python Modules documentation has some useful information on organising packages.

I find myself splitting things up when I get annoyed with the bigness of files and when the desirable structure of relatedness starts to emerge naturally. Often these two stages seem to coincide.
It can be very annoying if you split things up too early, because you start to realise that a totally different ordering of structure is required.
On the other hand, when any .java or .py file is getting to more than about 700 lines I start to get annoyed constantly trying to remember where "that particular bit" is.
With Python/Jython circular dependency of import statements also seems to play a role: if you try to split too many cooperating basic building blocks into separate files this "restriction"/"imperfection" of the language seems to force you to group things, perhaps in rather a sensible way.
As to splitting into packages, I don't really know, but I'd say probably the same rule of annoyance and emergence of happy structure works at all levels of modularity.

I would say to put as many classes as can be logically grouped in that file without making it too big and complex.

Factory pattern in Python [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I'm currently implementing the Factory design pattern in Python and I have a few questions.
Is there any way to prevent the direct instantiation of the actual concrete classes? For example, if I have a VehicleFactory that spawns Vehicles, I want users to just use that factory, and prevent anyone from accidentally instantiating Car() or Truck() directly. I can throw an exception in init() perhaps, but that would also mean that the factory can't create an instance of it...
It seems to me now that factories are getting addictive. Seems like everything should become a factory so that when I change internal implementation, the client codes will not change. I'm interested to know when is there an actual need to use factories, and when is it not appropriate to use. For example, I might have a Window class and there's only one of this type now (no PlasticWindow, ReinforcedWindow or anything like that). In that case, should I use a factory for the client to generate the Window, just in case I might add more types of Windows in the future?
I'm just wondering if there is a usual way of calling the factories. For example, now I'm calling my Vehicle factory as Vehicles, so the codes will go something like Vehicles.create(...). I see a lot of tutorials doing it like VehicleFactory, but I find it too long and it sort of exposes the implementation as well.
EDIT: What I meant by "exposes the implementation" is that it lets people know that it's a factory. What I felt was that the client need not know that it's a factory, but rather as some class that can return objects for you (which is a factory of course but maybe there's no need to explicitly tell clients that?). I know that the soure codes are easily exposed, so I didn't mean "exposing the way the functionalities are implemented in the source codes".
Thanks!

Be Pythonic. Don't overcomplicate your code with "enterprise" language (like Java) solutions that add unnecessary levels of abstraction.
Your code should be simple, and intuitive. You shouldn't need to delegate to another class to instantiate another.

Don't expose the class (for example make it private __MyClass, or obvious that you don't want it used directly _MyClass). This way it can only be instantiated via the factory function.
Perhaps you should review the use of keyword arguments, and inheritance. It sounds like you may be overlooking these, which will generally reduce your dependence on complex factories (To be honest, I've rarely needed factories).
In Python you cannot easily protect against exposing implementation, it goes against the Zen of Python. (It's the same in any language, a determined individual can get what they want eventually). At most you should try to ensure that a user of your code does not accidentally do the wrong thing, but never presume to know what the end-user may decide to achieve with your code. Don't make it obfuscated and difficult to work with.

Is there any way to prevent the direct instantiation of the actual concrete classes?
Why? Are your programmers evil sociopaths who refuse to follow the rules? If you provide a factory -- and the factory does what people need -- then they'll use the factory.
You can't "prevent" anything. Remember. This is Python -- they have the source.
should I use a factory for the client to generate the Window, just in case I might add more types of Windows in the future?
Meh. Neither good nor bad. It can get cumbersome to manage all the class-hierarchy-and-factory details.
Adding a factory isn't hard. This is Python -- you have all the source at all times -- you can use grep to find a class constructor and replace it with a factory when you need to.
Since you can use grep to find and fix your mistakes, you don't need to pre-plan this kind of thing as much as you might in Java or C++.
I see a lot of tutorials doing it like VehicleFactory, but I find it too long and it sort of exposes the implementation as well.
"Too Long"? It's used so rarely that it barely matters. Use long names -- it helps other folks understand what you're doing. This is not Code Golf where fewest keystrokes wins.
"exposes the implementation"? First, It exposes nothing. Second, this is Python -- you have all the source at all times -- everything is already exposed.
Stop thinking so much about prevention and privacy. It isn't helpful.

Learning to write reusable libraries [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
We need to write simple scripts to manipulate the configuration of our load balancers (ie, drain nodes from pools, enabled or disable traffic rules). The load balancers have a SOAP API (defined through a bunch of WSDL files) which is very comprehensive but using it is quite low-level with a lot of manual error checking and list manipulation. It doesn't tend to produce reusable, robust code.
I'd like to write a Python library to handle the nitty-gritty of interacting with the SOAP interface but I don't really know where to start; all of my coding experience is with writing one-off monolithic programs for specific jobs. This is fine for small jobs but it's not helping me or my coworkers -- we're reinventing the wheel with a different number of spokes each time :~)
The API already provides methods like getPoolNames() and getDrainingNodes() but they're a bit awkward to use. Most take a list of nodes and return another list, so (say) working out which virtual servers are enabled involves this sort of thing:
names = conn.getVirtualServerNames()
enabled = conn.getEnabled(names)
for i in range(0, len(names)):
if (enabled[i]):
print names[i]
conn.setEnabled(['www.example.com'], [0])
Whereas something like this:
lb = LoadBalancer('hostname')
for name in [vs.name for vs in lb.virtualServers() if vs.isEnabled()]:
print name
www = lb.virtualServer('www.example.com').disable()
is more Pythonic and (IMHO) easier.
There are a lot of things I'm not sure about: how to handle errors, how to deal with 20-odd WSDL files (a SOAPpy/suds instance for each?) and how much boilerplate translation from the API methods to my methods I'll need to do.
This is more an example of a wider problem (how to learn to write libraries instead of one-off scripts) so I don't want answers to these specific questions -- they're there to demonstrate my thinking and illustrate my problem. I recognise a code smell in the way I do things at the moment (one-off, non-reusable code) but I don't know how to fix it. How does one get into the mindset for tackling problems at a more abstract level? How do you 'learn' software design?

"I don't really know where to start"
Clearly false. You provided an excellent example. Just do more of that. It's that simple.
"There are a lot of things I'm not sure about: how to handle errors, how to deal with 20-odd WSDL files (a SOAPpy/suds instance for each?) and how much boilerplate translation from the API methods to my methods I'll need to do."
Handle errors by raising an exception. That's enough. Remember, you're still going to have high-level scripts using your API library.
20-odd WSDL files? Just pick something for now. Don't overengineer this. Design the API -- as you did with your example -- for the things you want to do. The WSDL's and the number of instances will become clear as you go. One, Ten, Twenty doesn't really matter to users of your API library. It only matters to you, the maintainer. Focus on the users.
Boilerplate translation? As little as possible. Focus on what parts of these interfaces you use with your actual scripts. Translate just what you need and nothing more.
An API is not fixed, cast in concrete, a thing of beauty and a joy forever. It's just a module (in your case a package might be better) that does some useful stuff.
It will undergo constant change and evolution.
Don't overengineer the first release. Build something useful that works for one use case. Then add use cases to it.
"But what if I realize I did something wrong?" That's inevitable, you'll always reach this point. Don't worry about it now.
The most important thing about writing an API library is writing the unit tests that (a) demonstrate how it works and (b) prove that it actually works.

There's an excellent presentation by Joshua Bloch on API design (and thus leading to library design). It's well worth watching. IIRC it's Java-focused, but the principles will apply to any language.

If you are not afraid of C++, there is an excellent book on the subject called "Large-scale C++ Software Design".
This book will guide you through the steps of designing a library by introducing "physical" and "logical" design.
For instance, you'll learn to flatten your components' hierarchy, to restrict dependency between components, to create levels of abstraction.
The is really "the" book on software design IMHO.

All code in one file [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
After asking organising my Python project and then calling from a parent file in Python it's occurring to me that it'll be so much easier to put all my code in one file (data will be read in externally).
I've always thought that this was bad project organisation but it seems to be the easiest way to deal with the problems I'm thinking I will face. Have I simply gotten the wrong end of the stick with file count or have I not seen some great guide on large (for me) projects?

If you are planning to use any kind of SCM then you are going to be screwed. Having one file is a guaranteed way to have lots of collisions and merges that will be painstaking to deal with over time.
Stick to conventions and break apart your files. If nothing more than to save the guy who will one day have to maintain your code...

If your code is going to work together all the time anyway, and isn't useful separately, there's nothing wrong with keeping everything in one file. I can think of at least popular package (BeautifulSoup) that does this. Sure makes installation easier.
Of course, if it seems, down the road, that you could use part of your code with another project, or if maintainance starts to be an issue, then worry about organizing your project differently.
It seems to me from the questions you've been asking lately that you're worrying about all of this a bit prematurely. Often, for me, these sorts of issues are better tackled a little later on in the solution. Especially for smaller projects, my goal is to get a solution that is correct, and then optimal.

It's always a now verses then argument. If you're under the gun to get it done, do it. Source control will be a problem later, as with many things there's no black and white answer. You need to be responsible to both your deadline and the long term maintenance of the code.

If that's the best way to organise it, you're probably doing something wrong.
If it's more than just a toy program or a simple script, then you should break it up into separate files, etc. It's the only sane way of doing it. When your project gets big enough that you need someone else helping on it, then it will make the SCM a whole bunch easier.
Additionally, sooner or later you are going to need to add a separate utility to your project, that is going to need some common code/structures. It's far easier to do this if you have separate source files than if you have just one big one.

Looking at your earlier questions I would say all code in one file would be a good intermediate state on the way to a complete refactoring of your project. To do this you'll need a regression test suite to make sure you don't break the project while refactoring it.
Once all your code is in one file, I suggest iterating on the following:
Identify a small group of interdependent classes.
Pull those classes into a separate file.
Add unit tests for the new separate file.
Retest the entire project.
Depending on the size of your project, it shouldn't take too many iterations for you to reach something reasonable.

Since Calling from a parent file in Python indicates serious design problems, I'd say that you have two choices.
Don't have a library module try to call back to main. You'll have to rewrite things to fix this.
[An imported component calling the main program is an improper dependency. And Python doesn't support it because it's a poor design.]
Put it all in one file until you figure out a better design with proper one-way dependencies. Then you'll have to rewrite it to fix the dependency problems.
A module (a single file) should be a logical piece of related code. Not everything. Not a single class definition. There's a middle ground of modularity.
Additionally, there should be a proper one-way dependency graph from main program to components (which do NOT depend on the main program) to utility libraries and what-not (that do not know about the components OR the main program.
Circular (or mutual) dependencies often indicate a design problem. Callbacks are one way out of the problem. Another way is to decompose the circular elements to get a proper one-way graph.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.