Keep files (modules) as "big" as possible [duplicate]

Keep files (modules) as "big" as possible [duplicate] - python

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I'm working on a Python web application in which I have some small modules that serve very specific functions: session.py, logger.py, database.py, etc. And by "small" I really do mean small; each of these files currently includes around 3-5 lines of code, or maybe up to 10 at most. I might have a few imports and a class definition or two in each. I'm wondering, is there any reason I should or shouldn't merge these into one module, something like misc.py?
My thoughts are that having separate modules helps with code clarity, and later on, if by some chance these modules grow to more than 10 lines, I won't feel so bad about having them separated. But on the other hand, it just seems like such a waste to have a bunch of files with only a few lines in each! And is there any significant difference in resource usage between the multi-file vs. single-file approach? (Of course I'm nowhere near the point where I should be worrying about resource usage, but I couldn't resist asking...)
I checked around to see whether this had been asked before and didn't see anything specific to Python, but if it's in fact a duplicate, I'd appreciate being pointed in the right direction.

My thoughts are that having separate
modules helps with code clarity, and
later on, if by some chance these
modules grow to more than 10 lines, I
won't feel so bad about having them
separated.
This. Keep it the way you have it.

As a user of modules, I greatly prefer when I can include the entire module via a single import. Don't make a user of your package do multiple imports unless there's some reason to allow for importing different alternates.
BTW, there's no reason a single modules can't consist of multiple source files. The simplest case is to use an __init__.py file to simply load all the other code into the module's namespace.

Personally I find it easier to keep things like this in a single file, just for the practicality of editing a smaller number of files in my editor.
The important thing to do is treat the different pieces of code as though they were in separate files, so you ensure that you can trivially separate them later, for the reasons you cite. So for instance, don't introduce dependencies between the different pieces that will make it hard to disentangle them later.

For command line scripts there most likely will not be much difference unless each invocation invokes all files in the module, in which case there will be a slight performance cost as n files need to be opened vs one.
For mod_python there most likely will be no difference as byte-compiled modules stay alive for the duration of the apache process.
For google app engine though there will be a performance hit unless the service is constantly used and is "hot" as each cold start would require opening all files.

Off course you can have as many modules as you like.
But now let as think a little, what happens when we put every small code snippet into one single file.
We will end up in hundreds of import statements in any less trivial module. And off course you could also save a little by having all explicit in seperated files. But guess what: Nobody can remember so many module names and you might end up in searching for the right file anyway ...
I try to put things that belong together in one single file (unless it becomes to big!). But when I have small functions or classes that do not belong to other components in my system, I have "util" modules or the like. I also try to group these for example according to my application layering or seperate them by other means. One seperation criteria could be: Utilities that are used for UI and those that are not.

Small.

Related

Importing code from one script into another script

I'm new to Python so I searched for beginner projects in order to practice my skills. I came across a project on Edureka where you have to program a simple word game called Hangman (https://www.edureka.co/blog/python-projects/#hangman). The whole code consists of different scripts, and a part of one script is then improted into another, like in this case (Words.py)
import random
WORDLIST = 'wordlist.txt'
def get_random_word(min_word_length):
...
and then (Hangman.py)
from string import ascii_lowercase
from words import get_random_word
So they first created a function in a script Words.py and then imported it in another script Hangman.py. My question is: why is a code sometimes separated into several scripts and then parts of one imported into other? Can't one script just contain eveyrthing?
Thank you

Using multiple files to create sub-modules helps keep the code organised and makes reusing code between projects/functions much easier.
Functions and variables defined within a module importable into other modules and allows you to scope your function and variable names without worrying about conflicts.

It is basic organisation. Imagine a library would glue every new book to a stack of the old ones. Lord of the Rings would turn from a door stopper to a door. After defeating Sauron, the reader would smoothly transition to The Little Mermaid, before plunging into 50 Shades of Grey.
Small files are easier to stomach. If every file serves only one topic, you know quickly where to look. You also immediately know what does not belong to the topic, without having to read through commentary.
Multiple files allow for non-linear organisation. The building blocks of a program rarely follow a single, linear chain of interactions. Loosely coupled components are easily represented by individual files, and folders allow to add external structure.
Distinct files are easier to reorganise. As complexity grows, components move to sub packages, and sometimes you just need to clean up. A file can simply be moved as a whole. Copy/Pasting to migrate code is more work, especially if you have tacked on all the structuring manually.
Basically, a single file is easy to write. Multiple files are much easy to read, maintain and manage. For software development, the later is much more important. Even if you are working alone, future-you does not know everything that past-you has done.

Is it a good (or bad) habit to make only one class in each python file? [duplicate]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
The community reviewed whether to reopen this question 3 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I'm used to the Java model where you can have one public class per file. Python doesn't have this restriction, and I'm wondering what's the best practice for organizing classes.

A Python file is called a "module" and it's one way to organize your software so that it makes "sense". Another is a directory, called a "package".
A module is a distinct thing that may have one or two dozen closely-related classes. The trick is that a module is something you'll import, and you need that import to be perfectly sensible to people who will read, maintain and extend your software.
The rule is this: a module is the unit of reuse.
You can't easily reuse a single class. You should be able to reuse a module without any difficulties. Everything in your library (and everything you download and add) is either a module or a package of modules.
For example, you're working on something that reads spreadsheets, does some calculations and loads the results into a database. What do you want your main program to look like?
from ssReader import Reader
from theCalcs import ACalc, AnotherCalc
from theDB import Loader
def main( sourceFileName ):
rdr= Reader( sourceFileName )
c1= ACalc( options )
c2= AnotherCalc( options )
ldr= Loader( parameters )
for myObj in rdr.readAll():
c1.thisOp( myObj )
c2.thatOp( myObj )
ldr.laod( myObj )
Think of the import as the way to organize your code in concepts or chunks. Exactly how many classes are in each import doesn't matter. What matters is the overall organization that you're portraying with your import statements.

Since there is no artificial limit, it really depends on what's comprehensible. If you have a bunch of fairly short, simple classes that are logically grouped together, toss in a bunch of 'em. If you have big, complex classes or classes that don't make sense as a group, go one file per class. Or pick something in between. Refactor as things change.

I happen to like the Java model for the following reason. Placing each class in an individual file promotes reuse by making classes easier to see when browsing the source code. If you have a bunch of classes grouped into a single file, it may not be obvious to other developers that there are classes there that can be reused simply by browsing the project's directory structure. Thus, if you think that your class can possibly be reused, I would put it in its own file.

It entirely depends on how big the project is, how long the classes are, if they will be used from other files and so on.
For example I quite often use a series of classes for data-abstraction - so I may have 4 or 5 classes that may only be 1 line long (class SomeData: pass).
It would be stupid to split each of these into separate files - but since they may be used from different files, putting all these in a separate data_model.py file would make sense, so I can do from mypackage.data_model import SomeData, SomeSubData
If you have a class with lots of code in it, maybe with some functions only it uses, it would be a good idea to split this class and the helper-functions into a separate file.
You should structure them so you do from mypackage.database.schema import MyModel, not from mypackage.email.errors import MyDatabaseModel - if where you are importing things from make sense, and the files aren't tens of thousands of lines long, you have organised it correctly.
The Python Modules documentation has some useful information on organising packages.

I find myself splitting things up when I get annoyed with the bigness of files and when the desirable structure of relatedness starts to emerge naturally. Often these two stages seem to coincide.
It can be very annoying if you split things up too early, because you start to realise that a totally different ordering of structure is required.
On the other hand, when any .java or .py file is getting to more than about 700 lines I start to get annoyed constantly trying to remember where "that particular bit" is.
With Python/Jython circular dependency of import statements also seems to play a role: if you try to split too many cooperating basic building blocks into separate files this "restriction"/"imperfection" of the language seems to force you to group things, perhaps in rather a sensible way.
As to splitting into packages, I don't really know, but I'd say probably the same rule of annoyance and emergence of happy structure works at all levels of modularity.

I would say to put as many classes as can be logically grouped in that file without making it too big and complex.

Python: one single module (file .py) for each class? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I've started programming in python 2 weeks ago. I'm making a separate file (module) for each class as I've done before in languages like Java or C#.
But now, seeing tutorials and code from other people, I've realized that many people use the same files to define more than 1 class and the main function but I don't know if they do it like that because are just examples or because it's a python convention or something like that (to define and group many classes in the same files).
So, in Python, one file for each class or many classes in the same files if they can be grouped by any particular feature? (like motor vehicles by one side and just vehicles by the other side).
It's obvious that each one has his own style, but when I ask, I hope general answers or just the conventions, anyway, if someone wants to tell me his opinion about his own style and why, feel free to do it! ;)

one file for each class
Do not do this. In Java, you usually will not have more than one class in a file (you can, of course nest).
In Python, if you group related classes in a single file, you are on the safe side. Take a look at the Python standard library: many modules contain multiple classes in a single file.
As for the why? In short: Readability. I, personally, enjoy not having to switch between files to read related or similar code. It also makes imports more concise.
Imagine socketserver.py would spread UDPServer, TCPServer, ForkingUDPServer, ForkingTCPServer, ThreadingUDPServer, ThreadingTCPServer, BaseRequestHandler, StreamRequestHandler, DatagramRequestHandler into nine files. How would you import these? Like this?
from socketserver.tcp.server import TCPServer
from socketserver.tcp.server.forking import ForkingTCPServer
...
That's plain overhead. It's overhead, when you write it. It's overhead, when you read it. Isn't this easier?
from socketserver import TCPServer, ForkingTCPServer
That said, no one will stop you, if you put each class into a single file. It just might not be pythonic.

Python has the concept of packages, modules and classes. If you put one class per module, the advantage of having modules is gone. If you have a huge class, it might be ok to put this class in a separate file, but then again, is it good to have big classes? NO, it's hard to test and maintain. Better have more small classes with specific tasks and put them logically grouped in as few files as possible.

It's not wrong to have one class per file at all. Python isn't directly aimed at object oriented design so that's why you can get away with multiple classes per file.
I recommend having a read over some style guides if you're confused about what the 'proper' way to do it is.
I suggest either Google's style guide or the official style guide by the Python Foundation
You can also find more material relating to Python's idioms and meta analysis in the PEP index

Abbreviating several functions - guidelines? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
This is probably a really dumb question but I will ask anyway.
There are two ways to present this code:
file = "picture.jpg"
pic = makePicture(file)
show(pic)
or
show(makePicture("picture.jpg"))
This is just an example of how it can be abbreviated and I have seen it with other functions. But it always confuses me when I need to do it. Is there any rule of thumb when combining functions like this? It seems to me you work backwards picking out the functions as you go and ending with either the file or the function that chooses the file (i.e pickAFile()). Does this sound right?
Please keep explanations simple enough that a smart dog could understand.

Chiming in, because I think that style does matter. I would definitely pick show(makePicture("picture.jpg")) if you don't ever reuse "picture.jpg" and makePicture(…). The reason are that:
This is perfectly legible.
This makes the code faster to read (no need to spend more time than needed on it).
If you use variables, you are sending a signal to people reading the code (including you, after some time) that the variables are reused somewhere in the code and that they should better be put in their working (short-term) memory. Our short-term memory is limited (in the 1960s, experiments have shown that one remembers about 7 pieces of information at a time, and some modern experiments came up with lower numbers). So, if the variables are not reused anywhere, they often should be removed so as to not pollute the reader's short-term memory.
I think that your question is very valid and that you should definitely not use intermediate variables here unless they are necessary (because they are reused, or because they help break a complex expression in directly intelligible parts). This practice will make your code more legible and will give you good habits.
PS: As noted by Blender, having many nested function calls can make the code hard to read. If this is the case, I do recommend considering using intermediate variables to hold pieces of information that make sense, so that the function calls do not contain too many levels of nesting.
PPS: As noted by pcurry, nested function calls can also be easily broken down into many lines, if they become too long, which can make the code about as legible as if using intermediate variables, with the benefit of not using any:
print_summary(
energy=solar_panel.energy_produced(time_of_the_day),
losses=solar_panel.loss_ratio(),
output_path="/tmp/out.txt"
)

When you write:
pic = makePicture(file)
You call makePicture with file as its argument and put the output of that function into the variable pic. If all you do with pic is use it as an argument to show, you don't really need to use pic at all. It's just a temporary variable. Your second example does just that and passes the output of makePicture(file) directly as the first argument to show, without using a temporary variable like pic.
Now, if you're using pic somewhere else, there's really no way to get around using it. If you don't reuse the temporary variables, pick whatever way you like. Just make sure it's readable.

It's all at the discretion of the programmer, if you're planning on making a larger program you might want to keep the statements separate so you can refer back to the file.
Readability is always important if you're working with a team of programmers but if this is just something you're doing by yourself, do whatever's most comfortable.

show(makePicture("picture.jpg")) is more readable than the longer version for reasons discussed in other answers. I have also found that trying to eliminate intermediate variables very often results in a better solution. However, there are cases where a descriptive naming of a complex intermediate result will make code more readable.

All code in one file [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
After asking organising my Python project and then calling from a parent file in Python it's occurring to me that it'll be so much easier to put all my code in one file (data will be read in externally).
I've always thought that this was bad project organisation but it seems to be the easiest way to deal with the problems I'm thinking I will face. Have I simply gotten the wrong end of the stick with file count or have I not seen some great guide on large (for me) projects?

If you are planning to use any kind of SCM then you are going to be screwed. Having one file is a guaranteed way to have lots of collisions and merges that will be painstaking to deal with over time.
Stick to conventions and break apart your files. If nothing more than to save the guy who will one day have to maintain your code...

If your code is going to work together all the time anyway, and isn't useful separately, there's nothing wrong with keeping everything in one file. I can think of at least popular package (BeautifulSoup) that does this. Sure makes installation easier.
Of course, if it seems, down the road, that you could use part of your code with another project, or if maintainance starts to be an issue, then worry about organizing your project differently.
It seems to me from the questions you've been asking lately that you're worrying about all of this a bit prematurely. Often, for me, these sorts of issues are better tackled a little later on in the solution. Especially for smaller projects, my goal is to get a solution that is correct, and then optimal.

It's always a now verses then argument. If you're under the gun to get it done, do it. Source control will be a problem later, as with many things there's no black and white answer. You need to be responsible to both your deadline and the long term maintenance of the code.

If that's the best way to organise it, you're probably doing something wrong.
If it's more than just a toy program or a simple script, then you should break it up into separate files, etc. It's the only sane way of doing it. When your project gets big enough that you need someone else helping on it, then it will make the SCM a whole bunch easier.
Additionally, sooner or later you are going to need to add a separate utility to your project, that is going to need some common code/structures. It's far easier to do this if you have separate source files than if you have just one big one.

Looking at your earlier questions I would say all code in one file would be a good intermediate state on the way to a complete refactoring of your project. To do this you'll need a regression test suite to make sure you don't break the project while refactoring it.
Once all your code is in one file, I suggest iterating on the following:
Identify a small group of interdependent classes.
Pull those classes into a separate file.
Add unit tests for the new separate file.
Retest the entire project.
Depending on the size of your project, it shouldn't take too many iterations for you to reach something reasonable.

Since Calling from a parent file in Python indicates serious design problems, I'd say that you have two choices.
Don't have a library module try to call back to main. You'll have to rewrite things to fix this.
[An imported component calling the main program is an improper dependency. And Python doesn't support it because it's a poor design.]
Put it all in one file until you figure out a better design with proper one-way dependencies. Then you'll have to rewrite it to fix the dependency problems.
A module (a single file) should be a logical piece of related code. Not everything. Not a single class definition. There's a middle ground of modularity.
Additionally, there should be a proper one-way dependency graph from main program to components (which do NOT depend on the main program) to utility libraries and what-not (that do not know about the components OR the main program.
Circular (or mutual) dependencies often indicate a design problem. Callbacks are one way out of the problem. Another way is to decompose the circular elements to get a proper one-way graph.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Keep files (modules) as "big" as possible [duplicate] - python

My thoughts are that having separate modules helps with code clarity, and later on, if by some chance these modules grow to more than 10 lines, I won't feel so bad about having them separated. This. Keep it the way you have it.

Small.

Related

Importing code from one script into another script

Is it a good (or bad) habit to make only one class in each python file? [duplicate]

Python: one single module (file .py) for each class? [closed]

Abbreviating several functions - guidelines? [closed]

All code in one file [closed]

Categories

Resources