Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
This is probably a really dumb question but I will ask anyway.
There are two ways to present this code:
file = "picture.jpg"
pic = makePicture(file)
show(pic)
or
show(makePicture("picture.jpg"))
This is just an example of how it can be abbreviated and I have seen it with other functions. But it always confuses me when I need to do it. Is there any rule of thumb when combining functions like this? It seems to me you work backwards picking out the functions as you go and ending with either the file or the function that chooses the file (i.e pickAFile()). Does this sound right?
Please keep explanations simple enough that a smart dog could understand.
Chiming in, because I think that style does matter. I would definitely pick show(makePicture("picture.jpg")) if you don't ever reuse "picture.jpg" and makePicture(…). The reason are that:
This is perfectly legible.
This makes the code faster to read (no need to spend more time than needed on it).
If you use variables, you are sending a signal to people reading the code (including you, after some time) that the variables are reused somewhere in the code and that they should better be put in their working (short-term) memory. Our short-term memory is limited (in the 1960s, experiments have shown that one remembers about 7 pieces of information at a time, and some modern experiments came up with lower numbers). So, if the variables are not reused anywhere, they often should be removed so as to not pollute the reader's short-term memory.
I think that your question is very valid and that you should definitely not use intermediate variables here unless they are necessary (because they are reused, or because they help break a complex expression in directly intelligible parts). This practice will make your code more legible and will give you good habits.
PS: As noted by Blender, having many nested function calls can make the code hard to read. If this is the case, I do recommend considering using intermediate variables to hold pieces of information that make sense, so that the function calls do not contain too many levels of nesting.
PPS: As noted by pcurry, nested function calls can also be easily broken down into many lines, if they become too long, which can make the code about as legible as if using intermediate variables, with the benefit of not using any:
print_summary(
energy=solar_panel.energy_produced(time_of_the_day),
losses=solar_panel.loss_ratio(),
output_path="/tmp/out.txt"
)
When you write:
pic = makePicture(file)
You call makePicture with file as its argument and put the output of that function into the variable pic. If all you do with pic is use it as an argument to show, you don't really need to use pic at all. It's just a temporary variable. Your second example does just that and passes the output of makePicture(file) directly as the first argument to show, without using a temporary variable like pic.
Now, if you're using pic somewhere else, there's really no way to get around using it. If you don't reuse the temporary variables, pick whatever way you like. Just make sure it's readable.
It's all at the discretion of the programmer, if you're planning on making a larger program you might want to keep the statements separate so you can refer back to the file.
Readability is always important if you're working with a team of programmers but if this is just something you're doing by yourself, do whatever's most comfortable.
show(makePicture("picture.jpg")) is more readable than the longer version for reasons discussed in other answers. I have also found that trying to eliminate intermediate variables very often results in a better solution. However, there are cases where a descriptive naming of a complex intermediate result will make code more readable.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
The first part is what I want to do and the questions. Before discussing why I want to to that and proposing counterarguments, please read the motivation in the second part.
In short: I am not a developer. The main use of Python is fast prototyping of mathematical methods; additional motivation is learning how Python is implemented. This topic is not of crucial importance for me. If it seems lame and off-topic, feel free to remove, and I apologize for the inconvenience.
This feature does not introduce new functionality but serves as a shortcut for lambda.
The idea is borrowed from wolfram-language.
If the symbol of closing parenthesis is preceded with &, then the code inside the parentheses is interpreted as the definition of a function, where `1, `2, ... play the role of its arguments.
Example: (`1 + `2 &)(a, b) means (lambda x, y: x + y)(a, b)
Provided that I learn everything needed about Python, how hard / time consuming is to implement that extension? At the moment, I see two options:
1.a. Preprocessing text of the script before compiling (I use iPython in Anaconda).
Immediate problem: assigning unique names to arguments. Possible workaround: reserve names such as __my_lambda_123.
1.b. Modifying CPython similarly as described in https://hackernoon.com/modifying-the-python-language-in-7-minutes-b94b0a99ce14
Imagine that I implemented that feature correctly. Do you immediately see that it breaks something essential in Python, or iPython, or Anaconda? Assume that I do not use any developers' packages such as unittest, but a lot of "scientific" packages including numpy, as well as "interface" packages such as sqlalchemy.
Motivation. I gradually study Python as a programming language and appreciate its deepness, consistency and unique philosophy. I understand that my idea is not in line with the latter. However, I use Python for implementing mathematical methods which are barely reusable. A typical life cycle is following: (1) implement some mathematical method and experiments for a research project; (1.a) maybe save some function/class in my package if it feels reusable; (2) conduct computational experiments; (3) publish a paper; (4) never use the code again. It is much easier to implement an algorithm from scratch than to structure and reuse the code since the coincidence between large parts of different methods is very rare.
My typical project is one large Python script with long code fragments. Even structuring the code into functions is not time-efficient, since the life cycle of my program does not include "deploy", "maintain", "modify". I keep the amount of structure to a minimum needed for fast implementing and debugging.
I would use wolfram-mathematica but in my recent projects, it became useless due to the limitations of its standard libraries, the poor performance of the code in Wolfram language, and overall closeness of the platform. I switch to Python for its rich selection of libraries, and also with the intent of acquiring some software developer's skills. However, at the moment, programming in the Wolfram language style is much more efficient for me. The code of algorithms feels much more readable when it is more compact (you do not need to scroll), and includes less language-specific words such as lambda.
Just as a heads up, the issue you raise in 1a is called macro hygiene.
It's also a bit sketchy to be doing the "lambda replacing" in text, before it's converted to an abstract syntax tree (AST). This is certainly going to be prone to errors since now you have to explicitly deal with various parsing issues and the actual replacement in one go.
If you do go this route (I don't recommend it), I recommend you also look at Racket's macro system, which can do what you want.
There are also other potential problems you might run into - you need to think about how you want strings such as ("`1" + `1)(a) to parse, or, for example, strings such as (`2 + `3)(a, b) - is this an error or is it ok (if so, which argument goes where?). These are the kinds of test cases you need to think about if you're sure you want to design an addition to Python's syntax.
There's also a practical consideration - you'll essentially need to support your own fork of Python, so you won't be able to get updates to the language without redeveloping this feature for each release (kind of? I think).
TLDR: I highly recommend you don't do this, just use lambdas.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I'm working on a Python web application in which I have some small modules that serve very specific functions: session.py, logger.py, database.py, etc. And by "small" I really do mean small; each of these files currently includes around 3-5 lines of code, or maybe up to 10 at most. I might have a few imports and a class definition or two in each. I'm wondering, is there any reason I should or shouldn't merge these into one module, something like misc.py?
My thoughts are that having separate modules helps with code clarity, and later on, if by some chance these modules grow to more than 10 lines, I won't feel so bad about having them separated. But on the other hand, it just seems like such a waste to have a bunch of files with only a few lines in each! And is there any significant difference in resource usage between the multi-file vs. single-file approach? (Of course I'm nowhere near the point where I should be worrying about resource usage, but I couldn't resist asking...)
I checked around to see whether this had been asked before and didn't see anything specific to Python, but if it's in fact a duplicate, I'd appreciate being pointed in the right direction.
My thoughts are that having separate
modules helps with code clarity, and
later on, if by some chance these
modules grow to more than 10 lines, I
won't feel so bad about having them
separated.
This. Keep it the way you have it.
As a user of modules, I greatly prefer when I can include the entire module via a single import. Don't make a user of your package do multiple imports unless there's some reason to allow for importing different alternates.
BTW, there's no reason a single modules can't consist of multiple source files. The simplest case is to use an __init__.py file to simply load all the other code into the module's namespace.
Personally I find it easier to keep things like this in a single file, just for the practicality of editing a smaller number of files in my editor.
The important thing to do is treat the different pieces of code as though they were in separate files, so you ensure that you can trivially separate them later, for the reasons you cite. So for instance, don't introduce dependencies between the different pieces that will make it hard to disentangle them later.
For command line scripts there most likely will not be much difference unless each invocation invokes all files in the module, in which case there will be a slight performance cost as n files need to be opened vs one.
For mod_python there most likely will be no difference as byte-compiled modules stay alive for the duration of the apache process.
For google app engine though there will be a performance hit unless the service is constantly used and is "hot" as each cold start would require opening all files.
Off course you can have as many modules as you like.
But now let as think a little, what happens when we put every small code snippet into one single file.
We will end up in hundreds of import statements in any less trivial module. And off course you could also save a little by having all explicit in seperated files. But guess what: Nobody can remember so many module names and you might end up in searching for the right file anyway ...
I try to put things that belong together in one single file (unless it becomes to big!). But when I have small functions or classes that do not belong to other components in my system, I have "util" modules or the like. I also try to group these for example according to my application layering or seperate them by other means. One seperation criteria could be: Utilities that are used for UI and those that are not.
Small.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
The community reviewed whether to reopen this question 3 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I'm used to the Java model where you can have one public class per file. Python doesn't have this restriction, and I'm wondering what's the best practice for organizing classes.
A Python file is called a "module" and it's one way to organize your software so that it makes "sense". Another is a directory, called a "package".
A module is a distinct thing that may have one or two dozen closely-related classes. The trick is that a module is something you'll import, and you need that import to be perfectly sensible to people who will read, maintain and extend your software.
The rule is this: a module is the unit of reuse.
You can't easily reuse a single class. You should be able to reuse a module without any difficulties. Everything in your library (and everything you download and add) is either a module or a package of modules.
For example, you're working on something that reads spreadsheets, does some calculations and loads the results into a database. What do you want your main program to look like?
from ssReader import Reader
from theCalcs import ACalc, AnotherCalc
from theDB import Loader
def main( sourceFileName ):
rdr= Reader( sourceFileName )
c1= ACalc( options )
c2= AnotherCalc( options )
ldr= Loader( parameters )
for myObj in rdr.readAll():
c1.thisOp( myObj )
c2.thatOp( myObj )
ldr.laod( myObj )
Think of the import as the way to organize your code in concepts or chunks. Exactly how many classes are in each import doesn't matter. What matters is the overall organization that you're portraying with your import statements.
Since there is no artificial limit, it really depends on what's comprehensible. If you have a bunch of fairly short, simple classes that are logically grouped together, toss in a bunch of 'em. If you have big, complex classes or classes that don't make sense as a group, go one file per class. Or pick something in between. Refactor as things change.
I happen to like the Java model for the following reason. Placing each class in an individual file promotes reuse by making classes easier to see when browsing the source code. If you have a bunch of classes grouped into a single file, it may not be obvious to other developers that there are classes there that can be reused simply by browsing the project's directory structure. Thus, if you think that your class can possibly be reused, I would put it in its own file.
It entirely depends on how big the project is, how long the classes are, if they will be used from other files and so on.
For example I quite often use a series of classes for data-abstraction - so I may have 4 or 5 classes that may only be 1 line long (class SomeData: pass).
It would be stupid to split each of these into separate files - but since they may be used from different files, putting all these in a separate data_model.py file would make sense, so I can do from mypackage.data_model import SomeData, SomeSubData
If you have a class with lots of code in it, maybe with some functions only it uses, it would be a good idea to split this class and the helper-functions into a separate file.
You should structure them so you do from mypackage.database.schema import MyModel, not from mypackage.email.errors import MyDatabaseModel - if where you are importing things from make sense, and the files aren't tens of thousands of lines long, you have organised it correctly.
The Python Modules documentation has some useful information on organising packages.
I find myself splitting things up when I get annoyed with the bigness of files and when the desirable structure of relatedness starts to emerge naturally. Often these two stages seem to coincide.
It can be very annoying if you split things up too early, because you start to realise that a totally different ordering of structure is required.
On the other hand, when any .java or .py file is getting to more than about 700 lines I start to get annoyed constantly trying to remember where "that particular bit" is.
With Python/Jython circular dependency of import statements also seems to play a role: if you try to split too many cooperating basic building blocks into separate files this "restriction"/"imperfection" of the language seems to force you to group things, perhaps in rather a sensible way.
As to splitting into packages, I don't really know, but I'd say probably the same rule of annoyance and emergence of happy structure works at all levels of modularity.
I would say to put as many classes as can be logically grouped in that file without making it too big and complex.
I don't know if this will be useful to the community or not, as it might be unique to my situation. I'm working with a senior programmer who, in his code, has this peculiar habit of turning all strings into constants before using them. And I just don't get why. It doesn't make any sense to me. 99% of the time, we are gaining no abstraction or expressive power from the conversion, as it's done like this:
URL_CONVERTER = "url_converter"
URL_TYPE_LONG = "url_type_long"
URL_TYPE_SHORT = "url_type_short"
URL_TYPE_ARRAY = [URL_TYPE_LONG, URL_TYPE_SHORT]
for urltype in URL_TYPE_ARRAY:
outside_class.validate(urltype)
Just like that. As a rule, the constant name is almost always simply the string's content, capitalized, and these constants are seldom referenced more than once anyway. Perhaps less than 5% of the constants thus created are referenced twice or more during runtime.
Is this some programming technique that I just don't understand? Or is it just a bad habit? The other programmers are beginning to mimic this (possibly bad) form, and I want to know if there is a reason for me to as well before blindly following.
Thanks!
Edit: Updated the example. In addition, I understand everyone's points, and would add that this is a fairly small shop, at most two other people will ever see anyone's code, never mind work on it, and these are pretty simple one-offs we're building, not complicated workers or anything. I understand why this would be good practice in a large project, but in the context of our work, it comes across as too much overhead for a very simple task.
It helps to keep your code congruent. E.g. if you use URL_TYPE_LONG in both your client and your server and for some reason you need to change its string value, you just change one line. And you don't run the risk of forgetting to change one instance in the code, or to change one string in your code which just hazardly has the same value.
Even if those constants are only referenced once now, who are we to foresee the future...
I think this also arises from a time (when dinosaurs roamed the earth) when you tried to (A) keep data and code seperated and (B) you were concerned about how many strings you had allocated.
Below are a few scenarios where this would be a good practice:
You have a long string that will be used in a lot of places. Thus, you put it in a (presumably shorter) variable name so that you can use it easily. This keeps the lines of your code from becoming overly long/repetitive.
(somewhat similar to #1) You have a long sting that can't fit on a certain line without sending it way off the screen. So, you put the string in a variable to keep the lines concise.
You want to save a string and alter (add/remove characters from) it latter on. Only a variable will give you this functionality.
You want to have the ability to change multiple lines that use the same string by just altering one variable's value. This is a lot easier than having to go through numerous lines looking for occurrences of the string. It also keeps you from possibly missing some and thereby introducing bugs to your code.
Basically, the answer is to be smart about it. Ask yourself: will making the string into a variable save typing? Will it improve efficiency? Will it make your code easier to work with and/or maintain? If you answered "yes" to any of these, then you should do it.
Doing this is a really good idea.. I work on a fairly large python codebase with 100+ other engineers, and vouch for the fact that this makes collaboration much easier.
If you directly used the underlying strings everywhere, it would make it easier for you to make a typo when referencing it in one particular module and could lead to hard-to-catch bugs.
Its easier for modern IDE's to provide autocomplete and refactoring support when you are using a variable like this. You can easily change the underlying identifier to a different string or even a number later; This makes it easier for you to track down all modules referencing a particular identifier.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
After asking organising my Python project and then calling from a parent file in Python it's occurring to me that it'll be so much easier to put all my code in one file (data will be read in externally).
I've always thought that this was bad project organisation but it seems to be the easiest way to deal with the problems I'm thinking I will face. Have I simply gotten the wrong end of the stick with file count or have I not seen some great guide on large (for me) projects?
If you are planning to use any kind of SCM then you are going to be screwed. Having one file is a guaranteed way to have lots of collisions and merges that will be painstaking to deal with over time.
Stick to conventions and break apart your files. If nothing more than to save the guy who will one day have to maintain your code...
If your code is going to work together all the time anyway, and isn't useful separately, there's nothing wrong with keeping everything in one file. I can think of at least popular package (BeautifulSoup) that does this. Sure makes installation easier.
Of course, if it seems, down the road, that you could use part of your code with another project, or if maintainance starts to be an issue, then worry about organizing your project differently.
It seems to me from the questions you've been asking lately that you're worrying about all of this a bit prematurely. Often, for me, these sorts of issues are better tackled a little later on in the solution. Especially for smaller projects, my goal is to get a solution that is correct, and then optimal.
It's always a now verses then argument. If you're under the gun to get it done, do it. Source control will be a problem later, as with many things there's no black and white answer. You need to be responsible to both your deadline and the long term maintenance of the code.
If that's the best way to organise it, you're probably doing something wrong.
If it's more than just a toy program or a simple script, then you should break it up into separate files, etc. It's the only sane way of doing it. When your project gets big enough that you need someone else helping on it, then it will make the SCM a whole bunch easier.
Additionally, sooner or later you are going to need to add a separate utility to your project, that is going to need some common code/structures. It's far easier to do this if you have separate source files than if you have just one big one.
Looking at your earlier questions I would say all code in one file would be a good intermediate state on the way to a complete refactoring of your project. To do this you'll need a regression test suite to make sure you don't break the project while refactoring it.
Once all your code is in one file, I suggest iterating on the following:
Identify a small group of interdependent classes.
Pull those classes into a separate file.
Add unit tests for the new separate file.
Retest the entire project.
Depending on the size of your project, it shouldn't take too many iterations for you to reach something reasonable.
Since Calling from a parent file in Python indicates serious design problems, I'd say that you have two choices.
Don't have a library module try to call back to main. You'll have to rewrite things to fix this.
[An imported component calling the main program is an improper dependency. And Python doesn't support it because it's a poor design.]
Put it all in one file until you figure out a better design with proper one-way dependencies. Then you'll have to rewrite it to fix the dependency problems.
A module (a single file) should be a logical piece of related code. Not everything. Not a single class definition. There's a middle ground of modularity.
Additionally, there should be a proper one-way dependency graph from main program to components (which do NOT depend on the main program) to utility libraries and what-not (that do not know about the components OR the main program.
Circular (or mutual) dependencies often indicate a design problem. Callbacks are one way out of the problem. Another way is to decompose the circular elements to get a proper one-way graph.