Proper way of setting classes and constants in python package

Proper way of setting classes and constants in python package - python

I'm writing a small package for internal use and come to a design problem. I define a few classes and constants (i.e., server IP address) in some file, let's call it mathfunc.py. Now, some of these classes and constants will be used in other files in the same package. My current setup is like this:
/mypackage
__init__.py
mathfunc.py
datefunc.py
So, at the moment I think I have to import mathfunc.py in datefunc.py to use the classes defined there (or alternatively import both of them all the time). This sounds wrong to me because then I'll be in a lot of pain importing lots of files everywhere. Is it a proper design at all or there is some other way? Maybe I can put all definitions in some file which will not be a subpackage on its own, but will be used by all other files?

Nope, that's pretty much how Python works. If you want to use objects declared in another file, you have to import from it.
Tips:
You can keep your namespace clean by only importing the things you need, rather than using from foo import *.
If you really need to do a "circular import" (where A needs things in B, and B needs things in A) you can solve that by only importing inside the functions where you need the object, not at the top of a file.

Related

Splitting python code into different files

I am a beginner in Python, and I am trying to learn by making a simple game. I started by having everything in one big file (let's call it main.py), but it is getting to the point where it has so many classes and functions that I would like to split this code into more manageable components.
I have some experience with LaTeX (although certainly not an expert either) and, in LaTeX there is a function called \input which allows one to write part of the code in a different file. For example, if I have files main.tex and sub.tex which look like:
main.tex:
Some code here.
\input{sub}
Lastly, some other stuff.
and
sub.tex:
Some more code here
then, when I execute main.tex, it will execute:
Some code here.
Some more code here
Lastly, some other stuff.
I wonder, is there a similar thing in Python?
Note 1: From what I have seen, the most commonly suggested way to go about splitting your code is to use modules. I have found this a bit uncomfortable for a few reasons, which I will list below (of course, I understand that I find them uncomfortable because I am a inexperienced, and not because this is the wrong way to do things).
Reasons why I find modules uncomfortable:
My main.py file imports some other modules, like Pygame, which need to be imported into all the new modules I create. If for some reason I wanted to import a new module into main.py later in the process I would then need to import it on every other module I create.
My main.py file has some global variables that are used in the different classes; for example, I have a global variable CITY_SIZE that controls the size of all City instances on the screen. Naturally, CITY_SIZE is used in the definition of the class City. If I were to move the class City to a module classes.py, then I need to define CITY_SIZE on classes.py as well, and if I ever wanted to change the value of CITY_SIZE I would need to change its value on classes.py as well.
Again, suppose that I add a classes.py module where I store all my classes, like City. Then in main.py I need to write classes.City in my code instead of City. I understand this can be overcome by using from classes import City but then I need to add a line of code every time I add a new class to classes.py.
Note 2: I would very much appreciate any comments about how to use modules comfortably in Python, but please note that because this is not my question I would not be able to accept those as valid answers (but, again, they would be appreciated!).

If you have all of your modules in the same directory, you can simply use:
import <name of submodule without .py>
For example, if a submodule file was named sub.py, you would import it like this:
import sub

Proper way to use nested modules

I've split a program into three scripts. One of them, 'classes.py', is a module defining all the classes I need. Another one is a sort of setup module, call it 'setup.py', which instantiates a lot of objects from 'classes.py' (it's just a bunch of variable assignments with a few for loops, no functions or classes). It has a lot of strings and stuff I don't want to see when I'm working on the third script which is the program itself, i.e. the script that actually does something with all of the above.
The only way I got this to work was to add, in the 'setup.py' script:
from classes import *
This allows me to write quickly in the setup file without having the namespace added everywhere. And, in the main script:
import setup
This has the advantages of PyCharm giving me full code completion for classes and methods, which is nice.
What I'd like to achieve is having the main script import the classes, and then run the setup script to create the objects I need, with two simple commands. But I can't import the classes script into the main script because then the setup script can't do anything, having no class definitions. Should I import the classes into both scripts, or do something else entirely?

Import in each file. Consider this SO post. From the answer by Mr Fooz there,
Each module has its own namespace. So for boo.py to see something from an external module, boo.py must import it itself.
It is possible to write a language where namespaces are stacked the way you expect them to: this is called dynamic scoping. Some languages like the original lisp, early versions of perl, postscript, etc. do use (or support) dynamic scoping.
Most languages use lexical scoping instead. It turns out this is a much nicer way for languages to work: this way a module can reason about how it will work based on its own code without having to worry about how it was called.
See this article for additional details: http://en.wikipedia.org/wiki/Scope_%28programming%29
Intuitively this feels nicer too, as you can immediately (in the file itself) see which dependencies the code has - this will allow you to understand your code much better, a month, or even a year from now.

How to set up collection of Python Classes with inter-dependency?

So I have a set of .py documents as follows:
/Spider
Script.py
/Classes
__init__.py
ParseXML.py
CrawlWeb.py
TextAnalytics.py
Each .py document in the /Classes subfolder contains a class for a specific purpose, the script schedules the different components. There are a couple of questions I had:
1) A lot of the classes share frameworks such as urllib2, threading etc. What is considered the 'best' form for setting up the import statements? I.e. is there a way for me to use something like the __init__.py file to pass the shared dependencies to all of the classes, then use the specific .py files to import the singular dependencies?
2) Some of the classes call on the other classes, (e.g. the CrawlWeb.py document uses the ParseXML class to update the XML files after crawling). I separated out the classes like this because they were each quite large and so were easier to update like this... Would it be considered best form to combine classes in this case or are there other ways to get round this?
The classes will only ever be used as part of the script. So far the only real solution I've been able to come up with is perhaps using the Script.py file for all of the import statements, but it seems a little bit messy. Any advice would be very appreciated.

The best way to handle the common imports is to import them in each module they're used. While this probably feels annoying to you because you have to type more, it makes it dramatically clearer to the reader of the code what modules are in scope. You're not missing something by doing common imports; you're doing it right.
While you certainly can put your classes all into separate files, it's more common in Python to group related classes together in a single module. Given how short it sounds like your script is, that may mean it makes sense for you to pull everything into a single file. This is a judgment call, and I cannot offer a hard-and-fast rule.

How to properly handle a circular module dependency in Python?

Trying to find a good and proper pattern to handle a circular module dependency in Python. Usually, the solution is to remove it (through refactoring); however, in this particular case we would really like to have the functionality that requires the circular import.
EDIT: According to answers below, the usual angle of attack for this kind of issue would be a refactor. However, for the sake of this question, assume that is not an option (for whatever reason).
The problem:
The logging module requires the configuration module for some of its configuration data. However, for some of the configuration functions I would really like to use the custom logging functions that are defined in the logging module. Obviously, importing the logging module in configuration raises an error.
The possible solutions we can think of:
Don't do it. As I said before, this is not a good option, unless all other possibilities are ugly and bad.
Monkey-patch the module. This doesn't sound too bad: load the logging module dynamically into configuration after the initial import, and before any of its functions are actually used. This implies defining global, per-module variables, though.
Dependency injection. I've read and run into dependency injection alternatives (particularly in the Java Enterprise space) and they remove some of this headache; however, they may be too complicated to use and manage, which is something we'd like to avoid. I'm not aware of how the panorama is about this in Python, though.
What is a good way to enable this functionality?
Thanks very much!

As already said, there's probably some refactoring needed. According to the names, it might be ok if a logging modules uses configuration, when thinking about what things should be in configuration one think about configuration parameters, then a question arises, why is that configuration logging at all?
Chances are that the parts of the code under configuration that uses logging does not belong to the configuration module: seems like it is doing some kind of processing and logging either results or errors.
Without inner knowledge, and using only common sense, a "configuration" module should be something simple without much processing and it should be a leaf in the import tree.
Hope it helps!

Will this work for you?
# MODULE a (file a.py)
import b
HELLO = "Hello"
# MODULE b (file b.py)
try:
import a
# All the code for b goes here, for example:
print("b done",a.HELLO))
except:
if hasattr(a,'HELLO'):
raise
else:
pass
Now I can do an import b. When the circular import (caused by the import b statement in a) throws an exception, it gets caught and discarded. Of course your entire module b will have to indented one extra block spacing, and you have to have inside knowledge of where the variable HELLO is declared in a.
If you don't want to modify b.py by inserting the try:except: logic, you can move the whole b source to a new file, call it c.py, and make a simple file b.py like this:
# new Module b.py
try:
from c import *
print("b done",a.HELLO)
except:
if hasattr(a,"HELLO"):
raise
else:
pass
# The c.py file is now a copy of b.py:
import a
# All the code from the original b, for example:
print("b done",a.HELLO))
This will import the entire namespace from c to b, and paper over the circular import as well.
I realize this is gross, so don't tell anyone about it.

A cyclic module dependency is usually a code smell.
It indicates that part of the code should be re-factored so that it is external to both modules.

So if I'm reading your use case right, logging accesses configuration to get configuration data. However, configuration has some functions that, when called, require that stuff from logging be imported in configuration.
If that is the case (that is, configuration doesn't really need logging until you start calling functions), the answer is simple: in configuration, place all the imports from logging at the bottom of the file, after all the class, function and constant definitions.
Python reads things from top to bottom: when it comes across an import statement in configuration, it runs it, but at this point, configuration already exists as a module that can be imported, even if it's not fully initialized yet: it only has the attributes that were declared before the import statement was run.
I do agree with the others though, that circular imports are usually a code smell.

Python module getting too big

My module is all in one big file that is getting hard to maintain. What is the standard way of breaking things up?
I have one module in a file my_module.py, which I import like this:
import my_module
"my_module" will soon be a thousand lines, which is pushing the limits of my ability to keep everything straight. I was thinking of adding files my_module_base.py, my_module_blah.py, etc. And then, replacing my_module.py with
from my_module_base import *
from my_module_blah import *
# etc.
Then, the user code does not need to change:
import my_module # still works...
Is this the standard pattern?

It depends on what your module is doing actually. Usually it is always a good idea to make your module a directory with an '__init__.py' file inside. So you would first transform your your_module.py to something like your_module/__init__.py.
After that you continue according to your business logic. Here some examples:
do you have utility functions which are not directly used by the modules API put them in some file called utils.py
do you have some classes dealing with the database or representing your database models put them in models.py
do you have some internal configuration it might make sense to put it into some extra file called settings.py or config.py
These are just examples (a little bit stolen from the Django approach of reusable apps ^^). As said, it depends a lot what your module does. If it is still too big afterwards it also makes sense to create submodules (as subdirectories with their own __init__.py).

i'm sure there are lots of opinions on this, but I'd say you break it into more well-defined functional units (modules), contained in a package. Then you use:
from mypackage import modulex
Then use the package name to reference the object:
modulex.MyClass()
etc.
You should (almost) never use
from mypackage import *
Since that can introduce bugs (duplicate names from different modules will end up clobbering one).

No, that is not the standard pattern. from something import * is usually not a good practice as it will import lot of things you did not intend to. Instead follow the same approach as you did, but include the modules specifically from one to another for e.g.
In base.py if you are having def myfunc then in main.py use from base import myfunc So that for your users, main.myfunc would work too. Of course, you need to take care that you don't end up doing a circular import.
Also, if you see that from something import * is required, then control the import values using the __all__ construct.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.