So I have a set of .py documents as follows:
/Spider
Script.py
/Classes
__init__.py
ParseXML.py
CrawlWeb.py
TextAnalytics.py
Each .py document in the /Classes subfolder contains a class for a specific purpose, the script schedules the different components. There are a couple of questions I had:
1) A lot of the classes share frameworks such as urllib2, threading etc. What is considered the 'best' form for setting up the import statements? I.e. is there a way for me to use something like the __init__.py file to pass the shared dependencies to all of the classes, then use the specific .py files to import the singular dependencies?
2) Some of the classes call on the other classes, (e.g. the CrawlWeb.py document uses the ParseXML class to update the XML files after crawling). I separated out the classes like this because they were each quite large and so were easier to update like this... Would it be considered best form to combine classes in this case or are there other ways to get round this?
The classes will only ever be used as part of the script. So far the only real solution I've been able to come up with is perhaps using the Script.py file for all of the import statements, but it seems a little bit messy. Any advice would be very appreciated.
The best way to handle the common imports is to import them in each module they're used. While this probably feels annoying to you because you have to type more, it makes it dramatically clearer to the reader of the code what modules are in scope. You're not missing something by doing common imports; you're doing it right.
While you certainly can put your classes all into separate files, it's more common in Python to group related classes together in a single module. Given how short it sounds like your script is, that may mean it makes sense for you to pull everything into a single file. This is a judgment call, and I cannot offer a hard-and-fast rule.
Related
Below is a screenshot of part of an article explaining how to access the example Python module dataset.py, for which they provide the following line:
import my_model.training.dataset
I'd like to know if the following methods below are equivalent and accomplish the same thing:
from my_model.training import dataset
from my_model import training.dataset
I have a library where I've been accumulating all of my .py files over time. I'm trying to organize it into something more.. neat but I'm having trouble deciding how to do that.
The library (or rather, the folder I'm dumping everything in) is meant to be just a collection of independent modules, but some of the modules have cross dependencies.. It'd be easier if I had a systematic way to group functions/classes within certain files ie modules. Should they be grouped by purpose?
keep in mind these aren't even packages for projects, they are the building blocks for other packages; just my own personal collection of classes and functions but starting to get hard to manage. so i could use some advice
Thanks
I am a beginner in Python, and I am trying to learn by making a simple game. I started by having everything in one big file (let's call it main.py), but it is getting to the point where it has so many classes and functions that I would like to split this code into more manageable components.
I have some experience with LaTeX (although certainly not an expert either) and, in LaTeX there is a function called \input which allows one to write part of the code in a different file. For example, if I have files main.tex and sub.tex which look like:
main.tex:
Some code here.
\input{sub}
Lastly, some other stuff.
and
sub.tex:
Some more code here
then, when I execute main.tex, it will execute:
Some code here.
Some more code here
Lastly, some other stuff.
I wonder, is there a similar thing in Python?
Note 1: From what I have seen, the most commonly suggested way to go about splitting your code is to use modules. I have found this a bit uncomfortable for a few reasons, which I will list below (of course, I understand that I find them uncomfortable because I am a inexperienced, and not because this is the wrong way to do things).
Reasons why I find modules uncomfortable:
My main.py file imports some other modules, like Pygame, which need to be imported into all the new modules I create. If for some reason I wanted to import a new module into main.py later in the process I would then need to import it on every other module I create.
My main.py file has some global variables that are used in the different classes; for example, I have a global variable CITY_SIZE that controls the size of all City instances on the screen. Naturally, CITY_SIZE is used in the definition of the class City. If I were to move the class City to a module classes.py, then I need to define CITY_SIZE on classes.py as well, and if I ever wanted to change the value of CITY_SIZE I would need to change its value on classes.py as well.
Again, suppose that I add a classes.py module where I store all my classes, like City. Then in main.py I need to write classes.City in my code instead of City. I understand this can be overcome by using from classes import City but then I need to add a line of code every time I add a new class to classes.py.
Note 2: I would very much appreciate any comments about how to use modules comfortably in Python, but please note that because this is not my question I would not be able to accept those as valid answers (but, again, they would be appreciated!).
If you have all of your modules in the same directory, you can simply use:
import <name of submodule without .py>
For example, if a submodule file was named sub.py, you would import it like this:
import sub
Regarding code structure and formatting, I can't find any clear information about this small nitpick relating to importing many modules at once.
Say I have two files, crudely named solver.py and data.py. Like most people, I have a set of standard modules that I import for each solver. Is it advisable to create a third module/textfile such as importList or importList.py which contains all the information such as import xpackagex as xpx? Should I just suck it up and copy over all of the imports for each file I write? Of course I am concerned about compatibility since for the main function where one could type from importList import * it would overwrite the any other choices, but it might make for some tidier looking code, particularly when many libraries are imported. Is there a standard approach for this?
Best wishes and thanks in advance.
I've split a program into three scripts. One of them, 'classes.py', is a module defining all the classes I need. Another one is a sort of setup module, call it 'setup.py', which instantiates a lot of objects from 'classes.py' (it's just a bunch of variable assignments with a few for loops, no functions or classes). It has a lot of strings and stuff I don't want to see when I'm working on the third script which is the program itself, i.e. the script that actually does something with all of the above.
The only way I got this to work was to add, in the 'setup.py' script:
from classes import *
This allows me to write quickly in the setup file without having the namespace added everywhere. And, in the main script:
import setup
This has the advantages of PyCharm giving me full code completion for classes and methods, which is nice.
What I'd like to achieve is having the main script import the classes, and then run the setup script to create the objects I need, with two simple commands. But I can't import the classes script into the main script because then the setup script can't do anything, having no class definitions. Should I import the classes into both scripts, or do something else entirely?
Import in each file. Consider this SO post. From the answer by Mr Fooz there,
Each module has its own namespace. So for boo.py to see something from an external module, boo.py must import it itself.
It is possible to write a language where namespaces are stacked the way you expect them to: this is called dynamic scoping. Some languages like the original lisp, early versions of perl, postscript, etc. do use (or support) dynamic scoping.
Most languages use lexical scoping instead. It turns out this is a much nicer way for languages to work: this way a module can reason about how it will work based on its own code without having to worry about how it was called.
See this article for additional details: http://en.wikipedia.org/wiki/Scope_%28programming%29
Intuitively this feels nicer too, as you can immediately (in the file itself) see which dependencies the code has - this will allow you to understand your code much better, a month, or even a year from now.
I'm writing a small package for internal use and come to a design problem. I define a few classes and constants (i.e., server IP address) in some file, let's call it mathfunc.py. Now, some of these classes and constants will be used in other files in the same package. My current setup is like this:
/mypackage
__init__.py
mathfunc.py
datefunc.py
So, at the moment I think I have to import mathfunc.py in datefunc.py to use the classes defined there (or alternatively import both of them all the time). This sounds wrong to me because then I'll be in a lot of pain importing lots of files everywhere. Is it a proper design at all or there is some other way? Maybe I can put all definitions in some file which will not be a subpackage on its own, but will be used by all other files?
Nope, that's pretty much how Python works. If you want to use objects declared in another file, you have to import from it.
Tips:
You can keep your namespace clean by only importing the things you need, rather than using from foo import *.
If you really need to do a "circular import" (where A needs things in B, and B needs things in A) you can solve that by only importing inside the functions where you need the object, not at the top of a file.