Below is a screenshot of part of an article explaining how to access the example Python module dataset.py, for which they provide the following line:
import my_model.training.dataset
I'd like to know if the following methods below are equivalent and accomplish the same thing:
from my_model.training import dataset
from my_model import training.dataset
I have a library where I've been accumulating all of my .py files over time. I'm trying to organize it into something more.. neat but I'm having trouble deciding how to do that.
The library (or rather, the folder I'm dumping everything in) is meant to be just a collection of independent modules, but some of the modules have cross dependencies.. It'd be easier if I had a systematic way to group functions/classes within certain files ie modules. Should they be grouped by purpose?
keep in mind these aren't even packages for projects, they are the building blocks for other packages; just my own personal collection of classes and functions but starting to get hard to manage. so i could use some advice
Thanks
Related
I'm doing a research project on detecting breaking changes from Python library upgrades. One of the steps is to extract the difference between two major versions of the same Python library by using static analysis(Coule be AST-based or not), in order to triage the pattern of change. The detection should not only find the difference from .py files, but also the difference from other project files including config files, resources, etc. Ideally, a scenario like if a .py file moved to another module should also be included. So I have two questions here:
Is there a tool that can do a similar job and also support flexible configuration for analysis?
If not, what will be the best strategy to search for that kind of difference and identify the category of this difference(e.g. variable, function, etc.)
Sorry, this might be a silly question, I'm not coming from a Python background, really running out of thoughts here. Any thoughts, ideas, and inputs are welcome. Thanks in advance.
Just spit balling some ideas here:
I don't think I'd be so concerned about detecting changes in the source files up front. There are a lot of ways to move code around among files without changing the interface to the module. For example you can put all of the code in __init__.py or, you can split it up into any number of files and subdirectories. However, the programmatic interface will stay the same.
Instead, you could use the dir() built-in to detect changes in the public classes and methods in the module. This will work well for libraries that used named arguments, but won't work well for functions which just use def func(*args, **kwargs) (this is why that should be avoided, all you former perl programmers!)
If the module uses the new type hinting, you can really get some mileage out of detecting change in types. If you use some tool that actually parses the python and infers types, that would work as well. I would guess VSCode probably contains such a library that it uses to give context-sensitive help.
Regarding code structure and formatting, I can't find any clear information about this small nitpick relating to importing many modules at once.
Say I have two files, crudely named solver.py and data.py. Like most people, I have a set of standard modules that I import for each solver. Is it advisable to create a third module/textfile such as importList or importList.py which contains all the information such as import xpackagex as xpx? Should I just suck it up and copy over all of the imports for each file I write? Of course I am concerned about compatibility since for the main function where one could type from importList import * it would overwrite the any other choices, but it might make for some tidier looking code, particularly when many libraries are imported. Is there a standard approach for this?
Best wishes and thanks in advance.
I have created a package in python that I want to use in many other modules. The package contains a lot of classes; some large and some small. I have decided to keep each class in its own module as this makes sense in the context of the package and is, I think, what users would want.
I would like to know the most pythonic way to organise the package.
At present at present it is structured as shown here, in a top level directory called 'org':
(remember I have many more modules than the three show here and the list of modules is very long).
I can import any of the classes using into different packages using:
import sys
sys.path.append('../org')
from org.a import A
A()
I would like to organise it like this and still use the same import statements (if possible):
unfortunately, if I do this, I cannot import any of the classes using the code shown above.
Can someone please show me how they would do it?
So I have a set of .py documents as follows:
/Spider
Script.py
/Classes
__init__.py
ParseXML.py
CrawlWeb.py
TextAnalytics.py
Each .py document in the /Classes subfolder contains a class for a specific purpose, the script schedules the different components. There are a couple of questions I had:
1) A lot of the classes share frameworks such as urllib2, threading etc. What is considered the 'best' form for setting up the import statements? I.e. is there a way for me to use something like the __init__.py file to pass the shared dependencies to all of the classes, then use the specific .py files to import the singular dependencies?
2) Some of the classes call on the other classes, (e.g. the CrawlWeb.py document uses the ParseXML class to update the XML files after crawling). I separated out the classes like this because they were each quite large and so were easier to update like this... Would it be considered best form to combine classes in this case or are there other ways to get round this?
The classes will only ever be used as part of the script. So far the only real solution I've been able to come up with is perhaps using the Script.py file for all of the import statements, but it seems a little bit messy. Any advice would be very appreciated.
The best way to handle the common imports is to import them in each module they're used. While this probably feels annoying to you because you have to type more, it makes it dramatically clearer to the reader of the code what modules are in scope. You're not missing something by doing common imports; you're doing it right.
While you certainly can put your classes all into separate files, it's more common in Python to group related classes together in a single module. Given how short it sounds like your script is, that may mean it makes sense for you to pull everything into a single file. This is a judgment call, and I cannot offer a hard-and-fast rule.
I have a bunch of Python modules I want to clean up, reorganize and refactor (there's some duplicate code, some unused code ...), and I'm wondering if there's a tool to make a map of which module uses which other module.
Ideally, I'd like a map like this:
main.py
-> task_runner.py
-> task_utils.py
-> deserialization.py
-> file_utils.py
-> server.py
-> (deserialization.py)
-> db_access.py
checkup_script.py
re_test.py
main_bkp0.py
unit_tests.py
... so that I could tell which files I can start moving around first (file_utils.py, db_access.py), which files are not used by my main.py and so could be deleted, etc. (I'm actually working with around 60 modules)
Writing a script that does this probably wouldn't be very complicated (though there are different syntaxes for import to handle), but I'd also expect that I'm not the first one to want to do this (and if someone made a tool for this, it might include other neat features such as telling me which classes and functions are probably not used).
Do you know of any tools (even simple scripts) that assist code reorganization?
Do you know of a more exact term for what I'm trying to do? Code reorganization?
Python's modulefinder does this. It is quite easy to write a script that will turn this information into an import graph (which you can render with e.g. graphviz): here's a clear explanation. There's also snakefood which does all the work for you (and using ASTs, too!)
You might want to look into pylint or pychecker for more general maintenance tasks.
Writing a script that does this probably wouldn't be very complicated (though there are different syntaxes for import to handle),
It's trivial. There's import and from module import. Two syntax to handle.
Do you know of a more exact term for what I'm trying to do? Code reorganization?
Design. It's called design. Yes, you're refactoring an existing design, but...
Rule One
Don't start a design effort with what you have. If you do, you'll only "nibble around the edges" making small and sometimes inconsequential changes.
Rule Two
Start a design effort with what you should have had if you'd only been smarter. Think broadly and clearly about what you're really supposed to be doing. Ignore what you did.
Rule Three
Design from the ground up (or de novo as some folks say) with the correct package and module architecture.
Create a separate project for this.
Rule Four
Test First. Write unit tests for your new architecture. If you have existing unit tests, copy them into the new project. Modify the imports to reflect the new architecture and rewrite the tests to express your glorious new simplification.
All the tests fail, because you haven't moved any code. That's a good thing.
Rule Five
Move code into the new structure last. Stop moving code when the tests pass.
You don't need to analyze imports to do this, BTW. You're just using grep to find modules and classes. The old imports and the tangled relationships among the old imports doesn't matter, and doesn't need to be analyzed. You're throwing it away. You don't need tools smarter than grep.
If feel an urge to move code, you must be very disciplined. (1) you must have test(s) which fail and then (2) you can move some code to pass the failing test(s).
chuckmove is a tool that lets you recursively rewrite imports in your entire source tree to refer to a new location of a module.
chuckmove --old sound.utils --new media.sound.utils src
...this descends into src, and rewrites statements that import sound.utils to import media.sound.utils instead. It supports the whole range of Python import formats. I.e. from x import y, import x.y.z as w etc.
Modulefinder may not work with Python 3.5*, but pydeps worked very well:
Installation:
sudo apt install python-pygraphviz
pip install pydeps
Then, in the directory where you want to map from,
pydeps --max-bacon=0 .
..to create a map of maximum depth.
*An issue in Python 3.5 but not 3.6 caused the problems with modulefinder, similar to this