In my CI/CD environment I have a multiple projects that use mostly the same tests, with a bit of variation. Since all of them are mostly the same, just different projects/builds use them a bit differently, I am looking for a way (if there is one) to package the tests themselves to pass around the projects. EDIT: Packaging tested code is not possible.
The ultimate usage will be something like this:
pip install <test-package>
pytest -m <some-mark-depending-on-build/project> --<additional-variables>
Is there a way to do this?
EDIT: If there is, please point me out toward a solution.
Thanks in advance.
Keeping it here for references.
The way to do this is to create a test package that can run as python module, from main.py.
After researching and some testing, I've concluded that in my case this will create more code to maintain than I would otherwise properly reuse.
Related
I have 2 Python projects:
Proj1(/var/www/proj1)
venv
requirments.txt
app
fun.py
fun2.py
app2
pdf.py
somefun2.py
Proj2(/var/www/proj2)
venv
requirments.txt
another
anotherfun.py
anotherfun2.py
someanother
someanotherfun.py
pdfproj2.py
Both work individually and both have different set of requirements.
Lets say pdf.py from proj1 has a function generate which will generate some PDFs. It will take all other modules(app/fun2 etc) in same project for it.
Now what I want is this functionality(pdf.py->generate) I want to call in pdfproj2.py in proj2.
How is this possible?
Nb: I am not using any frameworks like flask/django etc
There are at least three approaches.
1. external call
Change nothing. Pretty much.
Command line callers are already able to take advantage
of $ python proj1/app2/pdf.py arg..., invoking generate().
Arrange for proj2pdf.py to fork off a subprocess
and do exactly that.
Nothing changes in project1,
since its public API already supports this use case.
Notice that you might need to carefully finesse the PATH & PYTHONPATH
env vars, as part of correctly invoking that pdf.py command.
That's the sort of setup that conda and venv are good at.
2. merge projects
This is the quick-n-dirty approach. I do not recommend it.
Create project3, and incorporate source code from both existing projects.
Take the union of all library dependencies.
Now you can call generate() in the same address space,
the same process, as the calling python code.
Downside is: ugliness.
The bigger project's codebase is not as easily maintainable.
3. packaging
The "right" way to make generate() available to project2,
or to any project, is to package it up.
Pretend you're going to publish on pypi.
Doesn't matter if you actually do,
let's just prepare for such a possibility.
Create setup.py or similar, maybe use setuptools,
and create a wheel (or at least a tar) of project1.
There are many ways to do this, and best practices
continue to evolve, so I won't delve into details here.
Now you can list project1 as a dependency in project2's requirements.txt,
and import it just like any other dep. Problem solved!
This is the best approach.
It does involve a bit of work, and a gentle learning curve.
There probably isn't one "right answer" to this question. I'm interested in thoughts and opinions.
We have a couple hundred RHEL7/Centos7/Rocky8 nodes. Many of them have python modules installed via pip/pip3.
I've been searching for a best practices on routine/monthly patching these modules...so I far haven't found any. Obviously things installed with rpm/yum/dnf are pretty easy to deal with.
From the pip man page:
pip install --upgrade SomePackage
Great!
But how do you update all of them?
Sure. It is possible to do a "pip list/freeze" pipe that to awk...etc..
Surely, there's a better way. Ideally, one that captures things like "boto3 V1.2 replaced with boto3 V1.3"
Right now it feels like I'm the only one thinking about this. Maybe I am and it is stupid. I'm ok with that response as well (but please tell me why).
A common solution is to deploy the application code inside a Docker container - the container image contains its own version of Python and all the dependency modules, so you don't have to update each module on all the host machines individually. It also means that the combination of OS, Python and modules that you deploy can be tested and then "frozen" into an immutable image which is then deployed the same everywhere.
Right now it feels like I'm the only one thinking about this.
I realise the above answer is probably not helpful in your situation as you already have a fairly large system deployed... but it might help to explain why not many people are developing solutions to your problem!
it is very much the title. if I have a code that use a non built-in library in my repository in github and someone copy it, this person will have to have that library installed, right?
Short answer, Yes.
Long Answer, Yes, but actually you do the following in order to make the script executable on other systems.
Add a requirements.txt file, which specifies the libraries used and needed to be installed. Usually, this is used in a virtual environment. This makes sure that the packages/libraries used will not get mixed up with the main python installation
This is a rough solution, and I would use it in very extreme scenarios. (I used it when I had to run a python code on AWS Lambda where the library I used was compiled in C beforehand.) You can directly copy the Library folder in your code and use it. Mind you, this will increase the code size and is Absolutely not recommended to be done.
I'm cleaning up packaging for a python project I didn't create. Currently, it does some explicitly unsupported magic to get its dependencies from a requirements.txt file. The file looks like it may have been generated by pip freeze; there are fixed versions for everything, and many apparently-extraneous packages listed. I am pretty sure some of these aren't real dependencies, but I don't know which ones.
Given just the source tree, how would I figure out, from scratch, what dependencies ought to be included in install_requires?
As a first stab, I'm grepping for non-stdlib import statements. I hope there's a better way.
There's no way to do this perfectly, because Python is too flexible.
But it's usually possible to do it well enough.
You can use start with the stdlib's modulefinder.
Beyond that, a number of projects—mostly projects designed for building binary executables, installers, etc. for Python apps—have come up with heuristics that go even farther.
These usually work. And, when they fail, you usually immediately spot it on your first test. Even if they aren't sufficient, they're at the very least good sample code. Here are a few off the top of my head:
cx_Freeze
py2exe
py2app
pyInstaller
In case you're wondering why it's impossible:
Even forgetting about the program of dependencies in C extension modules, Python is just too flexible to catch all the ways you could import a module via static analysis.
Sure, you'd have to be dealing with code written by someone crazy enough to use explicitly unsupported magic for no good reason… but if you were, there's nothing to stop someone from writing this instead of import lxml:1
with open('picture.jpg', encoding='cp500') as f:
getattr(sys.modules[11], codecs.encode('vzcbeg_zbqhyr', 'rot13'))(f.read().strip())
In reality, things aren't going to be that bad. But they could easily be too bad for rg import to be sufficient.
You could try to detect all the imports dynamically with a simple import hook, but that's only guaranteed to work if you can exercise 100% of the code paths.
1. Of course this only works if importlib was the 12th module loaded, and if picture.jpg is not a JPEG image but a textfile whose contents are, in EBCDIC, lxml\n
I've had great results with pipreqs that will automatically generate a requirements.txt file from your source code.
pipreqs /home/project/location
Successfully saved requirements file in /home/project/location/requirements.txt
I wrote a tool, realreq, specifically for this issue.
You can install it using pip python3 -m pip install realreq. Using it is easy as:
realreq -s /path/to/your/source
It will then gather your dependencies actually used in your source code.
I mean, the most effective way would honestly be to go through the code line by line and determine what packages may not be needed, what packages need updates, etc. I know Python 2 and 3 both have ModuleFinder which finds all the modules a script needs to successfully compile and run, but I've never used it before, so not sure how effective it is, especially for what you're doing. However, if you're interested, I'll attach the link below.
https://docs.python.org/3/library/modulefinder.html
Recently I started working on a personal project in my notebook that, all going OK, it will be placed in a server elsewhere. The problem is that I make use of modules. Some were installed from apt-get, others from easy_install and one or two of those were placed directly under a subdirectory since I changed them a bit. My question is: is there a way to move all those things together? Moreover, I don't want any of those modules being updated since it may break something. How to handle that?
Finally, I'm pretty sure that I've done things the wrong way since the beginning. How do you guys work to avoid those problems?
Have a look at virtualenv. Virtualenv is a tool to create isolated Python environments.