Python - replace specific file in module

Python - replace specific file in module - python

When someone installs my github repo. I'd like it to install specific files that override files from another module. There's a public library, but there are a couple bugs we want to fix, and the creator doesn't want to merge in a pull request.
I could copy the ENTIRE library into our repo, and that does work. But that seems cumbersome for 5 lines of changes in 2 files.
Is there any easy way to replace specific module files with a different version when my github repo gets installed? If I delete anything but the changed files, I get import errors.
Ideally, the user would install the public library, install our project, and the bugfixes would be implemented (at least when running scripts from our project..)

Related

Jenkins shared library: How to run code from a python module contained in a package?

I want to run a python script as part of a jenkins pipline triggered from a github repo. If I store the script directly in the repo itself, I can just do sh 'python path/to/my_package/script.py' which work perfectly. However since I want to use this from multiple pipelines from multiple repos, I want to put this in a jenkins shared library.
I found this question which suggested storing the python file in the resources directory and copying it to a temp file before use. That only works if the script is one standalone file. Unfortunately, mine is a package with multiple python files and imports between them, so thats a no go. I also tried to copy the entire folder containing the python package from the answer to this question, which suggests getting the location of the library with
import groovy.transform.SourceURI
import java.nio.file.Path
import java.nio.file.Paths
class ScriptSourceUri {
#SourceURI
static URI uri
}
but its gives me the following error:
Scripts not permitted to use staticMethod java.net.URI create java.lang.String. Administrators can decide whether to approve or reject this signature.
It seems that some additional permissions are required, which I don't think I'll be able to acquire (its a shared machine).
So? Does anyone know how I can run a python package from jenkins shared library? Right now the only solution I can think of is to manually recreate the directory structure of the python package, which is obviously very messy and non-generic.
PS: There is no particular reason for using the python script over writing the same script in groovy. Its just that the python script is well tested, well understood and well supported. Rewriting the whole thing in groovy just isn't feasible right now.

You can go to http://host:8080/jenkins/scriptApproval/ page of your Jenkins installation and approve the request for your scripts, please see below:-
And follow the link for more information.

How to use a utils across projects and ensure it is updated

I have made a few Python projects for work that all revolve around extracting data, performing Pandas manipulations, and exporting to Excel. Obviously, there are common functions I've been reusing. I've saved these into utils.py, and I copy paste utils.py into each new project.
Whenever I change utils.py, I need to ensure that I change it in my other project, which is an error-prone process
What would you suggest?
Currently, I create a new directory for each project, so
/PyCharm Projects
--/CollegeBoard
----/venv
----/CollegeBoard.py
----/Utils.py
----/Paths.py
--/BoxTracking
----/venv
----/BoxTracking.py
----/Utils.py
----/Paths.py
I'm wondering if this is the most effective way to structure/version control my work. Since I have many imports in common, too, would a directory like this be better?
/Projects
--/Reporting
----/venv
----/Collegeboard
------/Collegeboard.py
------/paths.py
----/BoxTracking
------/BoxTracking.py
------/paths.py
----/Utils.py
I would appreciate any related resources.

Instead of putting a copy of utils.py into each of your projects, make utils.py into a package with it's own dedicated repository/folder somewhere. I'd recommend renaming it to something less generic, such as "zhous_utils".
In that dedicated repository for zhous_utils, you can create a setup.py file and you can use that setup.py file to install the current version of the zhous_utils into your python install. That way you can import zhous_utils into any other python script on your PC, just like you would import pandas or any other package you've installed to your computer.
Check out this stackoverflow thread: What is setup.py?
When you understand setup.py, then you will understand how to make and install your own packages so that you can import those installed packages to any python script on your PC. That way all source code for zhous_utils is centralized to just one folder on your PC, which you can update whenever you want and re-install the package.
Now, of course, there are some potential challenges/downsides to this. If you install zhous_utils to your computer and then import and use zhous_utils in one of your other projects, then you've just made zhous_utils into a dependency of that project. That means that if you want to share that project with other people and let them work on it as well or use it in some way, then they will need to install zhous_utils. Just be aware of that. This won't be an issue if you're the only one interacting/developing the source code of the projects you intend to import zhous_utils into.

Locally modify package from pip

I've locally installed via pip Python package in virtualenv. I'd like to modify it (not monkey patch or subclass, but deeply modify) and keep it in my source control system referencing without installing. Maybe later I'd like to package it again so I'd like to keep all files for creating package, not only python sources.
Should I just copy it to my project folder and deinstall from virtualenv?

Two points. One, are the changes you're planning to make useful for anyone else? If the first, you might consider cloning the source repo, making your changes and submitting a PR. Even if it's not immediately merged, you can make use of setup.py to create a local package and install that in your virtualenv.
And two, are you planning to use these changes for just one project, or on many projects? If it's just for one project, throwing it in your repo and deeply modifying it is probably an ok thing, (although you need to confirm you're allowed to do so by the license). If you can foresee using this in multiple projects, you're probably better off creating a repo for it, and packaging it via setup.py.

GAE - Including external python modules without adding them to the repository?

I'm current working on a python based Google App Engine project. Specifically, I'm using Flask for the application. I'm wondering what the accepted method of including external python modules is, specifically when it comes to the repository. From what I can tell, including other people's code in my repository is bad form for several reasons. However, other people will be working on the same repository, so we should be using the same external modules to insure the same results.
Specifically, I need to include Flask (and its dependencies) to my application. The easiest way to do this with Google App Engine is just to throw them into the root level:
MyProject
app.yaml
main.py
MyApp
Flask
...
What is the proper way to bring in these external modules in such a project? Both a generalized answer and one specific to my case would be useful. Also, any other related recommendations would be appreciated. Thank you much.

While it is indeed possible to include third party libraries as submodules or symlinks from external repositories, in practice it's not a good idea. Here are two scenarios on what could go wrong:
If the third party library releases a new version that breaks the functionality, you will have to either make all the necessary changes to meet the new requirements or simply find the previous version to keep working and break the external connection. Usually this happens when you are very close to deadlines.
If the third party library releases a new version and one of your colleagues is upgraded and made all the necessary changes to support the new version, on your side the code will be broken until you will upgrade as well.
The above examples are much more visible in big projects with lots of dependencies and as more people joining the project in the long run it becomes a huge problem! I could come up with more examples, but I think you can see the point.
Your best option is to include the external libraries into your repository, which also has the advantage that you are able to have the whole project up and running on a new machine without many dependencies. There are many ways on how to organize your third party libraries and all of them needs to be included on the same or deeper level with your app.yaml file. Just as #dragonx mentioned include only the core library code.
Also do not afraid putting stuff into your repository cause space is not an issue today and these libraries usually not updating that often so your repository size is not getting too much bigger over time.
Since you mentioned Flask on Google App Engine, you can check out my gae-init project, where you can see in practice how the external libraries are organised.

You're actually asking two questions here.
How do I include the external library in my GAE project?
You've got the right idea. Whatever way you go about it, you must somehow include Flask and its dependencies in the root of your GAE project. One way is to put a copy directly in there.
The second way is to use a symbolic link to the folder that contains the external library. I'm not sure about Flask, but often times external repos contain the actual library code in a subdirectory - so often you don't want the root of the repo in your GAE app, just the root of the actual source. In this case, it's easier to put a symlink that links to the source folder.
How do I manage external libraries in my source repo?
This is a harder question to answer since it depends what source control tool you're using. Yes, you do want to have everyone use the same versions of external libraries, and they should be included in your source control somehow.
If you're using git, git submodule is the way to go. It's a bit confusing to start with but it'll get the job done.
I'd recommend a repo structure that looks something like this
repo/
thirdparty/
flask/
other_dependency/
another_dependency/
README.TXT
setup.py
src/
app/
app.yaml
your_source.py
softlink_to_flask
softlink_to_other_dependency
softlink_to_another_dependency_src
In this example you keep the source to your external libraries in the thirdparty folder. These may be git submodules. In the app folder you have your source, and softlinks to the appropriate files that are actually needed for your app to run. In this case, the actual code for another_dependency may be in the another_dependency/src folder rather than the actual root of another dependency. This way you don't need to include the unnecessary files in your deployment folder, but you can still keep the entire library in your repo.

You can't just create requirements.txt and put it to GAE. Your code must include all pure python libraries that used your project and doesn't supported by GAE (https://developers.google.com/appengine/docs/python/tools/libraries27).
If you look at flask deploy example for GAE (http://flask.pocoo.org/docs/quickstart/#deploying-to-a-web-server and https://github.com/kamalgill/flask-appengine-template) you can find some dependencies like flask, werkzeug and etc. and all this dependencies you must push to GAE server.
So I see three solutions:
Use local requirements for local development and make custom build function that will download all dependencies, put with your application and upload to GAE server.
Add tools for local deployment when you just start project that put required libraries with your application (don't forget about .gitignore).
Use something like git submodules to requirements repositories.

There is two case for using python third party packages in google app engine project:
If your library is one of the supported runtime-provided third-party libraries of GAE section
just add it to your app.yml file under libraries
libraries:
- name: package_name
version: latest
Add your code
import pack_name
Sometimes you need to install the package with
pip install package_name
Make sure you're using the right interpreter, by using
pip freeze
you can make sure the package is installed successfully to the right path.
Otherwise, if GAE does not support you library, you need to download it manually and save it locally under root/Lib directory:
or through GIT
or through pip (pip install package_name -t path/to/your/Lib/dir)
After that, we should declare Lib directory as source dir in pycharm
pycharm->preferences->Project Structure
Choose Lib directory and mark it as source.
Then, import it.
import pack_name
Pay attention that when you're doing the import, you choosing the local path and not your python path.
In general, that's recommended to have requirements.txt file, that includes all the used packages names, and then the pycharm will recognize the uninstalled packages and suggest you to install them.
Good Luck

Change dependencies code on dotcloud. Django

I'm deploying my Django app with Dotcloud. While developing locally, I had to make changes inside the code of some dependencies (that are in my virtualenv).
So my question is: is there a way to make the same changes on the dependencies (for example django-registration or django_socketio) while deploying on dotcloud?
Thank you for your help.

There are many ways, but not all of them are clean/easy/possible.
If those dependencies are on github, bitbucket, or a similar code repository, you can:
fork the dependency,
edit your fork,
point to the fork in your requirements.txt file.
This will allow you to track further changes to those dependencies, and easily merge your own modifications with future versions.
Otherwise, you can include the (modified) dependencies with your code. It's not very clean and increases the size of your app, but that's fine too.
Last but not least, you can write a very hackish postinstall script, to locate the .py file to be modified (e.g. import foo ; foopath = foo.__file__), then apply a patch on that file. This would probably cause most sysadmins to cringe in terror, but it's worth mentioning :-)

If you are using a requirements.txt, no, there is not a way to do that from pypi, since Dotcloud is simply downloading the packages you've specified from pypi, and obviously your changes within your virtualenv are not going to be reflected by the canonical versions of the packages.
In order to use the edited versions of your dependencies, you'll have to bundle them into your code like any other module you've written, and import them from there.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.