python, project structure to test separate git library

python, project structure to test separate git library - python

Suppose I have a private git repo(A library repo) which I want to use from my project. (B project repo)
I clone A to my ~/workspace/A
and I work on my project at ~/workspace/B
B 's virtualenv resides in ~/virtualenvs/B
In order to modify A and test the modified from B,
modify A
commit push to A's origin
pip install A git+http://A-repository
Which is very time consuming.. Can I reduce the above steps to
modify A
by placing A inside somewhere in project B's virtualenv?
and commit & push only after I test the modified A from B?
** Edit
I could think of two ways and wonder if there's better way.
add A as a git submodule to B somewhere (which is in python module path) under ~/workspace/B
: I just didn't like submodule whenever I used it.. hard to grasp and manage them.
add ~/workspace/parent-of-A/ to python-path before virutalenv python-path
So when I edit ~/workspace/parent-of-A/A, it is readily seen by B.
And production server and other people who don't modify A could use pip install-ed version in virtualenv.

Related

What is the best way to work privately on pypi released package within Pycharm?

I have a Python package that is released in pypi and can be installed via pip install.
I want to do some minor changes in that package that are only of use for me in my Pycharm project A. I expect that these changes will be quite frequent, so I want to be able to do these changes on the fly.
I know that I can work with a local copy of that project / package by doing the following steps:
perform a git clone
use that code as a separate Pycharm project B
in my own project A, I write:
import sys
sys.path.insert(0, '/path/to/second_pycharm_project')
import project_name
Now I can do code changes in the Pycharm project B and executing project A just reflects that correctly.
Nevertheless, I have some constraints:
Variable / code lookup within Pycharm is not possible this way.
Setting a breakpoint in project B has to be done within project A and seems to work only while entering the code of B during debugging.
My question is:
Is there any other (better) way to use another project within Pycharm?
(I thought of changing the code that gets copied by pip install in my virtual environment directly, but this seems very unclean and dangerous to me, in case my changes get accidentally overwritten by pip install)

Clone then pip install -e . of B on A’s virtualenv (do it in the directory with B’s setup.py). That’s a local editable install and puts B in A sys.path.
Git Branch B so you can do your local B edits wo impact to its git origin (but could still merge it later if you wish to).
Use Settings | (current) Project | Project Structure | Add Content Root in Pycharm to add the other project B to your main project A.
(Make sure you keep track of local B changes because nothing here does that if you were to duplicate your work on a different machine and git clone B again).
Remark: it must be a small -e and not a big -E.

When working with a venv virtual environment, which files should I be commiting to my git repository?

Using GitHub's .gitignore, I was able to filter out some files and directories. However, there's a few things that left me a little bit confused:
GitHub's .gitignore did not include /bin and /share created by venv. I assumed they should be ignored by git, however, as the user is meant to build the virtual environment themselves.
Pip generated a pip-selfcheck.json file, which seemed mostly like clutter. I assume it usually does this, and I just haven't seen the file before because it's been placed with my global pip.
pyvenv.cfg is what I really can't make any sense of, though. On one hand, it specifies python version, which ought to be needed for others who want to use the project. On the other hand, it also specifies home = /usr/bin, which, while perhaps probably correct on a lot of Linux distributions, won't necessarily apply to all systems.
Are there any other files/directories I missed? Are there any stricter guidelines for how to structure a project and what to include?

Although venv is a very useful tool, you should not assume (unless you have good reason to do so) that everyone who looks at your repository uses it. Avoid committing any files used only by venv; these are not strictly necessary to be able to run your code and they are confusing to people who don't use venv.
The only configuration file you need to include in your repository is the requirements.txt file generated by pip freeze > requirements.txt which lists package dependencies. You can then add a note in your readme instructing users to install these dependencies with the command pip install -r requirements.txt. It would also be a good idea to specify the required version of Python in your readme.

Git setup for file that will be branched and then call other modules

I have a repo that I would like to branch, say into b1 b2. It contains a file main.py that imports imported.py. Therefore, when I change imported.py, I would like the most uptodate version to be available to both b1 and b2. I thought it might be best to create a separate git repository repo A that contains main.py and then another repo B for imported.py... but what is the standard (and preferably simple) protocol for such a situation?

If you chose to create a another repository, you could make it a submodule of your other repositories. But this wouldn't solve your problem of having to update it in each repository. You might be able to add a git hook to the submodule, so that each time you update the submodule, you do a submodule update for each of your other repositories.
I don't think it's worth the effort of going through all this though. It would be easier to just manually merge your changes, or write a simple bash script if you're confident.

From my perspective you have two options:
distribute the dependency as a python package (either through pypi, or your own custom pypi, or via wheels)
just leave the file in the same repository and do
git checkout <other branch> -- dependency.py
Anything else will just be painful and error prone.

Locally modify package from pip

I've locally installed via pip Python package in virtualenv. I'd like to modify it (not monkey patch or subclass, but deeply modify) and keep it in my source control system referencing without installing. Maybe later I'd like to package it again so I'd like to keep all files for creating package, not only python sources.
Should I just copy it to my project folder and deinstall from virtualenv?

Two points. One, are the changes you're planning to make useful for anyone else? If the first, you might consider cloning the source repo, making your changes and submitting a PR. Even if it's not immediately merged, you can make use of setup.py to create a local package and install that in your virtualenv.
And two, are you planning to use these changes for just one project, or on many projects? If it's just for one project, throwing it in your repo and deeply modifying it is probably an ok thing, (although you need to confirm you're allowed to do so by the license). If you can foresee using this in multiple projects, you're probably better off creating a repo for it, and packaging it via setup.py.

Django and VirtualEnv Development/Deployment Best Practices

Just curious how people are deploying their Django projects in combination with virtualenv
More specifically, how do you keep your production virtualenv's synched correctly with your development machine?
I use git for scm but I don't have my virtualenv inside the git repo - should I, or is it best to use the pip freeze and then re-create the environment on the server using the freeze output? (If you do this, could you please describe the steps - I am finding very little good documentation on the unfreezing process - is something like pip install -r freeze_output.txt possible?)

I just set something like this up at work using pip, Fabric and git. The flow is basically like this, and borrows heavily from this script:
In our source tree, we maintain a requirements.txt file. We'll maintain this manually.
When we do a new release, the Fabric script creates an archive based on whatever treeish we pass it.
Fabric will find the SHA for what we're deploying with git log -1 --format=format:%h TREEISH. That gives us SHA_OF_THE_RELEASE
Fabric will get the last SHA for our requirements file with git log -1 --format=format:%h SHA_OF_THE_RELEASE requirements.txt. This spits out the short version of the hash, like 1d02afc which is the SHA for that file for this particular release.
The Fabric script will then look into a directory where our virtualenvs are stored on the remote host(s).
If there is not a directory named 1d02afc, a new virtualenv is created and setup with pip install -E /path/to/venv/1d02afc -r /path/to/requirements.txt
If there is an existing path/to/venv/1d02afc, nothing is done
The little magic part of this is passing whatever tree-ish you want to git, and having it do the packaging (from Fabric). By using git archive my-branch, git archive 1d02afc or whatever else, I'm guaranteed to get the right packages installed on my remote machines.
I went this route since I really didn't want to have extra virtuenvs floating around if the packages hadn't changed between release. I also don't like the idea of having the actual packages I depend on in my own source tree.

I use this bootstrap.py: http://github.com/ccnmtl/ccnmtldjango/blob/master/ccnmtldjango/template/bootstrap.py
which expects are directory called 'requirements' that looks something like this: http://github.com/ccnmtl/ccnmtldjango/tree/master/ccnmtldjango/template/requirements/
There's an apps.txt, a libs.txt (which apps.txt includes--I just like to keep django apps seperate from other python modules) and a src directory which contains the actual tarballs.
When ./bootstrap.py is run, it creates the virtualenv (wiping a previous one if it exists) and installs everything from requirements/apps.txt into it. I do not ever install anything into the virtualenv otherwise. If I want to include a new library, I put the tarball into requirements/src/, add a line to one of the textfiles and re-run ./bootstrap.py.
bootstrap.py and requirements get checked into version control (also a copy of pip.py so I don't even have to have that installed system-wide anywhere). The virtualenv itself isn't. The scripts that I have that push out to production run ./bootstrap.py on the production server each time I push. (bootstrap.py also goes to some lengths to ensure that it's sticking to Python 2.5 since that's what we have on the production servers (Ubuntu Hardy) and my dev machine (Ubuntu Karmic) defaults to Python 2.6 if you're not careful)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.