Deploying with git - python

Currently I have a bunch of git repos for a django site i'm looking to deploy, the repos' take the form:
sn-static
sn-django
sn-templates
[etc]
I then have a super repos that stores each of these as a submodules. In terms of deployment, I want to try to keep things fairly simple, would it be a valid method to:
Clone a stable tag from the super repos & therefore have stable clones of each repos in one place.
As the names are sn-* I would then look the symlink to a more friendly structure e.g. ln -s /path/to/super-repos/sn-static /home/site/media/
Then my nginx webserver (in the case of static content at least) could simply refer to /home/site/media
Without a great deal of technical knowledge i'm unsure if symlinking would have any implications, in terms of speed or stability. I'm also wondering if I can get away with this as a method of deployment, rather than, say, using something like Capistrano (that as yet I have no experience with).

An option you should consider is using pip in conjunction with virtualenv to install your packages especially as pip has the option to directly install certain branches or tags from a git repository.
That way you can use one requirements file to handle all your dependencies, your own packages and apps by other people. (See this post for the big picture.)
And to handle your static media I'd prefer to use Django's builtin staticfiles app instead of symlinking several dirs, as it seems cleaner and easier to manage.

When you reach a release point in your code, tag it (Git Tag). On your server, clone the master branch once and then simply to pull the release tag you want, each time you do a release.
git pull [tag]

Related

Open source python project dependency manager lock file without hard-coded package URL

Several countries implement national firewalls that make accessing certain package repos extremely difficult. There are also companies that only allow using internal repos. Many people insist lock files should be kept in source control. It seems many dep managers (at least poetry, pdm and pipenv) directly reference the repo URL in the lock file. That means that one can't simply use the default (international) repo url (typically pypi).
I can't understand why the repo URL has any use in the lock file, given that we have strong hashes. None of the package managers seem to make it easy to simply have an override config (like PIP_INDEX_URL) and simply have the package name, version and hash in the config and lock files. That's fine for private company repos but is a pain for open source projects.
Is there some common easy way to do this for projects so it is easy for open source devs to refer to their best repo URL and also have fully specified packages? Or do I simply have to, say, force using exact versions (e.g., in my pyproject.toml) and not keep a lock file in source control at all?

Python: incorporate git repo in project

What is best practice for incorporating git repositories into the own project which do not contain a setup.py? There are several ways that I could imagine, but it doesn't seem clear which is best.
Copy the relevant code and include it into the own project
pro:
Only use the relevant code,
con:
git repo might be updated
need to do this for every project again
feels like stealing
Cloning the repository and writing a setup.py and install it with pip
pro:
easy
package can be updated
can use package like any normal pip package
con:
feels weird
Clone the repository and add the path to the project's search path
pro:
easy
package can be updated
con:
needing to adjust the search path also feels strange
In my opinion, you forgot the best option: Ask the original project maintainer to make the package available via pip. Since pip can install directly from git repositories this doesn't take more than a setup.py -- in particular, you don't need a PyPI account, you don't need to tag releases, etc.
If that's not possible then I would opt for your second option, i.e. provide my own setup.py file in a fork of the project. This makes incorporating upstream changes pretty easy (basically you simply git pull them from the upstream repo) and gives you all the benefits of package management (automatic installation, dependency management, etc.).

How can I have Enterprise and Public version of Django application sharing some code?

I'm building a webapp using Django which needs to have two different versions: an Enterprise version and a standard public version. Up until now, I've been only developing the Enterprise version and am now looking for the best way to separate the two versions in the simplest way while avoiding duplication of code as much as possible. The main difference between the two versions will be that they need different URLs and different Views. I intend to differentiate based on subdomain using a multi-tenant architecture, where the www.example.com is the public version, and company1.example.com hits the enterprise version.
I've come up with a couple potential solutions, but I'm not happy with any of them.
Separate Git repositories and entirely separate projects, with all common code duplicated. This much duplication of code is bound to be error prone where things will get out of sync and is expected to be ridden with copy-paste mistakes. This is a last-resort solution.
Separate Git repositories, with common code shared via Git Submodules (a single common 'base' repository containing base models and shared views). I've read horror stories about git submodules, though, so I'm wary of this solution.
Single Git repository containing multiple 'project' folders (public/enterprise) each with their own base urls.py, settings.py, wsgi.py, etc...) and multiple manage.py files to choose which "Project" to run. I'm afraid that this solution would become an utter mess because it wouldn't be possible to have the public and enterprise versions use different versions of the common library if one needs an update before the other.
Separate Git repositories, with all shared code developed as 'Re-usable apps' and installed into the python path. This would be a somewhat clean solution, but would be difficult to work with any time changes needed to be made to the common modules.
Single project where all features are managed via conditional logic in the views. This would be most prone to bugs and confusion of all, and I'd prefer to avoid this solution.
Does anyone have any experience with this type of solution or could anyone help me find the best solution to this problem?
What about "a single Git repository, with all shared code developed as 'Re-usable apps'"? That is configure the options enabled with the INSTALLED_APPS setting.
First you need to decide on your release process. If you intend on releasing both versions simultaneously, using the one git repository makes sense.
An overriding concern might be if you have different distribution requirements for the code, e.g. if you want the code in the public version to be publicly available and the enterprise version to be private. Then you might have to use two git repositories.
Have you looked into using git subtree? It's an alternative to submodules, and it makes the process a little less complicated. I think Atlassian does a great job of explaining how it's used and the pros and cons. A few examples are:
"Contents of the module can be modified without having a separate repository copy of the dependency somewhere else."
"The sub-project’s code is available right after the clone of the super project is done."
"Management of a simple workflow is easy."
The Atlassian link is here.
Here's also a link to git-subtree's description file.
Probably the best solution is to identify exactly which code is shared between the two projects and make that a reusable app.
Then each installation can install that django app, and then has their own site specific code as well.

Contributing to a repository on GitHub on a new branch

Say someone owns a repository with only one master hosting code that is compatible with Python 2.7.X. I would like to contribute to that repository with my own changes to a new branch new_branch to offer a variant of the repository that is compatible with Python 3.
I followed the steps here:
I forked the repository on GitHub on my account
I cloned my fork on my local machine
I created a new branch new_branch locally
I made the relevant changes
I committed and pushed the changes to my own fork on GitHub
I went on the browser to the GitHub page of the official repository, and asked for a pull request
The above worked, but it did a pull request from "my_account:new_branch" to "official_account:master". This is not what I want, since Python 2.7.x and Python 3 are incompatible with each other. What I would like to do is create a PR to a new branch on the official repository (e.g. with the same name "new_branch"). How can I do that? Is this possible at all?
You really don't want to do things this way. But first I'll explain how to do it, then I'll come back to explain why not to.
Using Pull Requests at GitHub has a pretty good overview, in particular the section "Changing the branch range and destination repository." It's easiest if you use a topic branch, and have the upstream owner create a topic branch of the same name; then you just pull down the menu where it says "base: master" and the choice will be right there, and he can just click the "merge" button and have no surprises.
So, why don't you want to do things this way?
First, it doesn't fit the GitHub model. Topic branches that live forever in parallel with the master branch and have multiple forks make things harder to maintain and visualize.
Second, you need both a git URL and an https URL for you code. You need people to be able to share links, pip install from top of tree, just clone the repo instead of cloning and then checking out a different branch, etc. This all means your code has to be on the master branch.
Third, if you want people to be able to install your 3.x version off PyPI, find docs at readthedocs, etc., you need a single project with a single source tree. Most such sites have a single latest version, not a latest version for each Python version, and definitely not multiple variations of the same version. (You could install completely fork the project, and create a separate foo3 project. But it's much easier for people to be able to pip install foo than to have them try that, fail, come to SO and ask why it doesn't work, and get told they probably have Python 3 and need to pip install foo3 instead.)
How do you merge two versions into a single package? The porting docs should have the most up-to-date advice, but briefly: If it's at all possible to create a single codebase that runs on both versions, that's ideal; if not, and if you can't make things work by running 2to3 or 3to2 at install time, create a parallel directory for the 3.x code (e.g., a foo3 alongside foo) and pick the appropriate directory at install time. (You can always start with that and gradually work toward a unified codebase.)

GAE - Including external python modules without adding them to the repository?

I'm current working on a python based Google App Engine project. Specifically, I'm using Flask for the application. I'm wondering what the accepted method of including external python modules is, specifically when it comes to the repository. From what I can tell, including other people's code in my repository is bad form for several reasons. However, other people will be working on the same repository, so we should be using the same external modules to insure the same results.
Specifically, I need to include Flask (and its dependencies) to my application. The easiest way to do this with Google App Engine is just to throw them into the root level:
MyProject
app.yaml
main.py
MyApp
Flask
...
What is the proper way to bring in these external modules in such a project? Both a generalized answer and one specific to my case would be useful. Also, any other related recommendations would be appreciated. Thank you much.
While it is indeed possible to include third party libraries as submodules or symlinks from external repositories, in practice it's not a good idea. Here are two scenarios on what could go wrong:
If the third party library releases a new version that breaks the functionality, you will have to either make all the necessary changes to meet the new requirements or simply find the previous version to keep working and break the external connection. Usually this happens when you are very close to deadlines.
If the third party library releases a new version and one of your colleagues is upgraded and made all the necessary changes to support the new version, on your side the code will be broken until you will upgrade as well.
The above examples are much more visible in big projects with lots of dependencies and as more people joining the project in the long run it becomes a huge problem! I could come up with more examples, but I think you can see the point.
Your best option is to include the external libraries into your repository, which also has the advantage that you are able to have the whole project up and running on a new machine without many dependencies. There are many ways on how to organize your third party libraries and all of them needs to be included on the same or deeper level with your app.yaml file. Just as #dragonx mentioned include only the core library code.
Also do not afraid putting stuff into your repository cause space is not an issue today and these libraries usually not updating that often so your repository size is not getting too much bigger over time.
Since you mentioned Flask on Google App Engine, you can check out my gae-init project, where you can see in practice how the external libraries are organised.
You're actually asking two questions here.
How do I include the external library in my GAE project?
You've got the right idea. Whatever way you go about it, you must somehow include Flask and its dependencies in the root of your GAE project. One way is to put a copy directly in there.
The second way is to use a symbolic link to the folder that contains the external library. I'm not sure about Flask, but often times external repos contain the actual library code in a subdirectory - so often you don't want the root of the repo in your GAE app, just the root of the actual source. In this case, it's easier to put a symlink that links to the source folder.
How do I manage external libraries in my source repo?
This is a harder question to answer since it depends what source control tool you're using. Yes, you do want to have everyone use the same versions of external libraries, and they should be included in your source control somehow.
If you're using git, git submodule is the way to go. It's a bit confusing to start with but it'll get the job done.
I'd recommend a repo structure that looks something like this
repo/
thirdparty/
flask/
other_dependency/
another_dependency/
README.TXT
setup.py
src/
app/
app.yaml
your_source.py
softlink_to_flask
softlink_to_other_dependency
softlink_to_another_dependency_src
In this example you keep the source to your external libraries in the thirdparty folder. These may be git submodules. In the app folder you have your source, and softlinks to the appropriate files that are actually needed for your app to run. In this case, the actual code for another_dependency may be in the another_dependency/src folder rather than the actual root of another dependency. This way you don't need to include the unnecessary files in your deployment folder, but you can still keep the entire library in your repo.
You can't just create requirements.txt and put it to GAE. Your code must include all pure python libraries that used your project and doesn't supported by GAE (https://developers.google.com/appengine/docs/python/tools/libraries27).
If you look at flask deploy example for GAE (http://flask.pocoo.org/docs/quickstart/#deploying-to-a-web-server and https://github.com/kamalgill/flask-appengine-template) you can find some dependencies like flask, werkzeug and etc. and all this dependencies you must push to GAE server.
So I see three solutions:
Use local requirements for local development and make custom build function that will download all dependencies, put with your application and upload to GAE server.
Add tools for local deployment when you just start project that put required libraries with your application (don't forget about .gitignore).
Use something like git submodules to requirements repositories.
There is two case for using python third party packages in google app engine project:
If your library is one of the supported runtime-provided third-party libraries of GAE section
just add it to your app.yml file under libraries
libraries:
- name: package_name
version: latest
Add your code
import pack_name
Sometimes you need to install the package with
pip install package_name
Make sure you're using the right interpreter, by using
pip freeze
you can make sure the package is installed successfully to the right path.
Otherwise, if GAE does not support you library, you need to download it manually and save it locally under root/Lib directory:
or through GIT
or through pip (pip install package_name -t path/to/your/Lib/dir)
After that, we should declare Lib directory as source dir in pycharm
pycharm->preferences->Project Structure
Choose Lib directory and mark it as source.
Then, import it.
import pack_name
Pay attention that when you're doing the import, you choosing the local path and not your python path.
In general, that's recommended to have requirements.txt file, that includes all the used packages names, and then the pycharm will recognize the uninstalled packages and suggest you to install them.
Good Luck

Categories

Resources