How to integrate DVCS in a python application

How to integrate DVCS in a python application - python

Hi I have a simple pyQt text editor,
Essentially I want to add mercurial support
I have seen in various other editors the ability to support a number of DVCS (Mercurial, GIT,Bazaar, etc), and they give the user the ability to perform functions like commit,update, etc
I really want to know what/how I can integrate mercurial in my pyQt text editor, so that it behaves more or less like other fancy Editors.
Any good tutorials/guides on how to get this done

There are no tutorials around this, generally however there are three approaches:
Command line interface
Mercurials command line interface is considered stable. That means that you can expect Mercurial without extensions to not change the output of a command. Using "-T json" for most commands will also result in an easily parsable Json output. This approach is robust and fairly easy to implement as you only have to call out to Mercurial and parse the json back. Most standard commands like commit, log, etc should be implementable using this
hglib
Mercurial is offering hglib. A library that is available in C and Python which allows you to interface with Mercurial via a local protocol. Mercurial will be started in server mode and you use the library to interact. This approach is also very stable, offers a better abstraction, but relies on the command server being available and implies potential API changes in the library. Note that you also have to take the license of the library into account as you are linking against them.
Embedding Mercurial
Python processes can embedd Mercurial directly by important the right modules. However the Mercurial API is internally not stable and subject to continuous change. This option offers you the most flexibility as you have access to everything, including low-level parsing of datastructures, exposing of hidden functionality such as obsolence markers. The drawbacks are: 1. you have to know what to do otherwise you might corrupt the repository 2. the api changes all the time 3. you are subject to the GPL license.

Related

Asking for the best practice for managing third-party libraries (for C++) in code base

context: company that I am working at checked in a variety of third-party library for C++ and python. These days, I upgraded the curl library due to an issue related with curl, which is fixed in a later version.
But after upgrade, problem pop out: several python programs errors "invalid argument", which I never met in the past one year before upgrade. I suspect one possibility is we have multiple curls in our codebase, only one of them is explicitly checked-in my us, and all others are brought in by other third-party libraries.
bazel cquery supports it: I do see a bunch of "curl". And I know which curl function is used (for dynamic link) depends on import order in python.
I want to know for this kind of problem, do we have "best practice" or "recommended way" to manage third-party libraries, especially commonly-used ones like curl?

How can I simulate a Python shell most effectively and securely?

For to offer interactive examples about data analysis, I'd like to embed an interactive python shell. It does not necessarily have to be a real Python shell. Users shall be given tasks that they can execute in the shell. This is similar to existing tutorials, as seen on, e.g., http://www.codecademy.org, but I'd like to work with libraries that those solutions do not offer, as far as I understood.
In order to get a real shell on the website, I think of two approaches:
I found projects like http://www.repl.it, but it seems rather difficult to include the necessary libraries like SciPy, NumPy, and Pandas. In addition, user input has to be validated and I'm not sure whether that works with those shells I found.
I could pipe the commands through a web applications to a Python installation on my server, but I'm scared of using eval() on foreign, arbitrary code. Is there a safe mode for Python? I found http://www.pypy.org. Although they offer a Python sandbox, unfortunately, they do not support the libraries I need.
Alternatively, I thought of just embedding a "fake shell", which I build to copy the behaviour of the functions that I want to explain. Of course, this would result in more work, as I would have to write a fake interface, but for now it seems to be the only possibility.
I hope that this question is not too generic; I'm looking for either a good HTML/JS library that helps me put a fake shell on my website or a library/service/software that can embed a real Python shell with the required modules installed.

There is no way to run untrusted Python safely; Python's dynamic nature allows for too many ways to break through any protective layers you could care to think of.
Instead, run each session on a new virtual machine, properly locked down (firewalled, unprivileged user), which you shut down after a hard time limit. New sessions get a new, clean virtual machine.
This isolates you from any malicious code that might run and try to break out of a sandbox; a good virtual machine is hardware-isolated by the processor from the host OS, something a Python-only layer could never achieve.

This process is sometimes called sandboxing.
You can find some good information on the python wiki
There are basically three options available:
machine-level mechanisms (such as a VM, as Martijn Pieters suggested)
OS-level mechanisms (such as a chroot or SELinux)
custom interpreters, such as pypy (which has sandboxing capabilities, as you mentioned), or Jython, where you may be able to use the Java security manager or applet mechanisms.
You may also want to check Restricted Python, which is especially useful for very restricted environments, but security will depend on its configuration.
Ultimately, your choice of solution will depend on what you want to restrict:
Filesystem access? Block everything, or allow certain directories?
Network access, such as sockets?
Arbitrary system calls?

Differences between subprocess module, envoy, sarge and pexpect?

I am thinking about making a program that will need to send input and take output from the various aircrack-ng suite tools. I know of a couple of python modules like subprocess, envoy, sarge and pexpect that would provide the necessary functionality. Can anyone advise on what I should be using or not using, especially as I'm new to python.
Thanks

As the maintainer of sarge, I can tell you that its goals are broadly similar to envoy (in terms of ease of use over subprocess) and there is (IMO) more functionality in sarge with respect to:
Cross-platform support for bash-like syntax (e.g.use of &&, ||, & in command lines)
Better support for capturing subprocess output streams and working with them asynchronously
More documentation, especially about the internals and peripheral issues like threading+forking in the context of using subprocess
Support for prevention of shell injection attacks
Of course YMMV, but you can check out the docs, they're reasonably comprehensive.

pexpect
In 2015, pexpect does not work on windows. Rumored to add "experimental" support in the next version, but this has been a rumor for a long time (I'm not holding my breath).
Having written many applications using pexpect (and loving it), I am now sorry because one of the things I love about python (that it is cross platform) is not true for my applications.
If you plan to ever add windows support, for the moment, avoid pexpect.
envoy
Not much activity in the last year. And few commits (12 total) since 2012. Not very promising for its future.
Internally it uses shlex in a way that is not compatible with windows paths (the commands must use '/' not '\' for directory separators). A workaround (when using pathlib) is to call as_posix() on path objects before passing them as commands. See this answer.
Getting access to the internal streams (i.e. I want to parse the output to have some updating scrollbars), seems possible but is not documented.
sarge
Works on windows out-of-the-box and has an expect() method that should provide functionality similar to pexpect (allowing me to update a scrollbar). Recent activity, but it is hosted on gitlab and bitbucket (very confusing).
Personal Conclusion
I'm moving from pexpect to sarge for future development. Seems to provide similar feature set to pexpect and supports windows.

subprocess - is a standard library module, so it'll be available with python installation. But it has a reputation of hard to use since it's api is non-intuitive.
envoy - is a third party module that wraps around subprocess. It was written to be an easy to use alternative to subprocess. The author of envoy Kenneth Reitz is famous for his Python for Humans philosophy.
I'm not familiar with the other two.

Is there any patch tools that work well with mercurial, when patch-based workflows fail causing .rej hunks in your repo

I am looking for a better patch tool than the ones built into Mercurial, or a visual tool to help me edit patches so that they will get accepted into Mercurial or Gnu patch.
The mercurial wiki has a topic HandlingRejects which shows how simple it is to have Patching fail. I am looking to implement a mercurial based workflow for feature branch version control that is based on feature branches, and relies on exporting and reviewing patches before they are integrated. In my initial testing of this idea, the weakest link in my "patch out", and "review and accept and modify" patches is the way that patch rejection shuts me down.
Here are some common cases where mercurial patch imports fail on me:
Trivial changes to both upstream repoA and feature branch repoB, where a single line is added somewhere, on both branches. Since both branches have a history, it should be possible for a merge tool to see that "someone added one line in repoA, and someone else added one line in repoB". In the case of patch imports, though, this results in a patch import reject and a .rej file turd in your repository which you ahve to manually repair (by editing the .rej file until it can be applied).
The wiki page above mentions the mpatch tool that is found here. I am looking for other Better Merge Tools that (a) work with mercurial, and (b) can handle the trivial case noted in the Handling Rejects wiki page above. Note that mpatch does not work for my purposes, it seems I need something that is more of a rebase tool than a patch tool, and in my case, I may have to make the patch tool be syntax-aware (and thus specific to a single programming language).
I am looking for tools available for working on Windows, either natively, or even via something like cygwin. I am not using a Unix/Linux environment, although I am comfortable with Linux/Unix style tools.
I am not currently using the mq extensions, just exporting ranges of changes using hg export, and importing using hg import, and the rest of the work is my own inventions, however I have tagged this mq, as mq users will be familiar with this .rej handling problem.
Related question here shows ways of resolving such problems while using TortoiseHg.

Emacs is quite capable of handling .rej files.
However, if at all practical, I try to use hg pull --rebase whenever possible. Often I find myself wanting to rebase some lineage of patches onto another changeset that I've already pulled. In these cases I just strip the changeset and pull it in again from .hg/strip-backup, allowing me to use --rebase.

Python wrapper to access Hg, Git and possibly Bazaar repositories?

I'm looking for a Python library that can do basic manipulation of repositories, but is independent of the backend version control system.
By basic manipulation, I'm referring to: initialize a repo, add files, commit, pull, push, get current revision number.
Users of the library could do something this:
import dvcs_wrapper as dvcs
dvcs.set_backend('hg') # could choose 'git', 'bzr'
repo = dvcs.init('/home/me/my_repo')
repo.add('/home/me/my_repo/*.py')
repo.commit('Initial commit')
repo.push('http://bitbucket.org/....')
print('At revision %d' % repo.revision_num)
Any pointers to something like the above? My Google searches turn up nothing...
Update: for what it's worth, I've started working on something like this: code is here with unit tests
for Hg repositories. I might get around to Git and Bazaar; contributions welcome.

There's also the VCS module, which advertises:
vcs is abstraction layer over various version control systems. It is
designed as feature-rich Python library with clean API.

I think you are out of luck.
There are Python wrappers for git but according to this the quality is still less than optimal. Hg and bzr are Python projects but their infrastructure is quite different, so API level integration is not easy. Also different SCMs have different design philosophies, which makes a unified wrapper less plausible.
That being said, if you do need a simple wrapper, you can use the subprocess module and wrap the command lines to get the result you want.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.