How can I get Pants to store the output of git describe somewhere in my .pex file so that I can access it from the Python code I'm writing?
Basically I want to be able to clone my project and do this:
./pants binary px
Distribute the resulting dist/px.pex to somebody
That somebody should be able to do px.pex --version and get a printout of whatever git describe said when I built the .pex in step one.
Help!
Turns out pex already does git describe on build. The result it stores in a PEX-INFO file in the root of the .pex file. So to read it, I did this:
def get_version():
"""Extract version string from PEX-INFO file"""
my_pex_name = os.path.dirname(__file__)
zip = zipfile.ZipFile(my_pex_name)
with zip.open("PEX-INFO") as pex_info:
return json.load(pex_info)['build_properties']['tag']
This is good enough IMO, but there are also drawbacks. If somebody has an improved answer I'm prepared to switch to that one as the accepted one.
Outages with this one:
Relies on relative paths to locate PEX-INFO, would be better if there was some kind of API call for this.
No way to customize how the version number is computed; I'd like to do git describe --dirty for example.
Related
I’d like to download from https://hebrewbooks.org/ all free available books, using a simple script.
Every book (52,000 of them) has a unique numeric number assigned. For example:
https://hebrewbooks.org/1
https://hebrewbooks.org/3
https://hebrewbooks.org/52000
But many numbers have been skipped or have been removed.
Usually a visitor would click on the download button which returns: (book number 52000)
https://download.hebrewbooks.org/downloadhandler.ashx?req=52000
Or (for book number 1)
https://download.hebrewbooks.org/downloadhandler.ashx?req=1
I would like to download all files to a local disk without having to request each file individually in a browser etc.
I know this can been achieved with a simple script (even a bash script).
Could anyone advise me where to look or where find a similar problem that’s been solved.
Edit: I forgot an important question. How do I get the script to change the name for each downloaded file from the I’d number (such as 42000) to the metadata included in each file?
As mentioned, wget would be a good tool to use. Maybe try using it in a loop?
#! /bin/bash
#iterate 52,000 times
for i in {1..52000}; do
sleep 1s
wget [local path] "https://download.hebrewbooks.org/downloader.ashx?req=${i}"
# $i is the current iteration, therefore collecting all 52,000
done
edit: Just realized someone commented this on the main question, but I'll leave this here for anyone who doesn't see them like me.
You can use wget for this task:
wget /download/path/to/save/downloaded/file "https://download.hebrewbooks.org/downloader.ashx?req=book_number"
More help: https://askubuntu.com/questions/207265/how-to-download-a-file-from-a-website-via-terminal
I have been studying some of the OpenAI gym envs and came across this line:
self.model = mujoco_py.MjModel(fullpath)
(https://github.com/openai/gym/blob/master/gym/envs/mujoco/mujoco_env.py#L28)
Can anyone tell me where mujoco_py.MjModel() is defined? I assume this is somehow pulled from native MuJoCo / Cython...
EDIT
Also when I search the install folder of mujoco_py (<Python-installation-directory>/Lib/site-packages/mujoco_py/), there is literally no MjModel found (Sublime fulltext search). (The search might exclude some files.) What I do find a lot are 'mjModel' and 'PyMjModel' though.
I am confused because the instantiation through mujoco_py.MjModel() also seems to create a different kind of model than using functions like mujoco_py.load_model_from_path(). The former have a .data attribute while the latter apparently don't.
If you have installed mujoco-py, you can probably find MjModel in something like the following file:
<Python-installation-directory>/Lib/site-packages/mujoco_py/mjcore.py
You won't find that python file in the mujoco-py repository though. It probably gets generated from C++ code during the installation process (when running setup.py). It looks like MjModel is defined in the mjmodel.pxd file (for more info on .pxd files, see this).
So, basically I have 2 versions of a project and for some users, I want to use the latest version while for others, I want to use older version. Both of them have same file names and multiple users will use it simultaneously. To accomplish this, I want to call function from different git branch without actually switching the branch.
Is there a way to do so?
for eg., when my current branch is v1 and the other branch is v2; depending on the value of variable flag, call the function
if flag == 1:
# import function f1() from branch v2
return f1()
else:
# use current branch v1
Without commenting on why you need to do that, you can simply checkout your repo twice: once for branch1, and one for branch2 (without cloning twice).
See "git working on two branches simultaneously".
You can then make your script aware of its current path (/path/to/branch1), and relative path to the other branch (../branch2/...)
You must have both versions of the code present / accessible in order to invoke both versions of the code dynamically.
The by-far-simplest way to accomplish this is to have both versions of the code present in different locations, as in VonC's answer.
Since Python is what it is, though, you could dynamically extract specific versions of specific source files, compile them on the fly (using dynamic imports and temporary files, or exec and internal strings), and hence run code that does not show up in casual perusal of the program source. I do not encourage this approach: it is difficult (though not very difficult) and error-prone, tends towards security holes, and is overall a terrible way to work unless you're writing something like a Python debugger or IDE. But if this is what you want to do, you simply decompose the problem into:
examine and/or extract specific files from specific commits (git show, git cat-file -p, etc.), and
dynamically load or execute code from file in file system or from string in memory.
The first is a Git programming exercise (and is pretty trivial, git show 1234567:foo.py or git show branch:foo.py: you can redirect the output to a file using either shell redirection or Python's subprocess module), and when done with files, the second is a Python programming exercise of moderate difficulty: see the documentation, paying particularly close attention to importlib.
I have some Python modules, some of them require more than 20 others.
My question is, if there a tool is which helps me to bundle some Python modules to one big file.
Here a simple example:
HelloWorld.py:
import MyPrinter
MyPrinter.displayMessage("hello")
MyPrinter.py:
def displayMessage(msg):
print msg
should be converted to one file, which contains:
def displayMessage(msg):
print msg
displayMessage("hello")
Ok, I know that this example is a bit bad, but i hope that someone understand what i mean and can help me. And one note: I talk about huge scripts with very much imports, if they were smaller I could do that by myself.
Thanks.
Aassuming you are using Python 2.6 or later, you could package the scripts into a zip file, add a __main__.py and run the zip file directly.
If you really want to collapse everything down to a single file, I expect you're going to have to write it yourself. The source code transformation engine in lib2to3 may help with the task.
You cannot and should not 'convert them into one file'.
If your application consists of several modules, you should just organize it into package.
There is pretty good tutorial on packages here: http://diveintopython3.org/packaging.html
And you should read docs on it here: http://docs.python.org/library/distutils.html
Pip supports bundling. This is an installation format, and will unpack into multiple files. Anything else would be a bad idea, as it would break imports and per-module metadata.
The website of the statistics generator in question is:
http://gitstats.sourceforge.net/
Its git repository can be cloned from:
git clone git://repo.or.cz/gitstats.git
What I want to do is something like:
./gitstatus --ext=".py" /input/foo /output/bar
Failing being able to easily pass the above option without heavy modification, I'd just hard-code the file extension I want to be included.
However, I'm unsure of the relevant section of code to modify and even if I did know, I'm unsure of how to start such modifications.
It's seems like it'd be rather simple but alas...
I found this question today while looking for the same thing. After reading sinelaw's answer I looked into the code and ended up forking the project.
https://github.com/ShawnMilo/GitStats
I added an "exclude_extensions" config option. It doesn't affect all parts of the output, but it's getting there.
I may end up doing a pretty extensive rewrite once I fully understand everything it's doing with the git output. The original project was started almost exactly four years ago today and there's a lot of clean-up that can be done due to many updates to the standard library and the Python language.
EDIT: apparently even the previous solution below only affects the "Files" stat page, which is not interesting. I'm trying to find something better. The line we need to fix is 254, this:
lines = getpipeoutput(['git rev-list --pretty=format:"%%at %%ai %%aN <%%aE>" %s' % getcommitrange('HEAD'), 'grep -v ^commit']).split('\n')
Previous attempt was:
Unfortunately, seems like git does not provide options for easily filtering by files in a commit (in the git log and git rev-list). This solution doesn't really filter all the statistics for certain file types (such as the statistics on tags), but does so for the part that calculates activity by number of lines changed.
So the best I could come up with is at line 499 of gitstats (the main script):
res = int(getpipeoutput(['git ls-tree -r --name-only "%s"' % rev, 'wc -l']).split('\n')[0])
You can change that by either adding a pipe into grep in the command, like this:
res = int(getpipeoutput(['git ls-tree -r --name-only "%s"' % rev, 'grep \\.py$', 'wc -l']).split('\n')[0])
OR, you could split out the 'wc -l' part, get the output of git ls-tree into a list of strings, and filter the resulting file names by using the fnmatch module (and then count the lines in each file, possibly by using 'wc -l') but that sounds like overkill for the specific problem you're trying to solve.
Still doesn't solve the problem (the rest of the stats will ignore this filter), but hopefully helpful.