deploy Python pip package on Hadoop? - python

Write a Python UDF for Hadoop/Pig, and need to use some Python libraries like "request" which I installed locally by pip when doing local box UDF testing. Wondering how to deploy the pip package on Hadoop cluster so that no matter my Python UDF runs on which node, it automatically consumes?

Information about zip file format can be found at Zip (file format). Practically speaking it is a compressed archive format sort of like tar (an archive format) plus gzip (a file compression format). Java jar (Java ARchive) format is compatible with zip.
On Linux and Unix platforms a directory dir can be zipped with 'zip -r dir dir' to create a dir.zip file. On Windows 7-Zip is most useful for creating and unbundling zip files plus it can be used to unbundle and browse files having other compression and archive formats including tar and gzip.
Given a file dir.tar.gz it can be unbundled and zipped interactively using the 7-Zip GUI on Windows while on Linux and Unix systems the following commands can do the same thing:
tar zxf dir.tar.gz # creates directory dir by extraction and decompression
zip -r dir dir # creates dir.zip by bundling without removing dir

Related

Python compile all files to another structure

I have a Python project with files in a directories structure and I would like to get all .pyc files to the same directories to deliver without sources.
I am trying to do this with python -m compileall -d /tmp/new -b . but all pyc files are created in their respective sources directories instead of /tmp/new/somedir/
Any ideas? Will i have to create a script to recreate this structure?
Take a look about distributing *.pyc files:
What are the limitations of distributing .pyc files?
I suggest you use Py2Exe or cx_Freeze:
http://www.py2exe.org/
https://anthony-tuininga.github.io/cx_Freeze/

Folders installed by Python Markdown

When I installed Python Markdown, I noticed that it added files to a build/docs/extensions folder. Where can I find this folder? I've searched through my machine, but came up empty.
As part of the build process, Python-Markdown also builds the docs (from Markdown text files into HTML files). However, as the generated files are written to the build directory, they are deleted in the cleanup step after install (the build dir is deleted). The docs are hosted here, but if you would like a local copy, you can build them yourself.
First you need a copy if the source files either from PyPI or GitHub. then from within the top directory, run the following command:
python setup.py build_docs
The docs will be written to build/docs. As we didn't complete the install process, the files haven't been deleted yet.
The complete steps to download and build might look this (on Linux using the current release of Python-Markdown):
wget http://pypi.python.org/packages/source/M/Markdown/Markdown-2.5.2.tar.gz
tar xvzf Markdown-2.5.2.tar.gz
cd markdown-2.5.2/
python setup.py build_docs
cd build/docs
From that point you can move/copy the files to wherever you would like them, or open them in your browser of choice to view them.

using PyInstaller as Installer

Is it possible to do a two steps decompression with PyInstaller?
e.g. it can decompress archived files from itself as needed, like InnoSetup, Nullsoft Installer (NSIS).
For --onefile exe generated with PyInstaller, everything is decompressed at invocation runtime, and it takes a lot of time, if there's a lot of bundled datafile.
What I trying to do is to replicate InnoSetup with PyInstaller+PyQt. Any ideas?
Yes you can. You can bundle installer files by appending to a.data in your spec file. Then at runtime your data files will be in the MEIPASS folder and you can copy them wherever you want. https://stackoverflow.com/a/20088482/259538

extract pyc from Application bundle

Is there anyway the pyc file can be extracted from an Application bundle? I'm referring to an application bundle that was created by using py2app on python code.
The pyc files are stored in a zipfile in the Resources folder. The name of the zip file depends on the Python version, for 2.7 it is .app/Contents/Resources/python2.7/site-packages.zip.
That zip file is a regular zipfile that can be manipulated with the usual tools.

How do I install a DMG file from the command line?

I am looking for small bash or python script that will install a .dmg file.
We'll assume that the dmg contains one or more .app directories that have to be copied to /Applications, overriding any already existing directories.
Files or directories not matching the *.app pattern are to be ignored.
You can mount the disk image using
hdiutil attach -mountpoint <path-to-desired-mountpoint> <filename.dmg>
The disk image will be mounted at the selected path (the argument following -mountpoint). Then, search for an .app file and copy the file to /Applications.
Once you have finished installation unmount the disk image:
hdiutil detach <path-to-mountpoint>

Categories

Resources