How to access files inside a Python egg file?

How to access files inside a Python egg file? - python

This might be a weird requirement but it's what I've run into. I Googled but yield nothing.
I'm coding an application who's using a lot of constant attributes / values recorded in an XML file (they'll not change so a static file), things work fine until I generated an egg file for it.
When the logic reaches the XML accessing part, I got one complaint like this:
/home/Workspace/my_proj/dist/mps-1.2.0_M2-py2.6.egg/mps/par/client/syntax/syntax.xml
Actually I've bundled the XML file in the path above but seems Python doesn't know how to access it.
The code to access the XML is as...
file_handler = open(path_to_the_file)
lines = file_handler.read().splitlines()
Any idea?

egg files are zipfiles, so you must access "stuff" inside them with the zipfile module of the Python standard libraries, not with the built-in open function!

If you want to access the contents inside the .egg file you can simply rename it and change extension from .egg to .zip and than unzip it.
Which will create a folder and the contents will be same as they were when it was a .egg file
for example brewer2mpl-1.4.1-py3.6.egg
After Renaming brewer2mpl-1.4.1-py3.6.zip
Now if we open it, it'll get easily unzipped and the content will be put in a folder with same name in the same directory.
(tested on macOS Sierra)

The less command on *nix systems can peek inside zip files. Therefore less some.egg will list the contents of a .egg file too.

Just run unzip file.egg
You can install unzip on Debian/Ubuntu with
sudo apt install unzip
or on macOS by installing Homebrew then
brew install unzip

Access file from inside egg file
Yes, It is possible to read the files from inside egg file.
Egg file: mps-1.2.0_M2-py2.6.egg structure for module level example:
In driverfile.py:
import xml.etree.ElementTree
import mps.par.client as syntaxpath
import os
path = os.path.dirname(syntaxpath.__file__)
element = xml.etree.ElementTree.parse(path+'\\syntax\\syntax.xml').getroot()
print(element)
Read xml file from inside an eggfile:
PYTHONPATH=mps-1.2.0_M2-py2.6.egg python driverfile.py

i think by default eggs packing file under python won't add your xml inside to the pack

Related

Using PYTHONPATH in Spyder with no access to command line

I've just started using Python with Spyder at work, which means I'm far more restricted than normal as I have no access to the command line.
I'm trying to access the PyPDF2 library, which I have downloaded as a ZIP file, and then pointed to this file with the PYTHONPATH manager. I still can't access it:
from PyPDF2 import PdfFileMerger, PdfFileReader
gets: "ImportError: No module named 'PyPDF2'"
All the walk throughs I've seen of using PYTHONPATH involve using the command line. Can anyone help with how to do this without this access? Sorry am relatively new to this and really stuck!
Thanks

I don't know anything about Spyder, but in Anaconda there is a way to install packages from the Anaconda Navigator. If Spyder doesn't have this feature, you can do the following:
Create a folder somewhere called PyPDFPath
Unzip PyPDF2 into this directory, making sure your directory structure looks like this, with all of the PyPDF2 code inside the PyPDF2 directory
At the top of your script, before any other imports, add the following code, where PYPDFPATH is the location of the PyPDFPath folder
import sys
sys.path.append('PYPDFPATH')
In your script, try importing PyPDF2 as you did in your question. If you've done everything right, you should have no problems.
The sys.path variable is a list that contains all of the folders that Python should look for modules. If you add a folder to this list with modules you'd like to import in it before you import them, Python will look for those modules in this folder in addition to the default folders it looks for modules in.
Note that if you downloaded the PyPDF2 zip from GitHub, your PyPDF2 directory needs to contain the PyPDF2 directory from inside the zip instead of the entire repository.
I hope this helps!

ImportError: cannot import name

I got a library called google-translate-python. https://github.com/terryyin/google-translate-python
Basically, I copied/pasted the translate.py file to my python27/lib directory. I imported it like so:
from translate import Translator
And I put in something like this:
theTranslate = Translator(to_lang="sp")
translation = theTranslate.translate("hello")
And I'm using pycharm btw so I haven't gotten any errors, it is saying the methods are there and everything.
However, I get the error: ImportError: cannot import name Translator
Did I import the library wrong? that's all I can think of. Because the methods are there and running.

I figured it out... the library I was trying to import had the same name as my actual python file. So my python file was called translate.py and my library I was trying to import was called translate. I don't know how to differentiate it.. but changing the name of my python file fixed it. wow.. that took about 3 hours to realize.

Does it show in the list of packages installed under Pycharm interpreter? You need to add the package to this list and then it becomes available to you for import. It is available as one of the packages there.

Based on the github page the package can be installed from the source using:
python setup.py install
Another option is to save the translate.py to the local directory or another directory.
If translate.py is not in the local directory you can add the module path using:
sys.path.append('PATH_TO_TRANSLATE.PY')

If you can't use pip the simplest way to get this installed would be to do download the source code (.zip file) and unzip it.
Open a terminal (where you have access to python) and change to the folder (cd <the path to the folder>) you have unzipped, and then run:
python setup.py install
This will make sure the files end up in the right location (which on Windows is actually in C:\Python27\Lib\site-packages).

How to build a single python file from multiple scripts?

I have a simple python script, which imports various other modules I've written (and so on). Due to my environment, my PYTHONPATH is quite long. I'm also using Python 2.4.
What I need to do is somehow package up my script and all the dependencies that aren't part of the standard python, so that I can email a single file to another system where I want to execute it. I know the target version of python is the same, but it's on linux where I'm on Windows. Otherwise I'd just use py2exe.
Ideally I'd like to send a .py file that somehow embeds all the required modules, but I'd settle for automatically building a zip I can just unzip, with the required modules all in a single directory.
I've had a look at various packaging solutions, but I can't seem to find a suitable way of doing this. Have I missed something?
[edit] I appear to be quite unclear in what I'm after. I'm basically looking for something like py2exe that will produce a single file (or 2 files) from a given python script, automatically including all the imported modules.
For example, if I have the following two files:
[\foo\module.py]
def example():
print "Hello"
[\bar\program.py]
import module
module.example()
And I run:
cd \bar
set PYTHONPATH=\foo
program.py
Then it will work. What I want is to be able to say:
magic program.py
and end up with a single file, or possibly a file and a zip, that I can then copy to linux and run. I don't want to be installing my modules on the target linux system.

I found this useful:
http://blog.ablepear.com/2012/10/bundling-python-files-into-stand-alone.html
In short, you can .zip your modules and include a __main__.py file inside, which will enable you to run it like so:
python3 app.zip
Since my app is small I made a link from my main script to __main__.py.
Addendum:
You can also make the zip self-executable on UNIX-like systems by adding a single line at the top of the file. This may be important for scripts using Python3.
echo '#!/usr/bin/env python3' | cat - app.zip > app
chmod a+x app
Which can now be executed without specifying python
./app

Use stickytape module
stickytape scripts/blah --add-python-path . > /tmp/blah-standalone
This will result with a functioning script, but not necessarily human-readable.

You can try converting the script into an executable file.
First, use:
pip install pyinstaller
After installation type ( Be sure you are in your file of interest directory):
pyinstaller --onefile --windowed filename.py
This will create an executable version of your script containing all the necessary modules. You can then transfer (copy and paste) this executable to the PC or machine you want to run your script.
I hope this helps.

You should create an egg file. This is an archive of python files.
See this question for guidance: How to create Python egg file
Update: Consider wheels in 2019

The only way to send a single .py is if the code from all of the various modules were moved into the single script and they your'd have to redo everything to reference the new locations.
A better way of doing it would be to move the modules in question into subdirectories under the same directory as your command. You can then make sure that the subdirectory containing the module has a __init__.py that imports the primary module file. At that point you can then reference things through it.
For example:
App Directory: /test
Module Directory: /test/hello
/test/hello/__init__.py contents:
import sayhello
/test/hello/sayhello.py contents:
def print_hello():
print 'hello!'
/test/test.py contents:
#!/usr/bin/python2.7
import hello
hello.sayhello.print_hello()
If you run /test/test.py you will see that it runs the print_hello function from the module directory under the existing directory, no changes to your PYTHONPATH required.

If you want to package your script with all its dependencies into a single file (it won't be a .py file) you should look into virtualenv. This is a tool that lets you build a sandbox environment to install Python packages into, and manages all the PATH, PYTHONPATH, and LD_LIBRARY_PATH issues to make sure that the sandbox is completely self-contained.
If you start with a virgin Python with no additional libraries installed, then easy_install your dependencies into the virtual environment, you will end up with a built project in the virtualenv that requires only Python to run.
The sandbox is a directory tree, not a single file, but for distribution you can tar/zip it. I have never tried distributing the env so there may be path dependencies, I'm not sure.
You may need to, instead, distribute a build script that builds out a virtual environment on the target machine. zc.buildout is a tool that helps automate that process, sort of like a "make install" that is tightly integrated with the Python package system and PyPI.

I've come up with a solution involving modulefinder, the compiler, and the zip function that works well. Unfortunately I can't paste a working program here as it's intermingled with other irrelevant code, but here are some snippets:
zipfile = ZipFile(os.path.join(dest_dir, zip_name), 'w', ZIP_DEFLATED)
sys.path.insert(0, '.')
finder = ModuleFinder()
finder.run_script(source_name)
for name, mod in finder.modules.iteritems():
filename = mod.__file__
if filename is None:
continue
if "python" in filename.lower():
continue
subprocess.call('"%s" -OO -m py_compile "%s"' % (python_exe, filename))
zipfile.write(filename, dest_path)

Have you taken into considerations Automatic script creation of distribute the official packaging solution.
What you do is create a setup.py for you program and provide entry points that will be turned into executables that you will be able run. This way you don't have to change your source layout while still having the possibility to easily distribute and run you program.
You will find an example on a real app of this system in gunicorn's setup.py

Why doesn't my Python 2.6 auto-unzip egg files on import?

I'm under the impression that Python import is supposed to automatically
unzip egg files in site-packages.
My installation doesn't seem to want to auto-unzip the egg. What I tried:
(1) I used easy_install to install the suds module, which copied the
egg file into site-packages. Python couldn't import it. (import suds)
(2) Then I used the --always-unzip option to easy_install. This time it
gave me a directory instead of a zip file. Python still couldn't import the suds module.
(3) I renamed the directory suds. still couldn't find it.
(4) finally I copied the suds directory out of the unzipped egg directory into
site-packags and Python found it (no surprise there).
for me, easy_install wasn't. What's missing here?
Rufus

By default (if you haven't specified multi-version mode), easy_installing an egg will add an entry to the easy-install.pth file in site-packages. Check there to see if there's a reference to the suds egg. You can also check the Python import path (which is the list of places Python will search for modules) like this:
import sys
print sys.path
Did you try import suds in a Python shell that was started before you easy_installed suds?
That would explain the behaviour you saw. The .pth files are only read at Python startup, so the egg directory or zip file wouldn't have appeared in sys.path. Copying the suds dir from inside the egg directory worked because site-packages itself was already in sys.path. So make sure you restart Python after installing an egg.
Python will import from zip archives, but it won't unzip the archive into site-packages. That is, it won't leave the unzipped directory there after you import. (I think it reads from the zip file in-place without extracting it anywhere in the file system.) I've seen problems where some packages didn't work as zipped eggs (they tried to read data from their location in the file-system), so I'd recommend always using the --always-unzip flag as you do in (2).
You haven't given the command lines you used. Did you specify the -m option to easy_install? That will cause the egg to be installed in multi-version mode. It won't be in sys.path by default, and you'd need to use the pkg_resources.require function before trying to import it.

using a "temporary files" folder in python

I recently wrote a script which queries PyPI and downloads a package; however, the package gets downloaded to a user defined folder.
I`d like to modify the script in such a way that my downloaded files go into a temporary folder, if the folder is not specified.
The temporary-files folder in *nix machines is "/tmp" ; would there be any Python method I could use to find out the temporary-files folder in a particular machine?
If not, could someone suggest an alternative to this problem?

Python has a built-in module for using temporary files and folders. You probably want tempfile.mkdtemp().

Perhaps the tempfile module?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to access files inside a Python egg file? - python

egg files are zipfiles, so you must access "stuff" inside them with the zipfile module of the Python standard libraries, not with the built-in open function!

The less command on *nix systems can peek inside zip files. Therefore less some.egg will list the contents of a .egg file too.

Just run unzip file.egg You can install unzip on Debian/Ubuntu with sudo apt install unzip or on macOS by installing Homebrew then brew install unzip

i think by default eggs packing file under python won't add your xml inside to the pack

Related

Using PYTHONPATH in Spyder with no access to command line

ImportError: cannot import name

How to build a single python file from multiple scripts?

Why doesn't my Python 2.6 auto-unzip egg files on import?

using a "temporary files" folder in python

Categories

Resources