How can I appropriately include external resources with cxfreeze? - python

I'm trying to use cxfreeze to build my Python scripts into an .exe file. However my scripts use some external data files which aren't being packaged into the libary.zip file created.
For example, my scripts are located in src/, and the external data is located in src/data/. I've specified the include_files property in the build_exe_options, but this only copies the directory and files into the built directory; it doesn't add them to library.zip, which is where the scripts end up looking for the files.
Even if I go in to the created library.zip and manually add the data directory, I receive the same error. Any idea how to get cxfreeze to package these external resources appropriately?
setup.py
from cx_Freeze import setup, Executable
build_exe_options = {"includes" : ["re"], "include_files" : ["data/table_1.txt", "data/table_2.txt"]}
setup(name = "My Script",
version = "0.8",
description = "My Script",
options = { "build_exe" : build_exe_options },
executables = [Executable("my_script.py")])
fileutil.py (where it tries to read the resource files)
def read_file(filename):
path, fl = os.path.split(os.path.realpath(__file__))
filename = os.path.join(path, filename)
with open(filename, "r") as file:
lines = [line.strip() for line in file]
return [line for line in lines if len(line) == 0 or line[0] != "#"]
... called with ...
read_file("data/table_1.txt")
Error Traceback
Traceback (most recent call last):
File "C:\Python33\lib\site-packages\cx_Freeze\initscripts\Console3.py", line 2
7, in <module> exec(code, m.__dict__)
File "my_script.py", line 94, in <module>
File "my_script.py", line 68, in run
File "C:\workspaces\py\test_script\src\tables.py", line 12, in load_data
raw_gems = read_file("data/table_1.txt")
File "C:\workspaces\py\test_script\src\fileutil.py", line 8, in read_file
with open(filename, "r") as file:
FileNotFoundError: [Errno 2] No such file or directory:
'C:\\workspaces\\py\\test_script\\src\\build\\exe.win32-3.3\\library.zip\\data/table_1.txt'

The following structure worked for me:
|-main.py
|-src
|-utils.py (containing get_base_dir())
|-data
then refer to your data always relative to the location of main.py that you receive through the following function within the src directory:
import os, sys, inspect
def get_base_dir():
if getattr(sys,"frozen",False):
# If this is running in the context of a frozen (executable) file,
# we return the path of the main application executable
return os.path.dirname(os.path.abspath(sys.executable))
else:
# If we are running in script or debug mode, we need
# to inspect the currently executing frame. This enable us to always
# derive the directory of main.py no matter from where this function
# is being called
thisdir = os.path.dirname(inspect.getfile(inspect.currentframe()))
return os.path.abspath(os.path.join(thisdir, os.pardir))
If you include the data according to the cx_Freeze documentation, it will be in the same directory as the .exefile (i.e. not in the zipfile), which will work with this solution.

Related

Can't open file in the same directory

I was following a python tutorial about files and I couldn't open a text file while in the same directory as the python script. Any reason to this?
f = open("test.txt", "r")
print(f.name)
f.close()
Error message:
Traceback (most recent call last):
File "c:\Users\07gas\OneDrive\Documents\pyFileTest\ManipulatingFiles.py", line 1, in <module>
f = open("test.txt", "r")
FileNotFoundError: [Errno 2] No such file or directory: 'test.txt'
Here's a screenshot of proof of it being in the same directory:
The problem is "test.txt" is a relative file path and will be interpreted relative to whatever the current working directory (CWD) happens to be when the script is run. One simple solution is to use the predefined __file__ module attribute which is the pathname of the currently running script to obtain the (aka "parent") directory the script file is in and use that to obtain an absolute filepath the data file in the same folder.
You should also use the with statement to ensure the file gets closed automatically.
The code below shows how to do both of these things:
from pathlib import Path
filepath = Path(__file__).parent / "test.txt"
with open(filepath, "r") as f:
print(f.name)

How to get content of file inside package

I am writing something in Python where I want to use predefined texts from files within the package. Somehow I can't manage to get it to work in Eclipse PyDev Console.
This is my path structure. From "story.py" I want to use the content of "starttext".
I tried open() with multiple variations of os.getcwd() and os.path.dirname(sys.argv[0]) which resulted in
FileNotFoundError: [Errno 2] No such file or directory: '..\starttext'
My last attempt was trying something like
import pkg_resources
resource_package = __name__
resource_path = '/'.join(('.', 'starttext'))
template = pkg_resources.resource_stream(resource_package, resource_path)
resulting in:
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "C:\Program Files\Python\Python36-64\lib\site-packages\pkg_resources\__init__.py", line 1232, in resource_stream
self, resource_name
File "C:\Program Files\Python\Python36-64\lib\site-packages\pkg_resources\__init__.py", line 1479, in get_resource_stream
return io.BytesIO(self.get_resource_string(manager, resource_name))
File "C:\Program Files\Python\Python36-64\lib\site-packages\pkg_resources\__init__.py", line 1482, in get_resource_string
return self._get(self._fn(self.module_path, resource_name))
File "C:\Program Files\Python\Python36-64\lib\site-packages\pkg_resources\__init__.py", line 1560, in _get
"Can't perform this operation for loaders without 'get_data()'"
NotImplementedError: Can't perform this operation for loaders without 'get_data()'
which appears to have something to do with python 3.x?
This seems to be such an easy task and I don't understand whats wrong.
Any help is appreciated.
Thank you.
update
Thanks to ShmulikA I changed it to:
from os.path import dirname, join, abspath
filename = join(dirname(abspath(communitybot.anthology.teststory.story.__file__)), 'starttext')
file = open(filename, 'r')
content = file.read()
This works although I think it is a little bit long, but I'm certain I am still doing something wrong there.
seems like you are missing a \ - use os.path.join:
from os.path import dirname, join, abspath
filename = join(dirname(abspath(__file__)), 'starttext')
file = open(filename, 'r')
__file__ - the path to the module's source file (you can also do import requests;requests.__file__)
os.path.abspath - returns the absolute filename (e.g. abspath('..') returns /home)
os.path.dirname - returns the dirname of a file
os.path.join - joins a file parts compatible on both linux and windows

cx_freeze and docx - problems when freezing

I have a simple program that takes input from the user and then does scraping with selenium. Since the user doesn't have Python environment installed I would like to convert it to *.exe. I usually use cx_freeze for that and I have successfully converted .py programs to .exe. At first it was missing some modules (like lxml) but I was able to solve it. Now I think I only have problem with docx package.
This is how I initiate the new document in my program (I guess this is what causes me problems):
doc = Document()
#then I do some stuff to it and add paragraph and in the end...
doc.save('results.docx')
When I run it from python everything works fine but when I convert to exe I get this error:
Traceback (most recent call last):
File "C:\Users\tyszkap\AppData\Local\Continuum\Anaconda3\lib\site-packages\cx_Freeze\initscripts\Console.py", line 27, in <module>
exec(code, m.__dict__)
File "tribunalRio.py", line 30, in <module>
File "C:\Users\tyszkap\AppData\Local\Continuum\Anaconda3\lib\site-packages\docx\api.py", line 25, in Document
document_part = Package.open(docx).main_document_part
File "C:\Users\tyszkap\AppData\Local\Continuum\Anaconda3\lib\site-packages\docx\opc\package.py", line 116, in open
pkg_reader = PackageReader.from_file(pkg_file)
File "C:\Users\tyszkap\AppData\Local\Continuum\Anaconda3\lib\site-packages\docx\opc\pkgreader.py", line 32, in from_file
phys_reader = PhysPkgReader(pkg_file)
File "C:\Users\tyszkap\AppData\Local\Continuum\Anaconda3\lib\site-packages\docx\opc\phys_pkg.py", line 31, in __new__
"Package not found at '%s'" % pkg_file
docx.opc.exceptions.PackageNotFoundError: Package not found at 'C:\Users\tyszkap\Dropbox (Dow Jones)\Python Projects\build\exe.win-a
md64-3.4\library.zip\docx\templates\default.docx'
This is my setup.py program:
from cx_Freeze import setup, Executable
executable = Executable( script = "tribunalRio.py" )
# Add certificate to the build
options = {
"build_exe": {'include_files' : ['default.docx'],
'packages' : ["lxml._elementpath", "inspect", "docx", "selenium"]
}
}
setup(
version = "0",
requires = [],
options = options,
executables = [executable])
I thought that explicitly adding default.docx to the package would solve the problem (I have even tried adding it to the library.zip but it gives me even more errors) but it didn't. I have seen this post but I don't know what they mean by:
copying the docx document.py module inside my function (instead of
using Document()
Any ideas? I know that freezing is not the best solution but I really don't want to build a web interface for such a simple program...
EDIT:
I have just tried this solution :
def find_data_file(filename):
if getattr(sys, 'frozen', False):
# The application is frozen
datadir = os.path.dirname(sys.executable)
else:
# The application is not frozen
# Change this bit to match where you store your data files:
datadir = os.path.dirname(__file__)
return os.path.join(datadir, filename)
doc = Document(find_data_file('default.docx'))
but again receive Traceback error (but the file is in this location...):
Traceback (most recent call last):
File "C:\Users\tyszkap\AppData\Local\Continuum\Anaconda3\lib\site-packages\cx_Freeze\initscripts\Console.py", line 27, in <module>
exec(code, m.__dict__)
File "tribunalRio.py", line 43, in <module>
File "C:\Users\tyszkap\AppData\Local\Continuum\Anaconda3\lib\site-packages\docx\api.py", line 25, in Document
document_part = Package.open(docx).main_document_part
File "C:\Users\tyszkap\AppData\Local\Continuum\Anaconda3\lib\site-packages\docx\opc\package.py", line 116, in open
pkg_reader = PackageReader.from_file(pkg_file)
File "C:\Users\tyszkap\AppData\Local\Continuum\Anaconda3\lib\site-packages\docx\opc\pkgreader.py", line 32, in from_file
phys_reader = PhysPkgReader(pkg_file)
File "C:\Users\tyszkap\AppData\Local\Continuum\Anaconda3\lib\site-packages\docx\opc\phys_pkg.py", line 31, in __new__
"Package not found at '%s'" % pkg_file
docx.opc.exceptions.PackageNotFoundError: Package not found at 'C:\Users\tyszkap\Dropbox (Dow Jones)\Python Projects\build\exe.win-a
md64-3.4\default.docx'
What am I doing wrong?
I expect you'll find the problem has to do with your freezing operation not placing the default Document() template in the expected location. It's stored as package data in the python-docx package as docx/templates/default.docx (see setup.py here: https://github.com/python-openxml/python-docx/blob/master/setup.py#L37)
I don't know how to fix that in your case, but that's where the problem is it looks like.
I had the same problem and managed to get around it by doing the following. First, I located the default.docx file in the site-packages. Then, I copied it in the same directory as my .py file. I also start the .docx file with Document() which has a docx=... flag, to which I assigned the value: os.path.join(os.getcwd(), 'default.docx') and now it looks like doc = Document(docx=os.path.join(os.getcwd(), 'default.docx')). The final step was to include the file in the freezing process. Et voilĂ ! So far I have no problem.

Unable to import .xlsx into Python: No such file or directory

I'm trying to import data from HW3_Yld_Data.xlsx into Python. I made sure that the Excel file is in the same directory as the Python file. Here's what I wrote:
import pandas as pd
Z = pd.read_excel('HW3_Yld_Data.xlsx')
Here's the error I got:
In [2]: import pandas as pd
...:
...: Z = pd.read_excel('HW3_Yld_Data.xlsx')
Traceback (most recent call last):
File "<ipython-input-2-7237c05c79ba>", line 3, in <module>
Z = pd.read_excel('HW3_Yld_Data.xlsx')
File "/Users/Zhengnan/anaconda/lib/python2.7/site-packages/pandas/io/excel.py", line 151, in read_excel
return ExcelFile(io, engine=engine).parse(sheetname=sheetname, **kwds)
File "/Users/Zhengnan/anaconda/lib/python2.7/site-packages/pandas/io/excel.py", line 188, in __init__
self.book = xlrd.open_workbook(io)
File "/Users/Zhengnan/anaconda/lib/python2.7/site-packages/xlrd/__init__.py", line 394, in open_workbook
f = open(filename, "rb")
IOError: [Errno 2] No such file or directory: 'HW3_Yld_Data.xlsx'
What's mind-boggling is that it used to work fine. It appeared to stop working after I did a "conda update --all" yesterday.
BTW I'm using Spyder as IDE. Please help. Thank you.
Each process in the operating system has a current working directory. Any relative path is relative to the current working directory.
The current working directory is set to the directory from which you launched the process. This is very natural when using the command-line, but get be confusing for people only using GUIs.
You can retrieve it using os.getcwd(), and you can change it using os.chdir(). Of course, you can also change it before launching your script.
Instead of using the relative path, use the full path of your xlsx for a test. Your conda update may have changed your environment.
You can try something like this in order to test it:
import os
pre = os.path.dirname(os.path.realpath(__file__))
fname = 'HW3_Yld_Data.xlsx'
path = os.path.join(pre, fname)
Z = pd.read_excel(path)

Renaming files recursively with Python

I have a large directory structure, each directory containing multiple sub-directories, multiple .mbox files, or both. I need to rename all the .mbox files to the respective file name without the extension e.g.
bar.mbox -> bar
foo.mbox -> foo
Here is the script I've written:
# !/usr/bin/python
import os, sys
def walktree(top, callback):
for path, dirs, files in os.walk(top):
for filename in files:
fullPath = os.path.join(path, filename)
callback(fullPath)
def renameFile(file):
if file.endswith('.mbox'):
fileName, fileExt = os.path.splitext(file)
print file, "->", fileName
os.rename(file,fileName)
if __name__ == '__main__':
walktree(sys.argv[1], renameFile)
When I run this using:
python walktrough.py "directory"
I get the error:
Traceback (most recent call last):
File "./walkthrough.py", line 18, in <module>
walktree(sys.argv[1], renameFile)
File "./walkthrough.py", line 9, in walktree
callback(fullPath)
File "./walkthrough.py", line 15, in renameFile
os.rename(file,fileName)
OSError: [Errno 21] Is a directory
This was solved by adding an extra conditional statement to test if the name the file was to be changed to, was a current directory.
If this was true, the filename to-be had an underscore added to.
Thanks to WKPlus for the hint on this.
BCvery1

Categories

Resources