python yaml path after deployment - python

So this is a question about how to handle settings files and relative paths in python (probably also something about best practice).
So I have coded a smaller project that i want to deploy to a docker image and everything is set up now except when I try to run the python task (Through cron) I get the error: settings/settings.yml not found.
tree .
├───settings
│ └───settings/settings.yml
└───main.py
And am referencing the yml file as
open('settings/settings.yml', 'r') as f:
config = yaml.load(f, Loader=yaml.FullLoader)
I can see this is what is causing the problem but am unsure about how to fix it. I wish to reference the main file basically by using the entry_points from setuptools in the future so my quick fix with cd'ing before python main.py will not be a lasting solution.

Instead of hardcoding a path as a string, you can find the directories and build the file path with os.path. For example:
import os
import yaml
current_dir = os.path.dirname(os.path.abspath(__file__))
settings_dir = os.path.join(current_dir, "settings")
filename = "settings.yml"
settings_path = os.path.join(settings_dir, filename)
with open(settings_path, "r") as infile:
settings_data = yaml.load(infile)
This way it can be run in any file system and the python file can be called from any directory.

Related

Can't read csv file in same directory [duplicate]

Say I have a Python project that is structured as follows:
project
/data
test.csv
/package
__init__.py
module.py
main.py
__init__.py:
from .module import test
module.py:
import csv
with open("..data/test.csv") as f:
test = [line for line in csv.reader(f)]
main.py:
import package
print(package.test)
When I run main.py I get the following error:
C:\Users\Patrick\Desktop\project>python main.py
Traceback (most recent call last):
File "main.py", line 1, in <module>
import package
File "C:\Users\Patrick\Desktop\project\package\__init__.py", line 1, in <module>
from .module import test
File "C:\Users\Patrick\Desktop\project\package\module.py", line 3, in <module>
with open("../data/test.csv") as f:
FileNotFoundError: [Errno 2] No such file or directory: '../data/test.csv'
However, if I run module.py from the package directory, I don’t get any errors. So it seems that the relative path used in open(...) is only relative to where the originating file is being run from (i.e __name__ == "__main__")? I don't want to use absolute paths. What are some ways to deal with this?
Relative paths are relative to current working directory.
If you do not want your path to be relative, it must be absolute.
But there is an often used trick to build an absolute path from current script: use its __file__ special attribute:
from pathlib import Path
path = Path(__file__).parent / "../data/test.csv"
with path.open() as f:
test = list(csv.reader(f))
This requires python 3.4+ (for the pathlib module).
If you still need to support older versions, you can get the same result with:
import csv
import os.path
my_path = os.path.abspath(os.path.dirname(__file__))
path = os.path.join(my_path, "../data/test.csv")
with open(path) as f:
test = list(csv.reader(f))
[2020 edit: python3.4+ should now be the norm, so I moved the pathlib version inspired by jpyams' comment first]
For Python 3.4+:
import csv
from pathlib import Path
base_path = Path(__file__).parent
file_path = (base_path / "../data/test.csv").resolve()
with open(file_path) as f:
test = [line for line in csv.reader(f)]
This worked for me.
with open('data/test.csv') as f:
My Python version is Python 3.5.2 and the solution proposed in the accepted answer didn't work for me. I've still were given an error
FileNotFoundError: [Errno 2] No such file or directory
when I was running my_script.py from the terminal. Although it worked fine when I run it through Run/Debug Configurations from the PyCharm IDE (PyCharm 2018.3.2 (Community Edition)).
Solution:
instead of using:
my_path = os.path.abspath(os.path.dirname(__file__)) + some_rel_dir_path
as suggested in the accepted answer, I used:
my_path = os.path.abspath(os.path.dirname(os.path.abspath(__file__))) + some_rel_dir_path
Explanation:
Changing os.path.dirname(__file__) to os.path.dirname(os.path.abspath(__file__))
solves the following problem:
When we run our script like that: python3 my_script.py
the __file__ variable has a just a string value of "my_script.py" without path leading to that particular script. That is why method dirname(__file__) returns an empty string "". That is also the reason why my_path = os.path.abspath(os.path.dirname(__file__)) + some_rel_dir_path is actually the same thing as my_path = some_rel_dir_path. Consequently FileNotFoundError: [Errno 2] No such file or directory is given when trying to use open method because there is no directory like "some_rel_dir_path".
Running script from PyCharm IDE Running/Debug Configurations worked because it runs a command python3 /full/path/to/my_script.py (where "/full/path/to" is specified by us in "Working directory" variable in Run/Debug Configurations) instead of justpython3 my_script.py like it is done when we run it from the terminal.
Try
with open(f"{os.path.dirname(sys.argv[0])}/data/test.csv", newline='') as f:
I was surprised when the following code worked.
import os
for file in os.listdir("../FutureBookList"):
if file.endswith(".adoc"):
filename, file_extension = os.path.splitext(file)
print(filename)
print(file_extension)
continue
else:
continue
So, I checked the documentation and it says:
Changed in version 3.6: Accepts a path-like object.
path-like object:
An object representing a file system path. A path-like object is
either a str or...
I did a little more digging and the following also works:
with open("../FutureBookList/file.txt") as file:
data = file.read()

python; reading file path error

i have a directory structure;
DIR1:
----outerPyFile.py
----DIR2:
--------innerPyFile.py
--------DIR3:
------------fileToRead.csv
I'm reading fileToRead.csv in innerPyFile: pd.read_csv('DIR3/fileToRead.csv')
works fine if i run innerPyFile.py individually
Now when import innerPyFile module inside outerPyFile.py as
import innerPyFile
-- FileNotFoundError: DIR3\\fileToRead.csv. does not exist
i tried replacing path with absolute path in innerPyFile as pd.read_csv(os.path.abspath('DIR3/fileToRead.csv'))
still, when i run outerPyFile i get,
FileNotFoundError C:\\\DIR1\\\DIR3\\\fileToRead.csv does not exist,
here the code omitted DIR2 so i changed code as pd.read_csv(os.path.abspath('DIR2/DIR3/fileToRead.csv'))
Now the code structure works file when i run outerPyFile.py which is acceptable.
but here the problem will arise when i run innerPyFile individually because it will search for DIR2 which is not there in CWD of innerPyFile.
anyone can suspect this behavior,
please revert me what is going on?
FYI, I've also tried pathLib module which didn't solve the issue.
Try this:
innerPyFile.py
import os
script_path = os.path.abspath(__file__) # i.e. /path/to/dir/foobar.py
script_dir = os.path.split(script_path)[0] #i.e. /path/to/dir/
rel_path = "DIR3/fileToRead.csv"
abs_file_path = os.path.join(script_dir, rel_path)
pd.read_csv(abs_file_path)
outerPyFile.py
import DIR2.innerPyFile
#......do something.....

Relative file paths in python [duplicate]

Say I have a Python project that is structured as follows:
project
/data
test.csv
/package
__init__.py
module.py
main.py
__init__.py:
from .module import test
module.py:
import csv
with open("..data/test.csv") as f:
test = [line for line in csv.reader(f)]
main.py:
import package
print(package.test)
When I run main.py I get the following error:
C:\Users\Patrick\Desktop\project>python main.py
Traceback (most recent call last):
File "main.py", line 1, in <module>
import package
File "C:\Users\Patrick\Desktop\project\package\__init__.py", line 1, in <module>
from .module import test
File "C:\Users\Patrick\Desktop\project\package\module.py", line 3, in <module>
with open("../data/test.csv") as f:
FileNotFoundError: [Errno 2] No such file or directory: '../data/test.csv'
However, if I run module.py from the package directory, I don’t get any errors. So it seems that the relative path used in open(...) is only relative to where the originating file is being run from (i.e __name__ == "__main__")? I don't want to use absolute paths. What are some ways to deal with this?
Relative paths are relative to current working directory.
If you do not want your path to be relative, it must be absolute.
But there is an often used trick to build an absolute path from current script: use its __file__ special attribute:
from pathlib import Path
path = Path(__file__).parent / "../data/test.csv"
with path.open() as f:
test = list(csv.reader(f))
This requires python 3.4+ (for the pathlib module).
If you still need to support older versions, you can get the same result with:
import csv
import os.path
my_path = os.path.abspath(os.path.dirname(__file__))
path = os.path.join(my_path, "../data/test.csv")
with open(path) as f:
test = list(csv.reader(f))
[2020 edit: python3.4+ should now be the norm, so I moved the pathlib version inspired by jpyams' comment first]
For Python 3.4+:
import csv
from pathlib import Path
base_path = Path(__file__).parent
file_path = (base_path / "../data/test.csv").resolve()
with open(file_path) as f:
test = [line for line in csv.reader(f)]
This worked for me.
with open('data/test.csv') as f:
My Python version is Python 3.5.2 and the solution proposed in the accepted answer didn't work for me. I've still were given an error
FileNotFoundError: [Errno 2] No such file or directory
when I was running my_script.py from the terminal. Although it worked fine when I run it through Run/Debug Configurations from the PyCharm IDE (PyCharm 2018.3.2 (Community Edition)).
Solution:
instead of using:
my_path = os.path.abspath(os.path.dirname(__file__)) + some_rel_dir_path
as suggested in the accepted answer, I used:
my_path = os.path.abspath(os.path.dirname(os.path.abspath(__file__))) + some_rel_dir_path
Explanation:
Changing os.path.dirname(__file__) to os.path.dirname(os.path.abspath(__file__))
solves the following problem:
When we run our script like that: python3 my_script.py
the __file__ variable has a just a string value of "my_script.py" without path leading to that particular script. That is why method dirname(__file__) returns an empty string "". That is also the reason why my_path = os.path.abspath(os.path.dirname(__file__)) + some_rel_dir_path is actually the same thing as my_path = some_rel_dir_path. Consequently FileNotFoundError: [Errno 2] No such file or directory is given when trying to use open method because there is no directory like "some_rel_dir_path".
Running script from PyCharm IDE Running/Debug Configurations worked because it runs a command python3 /full/path/to/my_script.py (where "/full/path/to" is specified by us in "Working directory" variable in Run/Debug Configurations) instead of justpython3 my_script.py like it is done when we run it from the terminal.
Try
with open(f"{os.path.dirname(sys.argv[0])}/data/test.csv", newline='') as f:
I was surprised when the following code worked.
import os
for file in os.listdir("../FutureBookList"):
if file.endswith(".adoc"):
filename, file_extension = os.path.splitext(file)
print(filename)
print(file_extension)
continue
else:
continue
So, I checked the documentation and it says:
Changed in version 3.6: Accepts a path-like object.
path-like object:
An object representing a file system path. A path-like object is
either a str or...
I did a little more digging and the following also works:
with open("../FutureBookList/file.txt") as file:
data = file.read()

Python: Check if data file exists relative to source code file

I have a small text (XML) file that I want a Python function to load. The location of the text file is always in a fixed relative position to the Python function code.
For example, on my local computer, the files text.xml and mycode.py could reside in:
/a/b/text.xml
/a/c/mycode.py
Later at run time, the files could reside in:
/mnt/x/b/text.xml
/mnt/x/c/mycode.py
How do I ensure I can load in the file? Do I need the absolute path? I see that I can use os.path.isfile, but that presumes I have a path.
you can do a call as follows:
import os
BASE_DIR = os.path.dirname(os.path.realpath(__file__))
This will get you the directory of the python file you're calling from mycode.py
then accessing the xml files is as simple as:
xml_file = "{}/../text.xml".format(BASE_DIR)
fin = open(xml_file, 'r+')
If the parent directory of the two directories are always the same this should work:
import os
path_to_script = os.path.realpath(__file__)
parent_directory = os.path.dirname(path_to_script)
for root, dirs, files in os.walk(parent_directory):
for file in files:
if file == 'text.xml':
path_to_xml = os.path.join(root, file)
You can use the special variable __file__ which gives you the current file name (see http://docs.python.org/2/reference/datamodel.html).
So in your first example, you can reference text.xml this way in mycode.py:
xml_path = os.path.join(__file__, '..', '..', 'text.xml')

Include .pyd module files in py2exe compilation

I'm trying to compile a python script. On executing the exe I got:-
C:\Python27\dist>visualn.exe
Traceback (most recent call last):
File "visualn.py", line 19, in <module>
File "MMTK\__init__.pyc", line 39, in <module>
File "Scientific\Geometry\__init__.pyc", line 30, in <module>
File "Scientific\Geometry\VectorModule.pyc", line 9, in <module>
File "Scientific\N.pyc", line 1, in <module>
ImportError: No module named Scientific_numerics_package_id
I can see the file Scientific_numerics_package_id.pyd at the location "C:\Python27\Lib\site-packages\Scientific\win32". I want to include this module file into the compilation. I tried to copy the above file in the "dist" folder but no good. Any idea?
Update:
Here is the script:
from MMTK import *
from MMTK.Proteins import Protein
from Scientific.Visualization import VRML2; visualization_module = VRML2
protein = Protein('3CLN.pdb')
center, inertia = protein.centerAndMomentOfInertia()
distance_away = 8.0
front_cam = visualization_module.Camera(position= [center[0],center[1],center[2]+distance_away],description="Front")
right_cam = visualization_module.Camera(position=[center[0]+distance_away,center[1],center[2]],orientation=(Vector(0, 1, 0),3.14159*0.5),description="Right")
back_cam = visualization_module.Camera(position=[center[0],center[1],center[2]-distance_away],orientation=(Vector(0, 1, 0),3.14159),description="Back")
left_cam = visualization_module.Camera(position=[center[0]-distance_away,center[1],center[2]],orientation=(Vector(0, 1, 0),3.14159*1.5),description="Left")
model_name = 'vdw'
graphics = protein.graphicsObjects(graphics_module = visualization_module,model=model_name)
visualization_module.Scene(graphics, cameras=[front_cam,right_cam,back_cam,left_cam]).view()
Py2exe lets you specify additional Python modules (both .py and .pyd) via the includes option:
setup(
...
options={"py2exe": {"includes": ["Scientific.win32.Scientific_numerics_package_id"]}}
)
EDIT. The above should work if Python is able to
import Scientific.win32.Scientific_numerics_package_id
There is a way to work around this types of issues that I have used a number of times. In order to add extra files to the py2exe result you can extend the media collector in order to have a custom version of it. The following code is an example:
import glob
from py2exe.build_exe import py2exe as build_exe
def get_py2exe_extension():
"""Return an extension class of py2exe."""
class MediaCollector(build_exe):
"""Extension that copies Scientific_numerics_package_id missing data."""
def _add_module_data(self, module_name):
"""Add the data from a given path."""
# Create the media subdir where the
# Python files are collected.
media = module_name.replace('.', os.path.sep)
full = os.path.join(self.collect_dir, media)
if not os.path.exists(full):
self.mkpath(full)
# Copy the media files to the collection dir.
# Also add the copied file to the list of compiled
# files so it will be included in zipfile.
module = __import__(module_name, None, None, [''])
for path in module.__path__:
for f in glob.glob(path + '/*'): # does not like os.path.sep
log.info('Copying file %s', f)
name = os.path.basename(f)
if not os.path.isdir(f):
self.copy_file(f, os.path.join(full, name))
self.compiled_files.append(os.path.join(media, name))
else:
self.copy_tree(f, os.path.join(full, name))
def copy_extensions(self, extensions):
"""Copy the missing extensions."""
build_exe.copy_extensions(self, extensions)
for module in ['Scientific_numerics_package_id',]:
self._add_module_data(module)
return MediaCollector
I'm not sure which is the Scientific_numerics_package_id module so I've assumed that you can import it like that. The copy extensions method will get a the different module names that you are having problems with and will copy all their data into the dir folder for you. Once you have that, in order to use the new Media collector you just have to do something like the following:
cmdclass['py2exe'] = get_py2exe_extension()
So that the correct extension is used. You might need to touch the code a little but this should be a good starting point for what you need.
I encountered similar probelm with py2exe and the only solution I can find ,is to use another tool to convert python to exe - pyinstaller
Its very easy tool to use and more important , it works!
UPDATE
As I understood from your comments below , running your script from command line is not working also , due to import error (My recommendation is to first check your code from command line ,and than try to convert it to EXE)
It looks like PYTHONPATH problem.
PYTHONPATH is list of paths (similar of Windows PATH) that python programs use to find import modules.
If your script run from your IDE , that means the PYTHONPATH is set correctly in the IDE ,so all imported modules are found.
In order to set PYTHONPATH you can use :
import sys|
sys.path.append(pathname)
or use the following code that add the all folders under path parameter to PYTHONPATH:
import os
import sys
def add_tree_to_pythonpath(path):
"""
Function: add_tree_to_pythonpath
Description: Go over each directory in path and add it to PYTHONPATH
Parameters: path - Parent path to start from
Return: None
"""
# Go over each directory and file in path
for f in os.listdir(path):
if f == ".bzr" or f.lower() == "dll":
# Ignore bzr and dll directories (optional to NOT include specific folders)
continue
pathname = os.path.join(path, f)
if os.path.isdir(pathname) == True:
# Add path to PYTHONPATH
sys.path.append(pathname)
# It is a directory, recurse into it
add_tree_to_pythonpath(pathname)
else:
continue
def startup():
"""
Function: startup
Description: Startup actions needed before call to main function
Parameters: None
Return: None
"""
parent_path = os.path.normpath(os.path.join(os.getcwd(), ".."))
parent_path = os.path.normpath(os.path.join(parent_path, ".."))
# Go over each directory in parent_path and add it to PYTHONPATH
add_tree_to_pythonpath(parent_path)
# Start the program
main()
startup()
The ImportError is rectified by using "Gil.I" and "Janne Karila" suggestion by setting pythonpath and by using include function. But before this I had to create __init__.py file in the win32 folder of both the modules.
BTW I still got another error for the above script - link

Categories

Resources