How do I call data I included in a python package?

How do I call data I included in a python package? - python

I have a python package with this file structure:
package
- bin
clean_spam_ratings.py
- spam_module
- data
spam_ratings.csv
__init__.py
spam_ratings_functions.py
Contents of clean_spam_ratings.py:
import spam_module
with open(path_to_spam_ratings_csv, 'r') as fin:
spam_module.spam_ratings_functions(fin)
What should I set path_to_spam_ratings_csv to?

If you are in a module, then you can get the absolute path for the directory that contains that module via:
os.path.dirname(__file__)
You can use then that to construct the path to your csv file. For example, if you are in spam_ratings_functions.py, use:
path_to_spam_ratings_csv = os.path.join(os.path.dirname(__file__), "..", "data", "spam_ratings.csv")

Related

calling a python module that reads a file

so my program import a utils that reads a file in the same directory as the utils. However, this utils function can be called from different files from different directory.
Project
|
|-module_1:
|__ init __.py
| file.py <--- calls util.load_file()
|module_2:
| __ init __.py
| utils.py <---- load_file() path used 'file.txt'
| file.txt
what is this thing called ? I couldn't even search for it. tried package managment, expanding path ...etc

__file__ contains the path to the current file. Check it with print(__file__).
pathlib from Pythons standard library can be used to construct an absolute path to the data file.
import pathlib
print(pathlib.Path(__file__))
print(pathlib.Path(__file__).parent)
print(pathlib.Path(__file__).parent / 'file.txt')
You can now open your file like this:
filepath = pathlib.Path(__file__).parent / 'file.txt'
with open(filepath) as f:
for line in f:
print(line)

python yaml path after deployment

So this is a question about how to handle settings files and relative paths in python (probably also something about best practice).
So I have coded a smaller project that i want to deploy to a docker image and everything is set up now except when I try to run the python task (Through cron) I get the error: settings/settings.yml not found.
tree .
├───settings
│ └───settings/settings.yml
└───main.py
And am referencing the yml file as
open('settings/settings.yml', 'r') as f:
config = yaml.load(f, Loader=yaml.FullLoader)
I can see this is what is causing the problem but am unsure about how to fix it. I wish to reference the main file basically by using the entry_points from setuptools in the future so my quick fix with cd'ing before python main.py will not be a lasting solution.

Instead of hardcoding a path as a string, you can find the directories and build the file path with os.path. For example:
import os
import yaml
current_dir = os.path.dirname(os.path.abspath(__file__))
settings_dir = os.path.join(current_dir, "settings")
filename = "settings.yml"
settings_path = os.path.join(settings_dir, filename)
with open(settings_path, "r") as infile:
settings_data = yaml.load(infile)
This way it can be run in any file system and the python file can be called from any directory.

Opening a file with python in the same directory from different locations

I am currently accessing a script that opens a file in the directory it's located in. I am accessing this file from both the main.py file located in the same directory, as well as a testfile which is located in a "Test" subdirectory. Trying to use a file from the Test subdirectory to call the function that opens the file causes the script to try and open it from the Test directory instead of the super directory, since I am opening the file simply by calling it as following:
with open(filename,"w") as f:
Is there a way to define the location of the file in a way that makes it possible for the script opening it to be called from anywhere?

Use __file__ to get the path to the current script file, then find the file relative to that:
# In main.py: find the file in the same directory as this script
import os.path
open(os.path.join(os.path.dirname(__file__), 'file.txt'))
# In Test/test.py: find the file one directory up from this script
import os.path
open(os.path.join(os.path.dirname(__file__), '..', 'file.txt'))

just give the absolute file path instead of giving a relative one
for eg
abs_path = '/home/user/project/file'
with open(abs_path, 'r') as f:
f.write(data)

Try specifying the path:
import os
path = 'Your path'
path = os.path.abspath(path)
with open(path, 'w') as f:
f.write(data)

From what I understood your file is in a directory parent_directory/file_name.txt
and in another folder parent_directory/sub_directory/file_name.txt. All you have to do is paste the below code in both parent and sub directories.
import os
file_name = 'your_file_name'
# if the file is in current directory set the path to file_name
if file_name in os.listdir(os.getcwd()):
path = file_name
# if the path is not in current directory go back to parent directory set the path to parent directory
else:
path = os.path.abspath(os.path.join(os.getcwd(), os.pardir))
print('from',os.getcwd())
with open(path, 'r') as filename:
print(filename.read())

Can't read csv file in same directory [duplicate]

Say I have a Python project that is structured as follows:
project
/data
test.csv
/package
__init__.py
module.py
main.py
__init__.py:
from .module import test
module.py:
import csv
with open("..data/test.csv") as f:
test = [line for line in csv.reader(f)]
main.py:
import package
print(package.test)
When I run main.py I get the following error:
C:\Users\Patrick\Desktop\project>python main.py
Traceback (most recent call last):
File "main.py", line 1, in <module>
import package
File "C:\Users\Patrick\Desktop\project\package\__init__.py", line 1, in <module>
from .module import test
File "C:\Users\Patrick\Desktop\project\package\module.py", line 3, in <module>
with open("../data/test.csv") as f:
FileNotFoundError: [Errno 2] No such file or directory: '../data/test.csv'
However, if I run module.py from the package directory, I don’t get any errors. So it seems that the relative path used in open(...) is only relative to where the originating file is being run from (i.e __name__ == "__main__")? I don't want to use absolute paths. What are some ways to deal with this?

Relative paths are relative to current working directory.
If you do not want your path to be relative, it must be absolute.
But there is an often used trick to build an absolute path from current script: use its __file__ special attribute:
from pathlib import Path
path = Path(__file__).parent / "../data/test.csv"
with path.open() as f:
test = list(csv.reader(f))
This requires python 3.4+ (for the pathlib module).
If you still need to support older versions, you can get the same result with:
import csv
import os.path
my_path = os.path.abspath(os.path.dirname(__file__))
path = os.path.join(my_path, "../data/test.csv")
with open(path) as f:
test = list(csv.reader(f))
[2020 edit: python3.4+ should now be the norm, so I moved the pathlib version inspired by jpyams' comment first]

For Python 3.4+:
import csv
from pathlib import Path
base_path = Path(__file__).parent
file_path = (base_path / "../data/test.csv").resolve()
with open(file_path) as f:
test = [line for line in csv.reader(f)]

This worked for me.
with open('data/test.csv') as f:

My Python version is Python 3.5.2 and the solution proposed in the accepted answer didn't work for me. I've still were given an error
FileNotFoundError: [Errno 2] No such file or directory
when I was running my_script.py from the terminal. Although it worked fine when I run it through Run/Debug Configurations from the PyCharm IDE (PyCharm 2018.3.2 (Community Edition)).
Solution:
instead of using:
my_path = os.path.abspath(os.path.dirname(__file__)) + some_rel_dir_path
as suggested in the accepted answer, I used:
my_path = os.path.abspath(os.path.dirname(os.path.abspath(__file__))) + some_rel_dir_path
Explanation:
Changing os.path.dirname(__file__) to os.path.dirname(os.path.abspath(__file__))
solves the following problem:
When we run our script like that: python3 my_script.py
the __file__ variable has a just a string value of "my_script.py" without path leading to that particular script. That is why method dirname(__file__) returns an empty string "". That is also the reason why my_path = os.path.abspath(os.path.dirname(__file__)) + some_rel_dir_path is actually the same thing as my_path = some_rel_dir_path. Consequently FileNotFoundError: [Errno 2] No such file or directory is given when trying to use open method because there is no directory like "some_rel_dir_path".
Running script from PyCharm IDE Running/Debug Configurations worked because it runs a command python3 /full/path/to/my_script.py (where "/full/path/to" is specified by us in "Working directory" variable in Run/Debug Configurations) instead of justpython3 my_script.py like it is done when we run it from the terminal.

Try
with open(f"{os.path.dirname(sys.argv[0])}/data/test.csv", newline='') as f:

I was surprised when the following code worked.
import os
for file in os.listdir("../FutureBookList"):
if file.endswith(".adoc"):
filename, file_extension = os.path.splitext(file)
print(filename)
print(file_extension)
continue
else:
continue
So, I checked the documentation and it says:
Changed in version 3.6: Accepts a path-like object.
path-like object:
An object representing a file system path. A path-like object is
either a str or...
I did a little more digging and the following also works:
with open("../FutureBookList/file.txt") as file:
data = file.read()

Python: Check if data file exists relative to source code file

I have a small text (XML) file that I want a Python function to load. The location of the text file is always in a fixed relative position to the Python function code.
For example, on my local computer, the files text.xml and mycode.py could reside in:
/a/b/text.xml
/a/c/mycode.py
Later at run time, the files could reside in:
/mnt/x/b/text.xml
/mnt/x/c/mycode.py
How do I ensure I can load in the file? Do I need the absolute path? I see that I can use os.path.isfile, but that presumes I have a path.

you can do a call as follows:
import os
BASE_DIR = os.path.dirname(os.path.realpath(__file__))
This will get you the directory of the python file you're calling from mycode.py
then accessing the xml files is as simple as:
xml_file = "{}/../text.xml".format(BASE_DIR)
fin = open(xml_file, 'r+')

If the parent directory of the two directories are always the same this should work:
import os
path_to_script = os.path.realpath(__file__)
parent_directory = os.path.dirname(path_to_script)
for root, dirs, files in os.walk(parent_directory):
for file in files:
if file == 'text.xml':
path_to_xml = os.path.join(root, file)

You can use the special variable __file__ which gives you the current file name (see http://docs.python.org/2/reference/datamodel.html).
So in your first example, you can reference text.xml this way in mycode.py:
xml_path = os.path.join(__file__, '..', '..', 'text.xml')

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How do I call data I included in a python package? - python

Related

calling a python module that reads a file

python yaml path after deployment

Opening a file with python in the same directory from different locations

Can't read csv file in same directory [duplicate]

Python: Check if data file exists relative to source code file

Categories

Resources