How do you automatically find the file path in python - python

I am looking to parse an excel data file and would like to make it so that my program automatically fills out the file path based on the file location of the current python file I am using.
For example, in the code
categorization_file = r'C:\Users\Name\Desktop\ExcelFile.xlsx'
inputVariables = categorization_file.parse(sheet_name='Control')
I would like the "r'C:\Users\Name\Desktop\" part to be automatically generated if possible. This path will be common with the file I am running my program from.
Thanks

import os
# to get the location of the current python file
basedir = os.path.dirname(os.path.abspath(__file__))
# to join it with the filename
categorization_file = os.path.join(basedir,'ExcelFile.xlsx')

The os module is what you're looking for.
import os
os.getcwd()

Use os.path.dirname like this:
import os
base_dir = os.path.dirname('C:\Users\Name\Desktop\ExcelFile.xlsx')
or even better:
import os
filepath = 'C:\Users\Name\Desktop\ExcelFile.xlsx'
base_dir = os.path.dirname(filepath)
In both cases, base_dir will now evaluate to 'C:\Users\Name\Desktop\'
Hope this helps!

This will give you the full path where the script is
import os
path = os.path.dirname(os.path.realpath(__file__))

Related

Finding a file in the same directory as the python script (txt file)

file = open(r"C:\Users\MyUsername\Desktop\PythonCode\configure.txt")
Right now this is what im using. However if people download the program on their computer the file wont link because its a specific path. How would i be able to link the file if its in the same folder as the script.
You can use __file__. Technically not every module has this attribute, but if you're not loading your module from a file, loading a text file from the same folder becomes a moot point anyway.
from os.path import abspath, dirname, join
with open(join(dirname(abspath(__file__)), 'configure.txt')):
...
While this will do what you're asking for, it's not necessarily the best way to store configuration.
Use os module to get your current filepath.
import os
this_dir, this_filename = os.path.split(__file__)
myfile = os.path.join(this_dir, 'configure.txt')
file = open(myfile)
# import the OS library
import os
# create the absolute location with correct OS path separator to file
config_file = os.getcwd() + os.path.sep + "configure.txt"
# Open file
file = open(config_file)
This method will make sure the correct path separators are used.

Reading all files in a folder with relative urls in both windows and linux

I can read a csv with relative path using below.
import pandas as pd
file_path = './Data Set/part-0000.csv'
df = pd.read_csv(file_path )
but when there are multiple files, I am using glob, File paths are mixed with forward and backward slash. thus unable to read file due to wrong path.
allPaths = glob.glob(path)
file path looks like below for path = "./Data Set/UserIdToUrl/*"
"./Data Set/UserIdToUrl\\part-0000.csv"
file path looks like below for path = ".\\Data Set\\UserIdToUrl\\*"
".\\Data Set\\UserIdToUrl\\part-0000.csv"
If i am using
normalPath = os.path.normpath(path)
normalPath is missing the relative ./ or .\\ like below.
'Data Set\UserIdToUrl\part-00000.csv'
Below could work, what is the best way to do it so that it work in both windows and linux?
".\\Data Set\\UserIdToUrl\\part-0000.csv"
or
"./Data Set/UserIdToUrl/part-0000.csv"
Please ask clarification question, if any. Thanks in advance for comments and answers.
More Info:
I guess the problem is only in windows but not in linux.
Below is shortest program to show issue. consider there are files in path './Data Set/UserIdToUrl/*' and it is correct as i can read file when providing path to file directly to pd.read_csv('./Data Set/UserIdToUrl/filename.csv').
import os
import glob
import pandas as pd
path = "./Data Set/UserIdToUrl/*"
allFiles = glob.glob(path)
np_array_list = []
for file_ in allFiles:
normalPath = os.path.normpath(file_)
print(file_)
print(normalPath)
df = pd.read_csv(file_,index_col=None, header=0)
np_array_list.append(df.as_matrix())
Update2
I just googled glob library. Its definition says 'glob — Unix style pathname pattern expansion'. I guess, I need some utility function that could work in both unix and windows.
you can use abspath
for file in os.listdir(os.path.abspath('./Data Set/')):
...: if file.endswith('.csv'):
...: df = pandas.read_csv(os.path.abspath(file))
Try this:
import pandas as pd
from pathlib import Path
dir_path = 'Data Set'
datas = []
for p in Path(dir_path).rglob('*.csv'):
df = pd.read_csv(p)
datas.append(df)

Creating a submodule that relies on a properties file

I have the following structure
main.py
module/
properties.yaml
file.py
file.py relevant code:
def read_properties():
with open('properties.yaml') as file:
properties = yaml.load(file)
main.py relevant code:
from module import file
file.read_properties()
When read_properties() is called within main.py, I get the following error: FileNotFoundError: [Errno 2] No such file or directory: 'properties.yaml'
What is the recommended way of allowing my module to access the properties file even when imported?
Provide the absolute path to properties.yaml:
with open('/Users/You/Some/Path/properties.yaml') as file:
As JacobIRR said in his answer, it is best to use the absolute path to the file. I use the os module to construct the absolute path based on the current working directory. So for your code it would be something like:
import os
working_directory = os.path.dirname(__file__)
properties_file = os.path.join(working_directory, 'module', 'properties.yaml')
Based on answers from #JacobIRR and #BigGerman
I ended up using pathlib instead of os, but the logic is the same.
Here is the syntax with pathlib for those interested:
in file.py:
from pathlib import Path
properties_file = Path(__file__).resolve().parent/"properties.yaml"
with open(properties_file) as file:
properties = yaml.load(file)

How to move down to a parent directory in Python?

The following code in Python gives me the current path.
import os
DIR = os.path.dirname(os.path.dirname(__file__))
How can I now use the variable DIR to go down one more directory? I don't want to change the value of DIR as it is used elsewhere.
I have tried this:
DIR + "../path/"
But it does not seems to work.
Call one more dirname:
os.path.join(os.path.dirname(DIR), 'path')
Try:
import os.path
print(os.path.abspath(os.path.join(DIR, os.pardir)))
When you join a path via '+' you have to add a 'r':
path = r'C:/home/' + r'user/dekstop'
or write double backslashes:
path = 'C://home//' + 'user//dekstop'
Anyway you should never use that!
That's the best way:
import os
path = os.path.join('C:/home/', 'user/dekstop')

How do I get the full path of the current file's directory?

How do I get the current file's directory path?
I tried:
>>> os.path.abspath(__file__)
'C:\\python27\\test.py'
But I want:
'C:\\python27\\'
The special variable __file__ contains the path to the current file. From that we can get the directory using either pathlib or the os.path module.
Python 3
For the directory of the script being run:
import pathlib
pathlib.Path(__file__).parent.resolve()
For the current working directory:
import pathlib
pathlib.Path().resolve()
Python 2 and 3
For the directory of the script being run:
import os
os.path.dirname(os.path.abspath(__file__))
If you mean the current working directory:
import os
os.path.abspath(os.getcwd())
Note that before and after file is two underscores, not just one.
Also note that if you are running interactively or have loaded code from something other than a file (eg: a database or online resource), __file__ may not be set since there is no notion of "current file". The above answer assumes the most common scenario of running a python script that is in a file.
References
pathlib in the python documentation.
os.path - Python 2.7, os.path - Python 3
os.getcwd - Python 2.7, os.getcwd - Python 3
what does the __file__ variable mean/do?
Using Path from pathlib is the recommended way since Python 3:
from pathlib import Path
print("File Path:", Path(__file__).absolute())
print("Directory Path:", Path().absolute()) # Directory of current working directory, not __file__
Note: If using Jupyter Notebook, __file__ doesn't return expected value, so Path().absolute() has to be used.
In Python 3.x I do:
from pathlib import Path
path = Path(__file__).parent.absolute()
Explanation:
Path(__file__) is the path to the current file.
.parent gives you the directory the file is in.
.absolute() gives you the full absolute path to it.
Using pathlib is the modern way to work with paths. If you need it as a string later for some reason, just do str(path).
Try this:
import os
dir_path = os.path.dirname(os.path.realpath(__file__))
import os
print(os.path.dirname(__file__))
I found the following commands return the full path of the parent directory of a Python 3 script.
Python 3 Script:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from pathlib import Path
#Get the absolute path of a Python3.6 and above script.
dir1 = Path().resolve() #Make the path absolute, resolving any symlinks.
dir2 = Path().absolute() #See #RonKalian answer
dir3 = Path(__file__).parent.absolute() #See #Arminius answer
dir4 = Path(__file__).parent
print(f'dir1={dir1}\ndir2={dir2}\ndir3={dir3}\ndir4={dir4}')
REMARKS !!!!
dir1 and dir2 works only when running a script located in the current working directory, but will break in any other case.
Given that Path(__file__).is_absolute() is True, the use of the .absolute() method in dir3 appears redundant.
The shortest command that works is dir4.
Explanation links: .resolve(), .absolute(), Path(file).parent().absolute()
USEFUL PATH PROPERTIES IN PYTHON:
from pathlib import Path
#Returns the path of the current directory
mypath = Path().absolute()
print('Absolute path : {}'.format(mypath))
#if you want to go to any other file inside the subdirectories of the directory path got from above method
filePath = mypath/'data'/'fuel_econ.csv'
print('File path : {}'.format(filePath))
#To check if file present in that directory or Not
isfileExist = filePath.exists()
print('isfileExist : {}'.format(isfileExist))
#To check if the path is a directory or a File
isadirectory = filePath.is_dir()
print('isadirectory : {}'.format(isadirectory))
#To get the extension of the file
fileExtension = mypath/'data'/'fuel_econ.csv'
print('File extension : {}'.format(filePath.suffix))
OUTPUT:
ABSOLUTE PATH IS THE PATH WHERE YOUR PYTHON FILE IS PLACED
Absolute path : D:\Study\Machine Learning\Jupitor Notebook\JupytorNotebookTest2\Udacity_Scripts\Matplotlib and seaborn Part2
File path : D:\Study\Machine Learning\Jupitor Notebook\JupytorNotebookTest2\Udacity_Scripts\Matplotlib and seaborn Part2\data\fuel_econ.csv
isfileExist : True
isadirectory : False
File extension : .csv
works also if __file__ is not available (jupyter notebooks)
import sys
from pathlib import Path
path_file = Path(sys.path[0])
print(path_file)
Also uses pathlib, which is the object oriented way of handling paths in python 3.
IPython has a magic command %pwd to get the present working directory. It can be used in following way:
from IPython.terminal.embed import InteractiveShellEmbed
ip_shell = InteractiveShellEmbed()
present_working_directory = ip_shell.magic("%pwd")
On IPython Jupyter Notebook %pwd can be used directly as following:
present_working_directory = %pwd
I have made a function to use when running python under IIS in CGI in order to get the current folder:
import os
def getLocalFolder():
path=str(os.path.dirname(os.path.abspath(__file__))).split(os.sep)
return path[len(path)-1]
Python 2 and 3
You can simply also do:
from os import sep
print(__file__.rsplit(sep, 1)[0] + sep)
Which outputs something like:
C:\my_folder\sub_folder\
This can be done without a module.
def get_path():
return (__file__.replace(f"<your script name>.py", ""))
print(get_path())

Categories

Resources