How to format the beginning a loop correctly? - python

I have a program which I designed both for myself and my colleague to use, with all the data being stored in a directories. However, I want to set up the loop so that it work both for me and him. I tried all of these:
file_location = glob.glob('/../*.nc')
file_location = glob('/../*.nc')
But none of them are picking up any files. How can I fix this?

You can get a directory relative to a user's home (called ~ in the function call) using os.path.expanduser(). In your case, the line would be
file_location = glob.glob(os.path.expanduser('~/Dropbox/Argo/Data/*.nc'))

Usually is a good practice not hardcoding paths if you're going to use your paths for other tasks which need well-formed paths (ie: subprocess, writing paths to shell scripts), I'd recommend to manage paths using the os.path module instead, for example:
import os, glob
home_path = os.path.expanduser("~")
dropbox_path = os.path.join(home_path, "Dropbox")
good_paths = glob.glob(os.path.join(dropbox_path,"Argo","Data","*.nc"))
bad_paths = glob.glob(dropbox_path+"/Argo\\Data/*.nc")
print len(good_paths)==len(bad_paths)
print all([os.path.exists(p) for p in good_paths])
print all([os.path.exists(p) for p in bad_paths])
The example shows a comparison between bad and well formed paths. Both of them will work, but good_paths will be more flexible and portable in the long term.

Related

python: transforming os.Path code into Pathlib code

I have the following function in python to add a dict as row to a pandas DF that also takes care of creating a first empty DF if there is not yet there.
I use the library os but I would like to change to Pathlib since consulting with a Software Developer of my company I was said I should use pathlib and not os.Path for these issues. (note aside, I don't have a CS background)
def myfunc(dictt,filename, folder='', extension='csv'):
if folder == '':
folder = os.getcwd(). #---> folder = Path.cwd()
filename = filename + '.' + 'csv'
total_file = os.path.join(folder,filename) #<--- this is what I don't get translated
# check if file exists, otherwise create it
if not os.path.isfile(total_file):#<----- if total file is a Path object: totalfile.exists()
df_empty = pd.DataFrame()
if extension=='csv':
df_empty.to_csv(total_file)
elif extension=='pal':
df_empty.to_pkl(total_file)
else:
#raise error
pass
# code to append the dict as row
# ...
First I don't understand why path lib is supposed to be better, and secondly I don't understand how to translate the line above mentioned, i.e. how to really do os.path.join(folder_path, filename) with pathlib notation.
In path lib it seems to be different approaches for windows and other machines, and also I don't see an explanation as to what is a posix path (docs here).
Can anyone help me with those two lines?
Insights as to why use Pathlib instead of os.path are Welcome.
thanks
First I don't understand why path lib is supposed to be better.
pathlib provides an object-oriented interface to the same functionality os.path gives. There is nothing inherently wrong about using os.path. We (the python community) had been using os.path happily before pathlib came on the scene.
However, pathlib does make life simpler. Firstly, as mentioned in the comment by Henry Ecker, you're dealing with path objects, not strings, so you have less error checking to do after a path has been constructed, and secondly, the path objects' instance methods are right there to be used.
Can anyone help me with those two lines?
Using your example:
def mypathlibfunc(dictt, filename, folder='', extension='csv'):
if folder == '':
folder = pl.Path.cwd()
else:
folder = pl.Path(folder)
total_file = folder / f'{filename}.{extension}'
if not total_file.exists():
# do your thing
df_empty = pd.DataFrame()
if extension == 'csv':
df_empty.to_csv(total_file)
elif extension == 'pal':
df_empty.to_pickle(total_file)
notes:
if your function is called with folder != '', then a Path object is being built from it, this is to ensure that folder has a consistent type in the rest of the function.
child Path objects can be constructed using the division operator /, which is what I did for total_file & I didn't actually need to wrap f'{filename}.{extension}' in a Path object. pretty neat! reference
pandas.DataFrame.to_[filetype] methods all accept a Path object in addition to a path string, so you don't have to worry about modifying that part of your code.
In path lib it seems to be different approaches for windows and other machines, and also I don't see an explanation as to what is a posix path
If you use the Path object, it will be cross-platform, and you needn't worry about windows & posix paths.

Make glob directory variable

I'm trying to write a Python script that searches a folder for all files with the .txt extension. In the manuals, I have only seen it hardcoded into glob.glob("hardcoded path").
How do I make the directory that glob searches for patterns a variable? Specifically: A user input.
This is what I tried:
import glob
input_directory = input("Please specify input folder: ")
txt_files = glob.glob(input_directory+"*.txt")
print(txt_files)
Despite giving the right directory with the .txt files, the script prints an empty list [ ].
If you are not sure whether a path contains a separator symbol at the end (usually '/' or '\'), you can concatenate using os.path.join. This is a much more portable method than appending your local OS's path separator manually, and much shorter than writing a conditional to determine if you need to every time:
import glob
import os
input_directory = input('Please specify input folder: ')
txt_files = glob.glob(os.path.join(input_directory, '*.txt'))
print(txt_files)
For Python 3.4+, you can use pathlib.Path.glob() for this:
import pathlib
input_directory = pathlib.Path(input('Please specify input folder: '))
if not input_directory.is_dir():
# Input is invalid. Bail or ask for a new input.
for file in input_directory.glob('*.txt'):
# Do something with file.
There is a time of check to time of use race between the is_dir() and the glob, which unfortunately cannot be easily avoided because glob() just returns an empty iterator in that case. On Windows, it may not even be possible to avoid because you cannot open directories to get a file descriptor. This is probably fine in most cases, but could be a problem if your application has a different set of privileges from the end user or from other applications with write access to the parent directory. This problem also applies to any solution using glob.glob(), which has the same behavior.
Finally, Path.glob() returns an iterator, and not a list. So you need to loop over it as shown, or pass it to list() to materialize it.

How to access file in parent directory using python?

I am trying to access a text file in the parent directory,
Eg : python script is in codeSrc & the text file is in mainFolder.
script path:
G:\mainFolder\codeSrc\fun.py
desired file path:
G:\mainFolder\foo.txt
I am currently using this syntax with python 2.7x,
import os
filename = os.path.dirname(os.getcwd())+"\\foo.txt"
Although this works fine, is there a better (prettier :P) way to do this?
While your example works, it is maybe not the nicest, neither is mine, probably. Anyhow, os.path.dirname() is probably meant for strings where the final part is already a filename. It uses os.path.split(), which provides an empty string if the path end with a slash. So this potentially can go wrong. Moreover, as you are already using os.path, I'd also use it to join paths, which then becomes even platform independent. I'd write
os.path.join( os.getcwd(), '..', 'foo.txt' )
...and concerning the readability of the code, here (as in the post using the environ module) it becomes evident immediately that you go one level up.
To get a path to a file in the parent directory of the current script you can do:
import os
file_path = os.path.join(os.path.dirname(os.path.dirname(__file__)), 'foo.txt')
You can try this
import environ
environ.Path() - 1 + 'foo.txt'
to get the parent dir the below code will help you:
import os
os.path.abspath(os.path.join('..', os.getcwd()))

How to get the path of a program in python?

I'm doing a program in which Chimera needs to be opened, I'm opening it with:
def generate_files_bat(filename):
f = open(filename, 'w')
text = """echo off SET PATH=%PATH%;"C:\\Program Files (x86)\\Chimera 1.6.1\\bin" chimera colpeps.cmd"""
print >>f, text
f.close()
But I need to find Chimera apart from the computer the python program is running. Is there any way the path can be searched by the python program in any computer?
Generally speaking, I don't think it is such a good idea to search the path for a program. Imagine, for example that two different versions were installed on the machine. Are-you sure to find the right one? Maybe a configuraition file parsed with the standard module ConfigParser would be a better option?
Anyway, to go back to your question, in order to find a file or directory, you could try to use os.walk which recursively walks trough a directory tree.
Here is an example invoking os.walk from a generator, allowing you to collect either the first or all matching file names. Please note that the generator result is only based on file name. If you require more advanced filtering (say, to only keep executable files), you will probably use something like os.stat() to extend the test.
import os
def fileInPath(name, root):
for base, dirs, files in os.walk(root):
if name in files:
yield os.path.join(base, name)
print("Search for only one result:")
print(next(fileInPath("python", "/home/sylvain")))
print("Display all matching files:")
print([i for i in fileInPath("python", "/home/sylvain")])
There is which for Linux and where for Windows. They both give you the path to the executable, provided it lies in a directory that is 'searched' by the console (so it has to be in %PATH% in case of Windows)
There is a package called Unipath, that does elegant, clean path calculations.
Have look here for the AbstractPath constructor
Example:
from unipath import Path
prom_dir = Path(__file__)

How can I find path to given file?

I have a file, for example "something.exe" and I want to find path to this file
How can I do this in python?
Perhaps os.path.abspath() would do it:
import os
print os.path.abspath("something.exe")
If your something.exe is not in the current directory, you can pass any relative path and abspath() will resolve it.
use os.path.abspath to get a normalized absolutized version of the pathname
use os.walk to get it's location
import os
exe = 'something.exe'
#if the exe just in current dir
print os.path.abspath(exe)
# output
# D:\python\note\something.exe
#if we need find it first
for root, dirs, files in os.walk(r'D:\python'):
for name in files:
if name == exe:
print os.path.abspath(os.path.join(root, name))
# output
# D:\python\note\something.exe
if you absolutely do not know where it is, the only way is to find it starting from root c:\
import os
for r,d,f in os.walk("c:\\"):
for files in f:
if files == "something.exe":
print os.path.join(r,files)
else, if you know that there are only few places you store you exe, like your system32, then start finding it from there. you can also make use of os.environ["PATH"] if you always put your .exe in one of those directories in your PATH variable.
for p in os.environ["PATH"].split(";"):
for r,d,f in os.walk(p):
for files in f:
if files == "something.exe":
print os.path.join(r,files)
Just to mention, another option to achieve this task could be the subprocess module, to help us execute a command in terminal, like this:
import subprocess
command = "find"
directory = "/Possible/path/"
flag = "-iname"
file = "something.foo"
args = [command, directory, flag, file]
process = subprocess.run(args, stdout=subprocess.PIPE)
path = process.stdout.decode().strip("\n")
print(path)
With this we emulate passing the following command to the Terminal:
find /Posible/path -iname "something.foo".
After that, given that the attribute stdout is binary string, we need to decode it, and remove the trailing "\n" character.
I tested it with the %timeit magic in spyder, and the performance is 0.3 seconds slower than the os.walk() option.
I noted that you are in Windows, so you may search for a command that behaves similar to find in Unix.
Finally, if you have several files with the same name in different directories, the resulting string will contain all of them. In consequence, you need to deal with that appropriately, maybe using regular expressions.
This is really old thread, but might be useful to someone who stumbles across this. In python 3, there is a module called "glob" which takes "egrep" style search strings and returns system appropriate pathing (i.e. Unix\Linux and Windows).
https://docs.python.org/3/library/glob.html
Example usage would be:
results = glob.glob('./**/FILE_NAME')
Then you get a list of matches in the result variable.
Uh... This question is a bit unclear.
What do you mean "have"? Do you have the name of the file? Have you opened it? Is it a file object? Is it a file descriptor? What???
If it's a name, what do you mean with "find"? Do you want to search for the file in a bunch of directories? Or do you know which directory it's in?
If it is a file object, then you must have opened it, reasonably, and then you know the path already, although you can get the filename from fileob.name too.

Categories

Resources