What is the most pythonic way to join or construct paths? - python

Is it using the os.path.join() method, or concatenating strings? Examples:
fullpath1 = os.path.join(dir, subdir)
fullpath2 = os.path.join(dir, "subdir")
fullpath3 = os.path.join("dir", subdir)
fullpath4 = os.path.join(os.path.join(dir, subdir1), subdir2)
etc
or
fullpath1 = dir + "\\" + subdir
fullpath2 = dir + "\\" + "subdir"
fullpath3 = "dir" + "\\" + subdir
fullpath4 = dir + "\\" + subdir1 + \\" + subdir2"
etc
Edit with some more info.
This is a disagreement between a colleague and I. He insists the second method is "purer", while I insist using the built in functions are actually "purer" as it would make it more pythonic, and of course it makes the path handling OS-independent.
We tried searching to see if this question had been answered before, either here in SO or elsewhere, but found nothing

In my opinion (I know, no one asked) it is indeed using Path from pathlib
import pathlib
folder = pathlib.Path('path/to/folder')
subfolder = folder / 'subfolder'
file = subfolder / 'file1.txt'
Please read into pathlib for more useful functions, one I often use is resolve and folder.exists() to check if a folder exist or subfolder.mkdir(parents=True, exist_ok=True) to create a new folder including its parents. Those are random examples, the module can do a lot more.
See https://docs.python.org/3/library/pathlib.html

You can either use the first method using os.join().
A second option is to use the Pathlib module as #DeepSpace suggested.
But the other option is way worse and harder to read so you shouldn't use it.

Related

Add character in a link

I want to add an character to a link.
The link is C:\Users\user\Documents\test.csv I want to add C:\Users\user\Documents\test_new.csv.
So you can see I added the _new to the filename.
Should I extract the name with Path(path).name) and then with Regex? What is the best option for do that?
As you said you want to "add" _new and not rename here is your solution and it is tiny just 2 lines of code apart from the varaible and the result, this is solution might be complex because i have compressed the code to take less memory and do the work fast, you could also change the keyword and the extension from the OUTPUT FUNCTION arguments
PATH = "C:\\User\\Folder\\file.csv"
def new_name(path, ext="csv", keyword="_new"):
print('\\'.join(path.split("\\")[:-1])+"\\"+path.split("\\")[-1].split(".")[0] + keyword + "." + ext)
new_name(PATH)
Here's a solution using the os module:
path = r"C:\User\Folder\file.csv"
root, ext = os.path.splitext(path)
new_path = f'{root}_new{ext}'
And here's one using pathlib:
path = pathlib.Path(r"C:\User\Folder\file.csv")
new_path = str(path.with_stem(path.stem + '_new'))

Using variable and strings to create name with posixpath

I want to create a path with python pathlib using a variable.
This is of course incorrect due to mixing of string and posixpath:
from pathlib import Path
stringvariable='aname'
Path(Path.cwd() / 'firstpartofname_' +stringvariable+ '.csv')
I know I could do it with os, or in two lines like this:
filename='firstpartofname_' + stringvariable + '.csv'
Path(Path.cwd() / filename)
but I want to learn how to use it directly with Path.
Thanks
You just need to add parentheses to force the + to happen before the /.
new = Path.cwd() / ('firstpartofname_' + stringvariable + '.csv')

How to move images to different directory?

Very new to Python so please bear with me. I would like to move only the contents of a directory if it exist. Otherwise, would like to move the entire directory. Cleaning up the input directory would be ideal too. Here is what I have so far, for some reason this isn't working:
#!/usr/bin/python
import sys, os, glob, shutil
in_dir = '/images_in/'
out_dir = '/images_out/'
new_dirs = os.listdir(in_dir)
old_dirs = os.listdir(out_dir)
#See if directory already exists. If it doesnt exists, move entire
directory. If it does exists, move only new images.
for dir in new_dirs:
if dir not in old_dirs:
shutil.move(dir, out_dir)
else:
new_images = glob.glob(in_dir + dir + '*.jpg')
for i in new_images:
shutil.move(i, out_dir + dir + i)
The problem is that when you do:
for i in new_images:
shutil.move(i, out_dir + dir + i)
the target path is incorrect. See i is the result of glob.glob on an absolute path. So prepending another absolute path is wrong. You have to use the base name of i instead.
I would do:
for i in new_images:
shutil.move(i, os.path.join(out_dir, dir, os.path.basename(i)))
Aside:
put old_dirs in a set so lookup with in is faster: old_dirs = set(os.listdir(out_dir))
use os.path.join instead of string concatenation when handling path parts (as I did in my solution). Ex: new_images = glob.glob(os.path.join(in_dir,dir,'*.jpg')
dir is a built-in to list a module contents, that you're shadowing. Not a big concern, but better to avoid it.

How can I get the pathname of the folder two directories upstream of a file?

Using glob2 and os I would like the directory '/a/b/' given the file path '/a/b/c/xyz.txt'
I have been able to (recursively) move forward through directories using /* and /** in glob2, but not backwards through parent directories. I don't want to use regular expressions or split. Is there a simple way to do this using glob and/or os?
Why glob?
dir_path = file_path.split('/')
what_i_want = '/' + dir_path[10] + '/' + dir_path[1] + '/'
You can also do this by finding the index of the 3rd slash, using the return of each call as the "start" argument to the next.
third_slash = file_path.index('/', file_path.index('/', file_path.index('/')+1) +1)
what_i_want = file_path[:third_slash+1]

Working with relative paths

When I run the following script:
c:\Program Files\foo\bar\scripy.py
How can I refer to directory 'foo'?
Is there a convenient way of using relative paths?
I've done it before with the string module, but there must be a better way (I couldn't find it in os.path).
The os.path module includes various functions for working with paths like this. The convention in most operating system is to use .. to go "up one level", so to get the outside directory you could do this:
import os
import os.path
current_dir = os.getcwd() # find the current directory
print current_dir # c:\Program Files\foo\bar\scripy.py
parent = os.path.join(current_dir, "..") # construct a path to its parent
print parent # c:\Program Files\foo\bar\..
normal_parent = os.path.normpath(parent) # "normalize" the path
print normal_parent # c:\Program Files\foo
# or on one line:
print os.path.normpath(os.path.join(os.getcwd(), ".."))
os.path.dirname(path)
Will return the second half of a SPLIT that is performed on the path parameter. (head - the directory and tail, the file) Put simply it returns the directory the path is in. You'll need to do it twice but this is probably the best way.
Python Docs on path functions:
http://docs.python.org/library/os.path#os.path.expanduser
I have recently started using the unipath library instead of os.path. Its object-oriented representations of paths are much simpler:
from unipath import Path
original = Path(__file__) # .absolute() # r'c:\Program Files\foo\bar\scripy.py'
target = original.parent.parent
print target # Path(u'c:\\Program Files\\foo')
Path is a subclass of str so you can use it with standard filesystem functions, but it also provides alternatives for many of them:
print target.isdir() # True
numbers_dir = target.child('numbers')
print numbers_dir.exists() # False
numbers_dir.mkdir()
print numbers_dir.exists() # True
for n in range(10):
file_path = numbers_dir.child('%s.txt' % (n,))
file_path.write_file("Hello world %s!\n" % (n,), 'wt')
This is a bit tricky. For instance, the following code:
import sys
import os
z = sys.argv[0]
p = os.path.dirname(z)
f = os.path.abspath(p)
print "argv[0]={0} , dirname={1} , abspath={2}\n".format(z,p,f)
gives this output on Windows
argv[0]=../zzz.py , dirname=.. , abspath=C:\Users\michael\Downloads
First of all, notice that argv has the slash which I typed in the command python ../zzz.py and the absolute path has the normal Windows backslashes. If you need to be cross platform you should probably refrain from putting regular slashes on Python command lines, and use os.sep to refer to the character that separated pathname components.
So far I have only partly answered your question. There are a couple of ways to use the value of f to get what you want. Brute force is to use something like:
targetpath = f + os.sep + ".." + os.sep + ".."
which would result in something like C:\Users\michael\Downloads\..\.. on Windows and /home/michael/../.. on Unix. Each .. goes back one step and is the equivalent of removing the pathname component.
But you could do better by breaking up the path:
target = f.split(os.sep)
targetpath = os.sep.join(target[:-2]
and rejoining all but the last two bits to get C:\Users on Windows and / on Unix. If you do that it might be a good idea to check that there are enough pathname components to remove.
Note that I ran the program above by typing python ../xxx.py. In other words I was not in the same working directory as the script, therefore getcwd() would not be useful.

Categories

Resources