str = "a\b\c\dsdf\matchthis\erwe.txt"
The last folder name.
Match "matchthis"
Without using regex, just do:
>>> import os
>>> my_str = "a/b/c/dsdf/matchthis/erwe.txt"
>>> my_dir_path = os.path.dirname(my_str)
>>> my_dir_path
'a/b/c/dsdf/matchthis'
>>> my_dir_name = os.path.basename(my_dir_path)
>>> my_dir_name
'matchthis'
Better to use os.path.split(path) since it's platform independent. You'll have to call it twice to get the final directory:
path_file = "a\b\c\dsdf\matchthis\erwe.txt"
path, file = os.path.split(path_file)
path, dir = os.path.split(path)
>>> str = "a\\b\\c\\dsdf\\matchthis\\erwe.txt"
>>> str.split("\\")[-2]
'matchthis'
x = "a\b\c\d\match\something.txt"
match = x.split('\\')[-2]
>>> import re
>>> print re.match(r".*\\(.*)\\[^\\]*", r"a\b\c\dsdf\matchthis\erwe.txt").groups()
('matchthis',)
As #chrisaycock and #rafe-kettler pointed out. Use the x.split(r'\') if you can. It is way faster, readable and more pythonic. If you really need a regex then use one.
EDIT:
Actually, os.path is best. Platform independent. unix/windows etc.
Related
I would like to get the basename of a tar.gz file in python.
So from "foo/bar/alice.tar.gz" i want alice
What I have so far is:
url = "foo/bar/alice.tar.gz"
Path(Path(url).stem).stem
print(url)
~ alice
is there a smoother way to do so? What if my url is something like "foo/bar/alice.tar.gz.tar.gz.tar.gz" ?
Thanks in advance.
i think that will do the job:
>>> import os
>>> base=os.path.basename("foo/bar/alice.tar.gz")
>>> base
'alice.tar.gz'
>>> name = base.split('.')
'['alice', 'tar', 'gz']'
>>> name = base.split('.')[0]
'alice'
You can use str.partition():
result = Path(url).stem.partition('.')[0]
In python, suppose I have a path like this:
/folderA/folderB/folderC/folderD/
How can I get just the folderD part?
Use os.path.normpath, then os.path.basename:
>>> os.path.basename(os.path.normpath('/folderA/folderB/folderC/folderD/'))
'folderD'
The first strips off any trailing slashes, the second gives you the last part of the path. Using only basename gives everything after the last slash, which in this case is ''.
With python 3 you can use the pathlib module (pathlib.PurePath for example):
>>> import pathlib
>>> path = pathlib.PurePath('/folderA/folderB/folderC/folderD/')
>>> path.name
'folderD'
If you want the last folder name where a file is located:
>>> path = pathlib.PurePath('/folderA/folderB/folderC/folderD/file.py')
>>> path.parent.name
'folderD'
You could do
>>> import os
>>> os.path.basename('/folderA/folderB/folderC/folderD')
UPDATE1: This approach works in case you give it /folderA/folderB/folderC/folderD/xx.py. This gives xx.py as the basename. Which is not what you want I guess. So you could do this -
>>> import os
>>> path = "/folderA/folderB/folderC/folderD"
>>> if os.path.isdir(path):
dirname = os.path.basename(path)
UPDATE2: As lars pointed out, making changes so as to accomodate trailing '/'.
>>> from os.path import normpath, basename
>>> basename(normpath('/folderA/folderB/folderC/folderD/'))
'folderD'
Here is my approach:
>>> import os
>>> print os.path.basename(
os.path.dirname('/folderA/folderB/folderC/folderD/test.py'))
folderD
>>> print os.path.basename(
os.path.dirname('/folderA/folderB/folderC/folderD/'))
folderD
>>> print os.path.basename(
os.path.dirname('/folderA/folderB/folderC/folderD'))
folderC
I was searching for a solution to get the last foldername where the file is located, I just used split two times, to get the right part. It's not the question but google transfered me here.
pathname = "/folderA/folderB/folderC/folderD/filename.py"
head, tail = os.path.split(os.path.split(pathname)[0])
print(head + " " + tail)
I like the parts method of Path for this:
grandparent_directory, parent_directory, filename = Path(export_filename).parts[-3:]
log.info(f'{t: <30}: {num_rows: >7} Rows exported to {grandparent_directory}/{parent_directory}/{filename}')
If you use the native python package pathlib it's really simple.
>>> from pathlib import Path
>>> your_path = Path("/folderA/folderB/folderC/folderD/")
>>> your_path.stem
'folderD'
Suppose you have the path to a file in folderD.
>>> from pathlib import Path
>>> your_path = Path("/folderA/folderB/folderC/folderD/file.txt")
>>> your_path.name
'file.txt'
>>> your_path.parent
'folderD'
During my current projects, I'm often passing rear parts of a path to a function and therefore use the Path module. To get the n-th part in reverse order, I'm using:
from typing import Union
from pathlib import Path
def get_single_subpath_part(base_dir: Union[Path, str], n:int) -> str:
if n ==0:
return Path(base_dir).name
for _ in range(n):
base_dir = Path(base_dir).parent
return getattr(base_dir, "name")
path= "/folderA/folderB/folderC/folderD/"
# for getting the last part:
print(get_single_subpath_part(path, 0))
# yields "folderD"
# for the second last
print(get_single_subpath_part(path, 1))
#yields "folderC"
Furthermore, to pass the n-th part in reverse order of a path containing the remaining path, I use:
from typing import Union
from pathlib import Path
def get_n_last_subparts_path(base_dir: Union[Path, str], n:int) -> Path:
return Path(*Path(base_dir).parts[-n-1:])
path= "/folderA/folderB/folderC/folderD/"
# for getting the last part:
print(get_n_last_subparts_path(path, 0))
# yields a `Path` object of "folderD"
# for second last and last part together
print(get_n_last_subparts_path(path, 1))
# yields a `Path` object of "folderc/folderD"
Note that this function returns a Pathobject which can easily be converted to a string (e.g. str(path))
path = "/folderA/folderB/folderC/folderD/"
last = path.split('/').pop()
str = "/folderA/folderB/folderC/folderD/"
print str.split("/")[-2]
I want to insert a directory name in the middle of a given file path, like this:
directory_name = 'new_dir'
file_path0 = 'dir1/dir2/dir3/dir4/file.txt'
file_path1 = some_func(file_path0, directory_name, position=2)
print(file_path1)
>>> 'dir1/dir2/new_dir/dir3/dir4/file.txt'
I looked through the os.path and pathlib packages, but it looks like they don't have a function that allows for inserting in the middle of a file path. I tried:
import sys,os
from os.path import join
path_ = file_path0.split(os.sep)
path_.insert(2, 'new_dir')
print(join(path_))
but this results in the error
"expected str, bytes or os.PathLike object, not list"
Does anyone know standard python functions that allow such inserting in the middle of a file path? Alternatively - how can I turn path_ to something that can be processed by os.path. I am new to pathlib, so maybe I missed something out there
Edit: Following the answers to the question I can suggest the following solutions:
1.) As Zach Favakeh suggests and as written in this answer just correct my code above to join(*path_) by using the 'splat' operator * and everything is solved.
2.) As suggested by buran you can use the pathlib package, in very short it results in:
from pathlib import PurePath
path_list = list(PurePath(file_path0).parts)
path_list.insert(2, 'new_dir')
file_path1 = PurePath('').joinpath(*path_list)
print(file_path1)
>>> 'dir1/dir2/new_dir/dir3/dir4/file.txt'
Take a look at pathlib.PurePath.parts. It will return separate components of the path and you can insert at desired position and construct the new path
>>> from pathlib import PurePath
>>> file_path0 = 'dir1/dir2/dir3/dir4/file.txt'
>>> p = PurePath(file_path0)
>>> p.parts
('dir1', 'dir2', 'dir3', 'dir4', 'file.txt')
>>> spam = list(p.parts)
>>> spam.insert(2, 'new_dir')
>>> new_path = PurePath('').joinpath(*spam)
>>> new_path
PurePosixPath('dir1/dir2/new_dir/dir3/dir4/file.txt')
This will work with path as a str as well as with pathlib.Path objects
Since you want to use join on a list to produce the pathname, you should do the following using the "splat" operator: Python os.path.join() on a list
Edit: You could also take your np array and concatenate its elements into a string using np.array2string, using '/' as your separator parameter:https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.array2string.html
Hope this helps.
Solution using regex. The regex will create groups of the following
[^\/]+ - non-'/' characters(i.e. directory names)
\w+\.\w+ - word characters then '.' then word characters (i.e. file name)
import re
directory_name = 'new_dir'
file_path0 = 'dir1/dir2/dir3/dir4/file.txt'
position = 2
regex = re.compile(r'([^\/]+|\w+\.\w+)')
tokens = re.findall(regex, file_path0)
tokens.insert(position, directory_name)
file_path1 = '/'.join(tokens)
Result:
'dir1/dir2/new_dir/dir3/dir4/file.txt'
Your solution has only one flaw. After inserting the new directory in the path list path_.insert(2, 'new_dir')you need to call os.path.join(*path_) to get the new modified path. The error that you get is because you are passing a list as parameter to the join function, but you have to unpack it.
In my case, I knew the portion of path that would precede the insertion point (i.e., "root"). However, the position of the insertion point was not constant due to the possibility of having varying number of path components in the root path. I used Path.relative_to() to break the full path to yield an insertion point for the new_dir.
from pathlib import Path
directory_name = Path('new_dir')
root = Path('dir1/dir2/')
file_path0 = Path('dir1/dir2/dir3/dir4/file.txt')
# non-root component of path
chld = file_path0.relative_to(root)
file_path1 = root / directory_name / chld
print(file_path1)
Result:
'dir1/dir2/new_dir/dir3/dir4/file.txt'
I made a try with your need:
directory_name = '/new_dir'
file_path0 = 'dir1/dir2/dir3/dir4/file.txt'
before_the_newpath = 'dir1/dir2'
position = file_path0.split(before_the_newpath)
new_directory = before_the_newpath + directory_name + position[1]
Hope it helps.
What is the most concise way to split a path so that it includes the filename and two directories up in Python?
>>> path = r'/absolute/path/to/file.txt'
>>> os.path.dirname(path)
Gives:
/absolute/path/to
While:
>>> from pathlib import Path
>>> path = r'/absolute/path/to/file.txt'
>>> Path(path).parents[1]
gives:
/absolute/path
What would be the most concise strategy to give me:
to/file.txt
?
>>> os.path.join(*pathlib.Path(path).parts[-2:])
'to/file.txt'
This is one way.
path = r'/absolute/path/to/file.txt'
res = '/'.join(path.split('/')[-2:])
print(res)
# to/file.txt
A less concise, but better, alternative:
res = os.path.join(*os.path.normpath(path).split(os.sep)[-2:])
My question is a bit basic, but as I'm a newbie in python (crossed over from GIS), please bear with me.
I have a python list which is based on the files the user inserts -
for example: inputlist =["c:\\files\\foobar.shp","c:\\files\\snafu.shp"]
how do I get the file names only (without the path or extesions) into a new list?
(desired output: ["foobar","snafu"] )
Thanks.
You can use python's list comprehensions for that:
new_list = [ splitext(basename(i))[0] for i in inputlist ]
[os.path.basename(p).rsplit(".", 1)[0] for p in inputlist]
import os.path
extLessBasename = lambda fn: os.path.splitext(os.path.basename(fn))[0]
fileNames = map(extLessBasename, inputlist)
This solution is also help you.
import os
inputlist =["/home/anupam/PycharmProjects/DataStructures/LogicalProgram/classvsstatic.py",
"/home/anupam/PycharmProjects/DataStructures/LogicalProgram/decorators.py"]
filename_list = []
for i in inputlist:
path_list =i.split('/')
filename_with_extension = path_list[-1]
filename_without_extension = os.path.splitext(filename_with_extension)[0]
filename_list.append(filename_without_extension)
print(filename_list)
According to windows file path. You can split with '//' small change in my code.