I'm on windows and have an api response that includes a key value pair like
object = {'path':'/my_directory/my_subdirectory/file.txt'}
I'm trying to use pathlib to open and create that structure relative to the current working directory as well as a user supplied directory name in the current working directory like this:
output = "output_location"
path = pathlib.Path.cwd().joinpath(output,object['path'])
print(path)
What this gives me is this
c:\directory\my_subdirectory\file.txt
Whereas I'm looking for it to output something like:
'c:\current_working_directory\output_location\directory\my_subdirectory\file.txt'
The issue is because the object['path'] is a variable I'm not sure how to escape it as a raw string. And so I think the escapes are breaking it. I can't guarantee there will always be a leading slash in the object['path'] value so I don't want to simply trim the first character.
I was hoping there was an elegant way to do this using pathlib that didn't involve ugly string manipulation.
Try lstrip('/')
You want to remove your leading slash whenever it’s there, because pathlib will ignore whatever comes before it.
import pathlib
object = {'path': '/my_directory/my_subdirectory/file.txt'}
output = "output_location"
# object['path'][1:] removes the initial '/'
path = pathlib.PureWindowsPath(pathlib.Path.cwd()).joinpath(output,object[
'path'][1:])
# path = pathlib.Path.cwd().joinpath(output,object['path'])
print(path)
Say I have the path fodler1/folder2/folder3, and I don't know in advance the names of the folders.
How can I remove the first part of this path to get only folder2/folder3?
You can use pathlib.Path for that:
from pathlib import Path
p = Path("folder1/folder2/folder3")
And either concatenate all parts except the first:
new_path = Path(*p.parts[1:])
Or create a path relative_to the first part:
new_path = p.relative_to(p.parts[0])
This code doesn't require specifying the path delimiter, and works for all pathlib supported platforms (Python >= 3.4).
Use str.split with 1 as maxsplit argument:
path = "folder1/folder2/folder3"
path.split("/", 1)[1]
# 'folder2/folder3'
If there is no / in there, you might be safer with:
path.split("/", 1)[-1] # pick the last of one or two tokens
but that depends on your desired logic in that case.
For better protability across systems, you could replace the slash "/" with os.path.sep:
import os
path.split(os.path.sep, 1)[1]
I created one folder called dataset, then in this folder i created subfolder called subfold1, subfold2
names=[]
for users in os.listdir("dataset"):
names.append(users)
print(names)
Output:
['subfold1','subfold2']
In the subfold1 , i have 5 images and subfold2 also i have 5 images
Then, i want to list the images paths which i have inside the subfold1 and subfol2?
path= []
for name in names:
for image in os.listdir("dataset/{}".format(name)):
path_string = os.path.join("dataset/{}".format(name), image)
path.append(path_string)
print(path)
My output is
['dataset/subfold1\\1_1.jpg', 'dataset/subfold1\\1_2.jpg', 'dataset/subfold1\\1_3.jpg', 'dataset/subfold1\\1_4.jpg', 'dataset/subfold1\\1_5.jpg', 'dataset/subfold2\\2_1.jpg', 'dataset/subfold3\\2_2.jpg', 'dataset/subfold2\\2_3.jpg', 'dataset/subfold2\\2_4.jpg', 'dataset/subfold2\\2_5.jpg']
I want the correct paths
You code works correctly in Linux.
However you may want to simplify it by using os.walk. Please see below:
new_names = []
for dirpath, dirnames, filenames in os.walk('dataset'):
for filename in filenames:
new_names.append(os.path.join(dirpath, filename))
print(new_names)
which gives me following output:
['dataset/subfold2/93.jpg', 'dataset/subfold2/99.jpg', 'dataset/subfold2/97.jpg', 'dataset/subfold1/3.jpg', 'dataset/subfold1/2.jpg', 'dataset/subfold1/1.jpg']
I think you're in windows OS. As you know in Windows \ is the address separator.
And as \ is the escape character in Python (it will be followed by another character indicating a special character, for example, \t means tab), thus the \\ means \, and your addresses are totally correct and you can change the / to \\ in compliance with the Windows rule. BTW, I strongly suggest you to apply the pathlib library. It is more convenient and powerful.
from pathlib import Path
p = Path('MyPictures')
for image in p.iterdir():
print(image)
quick solution;
use os.path.normpath for normalizing a path (modifying every separator to os.path.sep AND adapting the path to the operating system)
paths = [
'dataset/subfold1\\1_1.jpg',
'dataset/subfold1\\1_2.jpg',
'dataset/subfold1\\1_3.jpg',
'dataset/subfold1\\1_4.jpg',
'dataset/subfold1\\1_5.jpg',
'dataset/subfold2\\2_1.jpg',
'dataset/subfold3\\2_2.jpg',
'dataset/subfold2\\2_3.jpg',
'dataset/subfold2\\2_4.jpg',
'dataset/subfold2\\2_5.jpg'
]
import os
paths = list(map(os.path.normpath, paths))
>>> paths
out
['dataset\\subfold1\\1_1.jpg',
'dataset\\subfold1\\1_2.jpg',
'dataset\\subfold1\\1_3.jpg',
'dataset\\subfold1\\1_4.jpg',
'dataset\\subfold1\\1_5.jpg',
'dataset\\subfold2\\2_1.jpg',
'dataset\\subfold3\\2_2.jpg',
'dataset\\subfold2\\2_3.jpg',
'dataset\\subfold2\\2_4.jpg',
'dataset\\subfold2\\2_5.jpg']
extra info:
you cant get rid of this \\ from this 'dataset/subfold1\\1_1.jpg' because that is the string __repr__ of the element, and when you do __repr__ you see double backslash because its escaped. if you will actually print the value on the screen you will see just one \
quick demo:
print('dataset\\subfold1\\1_1.jpg')
out
dataset\subfold1\1_1.jpg
if you really want to join the paths with / then make your own join function 4 paths (also i dont recommend this, i made that in the past and i realised that it was worthless, because os.path is handling everything for you)
but if you are on windows and you really want to have linux path separator you can try this:
paths = [
'dataset/subfold1\\1_1.jpg',
'dataset/subfold1\\1_2.jpg',
'dataset/subfold1\\1_3.jpg',
'dataset/subfold1\\1_4.jpg',
'dataset/subfold1\\1_5.jpg',
'dataset/subfold2\\2_1.jpg',
'dataset/subfold3\\2_2.jpg',
'dataset/subfold2\\2_3.jpg',
'dataset/subfold2\\2_4.jpg',
'dataset/subfold2\\2_5.jpg'
]
paths = [path.replace("\\", "/") for path in paths]
>>> paths
out
['dataset/subfold1/1_1.jpg',
'dataset/subfold1/1_2.jpg',
'dataset/subfold1/1_3.jpg',
'dataset/subfold1/1_4.jpg',
'dataset/subfold1/1_5.jpg',
'dataset/subfold2/2_1.jpg',
'dataset/subfold3/2_2.jpg',
'dataset/subfold2/2_3.jpg',
'dataset/subfold2/2_4.jpg',
'dataset/subfold2/2_5.jpg']`
i have this path
c:\JAVA\eclipse\java-neon\eclipse\configuration\
i want to get back the last folder "configuration"
or on
c:\JAVA\eclipse\java-neon\eclipse\configuration\S\D\CV\S\D\D\AAAAA
get "AAAAA"
i don't found this function on os.path
thanks
Suppose you know you have a separator character sep, this should accomplish what you ask:
path.split(sep)[-1]
Where path is the str containing your path.
If you don't know what the separator is you can call
os.path.sep
You can use os.path.split to split according to path separator:
os.path.split(path)[-1]
please check the code
import os
def getFolderName(str):
if(str.endswith("\\")):
str = str[0:-2]
return os.path.split(str)[-1]
print(getFolderName(r'c:\JAVA\eclipse\java-neon\eclipse\configuration\S\D\CV\S\D\D\AAAAA'))
if you're wanting to explore your paths try something like this
def explore(path):
finalpaths = []
for paths in os.listdir(path):
nextpath = path + '/' + paths
if os.path.isdir(nextpath):
finalpaths.extend(explore(nextpath))
else:
finalpaths.append(path)
return finalpaths
then if you run
set(explore(path)
you'll get a list of all folders that can be in that directory (the lowest folder down you can get)
this works for unix, you might need to change it to \ rather than / for windows
path
'h:\OmWPDump_Tue_Oct_07_21_08_13_2014\windows\SystemsManagementx64\SysMgmtx64.msi'
os.path.dirname(path)
'h:\OmWPDump_Tue_Oct_07_21_08_13_2014\windows\SystemsManagementx64'
I need the code for so that it outputs the top most parent directory. :
'h:\OmWPDump_Tue_Oct_07_21_08_13_2014;
Basically I would need this location so that I remove the complete directory .
The easiest method without requiring additional modules is to split() the path:
>>> path = r'h:\OmWPDump_Tue_Oct_07_21_08_13_2014\windows\SystemsManagementx64\SysMgmtx64.msi'
>>> topdir = path.split('\\')[1]
>>> topdir
'OmWPDump_Tue_Oct_07_21_08_13_2014'
If you're potentially dealing with UNC paths, then you may need check first and determine which element to use (split() on a UNC path will return a couple of empty elements, then hostname, then your top-level folder).
Edit:
Add to that your drive from the path:
>>> deldir = os.path.join(os.path.splitdrive(path), topdir)
>>> deldir
'h:\\OmWPDump_Tue_Oct_07_21_08_13_2014'
You can use regular expressions:
import re
path = 'h:\OmWPDump_Tue_Oct_07_21_08_13_2014\windows\SystemsManagementx64\SysMgmtx64.msi'
match = re.findall(r'.:\\.+\\', path)
answer = match[0][:-1]