How to remove an unknown string part in Python - python

So I have a FileDialogue Object in Tkinter that gets the location of a file from the User's Computer.
The output is not always the same as the directory may vary.
For E.G. = Filename = C:/MusicDirectory/Music.mp3, or it can be something else too, like D:/Some/Directory/IDontKnowWhatToTypeAnyMore/Music.mp3
My main objective is to remove the "C:/MusicDirectory/" and the unwanted Directory from the string, but the string does not remain the same. It can be some other folder too.
Can someone help me in this situation?

What you want is os.path.basename:
os.path.basename('C:/MusicDirectory/Music.mp3') # 'Music.mp3'
os.path.basename('D:/Some/Directory/IDontKnowWhatToTypeAnyMore/Music.mp3') # 'Music.mp3'

First using method os.path.basename ( recommended ):
os.path.basename('C:/MusicDirectory/Music.mp3')
Second method:
path = 'C:/MusicDirectory/Music.mp3'
partsOfPath = path.split("/")
nameOfFile = partsOfPath[-1]

Related

Python Pathlib escaping string stored in variable

I'm on windows and have an api response that includes a key value pair like
object = {'path':'/my_directory/my_subdirectory/file.txt'}
I'm trying to use pathlib to open and create that structure relative to the current working directory as well as a user supplied directory name in the current working directory like this:
output = "output_location"
path = pathlib.Path.cwd().joinpath(output,object['path'])
print(path)
What this gives me is this
c:\directory\my_subdirectory\file.txt
Whereas I'm looking for it to output something like:
'c:\current_working_directory\output_location\directory\my_subdirectory\file.txt'
The issue is because the object['path'] is a variable I'm not sure how to escape it as a raw string. And so I think the escapes are breaking it. I can't guarantee there will always be a leading slash in the object['path'] value so I don't want to simply trim the first character.
I was hoping there was an elegant way to do this using pathlib that didn't involve ugly string manipulation.
Try lstrip('/')
You want to remove your leading slash whenever it’s there, because pathlib will ignore whatever comes before it.
import pathlib
object = {'path': '/my_directory/my_subdirectory/file.txt'}
output = "output_location"
# object['path'][1:] removes the initial '/'
path = pathlib.PureWindowsPath(pathlib.Path.cwd()).joinpath(output,object[
'path'][1:])
# path = pathlib.Path.cwd().joinpath(output,object['path'])
print(path)

How do I put a variable in a path

I'm trying to store a path in a variable. see below
target = r"C:\Users\User\CodeProjects\WebSafer"
However, I need it to be dynamic. Not hardcoded to my username, so I get the login username by doing:
val = os.getlogin()
So I need to put the variable val in the path. But every time I tried doing it I always get a truncating/syntax error. Please help me! Below is the code snippet:
print("No copy found...making a copy\n")
val = os.getlogin()
original = r"C:\*******\********\*******\***\****"
target = r"C:\Users\User\CodeProjects\WebSafer"
shutil.copy(original, target)
The "*" are just for privacy reasons, there actually replaced with the right path location to what I'm copying.
What I have tried so far:
target = r"C:\Users\{val}\CodeProjects\WebSafer".format(val = os.getlogin)
target = r"C:\Users\{}\CodeProjects\WebSafer".format(val)
target = rf"C:\Users\{val}\CodeProjects\WebSafer".format(val = os.getlogin)
target = rf"C:\Users\{}\CodeProjects\WebSafer".format(val)
Don't mix f with .format, this is working for me:
import os
val = os.getlogin()
print(rf"C:\Users\{val}\CodeProjects\WebSafer")
And I think better way is:
import os.path
from pathlib import Path
home = str(Path.home())
print(os.path.join(home, "CodeProjects\WebSafer"))
Then if you encounter some error when copying, you need clarify what you want to copy, copy a file, or a folder, if a folder, should it go to within the dest folder, or overwrite dest folder?
You may want try different methods such as
shutil.copy, shutil.copytree, and different parameters.
"r" means the string will be treated as raw string so try removing that and using escaped characters target = "C:\\Users\\{val}\\CodeProjects\\WebSafer".format(val = os.getlogin)
You can use f-string (to directly enter your variable) and r-string (to enter the path without escape characters like \) together.
val = os.getlogin() # Returns username
target = fr"C:\Users\{val}\CodeProjects\WebSafer"
If you're getting a No such file or directory error, it means that the actual file or folder does not exist. Check the actual path to ensure every part (Your Username, CodeProjects, Websafer) exists on your computer.
In case you don't know if your user will have that folder on their system, you can use a try-except block to alert the user or to revert to some default folder instead.

How to correctly apply a RE for obtaining the last name (of a file or folder) from a given path and print it on Python?

I have wrote a code which creates a dictionary that stores all the absolute paths of folders from the current path as keys, and all of its filenames as values, respectively. This code would only be applied to paths that have folders which only contain file images. Here:
import os
import re
# Main method
the_dictionary_list = {}
for name in os.listdir("."):
if os.path.isdir(name):
path = os.path.abspath(name)
print(f'\u001b[45m{path}\033[0m')
match = re.match(r'/(?:[^\\])[^\\]*$', path)
print(match)
list_of_file_contents = os.listdir(path)
print(f'\033[46m{list_of_file_contents}')
the_dictionary_list[path] = list_of_file_contents
print('\n')
print('\u001b[43mthe_dictionary_list:\033[0m')
print(the_dictionary_list)
The thing is, that I want this dictionary to store only the last folder names as keys instead of its absolute paths, so I was planning to use this re /(?:[^\\])[^\\]*$, which would be responsible for obtaining the last name (of a file or folder from a given path), and then add those last names as keys in the dictionary in the for loop.
I wanted to test the code above first to see if it was doing what I wanted, but it didn't seem so, the value of the match variable became None in each iteration, which didn't make sense to me, everything else works fine.
So I would like to know what I'm doing wrong here.
I would highly recommend to use the builtin library pathlib. It would appear you are interested in the f.name part. Here is a cheat sheet.
I decided to rewrite the code above, in case of wanting to apply it only in the current directory (where this program would be found).
import os
# Main method
the_dictionary_list = {}
for subdir in os.listdir("."):
if os.path.isdir(subdir):
path = os.path.abspath(subdir)
print(f'\u001b[45m{path}\033[0m')
list_of_file_contents = os.listdir(path)
print(f'\033[46m{list_of_file_contents}')
the_dictionary_list[subdir] = list_of_file_contents
print('\n')
print('\033[1;37;40mThe dictionary list:\033[0m')
for subdir in the_dictionary_list:
print('\u001b[43m'+subdir+'\033[0m')
for archivo in the_dictionary_list[subdir]:
print(" ", archivo)
print('\n')
print(the_dictionary_list)
This would be useful in case the user wants to run the program with a double click on a specific location (my personal case)

How to manipulate directory paths to work across multiple OS?

I'm writing a python script which has to internally create output path from the input path. However, I am facing issues to create the path which I can use irrespective of OS.
I have tried to use os.path.join and it has its own limitations.
Apart from that, I think simple string concatenation is not the way to go.
Pathlib can be an option but I am not allowed to use it.
import os
inputpath = "C:\projects\django\hereisinput"
lastSlash = left.rfind("\\")
# This won't work as os path join stops at a slash
outputDir = os.path.join(left[:lastSlash], "\internal\morelevel\outputpath")
OR
OutDir = left[:lastSlash] + "\internal\morelevel\outputpath"
Expected output path :
C:\projects\django\internal\morelevel\outputpath
Also, the above code doesn't do it OS Specific where in Linux, the slash will be different.
Is os.sep() some option ?
From the documentation os.path.join can join "one or more path components...". So you could split "\internal\morelevel\outputpath" up into each of its components and pass all of them to your os.path.join function instead. That way you don't need to "hard-code" the separator between the path components. For example:
paths = ("internal", "morelevel", "outputpath")
outputDir = os.path.join(left[:lastSlash], *paths)
Remember that the backslash (\) is a special character in Python so your strings containing singular backslashes wouldn't work as you expect them to! You need to escape them with another \ in front.
This part of your code lastSlash = left.rfind("\\") might also not work on any operating system. You could rather use something like os.path.split to get the last part of the path that you need. For example, _, lastSlash = os.path.split(left).
Assuming your original path is "C:\projects\django\hereisinput", your other part of the path as "internal\morelevel\outputpath" (notice this is a relative path, not absolute), you could always move your primary back one folder (or more) and then append the second part. Do note that your first path needs to contain only folders and can be absolute or relative, while your second path must always be relative for this hack to work
path_1 = r"C:\projects\django\hereisinput"
path_2 = r"internal\morelevel\outputpath"
path_1_one_folder_down = os.path.join(path_1, os.path.pardir)
final_path = os.path.join(path_1_one_folder_down, path_2)
'C:\\projects\\django\\hereisinput\\..\\internal\\morelevel\\outputpath'

Why doesn't os.path.join() work in this case?

The below code will not join, when debugged the command does not store the whole path but just the last entry.
os.path.join('/home/build/test/sandboxes/', todaystr, '/new_sandbox/')
When I test this it only stores the /new_sandbox/ part of the code.
The latter strings shouldn't start with a slash. If they start with a slash, then they're considered an "absolute path" and everything before them is discarded.
Quoting the Python docs for os.path.join:
If a component is an absolute path, all previous components are thrown away and joining continues from the absolute path component.
Note on Windows, the behaviour in relation to drive letters, which seems to have changed compared to earlier Python versions:
On Windows, the drive letter is not reset when an absolute path component (e.g., r'\foo') is encountered. If a component contains a drive letter, all previous components are thrown away and the drive letter is reset. Note that since there is a current directory for each drive, os.path.join("c:", "foo") represents a path relative to the current directory on drive C: (c:foo), not c:\foo.
The idea of os.path.join() is to make your program cross-platform (linux/windows/etc).
Even one slash ruins it.
So it only makes sense when being used with some kind of a reference point like
os.environ['HOME'] or os.path.dirname(__file__).
os.path.join() can be used in conjunction with os.path.sep to create an absolute rather than relative path.
os.path.join(os.path.sep, 'home','build','test','sandboxes',todaystr,'new_sandbox')
Do not use forward slashes at the beginning of path components, except when refering to the root directory:
os.path.join('/home/build/test/sandboxes', todaystr, 'new_sandbox')
see also: http://docs.python.org/library/os.path.html#os.path.join
To help understand why this surprising behavior isn't entirely terrible, consider an application which accepts a config file name as an argument:
config_root = "/etc/myapp.conf/"
file_name = os.path.join(config_root, sys.argv[1])
If the application is executed with:
$ myapp foo.conf
The config file /etc/myapp.conf/foo.conf will be used.
But consider what happens if the application is called with:
$ myapp /some/path/bar.conf
Then myapp should use the config file at /some/path/bar.conf (and not /etc/myapp.conf/some/path/bar.conf or similar).
It may not be great, but I believe this is the motivation for the absolute path behaviour.
It's because your '/new_sandbox/' begins with a / and thus is assumed to be relative to the root directory. Remove the leading /.
Try combo of split("/") and * for strings with existing joins.
import os
home = '/home/build/test/sandboxes/'
todaystr = '042118'
new = '/new_sandbox/'
os.path.join(*home.split("/"), todaystr, *new.split("/"))
How it works...
split("/") turns existing path into list: ['', 'home', 'build', 'test', 'sandboxes', '']
* in front of the list breaks out each item of list its own parameter
To make your function more portable, use it as such:
os.path.join(os.sep, 'home', 'build', 'test', 'sandboxes', todaystr, 'new_sandbox')
or
os.path.join(os.environ.get("HOME"), 'test', 'sandboxes', todaystr, 'new_sandbox')
do it like this, without too the extra slashes
root="/home"
os.path.join(root,"build","test","sandboxes",todaystr,"new_sandbox")
Try with new_sandbox only
os.path.join('/home/build/test/sandboxes/', todaystr, 'new_sandbox')
os.path.join("a", *"/b".split(os.sep))
'a/b'
a fuller version:
import os
def join (p, f, sep = os.sep):
f = os.path.normpath(f)
if p == "":
return (f);
else:
p = os.path.normpath(p)
return (os.path.join(p, *f.split(os.sep)))
def test (p, f, sep = os.sep):
print("os.path.join({}, {}) => {}".format(p, f, os.path.join(p, f)))
print(" join({}, {}) => {}".format(p, f, join(p, f, sep)))
if __name__ == "__main__":
# /a/b/c for all
test("\\a\\b", "\\c", "\\") # optionally pass in the sep you are using locally
test("/a/b", "/c", "/")
test("/a/b", "c")
test("/a/b/", "c")
test("", "/c")
test("", "c")
Note that a similar issue can bite you if you use os.path.join() to include an extension that already includes a dot, which is what happens automatically when you use os.path.splitext(). In this example:
components = os.path.splitext(filename)
prefix = components[0]
extension = components[1]
return os.path.join("avatars", instance.username, prefix, extension)
Even though extension might be .jpg you end up with a folder named "foobar" rather than a file called "foobar.jpg". To prevent this you need to append the extension separately:
return os.path.join("avatars", instance.username, prefix) + extension
you can strip the '/':
>>> os.path.join('/home/build/test/sandboxes/', todaystr, '/new_sandbox/'.strip('/'))
'/home/build/test/sandboxes/04122019/new_sandbox'
I'd recommend to strip from the second and the following strings the string os.path.sep, preventing them to be interpreted as absolute paths:
first_path_str = '/home/build/test/sandboxes/'
original_other_path_to_append_ls = [todaystr, '/new_sandbox/']
other_path_to_append_ls = [
i_path.strip(os.path.sep) for i_path in original_other_path_to_append_ls
]
output_path = os.path.join(first_path_str, *other_path_to_append_ls)
The problem is your laptop maybe running Window. And Window annoyingly use back lash instead of forward slash'/'. To make your program cross-platform (linux/windows/etc).
You shouldn't provide any slashes (forward or backward) in your path if you want os.path.join to handle them properly. you should using:
os.path.join(os.environ.get("HOME"), 'test', 'sandboxes', todaystr, 'new_sandbox')
Or throw some Path(__file__).resolve().parent (path to parent of current file) or anything so that you don't use any slash inside os.path.join
Please refer following code snippet for understanding os.path.join(a, b)
a = '/home/user.name/foo/'
b = '/bar/file_name.extension'
print(os.path.join(a, b))
>>> /bar/file_name.extension
OR
a = '/home/user.name/foo'
b = '/bar/file_name.extension'
print(os.path.join(a, b))
>>> /bar/file_name.extension
But, when
a = '/home/user.name/foo/'
b = 'bar/file_name.extension'
print(os.path.join(a, b))
>>> /bar/file_name.extension
OR
a = '/home/user.name/foo'
b = 'bar/file_name.extension'
print(os.path.join(a, b))
>>> /home/user.name/foo/bar/file_name.extension

Categories

Resources