How to correctly decode window path in python - python

I have a question on correctly decode a window path in python. I tried several method online but didn't find a solution. I assigned the path (folder directory) to a variable and would like to read it as raw. However, there is '\' combined with number and python can't read correctly, any suggestion? Thanks
fld_dic = 'D:TestData\20190917_DT19_HigherFlowRate_StdCooler\DM19_Data'
I would like to have:
r'D:TestData\20190917_DT19_HigherFlowRate_StdCooler\DM19_Data'
And I tried:
fr'{fld_dic}' it gives me answer as: 'D:TestData\x8190917_DT19_HigherFlowRate_StdCooler\\DM19_Data'
which is not what I want. Any idea how to change to raw string from an assigned variable with '\' and number combined?
Thanks

The problem's root caused is string assigning. When you assigning like that path='c:\202\data' python encode this string according to default UNICODE. You need to change your assigning. You have to assige as raw string. Also like this path usage is not best practice. It will occure proble continuesly. It is not meet with PEP8
You should not be used path variable as string. It will destroy python cross platform advantage.
You should use pathlib or os.path. I recommend pathlib. It have pure windows and linux path. Also while getting path use this path. If You get path from and input you can read it as raw text and convert to pathlib instance.
Check this link:
https://docs.python.org/3/library/pathlib.html
It works but not best practice. Just replace path assigning as raw string/
import os
def fcn(path=r'C:\202\data'):
print(path)
os.chdir(path)
fcn()

Related

Opening files in python3 for windows10

Hi I cannot open files in python 3 actually I have a problem with the path. I don't know how to write the path for it.:/ For example I have a file(bazi.py) in folder(w8) in driver(F). How should i write it's path. Please help me im an amateur:/
In Windows, there are a couple additional ways of referencing a file. That is because natively, Windows file path employs the backslash "" instead of the slash. Python allows using both in a Windows system, but there are a couple of pitfalls to watch out for. To sum them up:
Python lets you use OS-X/Linux style slashes "/" even in Windows. Therefore, you can refer to the file as 'C:/Users/narae/Desktop/alice.txt'. RECOMMENDED.
If using backslash, because it is a special character in Python, you must remember to escape every instance: 'C:\Users\narae\Desktop\alice.txt'
Alternatively, you can prefix the entire file name string with the rawstring marker "r": r'C:\Users\narae\Desktop\alice.txt'. That way, everything in the string is interpreted as a literal character, and you don't have to escape every backslash.
File Name Shortcuts and CWD (Current Working Directory)
So, using the full directory path and file name always works; you should be using this method. However, you might have seen files called by their name only, e.g., 'alice.txt' in Python. How is it done?
The concept of Current Working Directory (CWD) is crucial here. You can think of it as the folder your Python is operating inside at the moment. So far we have been using the absolute path, which begins from the topmost directory. But if your file reference does not start from the top (e.g., 'alice.txt', 'ling1330/alice.txt'), Python assumes that it starts in the CWD (a "relative path").
using the os.path.abspath function will translate the path to a version appropriate for the operating system.
os.path.abspath(r'F:\w8\bazi.py')

Can't replace character "\"

I've been working on a program that reads out an specific PDF and converts the data to an Excel file. The program itself already works, but while trying to refine some aspects I ran into a problem. What happens is the modules I'm working with read directories with simple slashes dividing each folder, such as:
"C:/Users/UserX"
While windows directories are divided by backslashes, such as:
"C:\Users\UserX"
I thought using a simple replace would work just fine:
directory.replace("\" ,"/")
But whenever I try to run the program, the \ isn't identified as a string. Instead it pops up as orange in the IDE I'm working with (PyCharm). Is there anyway to remediate this? Or maybe another useful solution?
In general you should work with the os.path package here.
os.getcwd() gives you the current directory, you can add a subfolder of it via more arguments, and put the filename last.
import os
path_to_file = os.path.join(os.getcwd(), "childFolder", filename)
In Python, the '\' character is represented by '\\':
directory.replace("\\" ,"/")
Just try adding another backslash.
First of all you need to pass "C:\Users\UserX" as a raw string. Use
directory=r"C:\Users\UserX"
Secondly, suppress the backslash using a second backslash.
directory.replace("\\" ,"/")
All of this is required as in python the backslash (\) is a special character known as an escape character.
Try this:
import os
path = "C:\\temp\myFolder\example\\"
newPath = path.replace(os.sep, '/')
print(newPath)
Output:<< C:/temp/myFolder/example/ >>

Why is glob ignoring some directories?

I'm trying to find all *.txt files in a directory with glob(). In some cases, glob.glob('some\path\*.txt') gives an empty string, despite existing files in the given directories. This is especially true, if path is all lower-case or numeric.
As a minimal example I have two folders a and A on my C: drive both holding one Test.txt file.
import glob
files1 = glob.glob('C:\a\*.txt')
files2 = glob.glob('C:\A\*.txt')
yields
files1 = []
files2 = ['C:\\A\\Test.txt']
If this is by design, is there any other directory name, that leads to such unexpected behaviour?
(I'm working on win 7, with Python 2.7.10 (32bit))
EDIT: (2019) Added an answer for Python 3 using pathlib.
The problem is that \a has a special meaning in string literals (bell char).
Just double backslashes when inserting paths in string literals (i.e. use "C:\\a\\*.txt").
Python is different from C because when you use backslash with a character that doesn't have a special meaning (e.g. "\s") Python keeps both the backslash and the letter (in C instead you would get just the "s").
This sometimes hides the issue because things just work anyway even with a single backslash (depending on what is the first letter of the directory name) ...
I personally avoid using double-backslashes in Windows and just use Python's handy raw-string format. Just change your code to the following and you won't have to escape the backslashes:
import glob
files1 = glob.glob(r'C:\a\*.txt')
files2 = glob.glob(r'C:\A\*.txt')
Notice the r at the beginning of the string.
As already mentioned, the \a is a special character in Python. Here's a link to a list of Python's string literals:
https://docs.python.org/2/reference/lexical_analysis.html#string-literals
As my original answer attracted more views than expected and some time has passed. I wanted to add an answer that reliably solves this kind of problems and is also cross-plattform compatible. It's in python 3 on Windows 10, but should also work on *nix systems.
from pathlib import Path
filepath = Path(r'C:\a')
filelist = list(filepath.glob('*.txt'))
--> [WindowsPath('C:/a/Test.txt')]
I like this solution better, as I can copy and paste paths directly from windows explorer, without the need to add or double backslashes etc.

Long paths in Python on Windows

I have a problem when programming in Python running under Windows. I need to work with file paths, that are longer than 256 or whatsathelimit characters.
Now, I've read basically about two solutions:
Use GetShortPathName from kernel32.dll and access the file in this way.
That is nice, but I cannot use it, since I need to use the paths in a way
shutil.rmtree(short_path)
where the short_path is a really short path (something like D:\tools\Eclipse) and the long paths appear in the directory itself (damn Eclipse plugins).
Prepend "\\\\?\\" to the path
I haven't managed to make this work in any way. The attempt to do anything this way always result in error WindowsError: [Error 123] The filename, directory name, or volume label syntax is incorrect: <path here>
So my question is: How do I make the 2nd option work? I stress that I need to use it the same way as in the example in option #1.
OR
Is there any other way?
EDIT: I need the solution to work in Python 2.7
EDIT2: The question Python long filename support broken in Windows does give the answer with the 'magic prefix' and I stated that I know it in this question. The thing I do not know is HOW do I use it. I've tried to prepend that to the path but it just failed, as I've written above.
Well it seems that, as always, I've found the answer to what's been bugging me for a week twenty minutes after I seriously ask somebody about it.
So I've found that I need to make sure two things are done correctly:
The path can contain only backslashes, no forward slashes.
If I want to do something like list a directory, I need to end the path with a backslash, otherwise Python will append /*.* to it, which is a forward slash, which is bad.
Hope at least someone will find this useful.
Let me just simplify this for anyone looking for a straight answer:
For python < 3: Path needs to be unicode, prepend string with u like u'C:\\path\\to\\file'
Path needs to start with \\\\?\\ (which is escaped into \\?\) like u'\\\\?\\C:\\path\\to\\file'
No forward slashes only backslashes: / --> \\
It has to be an absolute path; it does not work for relative paths
py 3.8.2
# Fix long path access:
import ntpath
ntpath.realpath = ntpath.abspath
# Fix long path access.
In my case, this solved the problem of running a script from a long path.
(https://developers.google.com/drive/api/v3/quickstart/python)
But this is not a universal fix.
It looks like the ntpath.realpath implementation has problems. This code replaced it with a dummy.
it works for me
import os
str1=r"C:\Users\manual\demodfadsfljdskfjslkdsjfklaj\inner-2djfklsdfjsdklfj\inner3fadsfksdfjdklsfjksdgjl\inner4dfhasdjfhsdjfskfklsjdkjfleioreirueewdsfksdmv\anotherInnerfolder4aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\5qbbbbbbbbbbbccccccccccccccccccccccccsssssssssssssssss\tmp.txt"
print(len(str1)) #346
path = os.path.abspath(str1)
if path.startswith(u"\\\\"):
path=u"\\\\?\\UNC\\"+path[2:]
else:
path=u"\\\\?\\"+path
with open(path,"r+") as f:
print(f.readline())
if you get a long path(more then 258 char) issue in windows then try this .

Routine in python to test for a string if has a *nix valid absolute path?

I need a routine in python to test for a string that contains an absolute path, that is Unix style format.
So that /home/eduard/tmp/chrome-data-dir/file.ext would be a valid path.
But C:\Users\user\AppData\Local\Google\Chrome\Application\chrome.exe would not be a valid path.
I need also the path to bet tested not contain characters that might be consider special like: *,?
import posixpath
posixpath.isabs('/home/eduard/tmp/chrome-data-dir/file.ext')
posixpath is the implementation of os.path used on Unix-like systems. See also isabs documentation.
Your first example is not a relative path, it's absolute because it begins with /. The second is also absolute, since the first character after the drive name is a \.
A relative path in Unix would be something like chrome-data-dir/file.ext or ../../include/.
Your question is kind of unclear.
Perhaps you should look for a colon?
If I understand you, your first example IS an absolute path. All absolute paths will start with a "/" as they must start at the root directory, and all relative paths will not. So, just check if your string starts with a "/" using str.startswith('/'). Then, if you want to check if the path is valid, then use os.path.exists().
Your second example is not a *nix path.

Categories

Resources