Dealing with BACKSLASH character in non-string literals in Python - python

I have the following string read from an XML elememnt, and it is assigned to a variable called filename. I don't know how to make this any clearer as saying filename = the following string, without leading someone to think that I have a string literal then.
\\server\data\uploads\0224.1307.Varallo.mov
when I try and pass this to
os.path.basename(filename)
I get the following
\\server\\data\\uploads\x124.1307.Varallo.mov
I tried filename.replace('\\','\\\\') but that doesn't work either. os.path.basename(filename) then returns the following.
\\\\server\\data\\uploads\\0224.1307.Varallo.mov
Notice that the \0 is now not being converted to \x but now it doesn't process the string at all.
what can I do to my filename variable to get this String in a proper state so that os.path.basename() will actually give me back the basename. I am on OSX so the uncpath stuff is not available.
All attempts to replace the \ with \\ manually fail because of the \0 getting converted to \x in the beginning of the basename.
NOTE: this is NOT a string literal so r'' doesn't work.

We need more information. What exactly is in the variable filename? To answer, use print repr(filename) and add the results to your question above.
Wild guess
DISCLAIMER: This is a guess - try:
import ntpath
print ntpath.basename(filename)

All the downvoting in the world won't change the fact that you're doing it wrong. os.path is for native paths. \\foo\bar\baz is not a OS X path, it's a Windows UNC. posixpath is not equipped to handle UNCs; ntpath is.

Related

Why do some functions in Python change \ to \\

When I declare pass a file to shutil.copy as
shutil.copy(r'i:\myfile.txt', r'UNC to where I want it to go')
I get an error
No such file or directory 'i:\\myfile.txt'
I've experienced this problem before with the os module when I have a UNC path. Usually I just get frustrated enough that I forget using the os module and just put the file path into with open() or whatever I'm using it for.
It is my understanding that placing an r before '' is supposed to cause python to ignore escape characters and treat them as string literals, but the behavior I'm seeing leads me to believe that this is not the case. For some reason it takes the \ and changes it to \\.
I've seen this when using os.path.join where the \\ at the beginning of the the UNC Path gets turned into \\\\.
What is the best way to pass a string literal to ensure that all escape characters are ignored and the string is preserved?
Your string is not being modified by Python. It's the representation of your string that's coming out differently.
When the error is printed, Python calls repr() to print the value. This function will
Return a string containing a printable representation of an object. For many types, this function makes an attempt to return a string that would yield an object with the same value when passed to eval(), otherwise the representation is a string enclosed in angle brackets that contains the name of the type of the object together with additional information often including the name and address of the object. A class can control what this function returns for its instances by defining a repr() method.
This can be very nice when debugging: if I paste that string (quotes, escapes, and all) into the REPL I'll get the string in memory that you were working with. I can use this to interactively try your copy command, maybe tweaking the string a bit.
If you want to see your string in a printed form, you could do
source_path = r'i:\myfile.txt'
target_path = r'UNC to where I want it to go'
print(f'Copying {source_path} to {target_path}...')
shutil.copy(source_path, target_path)

re.escape returns unusable directory

using re.escape() on this directory:
C:\Users\admin\code
Should theoratically return this, right?
C:\\Users\\admin\\code
However, what I actually get is this:
C\:\\Users\\admin\\code
Notice the backslash immediately after C. This makes the string unusable, and trying to use directory.replace('\', '') just bugs out Python because it can't deal with a single backslash string, and treats everything after it as string.
Any ideas?
Update
This was a dumb question :p
No it should not. It's help says "Escape all the characters in pattern except ASCII letters, numbers and '_'"
What you are reporting you are getting is after calling the print function on the resulting string. In console, if you type directory and press enter, it would give something like: C\\:\\\\Users\\\\admin\\\\code. When using directory.replace('\\','') it would replace all backslashes. For example: directory.replace('\\','x') gives Cx:xxUsersxxadminxxcode. What might work in this case is replacing both the backslash and colon with ':' i.e. directory.replace('\\:',':'). This will work.
However, I will suggest doing something else. A neat way to work with Windows directories in Python is to use forward slash. Python and the OS will work out a way to understand your paths with forward slashes. Further, if you aren't using absolute paths, as far as the paths are concerned, your code will be portable to Unix-style OSes.
It also seems to me that you are calling re.escape unnecessarily. If the printing the directory is giving you C:\Users\admin\code then it's a perfectly fine directory to use already. And you don't need to escape it. It's already done. If it wasn't escaped print('C:\Users\admin\code') would give something like C:\Usersdmin\code since \a has special meaning (beep).

Issues handling strings with .encode('string-escape') method

I am working with variables containing directory paths in python on a windows machine, and as such need to convert string litterals to raw strings (removing escape sequences). All is fine when i use the os.getcwd() function and convert using the method .encode('string-escape'), but as soon as i try doing the same with a hard coded string it wont work. This is especially confusing as both objects are of the same type (string), and as such should behave in exactly the same way.
My code is:
import os
dir1 = os.getcwd()
type1 = type(dir1)
print type1
print dir1.encode('string-escape')
print "\n\n"
dir2 = "C:\Users\StaM\Desktop\brba\test1"
type2 = type(dir2)
print type2
print dir2.encode('string-escape')
And my output is:
<type 'str'>
C:\\Users\\StaM\\Desktop\\brba\\test1
<type 'str'>
C:\\Users\\StaM\\Desktop\x08rba\test1
As you can see both objects are the same type yet the behaviour is different in handling escape sequences. Any ideas on why this is happening and how to get this to work properly? All explanations / suggestions / solutions would be highly appreciated, I really want to understand what is going on here. Thnx
Please note: This question is about the .encode() method and not 'r' flag... Using the 'r' flag for raw strings is not an option here, as i am passing the variables containing directory paths into my program to construct a larger string to represent DOS commands.
The reason for this behavior is that the os.getcwd() function returns a pre-formatted string inclusive of double "\" even when pre-fixed to an escape character. While the .encode() method will only append the second "\" if the character that follows it is not an escape character.
>>> import os
>>> dir = os.getcwd()
>>> print "%r" %dir
'C:\\Users\\StaM\\Desktop\\brba\\test1'
The solution here is to use a dictionary to define all possible escape characters, then use a loop to locate these characters in the string in question and to append a secondary "\" directly preceding any escape characters. This should be done prior to using the .encode() method.
BOOM!

ValueError: No escaped character for python shlex.split

I have a string as follows
mystring1=xcopy /Q /Y d:\\Program Files\\TestData\\*.* c:\\Program Files\\TestData\\Company name\\
mystring2=xcopy '/Q' '/Y' 'd:\tj\tjData\\' "c:\Program Files\TestData\\Company name\\"
I used shlex module as follows
mylist1=shlex.split(mystring1)
mylist2=shlex.split(mystring2)
but I am getting an error:
ValueError: No escaped character
mylist1 value should be [xcopy,/Q,/Y,d:\Program Files\TestData\,c:\Program Files\TestData\Company name\]
and
mylist2 value should be [xcopy,/Q,/Y,d:\tj\tjData\,c:\Program Files\TestData\Company name\]
Well, I'm not sure to understand what you want to do but, on the first hand, I see a Windows user and, on the second hand, I seed a Posix option in the manual.
So I thought : "posix=False" is for him.
And here is what it give :
>>> mystring1
'xcopy /Q /Y d:\\Program Files\\TestData\\*.* c:\\Program Files\\TestData\\Company name\\'
>>> split(mystring1, posix=False)
['xcopy', '/Q', '/Y', 'd:\\Program', 'Files\\TestData\\*.*', 'c:\\Program', 'Files\\TestData\\Company', 'name\\']
>>> mystring2
'xcopy \'/Q\' \'/Y\' \'d:\tj\tjData\\\' "c:\\Program Files\\TestData\\Company name"'
>>> split(mystring2, posix=False)
['xcopy', "'/Q'", "'/Y'", "'d:\tj\tjData\\'", '"c:\\Program Files\\TestData\\Company name"']
Character escaping is maybe not exactly what you need but, as I do not frequent Windows,I would venture no further on this point.
Edit: as I know it is not always easy to navigate in the documentation when you start on a subject, here are some links :
shlex <= you shloud always RTFM. At least twice.
Python Lexcial Analysys <= Could be not obvious, but will change your minds.
The formatting of the input values is really bad.
Consider reading the formatting help.
Which string causes an error?
A first look at your input: The backslash character has a special meaning in Python strings.
So when the path is:
s = 'C:\MSDOS'
you have to write:
s = 'C:\\MSDOS'
The first backslash says: "Attention! The next character is not meant to have a special function", the second backslash is the character itself.
Have a look at http://docs.python.org/release/2.5.2/ref/strings.html

Convert backward slash to forward slash in python

Hi
I have read articles related converting backward to forward slashes.
But sol was to use raw string.
But Problem in my case is :
I will get file path dynamically to a variable
var='C:\dummy_folder\a.txt'
In this case i need to convert it to Forward slashes.
But due to '\a',i am not able to convert to forward slashes
How to i convert it? OR How should i change this string to raw string so that i can change it to forward slash
Don't do this. Just use os.path and let it handle everything. You should not explicitly set the forward or backward slashes.
>>> var=r'C:\dummy_folder\a.txt'
>>> var.replace('\\', '/')
'C:/dummy_folder/a.txt'
But again, don't. Just use os.path and be happy!
There is also os.path.normpath(), which converts backslashes and slashes depending on the local OS. Please see here for detailed usage info. You would use it this way:
>>> string = r'C:/dummy_folder/a.txt'
>>> os.path.normpath(string)
'C:\dummy_folder\a.txt'
Handling paths as a mere string could put you into troubles.; even more if the path you are handling is an user input or may vary in unpredictable ways.
Different OS have different way to express the path of a given file, and every modern programming language has own methods to handle paths and file system references. Surely Python and Ruby have it:
Python: os.path
Ruby: File and FileUtils
If you really need to handle strings:
Python: string.replace
Ruby : string.gsub
Raw strings are for string literals (written directly in the source file), which doesn't seem to be the case here. In any case, forward slashes are not special characters -- they can be embedded in a regular string without problems. It's backslashes that normally have other meaning in a string, and need to be "escaped" so that they get interpreted as literal backslashes.
To replace backslashes with forward slashes:
# Python:
string = r'C:\dummy_folder\a.txt'
string = string.replace('\\', '/')
# Ruby:
string = 'C:\\dummy_folder\\a.txt'
string = string.gsub('\\', '/')
>>> 'C:\\dummy_folder\\a.txt'.replace('\\', '/')
'C:/dummy_folder/a.txt'
In a string literal, you need to escape the \ character.

Categories

Resources