Is os.path.basename meant for file system files? - python

In Windows, os.path.basename('D:\\abc\def.txt') returns abc\def.txt, whereas os.path.basename('/abc/def.txt') returns def.txt.
Shouldn't the first also return def.txt?

You have an escape code in your filename, not a \ directory separator. You must've simplified your problem by using def for the filename, but had you actually tested with that simplified filename you'd have noticed that the slash would be doubled:
>>> 'D:\\abc\def.txt'
'D:\\abc\\def.txt'
Note that the \d in the string literal became a \\ escaped backslash in the Python representation of the value. That's because there is no valid \d escape sequence. On a Windows system the os.path.basename() call works as expected for that path:
>>> import os.path
>>> os.path.basename('D:\\abc\\def.txt')
'def.txt'
In your case, however, you created an escape sequence, either \n, \r or \t, because you either forgot to double the backslash or you forgot to use a raw string. You do not have a \ character in that part of the filename, so there is nothing to split on at that location.
Use a r'...' raw string to prevent single backslashes from forming escape sequences, or double your backslashes in all locations, or use forward slashes (Windows accepts either).

Related

Python assign "\" to a variable [duplicate]

When I write print('\') or print("\") or print("'\'"), Python doesn't print the backslash \ symbol. Instead it errors for the first two and prints '' for the third. What should I do to print a backslash?
This question is about producing a string that has a single backslash in it. This is particularly tricky because it cannot be done with raw strings. For the related question about why such a string is represented with two backslashes, see Why do backslashes appear twice?. For including literal backslashes in other strings, see using backslash in python (not to escape).
You need to escape your backslash by preceding it with, yes, another backslash:
print("\\")
And for versions prior to Python 3:
print "\\"
The \ character is called an escape character, which interprets the character following it differently. For example, n by itself is simply a letter, but when you precede it with a backslash, it becomes \n, which is the newline character.
As you can probably guess, \ also needs to be escaped so it doesn't function like an escape character. You have to... escape the escape, essentially.
See the Python 3 documentation for string literals.
A hacky way of printing a backslash that doesn't involve escaping is to pass its character code to chr:
>>> print(chr(92))
\
print(fr"\{''}")
or how about this
print(r"\ "[0])
For completeness: A backslash can also be escaped as a hex sequence: "\x5c"; or a short Unicode sequence: "\u005c"; or a long Unicode sequence: "\U0000005c". All of these will produce a string with a single backslash, which Python will happily report back to you in its canonical representation - '\\'.

Using os.chdir to access a file in which a folder starts with '\f'

I know that \f is a form feed. I want to access my folder the following way:
os.chdir("C:\Python27\BGT_Python\skills\fuzzymatching")
The folder 'fuzzymatching' starts with the \f symbol which breaks the string.
What's the easiest way to get around these types of symbols?
Add an r character in front of the string:
os.chdir(r"C:\Python27\BGT_Python\skills\fuzzymatching")
See the Python docs.
In triple-quoted strings, unescaped newlines and quotes are allowed (and are retained), except that three unescaped quotes in a row terminate the string. (A ``quote'' is the character used to open the string, i.e. either ' or ".)
and
Unless an r' orR' prefix is present, escape sequences in strings are interpreted according to rules similar to those used by Standard C.
For completeness, I'll add:
os.chdir("C:/Python27/BGT_Python/skills/fuzzymatching")
About the only part of Windows that actually requires backslashes is the command line.
This should work:
os.chdir("C:\Python27\BGT_Python\skills\\fuzzymatching")
I just added a \ to scape \f.

Understanding file locations in python - unexpected errors

I am learning python 3.3 in windows 7. I have a two text files - lines.txt and raven.txt in a folder. Both contain the same text for the first example.
When I try to access ravens, using the code below, I get the error -
OSError: [Errno 22] Invalid argument: 'C:\\Python\raven.txt'
I know that the above error can be fixed by using an escape character like this -
C:\\Python\\raven.txt
C:\Python\\raven.txt
Why do both methods work ? Strangely, when I access lines.txt in the same folder, I get no error ! Why ?
import re
def main():
print('')
fh = open('C:\Python\lines.txt')
for line in fh:
if re.search('(Len|Neverm)ore', line):
print(line, end = '')
if __name__ == '__main__':main()
Also, when I use the line below, I get a completely different error - TypeError: embedded NUL character. Why ?
fh = open('C:\Python\Exercise Files\09 Regexes\raven.txt')
I can rectify this by using \ before every \ in the file path.
\r is an escape character, but \l is not. So, lines is interpreted as lines while raven is interpreted as aven, since \r is escaped.
In [1]: len('\l')
Out[1]: 2
In [2]: len('\r')
Out[2]: 1
You should always escape backslashes with \\. In cases your string doesn't have quotes, you can also use raw strings:
In [9]: len(r'\r')
Out[9]: 2
In [10]: r'\r'
Out[10]: '\\r'
See: https://docs.python.org/3/reference/lexical_analysis.html
maybe you can use raw string.
just like this open(r'C:\Python\Exercise Files\09 Regexes\raven.txt').
When an r' orR' prefix is present, backslashes are still used to
quote the following character, but all backslashes are left in the
string. For example, the string literal r"\n" consists of two
characters: a backslash and a lowercase `n'. String quotes can be
escaped with a backslash, but the backslash remains in the string; for
example, r"\"" is a valid string literal consisting of two characters:
a backslash and a double quote; r"\" is not a value string literal
(even a raw string cannot end in an odd number of backslashes).
Specifically, a raw string cannot end in a single backslash (since the
backslash would escape the following quote character). Note also that
a single backslash followed by a newline is interpreted as those two
characters as part of the string, not as a line continuation.
You can actually use forward slashes instead of backward ones, that way you don't have to escape them at all, which would save you a lot of headaches. Like this: 'C:/Python/raven.txt', I can guarantee that it works on Windows.

String literals for file names

I am new to Python - but not to programming, and on a bit of a steep learning curve.
I have a programme that reads several input files - the first input file contains (amongst other things) the path and name the other files.
I can open the file and read the name OK. If I print the string it looks like this
Z:\ \python\ \rb_data.dat\n'
all my "\" become "\ \" I think I can fix this by using the "r" prefix to convert it to a literal.
My question is how do I attach the prefix to a string variable ??
This is what I want to do :
modat = open('z:\\python\mot1 input.txt') # first input file containing names of other file
rbfile = modat.readline() # get new file name
rbdat = open(rbfile) # open new file
The \\ is an escape sequence for the backslash character \. When you specify a string literal, they are enquoted by either ' or ". Because there are some characters you might need to specify to be part of the string which you cannot enter like this—for example the quotation marks themselves—escape sequences allow you to do it. They usually are \x where x is something you want to enter. Now because all escape sequences start with a backslash, the backslash itself also turns into a special character which you cannot specify directly within a string literal. So you need to escape it too.
That means that the string literal '\\' actually represents a string with a single character: The backslash. Raw strings, that are string literals with an r in front of the opening quotation character, ignore (most) escape sequences. So r'\\x' is actually the string where two backslashes are followed by an x. So it’s identical to the string described by the non-raw string literal '\\\\x'.
All this only applies to string literals though. The string itself holds no information about whether it was created with a raw string literal or not, or whether there was some escape sequence need or not. It just contains all the characters that make out the string.
That also means that as soon as you get a string from somewhere, for example by reading it from a file, then you don’t need to worry about escaping something in there to make sure that it’s a correct string. It just is.
So in your code, when you open the file at z:\python\mot1 input.txt, you need to specify that filename as a string first. So you have to use a string literal, either with escaping the backslashes, or by using a raw string.
Then, when you read the new filename from that file, you already have a real string, and don’t need to bother with anything more. Assuming that it was correctly written to the file, you can just use it like that.
The backslash \ in Python strings (and in code blocks on StackOverflow!) means, effectively, "treat the next character differently". As it is reserved for this purpose, when you actually have a backslash in your strings, it must be "escaped" by a preceding backslash:
>>> myString = "\\" # the first one "escapes" the second
>>> myString = "\" # no escape, so...
SyntaxError: EOL while scanning string literal
>>> print("\\") # when we actually print out the string
\
The short story is, you can basically ignore this in your strings. If you pass rbfile to open, Python will interpret it correctly.
Why not use os.path.normcase, like this:
with open(r'z:\python\mot1 input.txt') as f:
for line in f:
if line.strip():
if os.path.isfile(os.path.normcase(line.strip())):
with open(line.strip()) as f2:
# do something with
# f2
From the documentation of os.path.normcase:
Normalize the case of a pathname. On Unix and Mac OS X, this returns
the path unchanged; on case-insensitive filesystems, it converts the
path to lowercase. On Windows, it also converts forward slashes to
backward slashes.

How can I put an actual backslash in a string literal (not use it for an escape sequence)?

I have this code:
import os
path = os.getcwd()
final = path +'\xulrunner.exe ' + path + '\application.ini'
print(final)
I want output like:
C:\Users\me\xulrunner.exe C:\Users\me\application.ini
But instead I get an error that looks like:
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \xXX escape
I don't want the backslashes to be interpreted as escape sequences, but as literal backslashes. How can I do it?
Note that if the string should only contain a backslash - more generally, should have an odd number of backslashes at the end - then raw strings cannot be used. Please use How can I get a string with a single backslash in it? to close questions that are asking for a string with just a backslash in it. Use How to write string literals in python without having to escape them? when the question is specifically about wanting to avoid the need for escape sequences.
To answer your question directly, put r in front of the string.
final= path + r'\xulrunner.exe ' + path + r'\application.ini'
But a better solution would be os.path.join:
final = os.path.join(path, 'xulrunner.exe') + ' ' + \
os.path.join(path, 'application.ini')
(the backslash there is escaping a newline, but you could put the whole thing on one line if you want)
I will mention that you can use forward slashes in file paths, and Python will automatically convert them to the correct separator (backslash on Windows) as necessary. So
final = path + '/xulrunner.exe ' + path + '/application.ini'
should work. But it's still preferable to use os.path.join because that makes it clear what you're trying to do.
You can escape the slash. Use \\ and you get just one slash.
You can escape the backslash with another backslash (\\), but it won’t look nicer. To solve that, put an r in front of the string to signal a raw string. A raw string will ignore all escape sequences, treating backslashes as literal text. It cannot contain the closing quote unless it is preceded by a backslash (which will be included in the string), and it cannot end with a single backslash (or odd number of backslashes).
Another simple (and arguably more readable) approach is using string raw format and replacements like so:
import os
path = os.getcwd()
final = r"{0}\xulrunner.exe {0}\application.ini".format(path)
print(final)
or using the os path method (and a microfunction for readability):
import os
def add_cwd(path):
return os.path.join( os.getcwd(), path )
xulrunner = add_cwd("xulrunner.exe")
inifile = add_cwd("application.ini")
# in production you would use xulrunner+" "+inifile
# but the purpose of this example is to show a version where you could use any character
# including backslash
final = r"{} {}".format( xulrunner, inifile )
print(final)

Categories

Resources