Understanding file locations in python - unexpected errors - python

I am learning python 3.3 in windows 7. I have a two text files - lines.txt and raven.txt in a folder. Both contain the same text for the first example.
When I try to access ravens, using the code below, I get the error -
OSError: [Errno 22] Invalid argument: 'C:\\Python\raven.txt'
I know that the above error can be fixed by using an escape character like this -
C:\\Python\\raven.txt
C:\Python\\raven.txt
Why do both methods work ? Strangely, when I access lines.txt in the same folder, I get no error ! Why ?
import re
def main():
print('')
fh = open('C:\Python\lines.txt')
for line in fh:
if re.search('(Len|Neverm)ore', line):
print(line, end = '')
if __name__ == '__main__':main()
Also, when I use the line below, I get a completely different error - TypeError: embedded NUL character. Why ?
fh = open('C:\Python\Exercise Files\09 Regexes\raven.txt')
I can rectify this by using \ before every \ in the file path.

\r is an escape character, but \l is not. So, lines is interpreted as lines while raven is interpreted as aven, since \r is escaped.
In [1]: len('\l')
Out[1]: 2
In [2]: len('\r')
Out[2]: 1
You should always escape backslashes with \\. In cases your string doesn't have quotes, you can also use raw strings:
In [9]: len(r'\r')
Out[9]: 2
In [10]: r'\r'
Out[10]: '\\r'
See: https://docs.python.org/3/reference/lexical_analysis.html

maybe you can use raw string.
just like this open(r'C:\Python\Exercise Files\09 Regexes\raven.txt').
When an r' orR' prefix is present, backslashes are still used to
quote the following character, but all backslashes are left in the
string. For example, the string literal r"\n" consists of two
characters: a backslash and a lowercase `n'. String quotes can be
escaped with a backslash, but the backslash remains in the string; for
example, r"\"" is a valid string literal consisting of two characters:
a backslash and a double quote; r"\" is not a value string literal
(even a raw string cannot end in an odd number of backslashes).
Specifically, a raw string cannot end in a single backslash (since the
backslash would escape the following quote character). Note also that
a single backslash followed by a newline is interpreted as those two
characters as part of the string, not as a line continuation.

You can actually use forward slashes instead of backward ones, that way you don't have to escape them at all, which would save you a lot of headaches. Like this: 'C:/Python/raven.txt', I can guarantee that it works on Windows.

Related

Trouble loading csv file into Jupyter Notebooks [duplicate]

This question already has answers here:
How should I write a Windows path in a Python string literal?
(5 answers)
Closed 3 years ago.
I'm trying to read a CSV file into Python (Spyder), but I keep getting an error. My code:
import csv
data = open("C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener")
data = csv.reader(data)
print(data)
I get the following error:
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes
in position 2-3: truncated \UXXXXXXXX escape
I have tried to replace the \ with \\ or with / and I've tried to put an r before "C.., but all these things didn't work.
This error occurs, because you are using a normal string as a path. You can use one of the three following solutions to fix your problem:
1: Just put r before your normal string. It converts a normal string to a raw string:
pandas.read_csv(r"C:\Users\DeePak\Desktop\myac.csv")
2:
pandas.read_csv("C:/Users/DeePak/Desktop/myac.csv")
3:
pandas.read_csv("C:\\Users\\DeePak\\Desktop\\myac.csv")
The first backslash in your string is being interpreted as a special character. In fact, because it's followed by a "U", it's being interpreted as the start of a Unicode code point.
To fix this, you need to escape the backslashes in the string. The direct way to do this is by doubling the backslashes:
data = open("C:\\Users\\miche\\Documents\\school\\jaar2\\MIK\\2.6\\vektis_agb_zorgverlener")
If you don't want to escape backslashes in a string, and you don't have any need for escape codes or quotation marks in the string, you can instead use a "raw" string, using "r" just before it, like so:
data = open(r"C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener")
You can just put r in front of the string with your actual path, which denotes a raw string. For example:
data = open(r"C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener")
Consider it as a raw string. Just as a simple answer, add r before your Windows path.
import csv
data = open(r"C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener")
data = csv.reader(data)
print(data)
Try writing the file path as "C:\\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener" i.e with double backslash after the drive as opposed to "C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener"
Add r before your string. It converts a normal string to a raw string.
As per String literals:
String literals can be enclosed within single quotes (i.e. '...') or double quotes (i.e. "..."). They can also be enclosed in matching groups of three single or double quotes (these are generally referred to as triple-quoted strings).
The backslash character (i.e. \) is used to escape characters which otherwise will have a special meaning, such as newline, backslash itself, or the quote character. String literals may optionally be prefixed with a letter r or R. Such strings are called raw strings and use different rules for backslash escape sequences.
In triple-quoted strings, unescaped newlines and quotes are allowed, except that the three unescaped quotes in a row terminate the string.
Unless an r or R prefix is present, escape sequences in strings are interpreted according to rules similar to those used by Standard C.
So ideally you need to replace the line:
data = open("C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener")
To any one of the following characters:
Using raw prefix and single quotes (i.e. '...'):
data = open(r'C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener')
Using double quotes (i.e. "...") and escaping backslash character (i.e. \):
data = open("C:\\Users\\miche\\Documents\\school\\jaar2\\MIK\\2.6\\vektis_agb_zorgverlener")
Using double quotes (i.e. "...") and forwardslash character (i.e. /):
data = open("C:/Users/miche/Documents/school/jaar2/MIK/2.6/vektis_agb_zorgverlener")
Just putting an r in front works well.
eg:
white = pd.read_csv(r"C:\Users\hydro\a.csv")
It worked for me by neutralizing the '' by f = open('F:\\file.csv')
The double \ should work for Windows, but you still need to take care of the folders you mention in your path. All of them (except the filename) must exist. Otherwise you will get an error.

How to get rid of trailing \ while reading a file in python3

I am reading a file in python and getting the lines from it.
However, after printing out the values I get, I realize that after each line there is a trailing \ at the end.
I have looked at Python strip with \n and tried everything in it but nothing has removed the trailing .
For example
0048\
0051\
0052\
0054\
0056\
0057\
0058\
0059\
How can I get rid of these slashes?
Here is the code I have so far
for line in f:
line = line.replace('\\n', "")
line = line.replace('\\n', "")
print(line)
I've even tried using regex
strings = re.findall(r"\S+", f.read())
But nothing has worked so far.
You're probably confused about what is in the lines, and as a result you're confusing me too. '\n' is a single newline character, as shown using repr() (which is your friend when you want to know what a value is exactly). A line typically ends with that (the exception being the end of file which might not). That does not contain a backslash; that backslash is part of a string literal escape sequence. Your replace argument of '\\n' contains two characters, a backslash followed by the letter n. This wouldn't match a '\n'; the easiest way to remove the newline specifically is to use str.rstrip('\n'). The line reading itself will guarantee that there's only up to one newline, and it is at the end of the string. Frequently we use strip() with no argument instead as we don't want whitespace either.
If your string really does contain backslash, you can process that as well, whether using replace, strip, re or some other string processing. Just keep in mind that it might be used for escape sequences not only at string literal level but at regular expression level too. For instance, re.sub(r'\\$', '', str) will remove a backslash from the end of a string; the backslash itself is doubled to not mean a special sequence in the regular expression, and the string literal is raw to not need another doubling of the backslashes.

python replace single backslash with double backslash [duplicate]

This question already has answers here:
How can I put an actual backslash in a string literal (not use it for an escape sequence)?
(4 answers)
Closed 7 months ago.
In python, I am trying to replace a single backslash ("\") with a double backslash("\"). I have the following code:
directory = string.replace("C:\Users\Josh\Desktop\20130216", "\", "\\")
However, this gives an error message saying it doesn't like the double backslash. Can anyone help?
No need to use str.replace or string.replace here, just convert that string to a raw string:
>>> strs = r"C:\Users\Josh\Desktop\20130216"
^
|
notice the 'r'
Below is the repr version of the above string, that's why you're seeing \\ here.
But, in fact the actual string contains just '\' not \\.
>>> strs
'C:\\Users\\Josh\\Desktop\\20130216'
>>> s = r"f\o"
>>> s #repr representation
'f\\o'
>>> len(s) #length is 3, as there's only one `'\'`
3
But when you're going to print this string you'll not get '\\' in the output.
>>> print strs
C:\Users\Josh\Desktop\20130216
If you want the string to show '\\' during print then use str.replace:
>>> new_strs = strs.replace('\\','\\\\')
>>> print new_strs
C:\\Users\\Josh\\Desktop\\20130216
repr version will now show \\\\:
>>> new_strs
'C:\\\\Users\\\\Josh\\\\Desktop\\\\20130216'
Let me make it simple and clear. Lets use the re module in python to escape the special characters.
Python script :
import re
s = "C:\Users\Josh\Desktop"
print s
print re.escape(s)
Output :
C:\Users\Josh\Desktop
C:\\Users\\Josh\\Desktop
Explanation :
Now observe that re.escape function on escaping the special chars in the given string we able to add an other backslash before each backslash, and finally the output results in a double backslash, the desired output.
Hope this helps you.
Use escape characters: "full\\path\\here", "\\" and "\\\\"
In python \ (backslash) is used as an escape character. What this means that in places where you wish to insert a special character (such as newline), you would use the backslash and another character (\n for newline)
With your example string you would notice that when you put "C:\Users\Josh\Desktop\20130216" in the repl you will get "C:\\Users\\Josh\\Desktop\x8130216". This is because \2 has a special meaning in a python string. If you wish to specify \ then you need to put two \\ in your string.
"C:\\Users\\Josh\\Desktop\\28130216"
The other option is to notify python that your entire string must NOT use \ as an escape character by pre-pending the string with r
r"C:\Users\Josh\Desktop\20130216"
This is a "raw" string, and very useful in situations where you need to use lots of backslashes such as with regular expression strings.
In case you still wish to replace that single \ with \\ you would then use:
directory = string.replace(r"C:\Users\Josh\Desktop\20130216", "\\", "\\\\")
Notice that I am not using r' in the last two strings above. This is because, when you use the r' form of strings you cannot end that string with a single \
Why can't Python's raw string literals end with a single backslash?
https://pythonconquerstheuniverse.wordpress.com/2008/06/04/gotcha-%E2%80%94-backslashes-are-escape-characters/
Maybe a syntax error in your case,
you may change the line to:
directory = str(r"C:\Users\Josh\Desktop\20130216").replace('\\','\\\\')
which give you the right following output:
C:\\Users\\Josh\\Desktop\\20130216
The backslash indicates a special escape character. Therefore, directory = path_to_directory.replace("\", "\\") would cause Python to think that the first argument to replace didn't end until the starting quotation of the second argument since it understood the ending quotation as an escape character.
directory=path_to_directory.replace("\\","\\\\")
Given the source string, manipulation with os.path might make more sense, but here's a string solution;
>>> s=r"C:\Users\Josh\Desktop\\20130216"
>>> '\\\\'.join(filter(bool, s.split('\\')))
'C:\\\\Users\\\\Josh\\\\Desktop\\\\20130216'
Note that split treats the \\ in the source string as a delimited empty string. Using filter gets rid of those empty strings so join won't double the already doubled backslashes. Unfortunately, if you have 3 or more, they get reduced to doubled backslashes, but I don't think that hurts you in a windows path expression.
You could use
os.path.abspath(path_with_backlash)
it returns the path with \
Use:
string.replace(r"C:\Users\Josh\Desktop\20130216", "\\", "\\")
Escape the \ character.

Replacing strings in a text file using Python adds weird characters

I want to replace a text with a path in each line of a text file using Python, but I am getting weird characters (squares) in the path in output file.
Current code:
#!/usr/bin/env python
f1 = open('input.txt', 'r')
f2 = open('output.txt', 'w')
for line in f1:
f2.write(line.replace('test/software', 'C:\Software\api\render\3bit\sim>'))
f1.close()
f2.close()
In the output text the following in the path is replaced with a square (weird character):
\a = changed to a square
\r = changed to a square
\3 = changed to a square
Is there something wrong with my code or are the above letters reserved for the system?
Python strings support escape codes; a backslash with certain characters is replaced by the code they represent. \r is interpreted as the ASCII line-feed character, for example, \a is an ASCII BELL, and \3 is interpreted as the ascii codepoint 3 (in octal numbering). See the Python string literal documentation.
To disable escape codes being interpreted, use a raw python string by prefixing the string definition with a r:
r'C:\Software\api\render\3bit\sim>'
so your line reads:
f2.write(line.replace('test/software', r'C:\Software\api\render\3bit\sim>'))
Alternatively, double the backslashes to have them interpreted as a literal backslashes instead:
'C:\\Software\\api\\render\\3bit\\sim>'
Your string has backslashes which are read in Python as escape codes. These are when a character preceded by a backslash is changed into a special character. For example, \n is a newline. You will need to either escape them (with another backslash) or just use a raw string.
r'C:\Software\api\render\3bit\sim>'
Before each of your path files, add an "r" character to create a raw string, this might fix the issue. Example:
f2.write(line.replace('test/software', r'C:\Software\api\render\3bit\sim>'))
Or alternatively, escape your backslashes:
f2.write(line.replace('test/software', 'C:\\Software\\api\\render\\3bit\\sim>'))

how to replace back slash character with empty string in python [duplicate]

This question already has answers here:
How can I print a single backslash?
(4 answers)
Closed 7 months ago.
I am trying replace a backslash '\' in a string with the following code
string = "<P style='TEXT-INDENT'>\B7 </P>"
result = string.replace("\",'')
result:
------------------------------------------------------------
File "<ipython console>", line 1
result = string.replace("\",'')
^
SyntaxError: EOL while scanning string literal
Here i don't need the back slashes because actually i am parsing an xml file which has a tag in the above format, so if backslashes are there it is displaying invalid token during parsing
Can i know how to replace the backslashes with empty string in python
We need to specify that we want to replace a string that contains a single backslash. We cannot write that as "\", because the backslash is escaping the intended closing double-quote. We also cannot use a raw string literal for this: r"\" does not work.
Instead, we simply escape the backslash using another backslash:
result = string.replace("\\","")
The error is because you did not add a escape character to your '\', you should give \\ for backslash (\)
In [147]: foo = "a\c\d" # example string with backslashes
In [148]: foo
Out[148]: 'a\\c\\d'
In [149]: foo.replace('\\', " ")
Out[149]: 'a c d'
In [150]: foo.replace('\\', "")
Out[150]: 'acd'
In Python, as explained in the documentation:
The backslash () character is used to escape characters that otherwise have a special meaning, such as newline, backslash itself, or the quote character.
So, in order to replace \ in a string, you need to escape the backslash itself with another backslash, thus:
>>> "this is a \ I want to replace".replace("\\", "?")
'this is a ? I want to replace'
Using regular expressions:
import re
new_string = re.sub("\\\\", "", old_string)
The trick here is that "\\\\" is a string literal describing a string containing two backslashes (each one is escaped), then the regex engine compiles that into a pattern that will match one backslash (doing a separate layer of unescaping).
Adding a solution if string='abcd\nop.png'
result = string.replace("\\","")
This above won't work as it'll give result='abcd\nop.png'.
Here if you see \n is a newline character. So we have to replace backslah char in raw string(as there '\n' won't be detected)
string.encode('unicode_escape')
result = string.replace("\\", "")
#result=abcdnop.png
You need to escape '\' with one extra backslash to compare actually with \.. So you should use '\'..
See Python Documentation - section 2.4 for all the escape sequences in Python.. And how you should handle them..
It's August 2020.
Python 3.8.1
Pandas 1.1.0
At this point in time I used both the double \ backslash AND the r.
df.replace([r'\\'], [''], regex=True, inplace=True)
Cheers.

Categories

Resources