Python doesn't create file if not existing - python

I cannot figure out how to create file that does not exist. I tried following, yet I get error that file does not exist.
Please guide.
f=open('c:\Lets_Create_Malware\output.txt', 'r+')
f=open('c:\Lets_Create_Malware\output.txt', 'w+')
f=open('c:\Lets_Create_Malware\output.txt', 'a+')
f=open('c:\Lets_Create_Malware\output.txt', 'r')
f=open('c:\Lets_Create_Malware\output.txt', 'w')
f=open('c:\Lets_Create_Malware\output.txt', 'a')

Use a double backslash:
f=open('c:\\Lets_Create_Malware\\output.txt', 'w+')
From the docs:
The backslash (\) character is used to escape characters that otherwise have a special meaning, such as newline, backslash itself, or the quote character.

Given the exact paths you've specificed, at least some of your examples ought to have worked (unless the c:\Lets_Create_Malware path doesn't exist, which would add to the confusion by causing all of your test cases to fail).
Backslashes aren't a problem here given your examples because the characters being modified aren't special:
f=open('c:\Lets_Create_Malware\output.txt', 'w')
works because \L and \o don't have special meanings and so are used literally (and the 'w' and 'a' flags will create the file if it's not already present).
However, another path:
f=open('c:\Lets_Create_Malware\badname.txt', 'w')
will fail:
IOError: [Errno 22] invalid mode ('w') or filename: 'c:\\Lets_Create_Malware\x08adname.txt'
because the \b part of that filename gets translated as the bell character (ctrl-b or \x08).
There are two ways to avoid this problem: either precede the string with the r raw string modifier (e.g., r'foo\bar') or ensure each backslash is escaped (\\). It's preferable to use os.path.join() from the os.path module for this purpose.

Related

How do I read a filepath from a text file using Python?

I am currently trying to write a simple python script that opens a folder or a list of folders using filepaths that I have written down on my text file.
import os
with open('filepaths.txt') as f:
[os.startfile(line) for line in f.readlines()]
My issue is that whenever I run this code, python reads the lines as its non-raw form. The backslashes are doubled, and there is a new line "\n" in every string.
FileNotFoundError: [WinError 2] The system cannot find the file specified: 'D:\\Nitro\\Downloads\n'
I have attempted to solve this problem using repr() on the variable. Instead of removing the backslash, it was doubled even further.
import os
with open('filepaths.txt') as f:
[os.startfile(repr(line)) for line in f.readlines()]
FileNotFoundError: [WinError 2] The system cannot find the file specified: "'D:\\\\Nitro\\\\Downloads\\n'"
I have also attempted to use the string replace function to replace "\" with "". It did not work.
The readlines method reads the file correctly, with the trailing newline character in each line preserved. You just need to strip the newline character from the string before using it as a path name, which is typically done with the str.rstrip method:
for line in f: # no need to use the readlines method for iteration
os.startfile(line.rstrip())
The path name in the error message you included in the question contains double backslashes because it is displayed with the repr function, with backslashes escaped already, not because the path names are read incorrectly.

disable the automatic change from \r\n to \n in python

I am working under ubuntu on a python3.4 script where I take in parameter a file (encoded to UTF-8), generated under Windows. I have to go through the file line by line (separated by \r\n) knowing that the "lines" contain some '\n' that I want to keep.
My problem is that Python transforms the file's "\r\n" to "\n" when opening. I've tried to open with different modes ("r", "rt", "rU").
The only solution I found is to work in binary mode and not text mode, opening with the "rb" mode.
Is there a way to do it without working in binary mode or a proper way to do it?
Set the newline keyword argument to open() to '\r\n', or perhaps to the empty string:
with open(filename, 'r', encoding='utf-8', newline='\r\n') as f:
This tells Python to only split lines on the \r\n line terminator; \n is left untouched in the output. If you set it to '' instead, \n is also seen as a line terminator but \r\n is not translated to \n.
From the open() function documentation:
newline controls how universal newlines mode works (it only applies to text mode). It can be None, '', '\n', '\r', and '\r\n'. [...] If it is '', universal newlines mode is enabled, but line endings are returned to the caller untranslated. If it has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated.
Bold emphasis mine.
From Martijn Pieters the solution is:
with open(filename, "r", newline='\r\n') as f:
This answer was posted as an edit to the question disable the automatic change from \r\n to \n in python by the OP lu1her under CC BY-SA 3.0.

How to deal with invalid utf8 in fileinput?

I have basically the following code:
def main():
for filename in fileinput.input():
filename = filename.strip()
process_file(filename)
The script takes a newline-separated list of file names as its input. However, some of the file names contain invalid utf8, which causes fileinput.input() to implode. I've read about the surrogateescape error handler, which I think is what I want, but I don't know how to set the error handler for fileinput.
In short: how do I get fileinput to deal with invalid Unicode?
filenames on POSIX may be arbitrary sequences of bytes (except b'\0' and b'/') i.e., no character encoding can decode them in the general case (that is why os.fsdecode() exists that uses surrogateescape error handler).
You could use a binary mode to read the filenames then either skip undecodable filenames if the input shouldn't contain them or pass them as is (or os.fsdecode()) to functions that expect filenames:
for filename in fileinput.input(mode='rb'):
process_file(os.fsdecode(filename).strip())
Beware, there were several known Python bugs related to using a binary mode and fileinput e.g.:
fileinput should use stdin.buffer for "rb" mode
fileinput.FileInput.readline() always returns str object at the end even if in 'rb' mode
Following documentation please use opening hook:
def main():
for filename in fileinput.input(openhook=fileinput.hook_encoded("utf-8")):
filename = filename.strip()
process_file(filename)

Python finding file path

if os.path.exists('D:\Python\New folder\'+f):
open(f+c, 'w')
The f is a character that changes in a loop. How do i add it to the rest of the 'D:\Python\New folder\' ? What i've done above makes the whole line highlighted as a comment.
You cannot use a \ backslash as the last character, as \' means use an actual quote character rather then the end of the string.
You should really use os.path.join() here and have Python join the path and the filename together, and use a raw string literal for the path so that the other \ characters don't form escape sequences (\n would be a newline, for example):
path = os.path.join(r'D:\Python\New folder', f)
if os.path.exists(path):
open(os.path.join(path, c), 'w')
os.path.join() will add the required \ path separators for you.
Use python os.path module
os.path.join
Try:
if os.path.exists('D:\Python\New folder\\'+f):
open(f+c, 'w')

python filename on windows

Disclaimer I have a similar thread started but I think it got too big and convoluted
In short this is the problem
import imghdr
import os.path
....
image_type = imghdr.what(os.path.normpath(filename))
fails with
IOError: [Errno 22] invalid mode ('rb') or filename: 'D:\\mysvn\\trunk\\Assets\\models\\character\\char1.jpg\r'
Where the aforementioned file does exist
Help? :D
There is a carriage return character \r at the end of the filename. That is not a valid character for a Windows filename, so I doubt the filename will work.
Use .rstrip('\r') to remove it:
image_type = imghdr.what(os.path.normpath(filename.rstrip('\r')))
.rstrip() removes characters from the end of a string, and only those in the set that you name.
Since this is a filename, any whitespace around the filename is probably incorrect, so a straight-up .strip() would work too:
image_type = imghdr.what(os.path.normpath(filename.strip()))
This would remove tabs, newlines, carriage returns and spaces from both the start and end of the string.
invalid mode ('rb') or filename: 'D:\\...\\char1.jpg\r'
^^
You have a trailing carriage return in the file path. Strip it first:
filename = filename.strip()

Categories

Resources