Python write to a file r - python

I am trying to write some basic lines to a text file using Python 3.3.2 (complete beginner here).Not sure why a number returns after the write command line. The number seems to be the length of the string. The string does get stored into the new text file and everything else seems to be okay.
Also :
>>> f=open('testfile.txt','w')
>>> f.write('this is line 1\n')
15
So, the number '15'.. not sure what it means. Every line I write would return an integer.

From docs:
f.write(string) writes the contents of string to the file, returning
the number of characters written.

A better way to write this would be
print('this is line 1', file=f)
Because it is more flexible (it takes any input type, not just str) and it automatically adds the newline character. As an added bonus, it won't echo anything in the python shell.

Related

Issue reading text file python

I am new to python and stuck with some issue which could be pretty easy for python expert. I am trying to read text file in python but not getting desired out put using f string.
print(f'{lines[0]} {lines[2]}')\n
I am getting output in two lines, although I didn't use \n
Hello
I am testing!
Expected output:
Hello I am testing!
It's because when you read a file at end of every line a newline character exists. So if you even print only one line you'll get a blank line after the text. You can solve it by using strip method:
print(f'{lines[0].strip()} {lines[2].strip()}')
As you can see in your text file, the text is in different lines, so python interpreted it as a newline. So, it added a new line character \n. You just have to strip the character.
print(f'{lines[0].strip('\n')} {lines[2].strip('\n')}')

How does input() function actually work? [duplicate]

I get how to open files, and then use Python's pre built in functions with them. But how does sys.stdin work?
for something in sys.stdin:
some stuff here
lines = sys.stdin.readlines()
What's the difference between the above two different uses on sys.stdin? Where is it reading the information from? Is it via keyboard, or do we still have to provide a file?
So you have used Python's "pre built in functions", presumably like this:
file_object = open('filename')
for something in file_object:
some stuff here
This reads the file by invoking an iterator on the file object which happens to return the next line from the file.
You could instead use:
file_object = open('filename')
lines = file_object.readlines()
which reads the lines from the current file position into a list.
Now, sys.stdin is just another file object, which happens to be opened by Python before your program starts. What you do with that file object is up to you, but it is not really any different to any other file object, its just that you don't need an open.
for something in sys.stdin:
some stuff here
will iterate through standard input until end-of-file is reached. And so will this:
lines = sys.stdin.readlines()
Your first question is really about different ways of using a file object.
Second, where is it reading from? It is reading from file descriptor 0 (zero). On Windows it is file handle 0 (zero). File descriptor/handle 0 is connected to the console or tty by default, so in effect it is reading from the keyboard. However it can be redirected, often by a shell (like bash or cmd.exe) using syntax like this:
myprog.py < input_file.txt
That alters file descriptor zero to read a file instead of the keyboard. On UNIX or Linux this uses the underlying call dup2(). Read your shell documentation for more information about redirection (or maybe man dup2 if you are brave).
It is reading from the standard input - and it should be provided by the keyboard in the form of stream data.
It is not required to provide a file, however you can use redirection to use a file as standard input.
In Python, the readlines() method reads the entire stream, and then splits it up at the newline character and creates a list of each line.
lines = sys.stdin.readlines()
The above creates a list called lines, where each element will be a line (as determined by the end of line character).
You can read more about this at the input and output section of the Python tutorial.
If you want to prompt the user for input, use the input() method (in Python 2, use raw_input()):
user_input = input('Please enter something: ')
print('You entered: {}'.format(user_input))
To get a grasp how sys.stdin works do following:
create a simple python script, let's name it "readStdin.py":
import sys
lines = sys.stdin.readlines()
print (lines)
Now open console any type in:
echo "line1 line2 line3" | python readStdin.py
The script outputs:
['"line1 line2 line3" \n']
So, the script has read the input into list (named 'lines'), including the new line character produced by 'echo'. That is.
According to me sys.stdin.read() method accepts a line as the input from the user until a special character like Enter Key and followed by Ctrl + D and then stores the input as the string.
Control + D works as the stop signal.
Example:
import sys
input = sys.stdin.read()
print(input)
tokens = input.split()
a = int(tokens[0])
b = int(tokens[1])
print(a + b)
After running the program enter two numbers delimited by space and after finishing press Control + D once or twice and you will be presented by the sum of the two inputs.
for something in sys.stdin:
some stuff here
The code above does not work as you expect because sys.stdin is a file handle - it is a file handle to the stdin. It will not reach the some stuff here line
lines = sys.stdin.readlines()
When the script above is run in an interactive shell, it will block the execution until a user presses Ctrl-D, which indicates the end of the input.
It will read the source file line by line. It is widely used in Online Judge System.
For example: suppose we have only one number 2 will be used in the file.
import sys
if __name__ == "__main__":
n = int(sys.stdin.readline().strip())
Read the file line by line means read the number 2 (only one line in this case). Using the strip to remove unneeded space or other specified characters. This will result in n = (integer) 2.
If we have a file with two lines like:
1
2
Then, sys.stdin.readline().strip() will transform it to one line (a list, named n) with two elements 1, 2. Then we cannot use int transformer now but we can use int(n[0]) and int(n[1]) instead.

Python Saving long string in text file

I have a long string that I want to save in a text file with the code:
taxtfile.write(a)
but because the string is too long, the saved file prints as:
"something something ..... something something"
how do I make sure it will save the entire string without truncating it ?
it should work regardless of the string length
this is the code I made to show it:
import random
a = ''
number_of_characters = 1000000
for i in range(number_of_characters):
a += chr(random.randint(97, 122))
print(len(a)) # a is now 1000000 characters long string
textfile = open('textfile.txt', 'w')
textfile.write(a)
textfile.close()
you can put number_of_characters to whatever number you like but than you must wait for string to be randomized
and this is screenshot of textfile.txt: http://prntscr.com/bkyvs9
probably your problem is in string a.
I think this is just a representation in your IDE or terminal environment. Try something like the following, then open the file and see for yourself if its writing in its entirety:
x = 'abcd'*10000
with open('test.txt', 'w+') as fh:
fh.write(x)
Note the the above will write a file to whatever your current working directory is. You may first want to navigate to your ~/Desktop before calling Python.
Also, how are you building the string a? How is textfile being written? If the call to textfile.write(a) is occurring within a loop, there may be a bug in the loop. (Showing more of your code would help)

How to make the file.write() method in Python explicitly write the newline characters?

I am trying to write text to an output file that explicitly shows all of the newline characters (\n, \r, \r\n,). I am using Python 3 and Windows 7. My thought was to do this by converting the strings that I am writing into bytes.
My code looks like this:
file_object = open(r'C:\Users\me\output.txt', 'wb')`
for line in lines:
line = bytes(line, 'UTF-8')
print('Line: ', line) #for debugging
file_object.write(line)
file_object.close()
The print() statement to standard output (my Windows terminal) is as I want it to be. For example, one line looks like so, with the \n character visible.
Line: b'<p class="byline">Foo C. Bar</p>\n'
However, the write() method does not explicitly print any of the newline characters in my output.txt file. Why does write() not explicitly show the newline characters in my output text file, even though I'm writing in bytes mode, but print does explicitly show the newline characters in the windows terminal?
What Python does when writing strings or bytes to text or binary files:
Strings to a text file. Directly written.
Bytes to a text file. Writes the repr.
Strings to a binary file. Throws an exception.
Bytes to a binary file. Directly written.
You say that you get what you’re looking for when you write a bytes to standard out (a text file). That, with the pseudo-table above, suggests you might look into using repr. Specifically, if you’re looking for the output b'<p class="byline">Foo C. Bar</p>\n', you’re looking for the repr of a bytes object. If line was a str to start with and you don’t actually need that b at the beginning, you might instead be looking for the repr of the string, '<p class="byline">Foo C. Bar</p>\n'. If so, you could write it like this:
with open(r'C:\Users\me\output.txt', 'w') as file_object:
for line in lines:
file_object.write(repr(line) + '\n')

readline() Produces Unexpected String

Getting some practice playing with dictionaries and file i/o today when a file gave me an unexpected output that I'm curious about. I wrote the following simple function that just takes the first line of a text file, breaks it into individual words, and puts each word into a dictionary:
def create_dict(file):
dict = {}
for i, item in enumerate(file.readline().split(' ')):
dict[i]= item
file.seek(0)
return dict
print "Enter a file name:"
f = open(raw_input('-> '))
dict1 = create_dict(f)
print dict1
Simple enough, in every case it produces exactly the expected output. Every case except for one. I have one text file that was created by piping the output of another python script to a text file via the following shell command:
C:\> python script.py > textFile.txt
When I use textFile.txt with my dictionary script, I get an output that looks like:
{0: '\xff\xfeN\x00Y\x00', 1: '\x00S\x00t\x00a\x00t\x00e\x00', 2: '\x00h\x00a\x00s\x00:\x00', 3: '\x00', 4: '\x00N\x00e\x00w\x00', 5: '\x00Y\x00o\x00r\x00k\x00\r\x00\n'}
What is this output called? Why does piping the output of the script to a text file via the command line produce a different type of string than any other text file? Why are there no visible differences when I open this file in my text editor? I searched and searched but I don't even know what that would be called as I'm still pretty new.
Your file is UTF-16 encoded. The first 2 characters is a Byte Order Marker (BOM) \xff and \xfe. Also you will notice that each character appears to take 2 bytes, one of which is \x00.
You can use the codecs module to decode for you:
import codecs
f = codecs.open(raw_input('-> '), 'r', encoding='utf-16')
Or, if you are using Python 3 you can supply the encoding argument to open().
I guess the problem you met is the 'Character Code' problem.
In python, the default character code is ascii,so when you use the open() fuction to read the file, the value will be explain to ascii code.
But, the output may not know what the character code means, you need to decode the output message to see it 'normal like'.
As normal, the system use the utf-8 code to read, you can try to decode(item, 'utf-8').
And you can search for more information about character code, ascii, utf-8, unicode and the transfer method of them.
Hope can helping.
>>> import codecs
>>> codecs.BOM_UTF16_LE
'\xff\xfe'
To read utf-16 encoded file you could use io module:
import io
with io.open(filename, encoding='utf-16') as file:
words = [word for line in file for word in line.split()]
The advantage compared to codecs.open() is that it supports the universal newline mode like the builtin open(), and io.open() is the builtin open() in Python 3.

Categories

Resources