I need \r\n instead of \n in text - python

When I open a file in Python and read it in I get this format: Data\r\n\r\n\t
When I paste it I get text without formatting characters, which I encode like this:
encoded = clipboard.encode('utf8')
The result looks like this: Data\n\n\t
I need the first result with the \r\n instead of just \n
Is there a different .encode to use?
Or another simple way to end up with the \r\n characters?
Thanks

The linux command unix2dos can help you out here, probably.
Reverse command is dos2unix.
Both commands are used to change formats between LF-ended and CRLF-ended line files.

encoded = clipboard.encode('utf8').replace("\n","\r\n")
I suspect clipboard is removing the "\r\n" and changing it to the more standard "\n" ... this will simply replace all "\n" with "\r\n"

Related

How to get raw representation of existing string or escaping backslash

Problem
I'm running dataflow job where I have steps - reading txt file from cloud storage using dataflow/beam - apache_beam.io.textio.ReadFromText() which has StrUtf8Coder (utf-8) by default and after that loading it into postgres using StringIteratorIO with copy_from.
data coming from pcollection element by element, there are some elements which will look like this:
line = "some information|more information S\\\\H, F\226|DIST|local|app\\\\\lock\|"
After that, I need to download it to postgres (the delimiter here is "|"), but the problem is these kinds of elements because postgres try to encode it(and I'm getting: 'invalid byte sequence for encoding "UTF8"'):
from F\226 we are getting this -> F\x96
This slash is not visible so I can not just replace it like this:
line.replace("\\", "\\\\")
Using python 3.8.
Have tried repr() or encode("unicode_escape").decode()
Also in every line we have different elements so let's say in the next one can be r\456
I'm able to catch and change it with regex only if I will use a raw string, but not sure how to represent a regular string as a raw if we already have it in a variable.
import re
line = r"some information|more information S\\\\H, F\226|DIST|local|app\\\\\lock\|"
updated = re.sub("([a-zA-Z])\\\\(\\d*)", "\\1\\\\\\\\\\2",string)
print(updated)
$ some information|more information S\\\\\H, F\\226|DIST|local|app\\\\\\lock\\|
Goal
Have an extra backslash if after backslash we have some element, so the line need to look like this:
line = "some information|more information S\\\\\H, F\\226|DIST|local|app\\\\\\lock\\|"
Thank's for any help!
If you're able to read the file in binary or select the encoding, you could get a better starting point. This is how to do it in binary:
>>> line = b"some information|more information S\\\\H, F\226|DIST|local|app\\\\\lock\|"
>>> line.decode('cp1252')
'some information|more information S\\\\H, F–|DIST|local|app\\\\\\lock\\|'
This is how to decode the whole file:
f = open('file.txt', encoding='cp1252')
f.read()
The encoding CP-1252 is the legacy Microsoft latin-1 encoding.

Delete a complete line in python 3 output

How can I delete a complete line in the output screen of python?
Can I use the escape sequence '\b' for this?
What your asking is somewhat terminal-specific. However, the following solution should work in both Linux and Windows.
Write \r to return to the beginning of the current line.
Write as many spaces as needed to "cover" any previous content on the line.
Write \r to return to the beginning of the current line again.
Write the new text for this line.

How to save a dataframe as a csv file with '/' in the file name

I want to save a dataframe to a .csv file with the name '123/123', but it will split it in to two strings if I just type like df.to_csv('123/123.csv').
Anyone knows how to keep the slash in the name of the file?
You can't use a slash in the name of the file.
But you can use Unicode character that looks like slash if your file system support it http://www.fileformat.info/info/unicode/char/2215/index.htm
... "/" is used for indicating a path to a child folder, so your filename says the file 123.csv is in a folder "123"
however, that does not make it entirely impossible, just very difficult see this question:
https://superuser.com/questions/187469/how-would-i-go-about-creating-a-filename-with-invalid-characters-such-as
and that a charmap can find a character that looks like it, which is legal. In this case a division character
You can not use any of these chars in the file name ;
/:*?\"|
You can use a similar unicode character as the "Fraction slash - Unicode hexadecimal: 0x2044"
Example:
df.to_csv("123{0}123".format(u'\u2044'.encode('utf-8')))
It gives you the filename that you asked.

^M characters in exported file, converting to newlines

I exported a CSV from excel to parse using python. When I opened the vimmed the CSV, I noticed that it was all one line with ^M characters where newlines should be.
Name, Value, Value2, OtherStuff ^M Name, Value, Value2, OtherStuff ^M
I have the file parsed such that I modify the values and put the into a string (using 'rU' mode in csvreader). However, the string has no newlines. So I am wondering, is there a way to split the string on this ^M character, or a way to replace it with a \n?
^M is how vim displays windows end-of-line's
The dos2unix command should fix those up for you:
dos2unix my_file.csv
It's due to the different EOL formats on Windows/Unix.
On windows, it's \r\n
On Unix/Linux/Mac, it's just \n
The ^M is actually vim showing you the windows CR (Carriage Return) or \r
The python open command documentation has more information on handling Universal Newlines: http://docs.python.org/2/library/functions.html#open
If you are on a unix system, there is a program called dos2unix (and its counterpart unix2dos) that will do exactly that conversion.
But, it is pretty much the same as something like this:
sed -i -e 's/$/\r/' file

How to write \t to file using Python

I build one web and user can enter directory path in form.
My program will extract the path and write into one file.
My question is when the path include some special word such as \t \n, the program can't write file correctly.
For example:
C:\abc\test will become C:\abc[TAB]est
How can I change the string into other type like raw string and write file correctly ?
Thank you for your reply.
Use r'C:\abc\test' or 'C:\\abc\\test' when you enter your strings
... user can enter directory path in form.
...
My question is when the path include some special word such as \t \n, the program can't write file correctly.
Uh, no. Python is perfectly capable of distinguishing between '\t' and '\\t' unless you are doing something to confuse it.
>>> raw_input()
<Ctrl-V><Tab>
'\t'
>>> raw_input()
\t
'\\t'

Categories

Resources