Is there a way to print all characters in python, even ones which usually aren't printed?
For example
>>>print_all("skip
line")
skip\nline
Looks like you want repr()
>>> """skip
... line"""
'skip\nline'
>>>
>>> print(repr("""skip
... line"""))
'skip\nline'
>>> print(repr("skip line"))
'skip\tline
So, your function could be
print_all = lambda s: print(repr(s))
And for Python 2, you need from __future__ import print_function
Even easier, cast it to a raw string by using "%r", raw strings treat backslashes as literal characters:
print("%r" % """skip
line""")
skip\nline
Additionally, use !r in a format call:
print("{0!r}".format("""skip
line"""))
for similar results.
Related
I have a function in python which return a huge text table in a neat format.
My output has multiple \n an \t.
I can print out the output and it would have preserved the table format. However, in a python interactive window,I want to call the function and not store it the output but display it on console screen.
What I see is \\n instead of \n.
I understand that \ is an escape character.But what do I do make my python interactively handle the formatting.
eg. for descriptive purpose only
def print_table():
return table;
>>> print_table() #is there anything I can do here to have neat display
>>> r0c0 r0c1 \nr1c0 r1c1
>>> print (print_table())
>>> r0c0 r0c1
r1c0 r1c1
I am using Python3.6
Have you tried pretty printer?
from pprint import pprint
pprint(print_table)
You could store the result in a variable
>>> def print_table():
>>> return table
>>> my_table = print_table()
>>> print(my_table)
>>> r0c0 r0c1
r1c0 r1c1
I have a function like this:
persian_numbers = '۱۲۳۴۵۶۷۸۹۰'
english_numbers = '1234567890'
arabic_numbers = '١٢٣٤٥٦٧٨٩٠'
english_trans = string.maketrans(english_numbers, persian_numbers)
arabic_trans = string.maketrans(arabic_numbers, persian_numbers)
text.translate(english_trans)
text.translate(arabic_trans)
I want it to translate all Arabic and English numbers to Persian. But Python says:
english_translate = string.maketrans(english_numbers, persian_numbers)
ValueError: maketrans arguments must have same length
I tried to encode strings with Unicode utf-8 but I always got some errors! Sometimes the problem is Arabic string instead! Do you know a better solution for this job?
EDIT:
It seems the problem is Unicode characters length in ASCII. An Arabic number like '۱' is two character -- that I find out with ord(). And the length problem starts from here :-(
See unidecode library which converts all strings into UTF8. It is very useful in case of number input in different languages.
In Python 2:
>>> from unidecode import unidecode
>>> a = unidecode(u"۰۱۲۳۴۵۶۷۸۹")
>>> a
'0123456789'
>>> unidecode(a)
'0123456789'
In Python 3:
>>> from unidecode import unidecode
>>> a = unidecode("۰۱۲۳۴۵۶۷۸۹")
>>> a
'0123456789'
>>> unidecode(a)
'0123456789'
Unicode objects can interpret these digits (arabic and persian) as actual digits -
no need to translate them by using character substitution.
EDIT -
I came out with a way to make your replacement using Python2 regular expressions:
# coding: utf-8
import re
# Attention: while the characters for the strings bellow are
# dislplayed indentically, inside they are represented
# by distinct unicode codepoints
persian_numbers = u'۱۲۳۴۵۶۷۸۹۰'
arabic_numbers = u'١٢٣٤٥٦٧٨٩٠'
english_numbers = u'1234567890'
persian_regexp = u"(%s)" % u"|".join(persian_numbers)
arabic_regexp = u"(%s)" % u"|".join(arabic_numbers)
def _sub(match_object, digits):
return english_numbers[digits.find(match_object.group(0))]
def _sub_arabic(match_object):
return _sub(match_object, arabic_numbers)
def _sub_persian(match_object):
return _sub(match_object, persian_numbers)
def replace_arabic(text):
return re.sub(arabic_regexp, _sub_arabic, text)
def replace_persian(text):
return re.sub(arabic_regexp, _sub_persian, text)
Attempt that the "text" parameter must be unicode itself.
(also this code could be shortened
by using lambdas and combining some expressions in a single line, but there is no point in doing so, but for loosing readability)
It should work to you up to here, but please read on the original answer I had posted
-- original answer
So, if you instantiate your variables as unicode (prepending an u to the quote char), they are correctly understood in Python:
>>> persian_numbers = u'۱۲۳۴۵۶۷۸۹۰'
>>> english_numbers = u'1234567890'
>>> arabic_numbers = u'١٢٣٤٥٦٧٨٩٠'
>>>
>>> print int(persian_numbers)
1234567890
>>> print int(english_numbers)
1234567890
>>> print int(arabic_numbers)
1234567890
>>> persian_numbers.isdigit()
True
>>>
By the way, the "maketrans" method does not exist for unicode objects (in Python2 - see the comments).
It is very important to understand the basics about unicode - for everyone, even people writing English only programs who think they will never deal with any char out of the 26 latin letters. When writing code that will deal with different chars it is vital - the program can't possibly work without you knowing what you are doing except by chance.
A very good article to read is http://www.joelonsoftware.com/articles/Unicode.html - please read it now.
You can keep in mind, while reading it, that Python allows one to translate unicode characters to a string in any "physical" encoding by using the "encode" method of unicode objects.
>>> arabic_numbers = u'١٢٣٤٥٦٧٨٩٠'
>>> len(arabic_numbers)
10
>>> enc_arabic = arabic_numbers.encode("utf-8")
>>> print enc_arabic
١٢٣٤٥٦٧٨٩٠
>>> len(enc_arabic)
20
>>> int(enc_arabic)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '\xd9\xa1\xd9\xa2\xd9\xa3\xd9\xa4\xd9\xa5\xd9\xa6\xd9\xa7\xd9\xa8\xd9\xa9\xd9\xa0'
Thus, the characters loose their sense as "single entities" and as digits when encoding - the encoded object (str type in Python 2.x) is justa strrng of bytes - which nonetheless is needed when sending these characters to any output from the program - be it console, GUI Window, database, html code, etc...
You can use persiantools package:
Examples:
>>> from persiantools import digits
>>> digits.en_to_fa("0987654321")
'۰۹۸۷۶۵۴۳۲۱'
>>> digits.ar_to_fa("٠٩٨٧٦٥٤٣٢١") # or digits.ar_to_fa(u"٠٩٨٧٦٥٤٣٢١")
'۰۹۸۷۶۵۴۳۲۱'
unidecode converts all characters from Persian to English, If you want to change only numbers follow bellow:
In python3 you can use this code to convert any Persian|Arabic number to English number while keeping other characters unchanged:
intab='۱۲۳۴۵۶۷۸۹۰١٢٣٤٥٦٧٨٩٠'
outtab='12345678901234567890'
translation_table = str.maketrans(intab, outtab)
output_text = input_text.translate(translation_table)
Use Unicode Strings:
persian_numbers = u'۱۲۳۴۵۶۷۸۹۰'
english_numbers = u'1234567890'
arabic_numbers = u'١٢٣٤٥٦٧٨٩٠'
And make sure the encoding of your Python file is correct.
With this you can easily do that:
def p2e(persiannumber):
number={
'0':'۰',
'1':'۱',
'2':'۲',
'3':'۳',
'4':'۴',
'5':'۵',
'6':'۶',
'7':'۷',
'8':'۸',
'9':'۹',
}
for i,j in number.items():
persiannumber=persiannumber.replace(j,i)
return persiannumber
here is usage:
print(p2e('۳۱۹۶'))
#returns 3196
In Python 3 easiest way is:
str(int('۱۲۳'))
#123
but if number starts with 0 it have an issue.
so we can use zip() function:
for i, j in zip('1234567890', '۱۲۳۴۵۶۷۸۹۰'):
number.replace(i, j)
def persian_number(persiannumber):
number={
'0':'۰',
'1':'۱',
'2':'۲',
'3':'۳',
'4':'۴',
'5':'۵',
'6':'۶',
'7':'۷',
'8':'۸',
'9':'۹',
}
for i,j in number.items():
persiannumber=time2str.replace(i,j)
return time2str
persiannumber must be a string
The following code
# -*- coding: utf-8 -*-
x = (u'abc/αβγ',)
print x
print x[0]
print unicode(x).encode('utf-8')
print x[0].encode('utf-8')
...produces:
(u'abc/\u03b1\u03b2\u03b3',)
abc/αβγ
(u'abc/\u03b1\u03b2\u03b3',)
abc/αβγ
Is there any way to get Python to print
('abc/αβγ',)
that does not require me to build the string representation of the tuple myself? (By this I mean stringing together the "(", "'", encoded value, "'", ",", and ")"?
BTW, I'm using Python 2.7.1.
Thanks!
You could decode the str representation of your tuple with 'raw_unicode_escape'.
In [25]: print str(x).decode('raw_unicode_escape')
(u'abc/αβγ',)
I don't think so - the tuple's __repr__() is built-in, and AFAIK will just call the __repr__ for each tuple item. In the case of unicode chars, you'll get the escape sequences.
(Unless Gandaro's solution works for you - I couldn't get it to work in a plain python shell, but that could be either my locale settings, or that it's something special in ipython.)
The following should be a good start:
>>> x = (u'abc/αβγ',)
>>> S = type('S', (unicode,), {'__repr__': lambda s: s.encode('utf-8')})
>>> tuple(map(S, x))
(abc/αβγ,)
The idea is to make a subclass of unicode which has a __repr__() more to your liking.
Still trying to figure out how best to surround the result in quotes, this works for your example:
>>> S = type('S', (unicode,), {'__repr__': lambda s: "'%s'" % s.encode('utf-8')})
>>> tuple(map(S, x))
('abc/αβγ',)
... but it will look odd if there is a single quote in the string:
>>> S("test'data")
'test'data'
I have a string that after print is like this: \x4d\xff\xfd\x00\x02\x8f\x0e\x80\x66\x48\x71
But I want to change this string to "\x4d\xff\xfd\x00\x02\x8f\x0e\x80\x66\x48\x71" which is not printable (it is necessary to write to serial port). I know that it ist problem with '\'. how can I replace this printable backslashes to unprintable?
If you want to decode your string, use decode() with 'string_escape' as parameter which will interpret the literals in your variable as python literal string (as if it were typed as constant string in your code).
mystr.decode('string_escape')
Use decode():
>>> st = r'\x4d\xff\xfd\x00\x02\x8f\x0e\x80\x66\x48\x71'
>>> print st
\x4d\xff\xfd\x00\x02\x8f\x0e\x80\x66\x48\x71
>>> print st.decode('string-escape')
MÿýfHq
That last garbage is what my Python prints when trying to print your unprintable string.
You are confusing the printable representation of a string literal with the string itself:
>>> c = '\x4d\xff\xfd\x00\x02\x8f\x0e\x80\x66\x48\x71'
>>> c
'M\xff\xfd\x00\x02\x8f\x0e\x80fHq'
>>> len(c)
11
>>> len('\x4d\xff\xfd\x00\x02\x8f\x0e\x80\x66\x48\x71')
11
>>> len(r'\x4d\xff\xfd\x00\x02\x8f\x0e\x80\x66\x48\x71')
44
your_string.decode('string_escape')
I have a base64 encoded string
When I decode the string this way:
>>> import base64
>>> base64.b64decode("XH13fXM=")
'\\}w}s'
The output is fine.
But when i use it like this:
>>> d = base64.b64decode("XH13fXM=")
>>> print d
\}w}s
some characters are missing
Can anyone advise ?
Thank you in advanced.
It is just a matter of presentation:
>>> '\\}w}s'
'\\}w}s'
>>> print(_, len(_))
\}w}s 5
This string has 5 characters. When you use it in code you need to escape backslash, or use raw string literals:
>>> r'\}w}s'
'\\}w}s'
>>> r'\}w}s' == '\\}w}s'
True
When you print a string, the characters in the string are output. When the interactive shell shows you the value of your last statement, it prints the __repr__ of the string, not the string itself. That's why there are single-quotes around it, and your backslash has been escaped.
No characters are missing from your second example, those are the 5 characters in your string. The first example has had characters add to make the output a legal Python string literal.
If you want to use the print statement and have the output look like the first example, then use:
print repr(d)