I have the following function to create cipher text and then save it:
def create_credential(self):
des = DES.new(CIPHER_N, DES.MODE_ECB)
text = str(uuid.uuid4()).replace('-','')[:16]
cipher_text = des.encrypt(text)
return cipher_text
def decrypt_credential(self, text):
des = DES.new(CIPHER_N, DES.MODE_ECB)
return des.decrypt(text)
def update_access_credentials(self):
self.access_key = self.create_credential()
print repr(self.access_key) # "\xf9\xad\xfbO\xc1lJ'\xb3\xda\x7f\x84\x10\xbbv&"
self.access_password = self.create_credential()
self.save()
And I will call:
>>> from main.models import *
>>> u=User.objects.all()[0]
>>> u.update_access_credentials()
And this is the stacktrace I get:
UnicodeDecodeError: 'utf8' codec can't decode byte 0xf5 in position 738: invalid start byte
Why is this occurring and how would I get around it?
You are storing a bytestring into a Unicode database field, so it'll try and decode to Unicode.
Either use a database field that can store opaque binary data, decode explicitly to Unicode (latin-1 maps bytes one-on-one to Unicode codepoints) or wrap your data into a representation that can be stored as text.
For Django 1.6 and up, use a BinaryField, for example. For earlier versions, using a binary-to-text conversion (such as Base64) would be preferable over decoding to Latin-1; the result of the latter would not give you meaningful textual data but Django may try to display it as such (in the admin interface for example).
It's occurring because you're attempting to save non-text data in a text field. Either use a non-text field instead, or encode the data as text via e.g. Base-64 encoding.
Using base64 encoding and decoding here fixed this:
import base64
def create_credential(self):
des = DES.new(CIPHER_N, DES.MODE_ECB)
text = str(uuid.uuid4()).replace('-','')[:16]
cipher_text = des.encrypt(text)
base64_encrypted_message = base64.b64encode(cipher_text)
return base64_encrypted_message
def decrypt_credential(self, text):
text = base64.b64decode(text)
des = DES.new(CIPHER_N, DES.MODE_ECB)
message = des.decrypt(text)
return message
Related
I am developing a new payment_acquirer module for Odoo, and since last week, I am always getting an error when I try to decrypt data that I received through the server.
When I copy the data in an another python file to test, it seems to be working perfectly with the same data, but when I do it in my controller, it returns an error.
This is the code inside my controller :
#http.route('/payment/ariarynet/result', type='http', auth="none", methods=['POST', 'GET'], csrf=False)
def ariarynet_result(self, **post):
""" Handle Ariary.net response and redirect to form_validate"""
_logger.info('Beginning Ariary.net form_feedback with post data %s', pprint.pformat(post)) # debug
key = bytes("477c3551da64136491eff1cb6ab27be35093b2512eb78f2c8d"[:24])
params = dict(post)
raw = b"%s"%post.get('idpanier')
decode = raw.encode('utf8')
idpanier = main.Utils().decrypt(key,decode) #it return an error
When executed, I have the following error:
raise ValueError("Invalid data length, data must be a multiple of " + str(self.block_size) + " bytes\n.")
ValueError: Invalid data length, data must be a multiple of 8 bytes
I am using pyDes module to crypt and decrypt data.
This is the test that is working :
def test_bytes(self):
key = bytes("477c3551da64136491eff1cb6ab27be35093b2512eb78f2c8d"[:24])
expect = "12177"
raw = "%8E%16%B8n%A6%1F%2Fj" #this is the data that I copied from the url
text = urllib.unquote(raw)
byteArray = bytes(text)
print Utils().decrypt(key, text)
self.assertEqual(expect,Utils().decrypt(key, text), "%s est diférent de %s" % (expect, Utils().decrypt(key, text)) )
I really need your help to figure out what am I doing wrong.
Update:
I think that the probleme have to do with the character encoding, because when I am trying to compare the data I get with the excpected one, I don't get the same thing:
param = post.get('idpanier')
text = (param.encode('utf8'))
print "utf8 encode %s, hex encoded text %s" % (text, text.encode('hex'))
print "utf8 encode %s, hex encoded text %s" % ("b4227475d651420b".decode('hex'), "b4227475d651420b") #excpected behavior
Here is the output:
utf8 encode �"tu�QB
, hex encoded text efbfbd227475efbfbd51420b
utf8 encode �"tu�QB
, hex encoded text b4227475d651420b
The solution I found : instead of retriving parameters with post.get(), I have manage to get the real parameters data through the incoming url directly, where parameters encoding is not changed yet.
query = parse_qs("http://url?%s"%request.httprequest.query_string) #append the query string to a dummy url to get a well formed url
param = query.get('idpanier')
After that, everything worked fine.
I'm reading a config file in python getting sections and creating new config files for each section.
However.. I'm getting a decode error because one of the strings contains Español=spain
self.output_file.write( what.replace( " = ", "=", 1 ) )
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4: ordinal not in range(128)
How would I adjust my code to allow for encoded characters such as these? I'm very new to this so please excuse me if this is something simple..
class EqualsSpaceRemover:
output_file = None
def __init__( self, new_output_file ):
self.output_file = new_output_file
def write( self, what ):
self.output_file.write( what.replace( " = ", "=", 1 ) )
def get_sections():
configFilePath = 'C:\\test.ini'
config = ConfigParser.ConfigParser()
config.optionxform = str
config.read(configFilePath)
for section in config.sections():
configdata = {k:v for k,v in config.items(section)}
confignew = ConfigParser.ConfigParser()
cfgfile = open("C:\\" + section + ".ini", 'w')
confignew.add_section(section)
for x in configdata.items():
confignew.set(section,x[0],x[1])
confignew.write( EqualsSpaceRemover( cfgfile ) )
cfgfile.close()
If you use python2 with from __future__ import unicode_literals then every string literal you write is an unicode literal, as if you would prefix every literal with u"...", unless you explicitly write b"...".
This explains why you get an UnicodeDecodeError on this line:
what.replace(" = ", "=", 1)
because what you actually do is
what.replace(u" = ",u"=",1 )
ConfigParser uses plain old str for its items when it reads a file using the parser.read() method, which means what will be a str. If you use unicode as arguments to str.replace(), then the string is converted (decoded) to unicode, the replacement applied and the result returned as unicode. But if what contains characters that can't be decoded to unicode using the default encoding, then you get an UnicodeDecodeError where you wouldn't expect one.
So to make this work you can
use explicit prefixes for byte strings: what.replace(b" = ", b"=", 1)
or remove the unicode_litreals future import.
Generally you shouldn't mix unicode and str (python3 fixes this by making it an error in almost any case). You should be aware that from __future__ import unicode_literals changes every non prefixed literal to unicode and doesn't automatically change your code to work with unicode in all case. Quite the opposite in many cases.
I wish to use the cp437 character map from the utf-8 encoding.
I have all the code points for each of the cp437 characters.
The following code correctly displays a single cp437 character:
import locale
locale.setlocale(locale.LC_ALL, '')
icon u'\u263A'.encode('utf-8')
print icon
Whereas the following code shows most of the cp437 characters, but not all:
for i in range(0x00,0x100):
print chr(i).decode('cp437')
My guess is that the 2nd approach is not referencing the utf-8 encoding, but a separate incomplete cp437 character set.
I would like a way to summon a cp437 character from the utf-8 without having to specify each of the 256 individual code points. I have resorted to manually typing the unicode code point strings in a massive 16x16 table. Is there a better way?
The following code demonstrates this:
from curses import *
import locale
locale.setlocale(locale.LC_ALL, '')
def main(stdscr):
maxyx = stdscr.getmaxyx()
text= str(maxyx)
y_mid=maxyx[0]//2
x_mid=maxyx[1]//2
next_y,next_x = y_mid, x_mid
curs_set(1)
noecho()
event=1
y=0; x=0
icon1=u'\u2302'.encode('utf-8')
icon2=chr(0x7F).decode('cp437')
while event !=ord('q'):
stdscr.addstr(y_mid,x_mid-10,icon1)
stdscr.addstr(y_mid,x_mid+10,icon2)
event = stdscr.getch()
wrapper(main)
The icon on left is from utf-8 and does print to screen.
The icon on the right is from decode('cp437') and does not print to screen correctly [appears as ^?]
As mentioned by #Martijn in the comments, the stock cp437 decoder converts characters 0-127 straight into their ASCII equivalents. For some applications this would be the right thing, as you wouldn't for example want '\n' to translate to u'\u25d9'. But for full fidelity to the code page, you need a custom decoder and encoder.
The codec module makes it easy to add your own codecs, but examples are hard to find. I used the sample at http://pymotw.com/2/codecs/ along with the Wikipedia table for Code page 437 to generate this module - it automatically registers a codec with the name 'cp437ex' when you import it.
import codecs
codec_name = 'cp437ex'
_table = u'\0\u263a\u263b\u2665\u2666\u2663\u2660\u2022\u25d8\u25cb\u25d9\u2642\u2640\u266a\u266b\u263c\u25ba\u25c4\u2195\u203c\xb6\xa7\u25ac\u21a8\u2191\u2193\u2192\u2190\u221f\u2194\u25b2\u25bc !"#$%&\'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\u2302\xc7\xfc\xe9\xe2\xe4\xe0\xe5\xe7\xea\xeb\xe8\xef\xee\xec\xc4\xc5\xc9\xe6\xc6\xf4\xf6\xf2\xfb\xf9\xff\xd6\xdc\xa2\xa3\xa5\u20a7\u0192\xe1\xed\xf3\xfa\xf1\xd1\xaa\xba\xbf\u2310\xac\xbd\xbc\xa1\xab\xbb\u2591\u2592\u2593\u2502\u2524\u2561\u2562\u2556\u2555\u2563\u2551\u2557\u255d\u255c\u255b\u2510\u2514\u2534\u252c\u251c\u2500\u253c\u255e\u255f\u255a\u2554\u2569\u2566\u2560\u2550\u256c\u2567\u2568\u2564\u2565\u2559\u2558\u2552\u2553\u256b\u256a\u2518\u250c\u2588\u2584\u258c\u2590\u2580\u03b1\xdf\u0393\u03c0\u03a3\u03c3\xb5\u03c4\u03a6\u0398\u03a9\u03b4\u221e\u03c6\u03b5\u2229\u2261\xb1\u2265\u2264\u2320\u2321\xf7\u2248\xb0\u2219\xb7\u221a\u207f\xb2\u25a0\xa0'
decoding_map = { i: ord(ch) for i, ch in enumerate(_table) }
encoding_map = codecs.make_encoding_map(decoding_map)
class Codec(codecs.Codec):
def encode(self, input, errors='strict'):
return codecs.charmap_encode(input, errors, encoding_map)
def decode(self, input, errors='strict'):
return codecs.charmap_decode(input, errors, decoding_map)
class IncrementalEncoder(codecs.IncrementalEncoder):
def encode(self, input, final=False):
return codecs.charmap_encode(input, self.errors, encoding_map)[0]
class IncrementalDecoder(codecs.IncrementalDecoder):
def decode(self, input, final=False):
return codecs.charmap_decode(input, self.errors, decoding_map)[0]
class StreamReader(Codec, codecs.StreamReader):
pass
class StreamWriter(Codec, codecs.StreamWriter):
pass
def _register(encoding):
if encoding == codec_name:
return codecs.CodecInfo(
name=codec_name,
encode=Codec().encode,
decode=Codec().decode,
incrementalencoder=IncrementalEncoder,
incrementaldecoder=IncrementalDecoder,
streamreader=StreamReader,
streamwriter=StreamWriter)
codecs.register(_register)
Also note that decode produces Unicode strings, while encode produces byte strings. Printing a Unicode string should always work, but your question indicates you may have an incorrect default encoding. One of these should work:
icon2='\x7f'.decode('cp437ex')
icon2='\x7f'.decode('cp437ex').encode('utf-8')
Hi can you help me decode this message and what to do:
main.py", line 1278, in post
message.body = "%s %s/%s/%s" % (msg, host, ad.key().id(), slugify(ad.title.encode('utf-8')))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)
Thanks
UPDATE having tried removing the encode call it appears to work:
class Recommend(webapp.RequestHandler):
def post(self, key):
ad= db.get(db.Key(key))
email = self.request.POST['tip_email']
host = os.environ.get("HTTP_HOST", os.environ["SERVER_NAME"])
senderemail = users.get_current_user().email() if users.get_current_user() else 'info#monton.cl' if host.endswith('.cl') else 'info#monton.com.mx' if host.endswith('.mx') else 'info#montao.com.br' if host.endswith('.br') else 'admin#koolbusiness.com'
message = mail.EmailMessage(sender=senderemail, subject="%s recommends %s" % (self.request.POST['tip_name'], ad.title) )
message.to = email
message.body = "%s %s/%s/%s" % (self.request.POST['tip_msg'],host,ad.key().id(),slugify(ad.title))
message.send()
matched_images=ad.matched_images
count = matched_images.count()
if ad.text:
p = re.compile(r'(www[^ ]*|http://[^ ]*)')
text = p.sub(r'\1',ad.text.replace('http://',''))
else:
text = None
self.response.out.write("Message sent<br>")
path = os.path.join(os.path.dirname(__file__), 'market', 'market_ad_detail.html')
self.response.out.write(template.render(path, {'user_url':users.create_logout_url(self.request.uri) if users.get_current_user() else users.create_login_url(self.request.uri),
'user':users.get_current_user(), 'ad.user':ad.user,'count':count, 'ad':ad, 'matched_images': matched_images,}))
The problem here is your underlying model (message.body) only wants ASCII text but you're trying to give it a string encoded in unicode.
But since you've got a normal ascii string here, you can just make python print out the '?' character when you've got a non-ascii-printing string.
"UNICODE STRING".encode('ascii','replace').decode('ascii')
So like from your example above:
message.body = "%s %s/%s/%s" % \
(msgencode('ascii','replace').decode('ascii'),
hostencode('ascii','replace').decode('ascii'),
ad.key().id()encode('ascii','replace').decode('ascii'),
slugify(ad.title)encode('ascii','replace').decode('ascii'))
Or just encode/decode on the variable that has the unicode character.
But this isn't an optimal solution. The best idea is to make message.body a unicode string. Being that doesn't seem feasible (I'm not familiar with GAE), you can use this to at least not have errors.
You've got a Unicode character in a place that you're not supposed to. Most often I find this error is having MS Word-style slanted quotes.
One of these fields has some characters that cannot be encoded. If you switch to python 3 (it has better unicode support), or you change the encoding of the entire script the problem should stop, about the best way to change the encoding in 2.x is using the encoding comment line. If you see http://evanjones.ca/python-utf8.html you will see more of an explanation of using python with utf-8 support the best suggestion is add # -*- coding: utf-8 -*- to the top of your script. And handle scripts like this
s = "hello normal string"
u = unicode( s, "utf-8" )
backToBytes = u.encode( "utf-8" )
I had a similar problem when using Django norel and Google App Engine.
The problem was at the folder containing the application. Probably isn't this the problem described in this question, but, maybe helps someone don't waste time like me.
Try first change you application folder maybe to /home/ and try to run again, if doesn't works, try something more.
I have a function accepting requests from the network. Most of the time, the string passed in is not unicode, but sometimes it is.
I have code to convert everything to unicode, but it reports this error:
message.create(username, unicode(body, "utf-8"), self.get_room_name(),\
TypeError: decoding Unicode is not supported
I think the reason is the 'body' parameter is already unicode, so unicode() raises an exception.
Is there any way to avoid this exception, e.g. judge the type before the conversion?
You do not decode to UTF-8, you encode to UTF-8 or decode from.
You can safely decode from UTF8 even if it's just ASCII. ASCII is a subset of UTF8.
The easiest way to detect if it needs decoding or not is
if not isinstance(data, unicode):
# It's not Unicode!
data = data.decode('UTF8')
You can use either this:
try:
body = unicode(body)
except UnicodeDecodeError:
body = body.decode('utf8')
Or this:
try:
body = unicode(body, 'utf8')
except TypeError:
body = unicode(body)
Mark Pilgrim wrote a Python library to guess text encodings:
http://chardet.feedparser.org/
On Unicode and UTF-8, the first two sections of chapter 4 of his book ‘Dive into Python 3’ are pretty great:
http://diveintopython3.org/strings.html
This is what I use:
def to_unicode_or_bust(obj, encoding='utf-8'):
if isinstance(obj, basestring):
if not isinstance(obj, unicode):
obj = unicode(obj, encoding)
return obj
It's taken from this presentation: http://farmdev.com/talks/unicode/
And this is a sample code that uses it:
def hash_it_safe(s):
try:
s = to_unicode_or_bust(s)
return hash_it_basic(s)
except UnicodeDecodeError:
return hash_it_basic(s)
except UnicodeEncodeError:
assert type(s) is unicode
return hash_it_basic(s.encode('utf-8'))
Anyone have some thoughts on how to improve this code? ;)