Unicode issue in Python 2.7 - python

I saved tweets in a json file
This is my code :
def on_data(self, data):
try:
with codecs.open('python.json', 'a', encoding='utf-8') as f:
f.write(data)
print("Tweet ajoute au JSON")
return True
except BaseException as e:
print("Error on_data: %s" % str(e))
return True
but I get this type of character : \u0e40\u0e21\u0e19\u0e0a
I tried everything to not have this kind of character but nothing works(utf-8, latin2...)

If you want the non-ascii characters encoded directly in the JSON file, you need to encode JSON with the ensure_ascii=False option.

Related

PDF file encryption and decryption in python with fernet key

I'm trying to encrypt the pdf file and then trying to decrypt to get its data with fernet key.
i'm able to encrypt it successfully but while decrypting it, i'm getting a binary stream not the actual data, please help. (assume all the needed modules are imported and pdf as data as Hi, how are you in 2 lines)
Encryption:
def encrypt_file(file_path,file_name):
try:
fernet=Fernet(fernet_key)
print("Created fernet object")
file=os.path.join(file_path,file_name)
with open(file,'rb') as f:
data=f.read()
try:
data_enc=fernet.encrypt(data)
except Exception as e:
e_msg="".join(traceback.format_exception(*sys.exc_info()))
print("An occured during data encryption, reason: "+str(e)+"Error: "+e_msg)
return False
with open(file,'wb') as f:
f.write(data_enc)
print("Encryption Successful")
except Exception as e:
print("An occured while encrypting the file, reason: "+str(e)+"Error: "+e_msg)
return False
return True
Decryption:
def decrypt_data(file_path,file_name):
try:
data=''
fernet=Fernet(fernet_key)
file=os.path.join(file_path,file_name)
with open(file,'rb') as f:
data_enc=f.read()
try:
data=fernet.decrypt(data_enc)
data=data.decode()
except Exception as e:
e_msg="".join(traceback.format_exception(*sys.exc_info()))
print("An occured during data decryption, reason: "+str(e)+"Error: "+e_msg)
except Exception as e:
e_msg="".join(traceback.format_exception(*sys.exc_info()))
print("An occured while decrypting the file, reason: "+str(e)+"Error: "+e_msg)
return False
return data
OUTPUT (trimmed)
ZxM6cMB3Ou8xWZQ4FpZVUKelqo11TcJr_Js7LFo-0XpU05hsIX0pz88lqEfLmY_TSZQWHuYb1yulBT3FYBTd-QU0RqPlPsCSkH3z_LIHyIie5RO7Rztgxs2Y2zyAzkoNQ9M52hhqNgybTE8K_OzQGb9clOTKdkidCW4VTH77HGbSP1EK-x3lTTmVVf0m-
If you just want to encrypt and decrypt a pdf file, you don't need the data=data.decode(). Instead, you can write to an output pdf by appending the code below to your decrypt_data function.
f=open(os.path.join(file_path, "output.pdf"), "wb")
f.write(data)
Now if you open output.pdf, it will be the decrypted pdf.
If you only want a string with the readable text in the pdf, then it may help to look into pdf reading libraries such as PyPDF2.

How to save a path to json file as raw string

I want to save a path to json file, code as below:
def writeToJasonFile(results, filename):
with open(os.path.join(filename), "w") as fp:
try:
fp.write(json.dumps(results))
except Exception as e:
raise Exception("Exception while writing results " % e)
if __name__ == '__main__':
file_path = os.getcwd()
writeToJasonFile(file_path, 'test.json')
When I open json file, the string is saved as escape str: "C:\\test\\Python Script"
How could I dump it as raw string? Saying "C:\test\Python Script"
I could do it in another way. I replace '\' with '/' for the path string and then save it. Windows is able to open the location with this format "C:/test/Python Script". If someone has the answer for the original question, please post here.

Parsing XML from websites and save the code?

I would like to parse the xml code from a website like
http://ops.epo.org/3.1/rest-services/published-data/publication/docdb/EP1000000/biblio
and save it in another xml or csv file.
I tried it with this:
import urllib.request
web_data = urllib.request.urlopen("http://ops.epo.org/3.1/rest-services/published-data/publication/docdb/EP1000000/biblio")
str_data = web_data.read()
try:
f = open("file.xml", "w")
f.write(str(str_data))
print("SUCCESS")
except:
print("ERROR")
But in the saved XML data is between every element '\n' and at the beginning ' b' '
How can i save the XML data without all the 'n\' and ' b' '?
If you write the xml file in binary mode, you don't need to convert the data read into a string of characters first. Also, if you process the data a line at a time, that should get rid of '\n' problem. The logic of your code could also be structured a little better IMO, as shown below:
import urllib.request
web_data = urllib.request.urlopen("http://ops.epo.org/3.1/rest-services"
"/published-data/publication"
"/docdb/EP1000000/biblio")
data = web_data.read()
with open("file.xml", "wb") as f:
for line in data:
try:
f.write(data)
except Exception as exc:
print('ERROR')
print(str(exc))
break
else:
print('SUCCESS')
read() returns data as bytes but you can save data without converting to str(). You have to open file in byte mode - "wb" - and write data.
import urllib.request
web_data = urllib.request.urlopen("http://ops.epo.org/3.1/rest-services/published-data/publication/docdb/EP1000000/biblio")
data = web_data.read()
try:
f = open("file.xml", "wb")
f.write(data)
print("SUCCESS")
except:
print("ERROR")
BTW: To convert bytes to string/unicode you have to use ie. decode('utf-8') .
If you use str() then Python uses own method to create string and it adds b" to inform you that you have bytes in your data.

Python JSON encoding Verification

I am trying to scan in a text document that I have and then find certain sections and output it to a file in json format.
Unfortunatly I am not to sure how to use json and would appricate it if someone could tell me how to encode it as json properly.
Thank you everyone!
#save word and type to database
word = [{'WORD':strWrd , 'TYPE':strWrdtyp}]
with open(input_lang+'.dic', 'a') as outfile:
try:
json.dump(word, outfile)
outfile.write('\n')
outfile.close
except (TypeError, ValueError) as err:
print 'Error:', err

Error handling but still getting a ValueError

I am trying to solve this challenge about error-handling. Maybe I'm way off!
The challenge description:
Write a function called "load_file" that accepts one parameter: a filename. The function should open the file and return the contents.
If the contents of the file can be interpreted as an integer, return the contents as an integer. Otherwise, if the contents of the file can be interpreted as a float, return the contents as a float. Otherwise, return the contents of the file as a string.
You may assume that the file has only one line.
I get ValueError: could not convert string to float: "b>a!\{\'"
Am I all wrong about the error-handling?
def load_file(file):
file = open(file, "r")
all_lines = file.read()
try:
return int(all_lines)
except ValueError:
return float(all_lines)
else:
return all_lines
file.close()
You need to do something like
with open(file, "r") as file_handle:
all_lines = file.read()
try:
return int(all_lines)
except ValueError:
pass
try:
return float(all_lines)
except ValueError:
pass
return all_lines
The point is you don't really care about the errors at all, because they just mean you need to proceed to the next option.
I would also point out that the with construct takes care of closing the file for you. If you want to do file = open(file, "r") then you will need to store your return value to a variable, and then do file.close() before you return.
You handle the ValueError thrown by using the int() function, but there's the possibility of float() also throwing such an error. The purpose of the try/catch structure is for running code inside the try block that may throw any exception, such as a ValueError, and to execute "error handler" code inside the except block.
when you try to parse as a float there is also an exception. you can try something like this
def load_file(file):
file = open(file, "r")
all_lines = file.read()
try:
return int(all_lines)
except ValueError:
try:
return float(all_lines)
except ValueError:
return all_lines
file.close()
You can nest trys inside of exceptions to get it to do what you want
The problem with your approach is that the float(all_lines) can fail but that exception isn't handled.
So it should be:
try:
return int(all_lines)
except ValueError: # handle the exception if it's not an integer
try:
return float(all_lines)
except ValueError: # handle the exception if it's not a float
return all_lines
But you could also just suppress errors (requires python 3.3 or newer though). This could reduce the length of the code and the number of nested try and excepts:
from contextlib import suppress
def load_file(file):
with open(file, "r") as file: # using open with "with" closes the file automatically.
all_lines = file.read()
with suppress(ValueError):
return int(all_lines)
with suppress(ValueError):
return float(all_lines)
return all_lines

Categories

Resources