Heroku Changes Encoding Automatically

Heroku Changes Encoding Automatically - python

I have created a Python/Selenium app which runs perfectly on local. The app reads information from a file called 'config.txt' . When I deploy the files to Heroku and run my app, I get this error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
Function I used for opening the text file:
def config_parse(path):
with open(path, "r+") as file:
lines = file.readlines()
if lines:
for line in lines:
splitted = line.replace('"', "").replace("\n", "").split(",")
data = {'name': splitted[0], 'city': splitted[1], 'date': splitted[2],
"time": splitted[3], 'priority': splitted[4], 'guests': splitted[5]}
try:
data['frequency'] = splitted[6]
data['ping_date'] = splitted[7]
data['ping_time'] = splitted[8]
except:
pass
entries.append(data)
#print("Parsed Data: %s" % entries)
file.truncate(0)
The error simply comes from readlines() method.
I checked the file charset through the Heroku Console but it seems to be utf-16-le whereas it is utf-8 on my local and it is already an empty file. I tried to change locale and add config vars to Heroku but nothing resolved it. Why does it make all the files either us-ascii or utf-16-le?
When I edit the file on the Heroku console and open it through the python console, it opens successfully but when I run the script, it fails to open it.

Related

Stuck at translation of FTP-uploadscript from Python2.x towards Python3.x

Python script for ftp-upload of various types of files from local Raspberry to remote Webserver:
original is running on several Raspberries under Python2.x & Raspian_Buster (and earlier Raspian_versions) without any problems.
The txt-file for this upload is generated by a lua-script-setup like the one below
file = io.open("/home/pi/PVOutput_Info.txt", "w+")
-- Opens a file named PVOutput_Info.txt (stored under the designated sub-folder of Domoticz)
file:write(" === PV-generatie & Consumptie === \n")
file:write(" Datum = " .. WSDatum .. "\n")
file:write(" Tijd = " .. WSTijd .. "\n")
file:close() -- closes the open file
os.execute("chmod a+rw /home/pi/PVTemp_Info.txt")
Trying to upgrade this simplest version towards use with Python3.x & Raspian_Bullseye, but stuck with solving the reported error.
It looks as if the codec now has a problem with a byte 0xb0 in the txt-file.
Any remedy or hint to circumvent this problem?
#!/usr/bin/python3
# (c)2017 script compiled by Toulon7559 from various material from forums, version 0.1 for upload of *.txt to /
# Original script running under Python2.x and Raspian_Buster
# Version 0165P3 of 20230201 is an experimental adaptation towards Python3.x and Raspian_Bullseye
# --------------------------------------------------
# Line006 = Function for FTP_UPLOAD to Server
# --------------------------------------------------
# Imports for script-operation
import ftplib
import os
# Definition of Upload_function
def upload(ftp, file):
ext = os.path.splitext(file)[1]
if ext in (".txt", ".htm", ".html"):
ftp.storlines("STOR " + file, open(file))
else:
ftp.storbinary("STOR " + file, open(file, "rb"), 1024)
# --------------------------------------------------
# Line020 = Actual FTP-Login & -Upload
# --------------------------------------------------
ftp = ftplib.FTP("<FTP_server>")
ftp.login("<Login_UN>", "<login_PW>")
# set path to destination directory
ftp.cwd('/')
# set path to source directory
os.chdir("/home/pi/")
# upload of TXT-files
upload(ftp, "PVTemp_Info.txt")
upload(ftp, "PVOutput_Info.txt")
# reset path to root
ftp.cwd('/')
print ('End of script Misc_Upload_0165P3')
print
Putty_CLI_Command
sudo python3 /home/pi/domoticz/scripts/python/Misc_upload_0165P3a.py
Resulting report at Putty's CLI
Start of script Misc_Upload_0165P3
Traceback (most recent call last):
File "/home/pi/domoticz/scripts/python/Misc_upload_0165P3a.py", line 39, in <module>
upload(ftp, "PVTemp_Info.txt")
File "/home/pi/domoticz/scripts/python/Misc_upload_0165P3a.py", line 25, in upload
ftp.storlines("STOR " + file, open(file))
File "/usr/lib/python3.9/ftplib.py", line 519, in storlines
buf = fp.readline(self.maxline + 1)
File "/usr/lib/python3.9/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 175: invalid start byte

I'm afraid that there's no easy mapping to the Python 3. Two simple, but not 1:1 solutions for Python 3 would be:
Consider uploading all files using a binary mode. I.e. get rid of the
if ext in (".txt", ".htm", ".html"):
ftp.storlines("STOR " + file, open(file))
else:
Or open the text file using the actual encoding that the files use (you have to find out):
open(file, encoding='cp1252')
See Error UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
If you really need the exact functionality that you had in Python 2 (that is: Upload any text file, in whatever encoding, using FTP text transfer mode), it would be more complicated. The Python 2 basically just translates any of CR/LF EOL sequences in the file to CRLF (what is the requirement of the FTP specification), keeping the rest of the file intact.
You can copy FTP.storbinary code and implement the above translation of buf byte-wise (without decoding/recording which Python 3 FTP.storlines/readline does).
If the files are not huge, a simple implementation is to load whole file to memory, convert in memory and upload. This is not difficult, if you know that all your files use the same EOL sequence. If not, the translation might be more difficult.
Or you may even give up on the translation, as most FTP servers do not care (they can handle any common EOL sequence). Just use the FTP.storbinary code as it is, only change TYPE I to TYPE A (what you need to do even if you implement the translation as per the previous point).
Btw, you also need to close the file in any case, so the correct code would be like:
with open(file) as f:
ftp.storlines("STOR " + file, f)
Likewise for storbinary.

Read and upload to GitHub non UTF-8 file Python

I have code thats upload SQlite3 file to GitHub(module PyGithub).
import github
with open('server.db', 'r') as file:
content = file.read()
g = github.Github('token')
repo = g.get_user().get_repo("my-repo")
file = repo.get_contents("server.db")
repo.update_file("server.db", "Python Upload", content, file.sha, branch="main")
If you open this file through a text editor, then there will be characters that are not included in UTF-8, since this is a database file. I get this error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd9 in position 99: invalid continuation byte
How i can fix it?
Maybe I can upload the file to GitHub so it is not text-based, like a PNG?

i use this.
f = open("yourtextfile", encoding="utf8")
contents = get_blob_content(repo, branch="main",path_name="yourfile")
repo.delete_file("yourfile", "test title", contents.sha)
repo.create_file("yourfile", "test title", f.read())
and this def
def get_blob_content(repo, branch, path_name):
# first get the branch reference
ref = repo.get_git_ref(f'heads/{branch}')
# then get the tree
tree = repo.get_git_tree(ref.object.sha, recursive='/' in path_name).tree
# look for path in tree
sha = [x.sha for x in tree if x.path == path_name]
if not sha:
# well, not found..
return None
# we have sha
return repo.get_git_blob(sha[0])

Save unicode text from response without encoding into file

I want to download config file from my router via web scraping. The procedure I want to achieve is this:
Save the config file into disk
Send a factory reset
Load the config file previously downloaded.
So far, I have this code:
with requests.Session() as s: # To login into the modem
pagePostBackUp = 'https://192.168.1.1/goform/BackUp'
s.post(urlLogin, data=loginCredentials, verify=False, timeout=5)
dataBackUp = {'dir': 'admin/','file': 'cmconfig.cfg'}
resultBackUp = s.post(pagePostBackUp, data=dataBackUp, verify=False, timeout=10)
print(resultBackUp.text)
The last line is what I want to save into a file. But, when I try to do it with this code:
f = open('/Users/user/Desktop/file.cfg', 'w')
Throws an error that ascii codec can't encode character. If I save the file with, for example, encode='utf16', differs from what I originally download manually.
So, the question is, How can I save this file with the same encoding the router gives me via web? (As unicode). The content of the file looks like this:
�����g���m��� ������Z������ofpqJ
U\V,.o/����zf��v���~W3=,�D};y�tL�cJ

Change the last line of your code to the following:
with open('/Users/user/Desktop/file.cfg', 'wb') as f:
f.write(resultBackUp.content)
This will treat the payload as data (bytes), not text: the file is opened in binary mode, and the content is taken as-is.
There's no encoding/decoding happening.

How to return file contents from controller?

I'm trying to return the contents of an image file via a Python Connexion application generated from an OpenAPI v2 spec file using swagger-codegen and the python-flask language setting. In my controller module, I simply do the following:
def file_contents_get(file_id):
file = app.datastore.get_instance().get_file(file_id)
with open(file.path, "rb") as f:
return f.read()
However, this results in the following error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
What is the proper way to return a file's contents? Note that I don't want the file as an attachment but rather inline.

How to decompress a GZIP file pulled from SFTP in Python3 the same way Mac OS's gunzip does it?

Okay, I've been stuck on this one for hours which should have only taken a few minutes of work.
I have the following code which pulls a gzipped CSV file from a datastore:
from ftplib import FTP_TLS
import gzip
import csv
ftps = FTP_TLS('waws-prod.net')
ftps.login(user='foo', passwd='bar')
resp = ftps.retrbinary('RETR data/WFSIV0606201701.700.csv.gz', gzip.open('WFSIV0606201701.700.csv.gz', 'wb').write)
The file appears in the pwd, and I can even open my Mac Decompression tool, and the original CSV is decompressed perfectly.
However, if I try to decompress this file in using the gzip Library, i can't get a UTF8 encoded string to parse:
f=gzip.GzipFile('WFSIV0606201701.700.csv.gz', 'rb')
s = f.read()
I get what appears to be UTF8 bytestrings, however utf8 decoder can't parse the string.
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
BUT! If i download directly from the SFTP server using FileZilla, and i do run the gzip.GzipFile code above, it reads it perfectly. Something must be wrong with my downloader/reader but i haven't a clue as to what could be wrong.

resp = ftps.retrbinary('RETR data/WFSIV0606201701.700.csv.gz', gzip.open('WFSIV0606201701.700.csv.gz', 'wb').write)
This line downloads a compressed file, and then compresses it again when writing it to disk.
Replace gzip.open(...).write with open(...).write to write the compressed file directly.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.