decode subprocess.Popen and store in file - python

I wrote a script / Addon for pyLoad.
Basically it executes FileBot with arguments.
What I am trying to do is to get the output and store it into the pyLoad Log file.
So far so good. It works until that point where a single character needs to be decoded.
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 5: ordinal not in range(128)
I dont know how to do that.
I hope u guys can help.
try:
if self.getConfig('output_to_log') is True:
log = open('Logs/log.txt', 'a')
subprocess.Popen(args, stdout=log, stderr=log, bufsize=-1)
Thanks in advance
[edit]
28.05.2015 12:34:06 DEBUG FileBot-Hook: MKV-Checkup (package_extracted)
28.05.2015 12:34:06 DEBUG Hier sind keine Archive
28.05.2015 12:34:06 INFO FileBot: executed
28.05.2015 12:34:06 INFO FileBot: cleaning
Locking /usr/share/filebot/data/logs/amc.log
Done ヾ(@⌒ー⌒@)ノ
Parameter: exec = cd / && ./filebot.sh "{file}"
Parameter: clean = y
Parameter: skipExtract = y
Parameter: reportError = n
Parameter: storeReport = n
Parameter: artwork = n
Parameter: subtitles = de
Parameter: movieFormat = /mnt/HD/Medien/Movies/{n} ({y})/{n} ({y})
Parameter: seriesFormat = /mnt/HD/Medien/TV Shows/{n}/Season {s.pad(2)}/{n} - {s00e00} - {t}
Parameter: extras = n
So im guessing this
Done ヾ(@⌒ー⌒@)ノ
is causing the issue
when i open the loginterface on the webgui to see the log - this is the traceback
Traceback (most recent call last):
File "/usr/share/pyload/module/lib/bottle.py", line 733, in _handle
return route.call(**args)
File "/usr/share/pyload/module/lib/bottle.py", line 1448, in wrapper
rv = callback(*a, **ka)
File "/usr/share/pyload/module/web/utils.py", line 113, in _view
return func(*args, **kwargs)
File "/usr/share/pyload/module/web/pyload_app.py", line 464, in logs
[pre_processor])
File "/usr/share/pyload/module/web/utils.py", line 30, in render_to_response
return t.render(**args)
File "/usr/share/pyload/module/lib/jinja2/environment.py", line 891, in render
return self.environment.handle_exception(exc_info, True)
File "/usr/share/pyload/module/web/templates/Next/logs.html", line 1, in top-level template code
{% extends 'Next/base.html' %}
File "/usr/share/pyload/module/web/templates/Next/base.html", line 179, in top-level template code
{% block content %}
File "/usr/share/pyload/module/web/templates/Next/logs.html", line 30, in block "content"
<tr><td class="logline">{{line.line}}</td><td>{{line.date}}</td><td class="loglevel">{{line.level}}</td><td>{{line.message}}</td></tr>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 5: ordinal not in range(128)

I found a solution.
proc=subprocess.Popen(args, stdout=subprocess.PIPE)
for line in proc.stdout:
self.logInfo(line.decode('utf-8').rstrip('\r|\n'))
proc.wait()

Related

ignore encoding error when parsing pdf with pdfminer

from pdfminer.pdfparser import PDFParser
from pdfminer.pdfdocument import PDFDocument
from pdfminer.pdftypes import resolve1
fn='test.pdf'
with open(fn, mode='rb') as fp:
parser = PDFParser(fp)
doc = PDFDocument(parser)
fields = resolve1(doc.catalog['AcroForm'])['Fields']
item = {}
for i in fields:
field = resolve1(i)
name, value = field.get('T'), field.get('V')
item[name]=value
Hello, I need help with this code as it is giving me Unicode error on some characters
Traceback (most recent call last):
File "<stdin>", line 7, in <module>
File "/home/timmy/.local/lib/python3.8/site-packages/pdfminer/pdftypes.py", line 80, in resolve1
x = x.resolve(default=default)
File "/home/timmy/.local/lib/python3.8/site-packages/pdfminer/pdftypes.py", line 67, in resolve
return self.doc.getobj(self.objid)
File "/home/timmy/.local/lib/python3.8/site-packages/pdfminer/pdfdocument.py", line 673, in getobj
stream = stream_value(self.getobj(strmid))
File "/home/timmy/.local/lib/python3.8/site-packages/pdfminer/pdfdocument.py", line 676, in getobj
obj = self._getobj_parse(index, objid)
File "/home/timmy/.local/lib/python3.8/site-packages/pdfminer/pdfdocument.py", line 648, in _getobj_parse
raise PDFSyntaxError('objid mismatch: %r=%r' % (objid1, objid))
File "/home/timmy/.local/lib/python3.8/site-packages/pdfminer/psparser.py", line 85, in __repr__
return self.name.decode('ascii')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 0: ordinal not in range(128)
is there anything I can add so it "ingores" the charchters that its not able to decode or at least return the name with the value as blank in name, value = field.get('T'), field.get('V').
any help is appreciated
Here is one way you can fix it
nano "/home/timmy/.local/lib/python3.8/site-packages/pdfminer/psparser.py"
then in line 85
def __repr__(self):
return self.name.decode('ascii', 'ignore') # this fixes it
I don't believe it's recommended to edit source scripts, you should also post an issue on Github

Python3 Tarfile: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf0 in position

I am trying to port a piece of python2 code to python3. The code works perfectly in python2, but fails in python3. In the original python2 code, data is being compressed into a tarfile as follows:
_tar = tarfile.open(name, mode="w|")
data = StringIO()
data.write(compress(dumps(probe, HIGHEST_PROTOCOL)))
data.seek(0)
info = tarfile.TarInfo()
info.name = 'Probe_%s.lzo' % dest
info.uid = 0
info.gid = 0
info.size = len(data.buf)
info.mode = S_IMODE(0o0444)
info.mtime = mktime(probe.circs[0].created.timetuple())
_tar.addfile(tarinfo=info, fileobj=data)
Now, in another script, this code is being read in the following way:
with tarfile.open(fileobj=stdin, mode="r|") as tar:
while True:
cprobe = tar.next()
if not cprobe:
raise StopIteration()
tarx = tar.extractfile(cprobe)
if not tarx:
continue
yield tarx.read()
The second script is intended to be called in the following way:
cat outputOfFirst | python ./second.py 1> outputOfSecond
This works fine in python2. If I use the output of the first script generated through python2, and pass it to the second script with python3, i get the following error:
with tarfile.open(fileobj=stdin, mode="r|") as tar:
File "/usr/lib/python3.6/tarfile.py", line 1601, in open
t = cls(name, filemode, stream, **kwargs)
File "/usr/lib/python3.6/tarfile.py", line 1482, in __init__
self.firstmember = self.next()
File "/usr/lib/python3.6/tarfile.py", line 2297, in next
tarinfo = self.tarinfo.fromtarfile(self)
File "/usr/lib/python3.6/tarfile.py", line 1092, in fromtarfile
buf = tarfile.fileobj.read(BLOCKSIZE)
File "/usr/lib/python3.6/tarfile.py", line 539, in read
buf = self._read(size)
File "/usr/lib/python3.6/tarfile.py", line 547, in _read
return self.__read(size)
File "/usr/lib/python3.6/tarfile.py", line 572, in __read
buf = self.fileobj.read(self.bufsize)
File "/usr/lib/python3.6/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf0 in position 512: invalid continuation byte
What would be the python3 equivalent to this? My understanding is that i have to somehow encode the stdin part to something like "latin-1". But i am not sure how that would be done.

UnicodeDecodeError: 'utf8' codec can't decode byte 0x89 in position 51: invalid start byte

An error occurred when compiling "QRCODE.py"
from pyshorteners import Shortener
class Shortening():
def __init__(self):
self.shortener=Shortener('Tinyurl')
fo = open('/home/jayu/Desktop/qr.png','r+')
apiKey = fo.read()
self.shortener = Shortener('Google',api_key = apiKey)
def shortenURL(self):
self.url = raw_input("Enter The Url to shortener : ");
shortener = self.shortener.short(self.url)
print ("the short url : " +shortenURL)
def decodeURL(self):
self.url = raw_input("Enter The Url to expand: ");
expandURL = self.shortener.expand(self.url)
print ("the short url : " +expandURL);
def generateQRcode(self):
self.url = raw_input("Enter the URL to get QR code :")
self.shortener.short(self.url)
print (self.shortener.qrcode(150,150))
app = Shortening()
option = int (input("Enter ur choice : "))
if option==1:
app.shortenURL()
elif option==2:
decodeURL()
elif option==3:
app.generateQRcode()
else:
print ("wrong ")
Traceback (most recent call last):
jayu#jayu:~/Desktop$ python QRCODE.py
Enter ur choice : 3
Enter the URL to get QR code :http://www.google.com
Traceback (most recent call last):
File "QRCODE.py", line 29, in <module>
app.generateQRcode()
File "QRCODE.py", line 19, in generateQRcode
self.shortener.short(self.url)
File "/home/jayu/.local/lib/python2.7/site-packages/pyshorteners/shorteners/__init__.py", line 115, in short
self.shorten = self._class(**self.kwargs).short(url)
File "/home/jayu/.local/lib/python2.7/site-packages/pyshorteners/shorteners/googl.py", line 25, in short
response = self._post(url, data=params, headers=headers)
File "/home/jayu/.local/lib/python2.7/site-packages/pyshorteners/shorteners/base.py", line 32, in _post
timeout=self.kwargs['timeout'])
File "/home/jayu/.local/lib/python2.7/site-packages/requests/api.py", line 112, in post
return request('post', url, data=data, json=json, **kwargs)
File "/home/jayu/.local/lib/python2.7/site-packages/requests/api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "/home/jayu/.local/lib/python2.7/site-packages/requests/sessions.py", line 498, in request
prep = self.prepare_request(req)
File "/home/jayu/.local/lib/python2.7/site-packages/requests/sessions.py", line 441, in prepare_request
hooks=merge_hooks(request.hooks, self.hooks),
File "/home/jayu/.local/lib/python2.7/site-packages/requests/models.py", line 309, in prepare
self.prepare_url(url, params)
File "/home/jayu/.local/lib/python2.7/site-packages/requests/models.py", line 359, in prepare_url
url = url.decode('utf8')
File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x89 in position 51: invalid start byte
What is the cause of the error? Python's version is 2.7.15rc1
Each time I tried to run python QRCODE.py I got a same position N in the traceback.
can anyone correct me ?
If you have this problem in open(...) function you need to set encoding in this function
fo = open(filename, something_else, encoding = 'UTF-8')
but it's work only in python3 in python 2 you need to use io.open:
fo = io.open(filename, something else, encoding = 'UTF-8')
go to google i don't know full sintax, but i already answered alike ask here: unable to decode this string using python

rendering a file in python using pygal - ascii code error

I am trying to create a pygal chart in python and saving it to a .svg file.
#Creating pygal charts
pie_chart = pygal.Pie(style=DarkSolarizedStyle, legend_box_size = 20, pretty_print=True)
pie_chart.title = 'Github-Migration Status Chart (in %)'
pie_chart.add('Intro', int(intro))
pie_chart.add('Parallel', int(parallel))
pie_chart.add('In Progress', int(in_progress) )
pie_chart.add('Complete', int(complete))
pie_chart.render_to_file('../../../../../usr/share/nginx/html/TeamFornax/githubMigration/OverallProgress/overallProgress.svg')
This simple piece of code seems to give the error -
> Traceback (most recent call last): File
> "/home/ec2-user/githubr/migrationcharts.py", line 161, in <module>
> pie_chart.render_to_file('../../../../../usr/share/nginx/html/TeamFornax/githubMigration/OverallProgress/overallProgress.svg')
> File "/usr/lib/python2.6/site-packages/pygal/ghost.py", line 149, in
> render_to_file
> f.write(self.render(is_unicode=True, **kwargs)) File "/usr/lib/python2.6/site-packages/pygal/ghost.py", line 112, in render
> .render(is_unicode=is_unicode)) File "/usr/lib/python2.6/site-packages/pygal/graph/base.py", line 293, in
> render
> is_unicode=is_unicode, pretty_print=self.pretty_print) File "/usr/lib/python2.6/site-packages/pygal/svg.py", line 271, in render
> self.root, **args) File "/usr/lib64/python2.6/xml/etree/ElementTree.py", line 1010, in
> tostring
> return string.join(data, "") File "/usr/lib64/python2.6/string.py", line 318, in join
> return sep.join(words) UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 40: ordinal not in range(128)
Any idea why ?
try to decode the path string to unicode that you send to render_to_file.
such like:
pie_chart.render_to_file('path/to/overallProgress.svg'.decode('utf-8'))
the decoding charset should be consistent with your file encoding.

Google App Engine: UnicodeDecode Error in bulk data upload

I'm getting an odd error with Google App Engine devserver 1.3.5, and Python 2.5.4, on Windows.
A sample row in the CSV:
EQS,550,foobar,"<some><html><garbage /></html></some>",odp,Ti4=,http://url.com,success
The error:
..................................................................................................................[ERROR ] [Thread-1] WorkerThread:
Traceback (most recent call last):
File "C:\Program Files\Google\google_appengine\google\appengine\tools\adaptive_thread_pool.py", line 150, in WorkOnItems
status, instruction = item.PerformWork(self.__thread_pool)
File "C:\Program Files\Google\google_appengine\google\appengine\tools\bulkloader.py", line 695, in PerformWork
transfer_time = self._TransferItem(thread_pool)
File "C:\Program Files\Google\google_appengine\google\appengine\tools\bulkloader.py", line 852, in _TransferItem
self.request_manager.PostEntities(self.content)
File "C:\Program Files\Google\google_appengine\google\appengine\tools\bulkloader.py", line 1296, in PostEntities
datastore.Put(entities)
File "C:\Program Files\Google\google_appengine\google\appengine\api\datastore.py", line 282, in Put
req.entity_list().extend([e._ToPb() for e in entities])
File "C:\Program Files\Google\google_appengine\google\appengine\api\datastore.py", line 687, in _ToPb
properties = datastore_types.ToPropertyPb(name, values)
File "C:\Program Files\Google\google_appengine\google\appengine\api\datastore_types.py", line 1499, in ToPropertyPb
pbvalue = pack_prop(name, v, pb.mutable_value())
File "C:\Program Files\Google\google_appengine\google\appengine\api\datastore_types.py", line 1322, in PackString
pbvalue.set_stringvalue(unicode(value).encode('utf-8'))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe8 in position 36: ordinal not in range(128)
[INFO ] Unexpected thread death: Thread-1
[INFO ] An error occurred. Shutting down...
..[ERROR ] Error in Thread-1: 'ascii' codec can't decode byte 0xe8 in position 36: ordinal not in range(128)
Is the error being generated by an issue with a base64 string, of which there is one in every row?
KGxwMAoobHAxCihTJ0JJT0VFJwpwMgpJMjYxMAp0cDMKYWEu
KGxwMAoobHAxCihTJ01BVEgnCnAyCkkyOTQwCnRwMwphYS4=
The data loader:
class CourseLoader(bulkloader.Loader):
def __init__(self):
bulkloader.Loader.__init__(self, 'Course',
[('dept_code', str),
('number', int),
('title', str),
('full_description', str),
('unparsed_pre_reqs', str),
('pickled_pre_reqs', lambda x: base64.b64decode(x)),
('course_catalog_url', str),
('parse_succeeded', lambda x: x == 'success')
])
loaders = [CourseLoader]
Is there a way to tell from the traceback which row caused the error?
UPDATE: It looks like there are two characters causing errors: è, and ®. How can I get Google App Engine to handle them?
Looks like some row of the CSV has some non-ascii data (maybe a LATIN SMALL LETTER E WITH GRAVE -- that's what 0xe8 would be in ISO-8859-1, for example) and yet you're mapping it to str (should be unicode, and I believe the CSV should be in utf-8).
To find if any row of a text file has non-ascii data, a simple Python snippet will help, e.g.:
>>> f = open('thefile.csv')
>>> prob = []
>>> for i, line in enumerate(f):
... try: unicode(line)
... except: prob.append(i)
...
>>> print 'Problems in %d lines:' % len(prob)
>>> print prob

Categories

Resources