Trying to implement jsonschema with just filepaths - python

I've uploaded a json from the user and now I'm trying to compare that json to a schema using the jsonschema validator. I'm getting an error, ValidationError: is not of type u'object'
Failed validating u'type' in schema
This is my code so far:
from __future__ import unicode_literals
from django.shortcuts import render, redirect
import jsonschema
import json
import os
from django.conf import settings
#File to store all the parsers
def jsonVsSchemaParser(project, file):
baseProjectURL = 'src\media\json\schema'
projectSchema = project.lower()+'.schema'
projectPath = os.path.join(baseProjectURL,projectSchema)
filePath = os.path.join(settings.BASE_DIR,'src\media\json', file)
actProjectPath = os.path.join(settings.BASE_DIR,projectPath)
print filePath, actProjectPath
schemaResponse = open(actProjectPath)
schema = json.load(schemaResponse)
response = open(filePath)
jsonFile = json.load(response)
jsonschema.validate(jsonFile, schema)
I'm trying to do something similar to this question except instead of using a url I'm using my filepath.
Also I'm using python 2.7 and Django 1.11 if that is helpful at all.
Also I'm pretty sure I don't have a problem with my filepaths because I printed them and it outputted what I was expecting. I also know that my schema and json can be read by jsonschema since I used it on the command line as well.
EDIT: that validation error seemed to be a fluke. the actual validation error I'm consistently getting is "-1 is not of type u'string'". The annoying thing is that's supposed to be like that. It is wrong that sessionid isn't a string but I want that to be handled by the jsonschema but I don't want my validation errors to be given in this format: . What I want to do is collect all validation errors in an array and then post it to the user in the next page.

I just ended up putting a try-catch around my validate method. Here's what it looks like:
validationErrors = []
try:
jsonschema.validate(jsonFile, schema)
except jsonschema.exceptions.ValidationError as error:
validationErrors.append(error)
EDIT: This solution only works if you have one error because after the validation error is called it breaks out of the validate method. In order to present every error you need to use lazy validation. This is how it looks in my code if you need another example:
v = jsonschema.Draft4Validator(schema)
for error in v.iter_errors(jsonFile):
validationErrors.append(error)

✓ try-except-else-finally statement is a great way to catch and handle exceptions(Run time errors) in Python.
✓So if you want to catch and store Exceptions in an array then the great solution for you is to use try-except statement. In this way you can catch and store in any data structure like lists etc. and your program with continue with its execution, it will not terminate.
✓ Below is a modified code where I have used a for loop which catches error 5 times and stores in list.
validationErrors = []
for i in range(5):
try:
jsonschema.validate(jsonFile, schema)
except jsonschema.exceptions.ValidationError as error:
validationErrors.append(error)
✓ Finally, you can have a look at the below code sample where I have stored ZeroDivisionError and it's related string message in 2 different lists by iterating over a for loop 5 times.
You can use the 2nd list ZeroDivisionErrorMessagesList to pass to template, if you want to print messages on web page (if you want). You can use 1st also.
ZeroDivisionErrorsList = [];
ZeroDivisionErrorMessagesList = list(); # list() is same as [];
for i in range(5):
try:
a = 10 / 0; # it will raise exception
print(a);. # it will not execute
except ZeroDivisionError as error:
ZeroDivisionErrorsList.append(error)
ZeroDivisionErrorMessagesList.append(str(error))
print(ZeroDivisionErrorsList);
print(); # new line
print(ZeroDivisionErrorMessagesList);
» Output:
[ZeroDivisionError('division by zero',),
ZeroDivisionError('division by zero',),
ZeroDivisionError('division by zero',),
ZeroDivisionError('division by zero',),
ZeroDivisionError('division by zero',)]
['division by zero', 'division by zero', 'division by zero', 'division by zero', 'division by zero']

Related

How does django really handle multiple requests on development server?

I am making a little django app to serve translations for my react frontend. The way it works is as follows:
The frontend tries to find a translation using a key.
If the translation for that key is not found, It sends a request to the backend with the missing key
On the backend, the missing key is appended to a json file
Everything works just fine when the requests are sent one at a time (when one finishes, the other is sent). But when multiple requests are sent at the same time, everything breaks. The json file gets corrupted. It's like all the requests are changing the file at the same time which causes this to happen. I am not sure if that's the case because I think that the file can not be edited by two processes at the same time(correct me if I am wrong) but I don't receive such an error which indicates that the requests are handled one at a time according to this and this
Also, I tried something, which to my surprise worked, that is to add time.sleep(1) to the top of my api view. When I did this, everything worked as expected.
What is going on ?
Here is the code, just in case it matters:
#api_view(['POST'])
def save_missing_translation_keys(request, lng, ns):
time.sleep(1)
missing_trans_path = MISSING_TRANS_DIR / f'{lng}.json'
# Read lng file and get current missing keys for given ns
try:
with open(missing_trans_path, 'r', encoding='utf-8') as missing_trans_file:
if is_file_empty(missing_trans_path):
missing_keys_dict = {}
else:
missing_keys_dict = json.load(missing_trans_file)
except FileNotFoundError:
missing_keys_dict = {}
except Exception as e:
# Even if file is not empty, we might not be able to parse it for some reason, so we log any errors in log file
with open(MISSING_LOG_FILE, 'a', encoding='utf-8') as logFile:
logFile.write(
f'could not save missing keys {str(list(request.data.keys()))}\nnamespace {lng}/{ns} file can not be parsed because\n{str(e)}\n\n\n')
raise e
# Add new missing keys to the list above.
ns_missing_keys = missing_keys_dict.get(ns, [])
for missing_key in request.data.keys():
if missing_key and isinstance(missing_key, str):
ns_missing_keys.append(missing_key)
else:
raise ValueError('Missing key not allowed')
missing_keys_dict.update({ns: list(set(ns_missing_keys))})
# Write new missing keys to the file
with open(missing_trans_path, 'w', encoding='utf-8') as missing_trans_file:
json.dump(missing_keys_dict, missing_trans_file, ensure_ascii=False)
return Response()

How to skip one part of a single loop iteration in Python

I am creating about 200 variables within a single iteration of a python loop (extracting fields from excel documents and pushing them to a SQL database) and I am trying to figure something out.
Let's say that a single iteration is a single Excel workbook that I am looping through in a directory. I am extracting around 200 fields from each workbook.
If one of these fields I extract (lets say field #56 out of 200) and it isn't in proper format (lets say the date was filled out wrong ie. 9/31/2015 which isnt a real date) and it errors out with the operation I am performing.
I want the loop to skip that variable and proceed to creating variable #57. I don't want the loop to completely go to the next iteration or workbook, I just want it to ignore that error on that variable and continue with the rest of the variables for that single loop iteration.
How would I go about doing something like this?
In this sample code I would like to continue extracting "PolicyState" even if ExpirationDate has an error.
Some sample code:
import datetime as dt
import os as os
import xlrd as rd
files = os.listdir(path)
for file in files: #Loop through all files in path directory
filename = os.fsdecode(file)
if filename.startswith('~'):
continue
elif filename.endswith( ('.xlsx', '.xlsm') ):
try:
book = rd.open_workbook(os.path.join(path,file))
except KeyError:
print ("Error opening file for "+ file)
continue
SoldModelInfo=book.sheet_by_name("SoldModelInfo")
AccountName=str(SoldModelInfo.cell(1,5).value)
ExpirationDate=dt.datetime.strftime(xldate_to_datetime(SoldModelInfo.cell(1,7).value),'%Y-%m-%d')
PolicyState=str(SoldModelInfo.cell(1,6).value)
print("Insert data of " + file +" was successful")
else:
continue
Use multiple try blocks. Wrap each decode operation that might go wrong in its own try block to catch the exception, do something, and carry on with the next one.
try:
book = rd.open_workbook(os.path.join(path,file))
except KeyError:
print ("Error opening file for "+ file)
continue
errors = []
SoldModelInfo=book.sheet_by_name("SoldModelInfo")
AccountName=str(SoldModelInfo.cell(1,5).value)
try:
ExpirationDate=dt.datetime.strftime(xldate_to_datetime(SoldModelInfo.cell(1,7).value),'%Y-%m-%d')
except WhateverError as e:
# do something, maybe set a default date?
ExpirationDate = default_date
# and/or record that it went wrong?
errors.append( [ "ExpirationDate", e ])
PolicyState=str(SoldModelInfo.cell(1,6).value)
...
# at the end
if not errors:
print("Insert data of " + file +" was successful")
else:
# things went wrong somewhere above.
# the contents of errors will let you work out what
As suggested you could use multiple try blocks on each of your extract variable, or you could streamline it with your own custom function that handles the try for you:
from functools import reduce, partial
def try_funcs(cell, default, funcs):
try:
return reduce(lambda val, func: func(val), funcs, cell)
except Exception as e:
# do something with your Exception if necessary, like logging.
return default
# Usage:
AccountName = try_funcs(SoldModelInfo.cell(1,5).value, "some default str value", str)
ExpirationDate = try_funcs(SoldModelInfo.cell(1,7).value), "some default date", [xldate_to_datetime, partial(dt.datetime.strftime, '%Y-%m-%d')])
PolicyState = try_funcs(SoldModelInfo.cell(1,6).value, "some default str value", str)
Here we use reduce to repeat multiple functions, and pass partial as a frozen function with arguments.
This can help your code look tidy without cluttering up with lots of try blocks. But the better, more explicit way is just handle the fields you anticipate might error out individually.
So, basically you need to wrap your xldate_to_datetime() call into try ... except
import datetime as dt
v = SoldModelInfo.cell(1,7).value
try:
d = dt.datetime.strftime(xldate_to_datetime(v), '%Y-%m-%d')
except TypeError as e:
print('Could not parse "{}": {}'.format(v, e)

parsing API with Python - how to handle JSON with BOM

I'm using Python 2.7.11 on windows to get JSON data from API (data on trees in Warsaw, Poland, but nevermind that). I want to generate output csv file with all the data provided by the api, for further analysis. I started with a script I used for another project (also discussed here on Stackoverflow and corrected for me by #Martin Taylor).That script didn't work so I tried to modify it using my very basic understanding, googling around and applying pdb debugger. At the moment, the result looks like this:
import pdb
import json
import urllib2
import csv
pdb.set_trace()
url = "https://api.um.warszawa.pl/api/action/datastore_search/?resource_id=ed6217dd-c8d0-4f7b-8bed-3b7eb81a95ba"
myfile = 'C:/dane/drzewa.csv'
csv_myfile = csv.writer(open(myfile, 'wb'))
cols = ['numer_adres', 'stan_zdrowia', 'y_wgs84', 'dzielnica', 'adres', 'lokalizacja', 'wiek_w_dni', 'srednica_k', 'pnie_obwod', 'miasto', 'jednostka', 'x_pl2000', 'wysokosc', 'y_pl2000', 'numer_inw', 'x_wgs84', '_id', 'gatunek_1', 'gatunek', 'data_wyk_pom']
csv_myfile.writerow(cols)
def api_iterate(myfile):
while True:
global url
print url
json_page = urllib2.urlopen(url)
data = json.load(json_page)
json_page.close()
for data_object in data ['result']['records']:
csv_myfile.writerow([data_object[col] for col in cols])
try:
url = data['_links']['next']
except KeyError as e:
break
with open(myfile, 'wb'):
api_iterate(myfile)
I'm a very fresh Python user so I get confused all the time. Now I got to the point when, while reading the objects in json dictionary, I get a Keyerror message associated with the 'x_wgs84' element. I suppose it has something to do with the fact that in the source url this element is preceded by a U+FEFF unicode character. I tried to get around this but I got stuck and would appreciate assistance.
I suspect the code may be corrupt in several other ways - as I mentioned, I'm a very unskilled programmer (yet).
You need to put the key with the unicode character:
To know how to do it, one easy way is to print the keys:
>>> import requests
>>> res = requests.get('https://api.um.warszawa.pl/api/action/datastore_search/?resource_id=ed6217dd-c8d0-4f7b-8bed-3b7eb81a95ba')
>>> data = res.json()
>>> records = data['result']['records']
>>> records[0]
{u'numer_adres': u'', u'stan_zdrowia': u'dobry', u'y_wgs84': u'52.21865', u'y_pl2000': u'5787241.04475524', u'adres': u'ul. ALPEJSKA', u'x_pl2000': u'7511793.96937063', u'lokalizacja': u'Ulica ALPEJSKA', u'wiek_w_dni': u'60', u'miasto': u'Warszawa', u'jednostka': u'Dzielnica Wawer', u'pnie_obwod': u'73', u'wysokosc': u'14', u'data_wyk_pom': u'20130709', u'dzielnica': u'Wawer', u'\ufeffx_wgs84': u'21.172584', u'numer_inw': u'D386200', u'_id': 125435, u'gatunek_1': u'Quercus robur', u'gatunek': u'd\u0105b szypu\u0142kowy', u'srednica_k': u'7'}
>>> records[0].keys()
[u'numer_adres', u'stan_zdrowia', u'y_wgs84', u'y_pl2000', u'adres', u'x_pl2000', u'lokalizacja', u'wiek_w_dni', u'miasto', u'jednostka', u'pnie_obwod', u'wysokosc', u'data_wyk_pom', u'dzielnica', u'\ufeffx_wgs84', u'numer_inw', u'_id', u'gatunek_1', u'gatunek', u'srednica_k']
>>> records[0][u'\ufeffx_wgs84']
u'21.172584'
As you can see, to get your key, you need to write it as '\ufeffx_wgs84' with the unicode character that is causing trouble.
Note: I don't know if you are using python2 or 3, but you might need to put a u before your string declaration in python2 to declare it as unicode string.

Pylons: response renaming? Is there a better way?

I've got a Pylons controller with an action called serialize returning content_type=text/csv. I'd like the response of the action to be named based on the input patameter, i.e. for the following route, produced csv file should be named {id}.csv : /app/PROD/serialize => PROD.csv (so a user can open the file in Excel with a proper name directly via a webbrowser)
map.connect('/app/{id}/serialize',controller = 'csvproducer',action='serialize')
I've tried to set different HTTP headers and properties of the webob's response object with no luck. However, I figured out a workaround by simply adding a new action to the controller and dynamically redirecting the original action to that new action, i.e.:
map.connect('/app/{id}/serialize',controller = 'csvproducer',action='serialize')
map.connect('/app/csv/{foo}',controller = 'csvproducer', action='tocsv')
The controller's snippet:
def serialize(self,id):
try:
session['key'] = self.service.serialize(id) #produces csv content
session.save()
redirect_to(str("/app/csv/%s.csv" % id))
except Exception,e:
log.error(e)
abort(503)
def tocsv(self):
try:
csv = session.pop("rfa.enviornment.serialize")
except Exception,e:
log.error(e)
abort(503)
if csv:
response.content_type='text/csv'
response.status_int=200
response.write(csv)
else:
abort(404)
The above setup works perfectly fine, however, is there a better/slicker/neater way of doing it? Ideally I wouldn't like to redirect the request; instead I'd like to either rename location or set content-disposition: attachment; filename='XXX.csv' [ unsuccessfully tried both :( ]
Am I missing something obvious here?
Cheers
UPDATE:
Thanks to ebo I've managed to do fix content-disposition. Should better read W3C specs next time ;)
You should be able to set the content-disposition header on a response object.
If you have already tried that, it may not have worked because the http standard says that the quotes should be done by double-quote marks.

How do you debug Mako templates?

So far I've found it impossible to produce usable tracebacks when Mako templates aren't coded correctly.
Is there any way to debug templates besides iterating for every line of code?
Mako actually provides a VERY nice way to track down errors in a template:
from mako import exceptions
try:
template = lookup.get_template(uri)
print template.render()
except:
print exceptions.html_error_template().render()
Looking at the Flask-Mako source, I found an undocumented configuration parameter called MAKO_TRANSLATE_EXCEPTIONS.
Set this to False in your Flask app config and you'll get nice exceptions bubbling up from the template. This accomplishes the same thing as #Mariano suggested, without needing to edit the source. Apparently, this parameter was added after Mariano's answer.
I break them down into pieces, and then reassemble the pieces when I've found the problem.
Not good, but it's really hard to tell what went wrong in a big, complex template.
My main frustration with Mako was that it was hard to see what was happening in the template. As the template code is a runnable object that is in-memory, no debugger can look into it.
One solution is to write the template code to file, and re-run the template using this file as a standard python module. Then you can debug to your hearts content.
An example:
import sys
from mako import exceptions, template
from mako.template import DefTemplate
from mako.runtime import _render
<Do Great Stuff>
try:
template.render(**arguments))
except:
# Try to re-create the error using a proper file template
# This will give a clearer error message.
with open('failed_template.py', 'w') as out:
out.write(template._code)
import failed_template
data = dict(callable=failed_template.render_body, **arguments)
try:
_render(DefTemplate(template, failed_template.render_body),
failed_template.render_body,
[],
data)
except:
msg = '<An error occurred when rendering template for %s>\n'%arguments
msg += exceptions.text_error_template().render()
print(msg, file=sys.stderr)
raise
Using flask_mako, I find it's easier to skip over the TemplateError generation and just pass up the exception. I.e. in flask_mako.py, comment out the part that makes the TemplateError and just do a raise:
def _render(template, context, app):
"""Renders the template and fires the signal"""
app.update_template_context(context)
try:
rv = template.render(**context)
template_rendered.send(app, template=template, context=context)
return rv
except:
#translated = TemplateError(template)
#raise translated
raise
}
Then you'll see a regular python exception that caused the problem along with line numbers in the template.
Combining the two top answers with my own special sauce:
from flask.ext.mako import render_template as render_template_1
from mako import exceptions
app.config['MAKO_TRANSLATE_EXCEPTIONS'] = False # seems to be necessary
def render_template(*args, **kwargs):
kwargs2 = dict(**kwargs)
kwargs2['config'] = app.config # this is irrelevant, but useful
try:
return render_template_1(*args, **kwargs2)
except:
if app.config.get('DEBUG'):
return exceptions.html_error_template().render()
raise
It wraps the stock "render_template" function:
catch exceptions, and
if debugging, render a backtrace
if not debugging, raise the exception again so it will be logged
make config accessible from the page (irrelevant)

Categories

Resources