Error message when submitting HIT to Amazon Mechanical Turk - python

I have a problem submitting a HIT to Amazon Mechanical Turk sandbox.
I'm using the following code to submit a HIT:
external_content = """"
<ExternalQuestion xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2006-07-14/ExternalQuestion.xsd">
<ExternalURL>https://MY_HOST_GOES_HERE/</ExternalURL>
<FrameHeight>400</FrameHeight>
</ExternalQuestion>
"""
import boto3
import os
region_name = 'us-east-1'
aws_access_key_id = 'MYKEY'
aws_secret_access_key = 'MYSECRETKEY'
endpoint_url = 'https://mturk-requester-sandbox.us-east-1.amazonaws.com'
# Uncomment this line to use in production
# endpoint_url = 'https://mturk-requester.us-east-1.amazonaws.com'
client = boto3.client('mturk',
endpoint_url=endpoint_url,
region_name=region_name,
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key,
)
# This will return $10,000.00 in the MTurk Developer Sandbox
print(client.get_account_balance()['AvailableBalance'])
response = client.create_hit(Question=external_content,
LifetimeInSeconds=60 * 60 * 24,
Title="Answer a simple question",
Description="Help research a topic",
Keywords="question, answer, research",
AssignmentDurationInSeconds=120,
Reward='0.05')
# The response included several helpful fields
hit_group_id = response['HIT']['HITGroupId']
hit_id = response['HIT']['HITId']
# Let's construct a URL to access the HIT
sb_path = "https://workersandbox.mturk.com/mturk/preview?groupId={}"
hit_url = sb_path.format(hit_group_id)
print(hit_url)
The error message I get is:
botocore.exceptions.ClientError: An error occurred (ParameterValidationError) when calling the CreateHIT operation: There was an error parsing the XML question or answer data in your request. Please make sure the data is well-formed and validates against the appropriate schema. Details: Content is not allowed in prolog. (1493572622889 s)
What might be the reason here? The xml fully agrees with xml schema located on amazon servers.
The html returned by the external host is:
<!DOCTYPE html>
<head>
<meta http-equiv='Content-Type' content='text/html; charset=UTF-8'/>
<script src='https://s3.amazonaws.com/mturk-public/externalHIT_v1.js' type='text/javascript'></script>
</head>
<body>
<!-- HTML to handle creating the HIT form -->
<form name='mturk_form' method='post' id='mturk_form' action='https://workersandbox.mturk.com/mturk/externalSubmit'>
<input type='hidden' value='' name='assignmentId' id='assignmentId'/>
<!-- This is where you define your question(s) -->
<h1>Please name the company that created the iPhone</h1>
<p><textarea name='answer' rows=3 cols=80></textarea></p>
<!-- HTML to handle submitting the HIT -->
<p><input type='submit' id='submitButton' value='Submit' /></p></form>
<script language='Javascript'>turkSetAssignmentID();</script>
</body>
</html>
Thank you

This message "Details: Content is not allowed in prolog." is the clue. It turns out that what this is saying is that you can't have content outside of where it is expected. This is what usually happens when a junk character (think smart-quotes or non-printable ASCII value) appears in there. These can be a real pain in the butt to diagnose.
In your case, it's a little easier to debug but still just as frustrating. Check out this line:
external_content = """"
It turns out that Python only needs three quotes (""") in order to acknowledge a multi-line string definition. Thus your fourth " was actually rendering as part of the XML. Change that line to this:
external_content = """
And you're golden. I just tested it and it works. Sorry for all the frustration, but hopefully this unblocks you. Happy Sunday!

Related

Python Flask Constantly Refresh/Update page? [duplicate]

I have a view that generates data and streams it in real time. I can't figure out how to send this data to a variable that I can use in my HTML template. My current solution just outputs the data to a blank page as it arrives, which works, but I want to include it in a larger page with formatting. How do I update, format, and display the data as it is streamed to the page?
import flask
import time, math
app = flask.Flask(__name__)
#app.route('/')
def index():
def inner():
# simulate a long process to watch
for i in range(500):
j = math.sqrt(i)
time.sleep(1)
# this value should be inserted into an HTML template
yield str(i) + '<br/>\n'
return flask.Response(inner(), mimetype='text/html')
app.run(debug=True)
You can stream data in a response, but you can't dynamically update a template the way you describe. The template is rendered once on the server side, then sent to the client.
One solution is to use JavaScript to read the streamed response and output the data on the client side. Use XMLHttpRequest to make a request to the endpoint that will stream the data. Then periodically read from the stream until it's done.
This introduces complexity, but allows updating the page directly and gives complete control over what the output looks like. The following example demonstrates that by displaying both the current value and the log of all values.
This example assumes a very simple message format: a single line of data, followed by a newline. This can be as complex as needed, as long as there's a way to identify each message. For example, each loop could return a JSON object which the client decodes.
from math import sqrt
from time import sleep
from flask import Flask, render_template
app = Flask(__name__)
#app.route("/")
def index():
return render_template("index.html")
#app.route("/stream")
def stream():
def generate():
for i in range(500):
yield "{}\n".format(sqrt(i))
sleep(1)
return app.response_class(generate(), mimetype="text/plain")
<p>This is the latest output: <span id="latest"></span></p>
<p>This is all the output:</p>
<ul id="output"></ul>
<script>
var latest = document.getElementById('latest');
var output = document.getElementById('output');
var xhr = new XMLHttpRequest();
xhr.open('GET', '{{ url_for('stream') }}');
xhr.send();
var position = 0;
function handleNewData() {
// the response text include the entire response so far
// split the messages, then take the messages that haven't been handled yet
// position tracks how many messages have been handled
// messages end with a newline, so split will always show one extra empty message at the end
var messages = xhr.responseText.split('\n');
messages.slice(position, -1).forEach(function(value) {
latest.textContent = value; // update the latest value in place
// build and append a new item to a list to log all output
var item = document.createElement('li');
item.textContent = value;
output.appendChild(item);
});
position = messages.length - 1;
}
var timer;
timer = setInterval(function() {
// check the response for new data
handleNewData();
// stop checking once the response has ended
if (xhr.readyState == XMLHttpRequest.DONE) {
clearInterval(timer);
latest.textContent = 'Done';
}
}, 1000);
</script>
An <iframe> can be used to display streamed HTML output, but it has some downsides. The frame is a separate document, which increases resource usage. Since it's only displaying the streamed data, it might not be easy to style it like the rest of the page. It can only append data, so long output will render below the visible scroll area. It can't modify other parts of the page in response to each event.
index.html renders the page with a frame pointed at the stream endpoint. The frame has fairly small default dimensions, so you may want to to style it further. Use render_template_string, which knows to escape variables, to render the HTML for each item (or use render_template with a more complex template file). An initial line can be yielded to load CSS in the frame first.
from flask import render_template_string, stream_with_context
#app.route("/stream")
def stream():
#stream_with_context
def generate():
yield render_template_string('<link rel=stylesheet href="{{ url_for("static", filename="stream.css") }}">')
for i in range(500):
yield render_template_string("<p>{{ i }}: {{ s }}</p>\n", i=i, s=sqrt(i))
sleep(1)
return app.response_class(generate())
<p>This is all the output:</p>
<iframe src="{{ url_for("stream") }}"></iframe>
5 years late, but this actually can be done the way you were initially trying to do it, javascript is totally unnecessary (Edit: the author of the accepted answer added the iframe section after I wrote this). You just have to include embed the output as an <iframe>:
from flask import Flask, render_template, Response
import time, math
app = Flask(__name__)
#app.route('/content')
def content():
"""
Render the content a url different from index
"""
def inner():
# simulate a long process to watch
for i in range(500):
j = math.sqrt(i)
time.sleep(1)
# this value should be inserted into an HTML template
yield str(i) + '<br/>\n'
return Response(inner(), mimetype='text/html')
#app.route('/')
def index():
"""
Render a template at the index. The content will be embedded in this template
"""
return render_template('index.html.jinja')
app.run(debug=True)
Then the 'index.html.jinja' file will include an <iframe> with the content url as the src, which would something like:
<!doctype html>
<head>
<title>Title</title>
</head>
<body>
<div>
<iframe frameborder="0"
onresize="noresize"
style='background: transparent; width: 100%; height:100%;'
src="{{ url_for('content')}}">
</iframe>
</div>
</body>
When rendering user-provided data render_template_string() should be used to render the content to avoid injection attacks. However, I left this out of the example because it adds additional complexity, is outside the scope of the question, isn't relevant to the OP since he isn't streaming user-provided data, and won't be relevant for the vast majority of people seeing this post since streaming user-provided data is a far edge case that few if any people will ever have to do.
Originally I had a similar problem to the one posted here where a model is being trained and the update should be stationary and formatted in Html. The following answer is for future reference or people trying to solve the same problem and need inspiration.
A good solution to achieve this is to use an EventSource in Javascript, as described here. This listener can be started using a context variable, such as from a form or other source. The listener is stopped by sending a stop command. A sleep command is used for visualization without doing any real work in this example. Lastly, Html formatting can be achieved using Javascript DOM-Manipulation.
Flask Application
import flask
import time
app = flask.Flask(__name__)
#app.route('/learn')
def learn():
def update():
yield 'data: Prepare for learning\n\n'
# Preapre model
time.sleep(1.0)
for i in range(1, 101):
# Perform update
time.sleep(0.1)
yield f'data: {i}%\n\n'
yield 'data: close\n\n'
return flask.Response(update(), mimetype='text/event-stream')
#app.route('/', methods=['GET', 'POST'])
def index():
train_model = False
if flask.request.method == 'POST':
if 'train_model' in list(flask.request.form):
train_model = True
return flask.render_template('index.html', train_model=train_model)
app.run(threaded=True)
HTML Template
<form action="/" method="post">
<input name="train_model" type="submit" value="Train Model" />
</form>
<p id="learn_output"></p>
{% if train_model %}
<script>
var target_output = document.getElementById("learn_output");
var learn_update = new EventSource("/learn");
learn_update.onmessage = function (e) {
if (e.data == "close") {
learn_update.close();
} else {
target_output.innerHTML = "Status: " + e.data;
}
};
</script>
{% endif %}

How to use Python script in Django using AJAX or something like that?

I am working on an IoT project which is like smart attendance and cashless payment system for schools and colleges. I am using Raspberry Pi 3 as my client machine and DigitalOcean to host PostgreSQL and core Django project. In addition to Raspberry Pi, I am using EM-18 RFID reader and R305 fingerprint reader. Now, as of now, I have built a CLI utility to use project but it's not really convenient for a person who is not that familiar with either Linux or shell in general.
Now my problem lies in my current code logic. For instance, If I want to add a student information along with their RFID/Fingerprint information to the database, I first scan for RFID tag assigned to them. Check if that tag is already in the database or not, if not then ask for their information. After this step, I again ask them to tap the same card again to enroll their fingerprint. All of these processes work without any issues on CLI. I am not able to figure out how to let the operator know which step to execute now. Although I have built a web interface in Django and it's neither user-friendly nor interactive. I think I have to use JSON/AJAX to do that, but I have no idea on how to do it.
Please read this not as an answer but more like a longer comment ;)
So first read into the MVC concept (for example: https://djangobook.com/model-view-controller-design-pattern/) to understand how web interfaces interact with the rest of the code.
The classical django way would be to
create a form (in forms.py) based on your model.
you would write some html to display the form on a website
validate the form input in views.py
send back an the results and display them (again html)
when you would like to do that with AJAX its a little extra work but the result will be worth it. (for example: How to POST a django form with AJAX & jQuery)
you will call a specific url which you first initialise in your urls.py file (path('checkForm', views.checkForm, name='checkForm'), in django 2.0)
the url points to a function defined in views.py with the name checkForm (same as the url path in this example but you can call it whatever you want). Inside this function you can write your python code or call the functions from there to execute your db quires, validations etc.
Send back a response with the results of your code.
success: function(data, status) {
$('#whatever').html(data);
alert("whatever");
}
"data" is here the response from the server.
I hope I gave you a hint in the right direction. Please ask if you get stuck somewhere.
p.s. I have no idea how to send the fingerprint data via http, check if there is some library.
x=data.read(12).decode("utf-8")
tag_id_in = x
this is bad code, please get rid of x. first x is a bad variable name and second its completely redundant.
you also want to convert your print statements into logs or whatever but printing won't get you far.
I finally figured it out. I have divided process into multiple parts. I then make AJAX call to access it.
My urls.py file:
from django.conf.urls import url
from .views import tag, check_tagid, scan_tag
urlpatterns = [
url(r'^tag/', tag, name="tag"),
url(r'^check_tagid/', check_tagid, name="check_tagid"),
url(r'^scan_tag/', scan_tag, name="scan_tag"),
]
My views.py file:
from django.shortcuts import render
from django.http import HttpResponse, JsonResponse
from .models import Student
import serial
def check_tagid(request):
"""Check if scanned tag id is already associated or not"""
tagid = request.GET.get('tagid', None)
if Student.objects.filter(tag_id__iexact=tagid).exists():
data = {
'is_taken': Student.objects.filter(tag_id__iexact=tagid).exists(),
'student': Student.objects.get(tag_id=tagid).name
}
else:
data = {
'is_taken': Student.objects.filter(tag_id__iexact=tagid).exists(),
}
return JsonResponse(data)
def scan_tag(request):
"""Function to check if tag is scanned or not"""
if request.is_ajax():
## Init RFID reader
data = serial.Serial(
port='/dev/ttyUSB1',
baudrate = 9600,
parity=serial.PARITY_NONE,
stopbits=serial.STOPBITS_ONE,
bytesize=serial.EIGHTBITS
)
context = {'tagid': data.read(12).decode("utf-8")}
return JsonResponse(context)
else:
return HttpResponse("This route only handles AJAX requests")
def tag(request):
return render(request, 'students/tag.html', {})
my tag.html file:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
</head>
<body>
<center><h1><div id="tag_id">Scanning for tag</div></h1></center>
<div id="if_exists"></div>
<script src="https://code.jquery.com/jquery-3.1.0.min.js"></script>
<script type="text/javascript">
$(document).ready(function() {
$.get("/scan_tag/", function(data) {
console.log(data.tagid);
var x = document.createElement("INPUT");
x.setAttribute("id", "tag_id_in");
x.setAttribute("type", "hidden");
x.setAttribute("value", data.tagid);
document.body.appendChild(x);
// check tag
tagid = $("#tag_id_in").val();
console.log(tagid);
$.ajax({
type: "GET",
url: "{% url 'check_tagid' %}",
data: {
'tagid': tagid
},
success: function(data) {
if (data.is_taken) {
document.getElementById('tag_id').innerHTML = "Welcome "+data.student;
} else {
document.getElementById('tag_id').innerHTML = "No student with this tag id found";
}
}
});
});
});
</script>
</body>
</html>
All of this is now working as expected.
So, /tag/ is the main entry point. Now in tag.html I use jQuery get method on /scan_tag/. It will return RFID tag number. Then I call AJAX on /check_tagid/ and it will true/false depending on whether the student is in the database or not.

Python Bottle SSE

I'm trying to get Server Sent Events to work from Python, so I found a little demo code and to my surprise, it only partly works and I can't figure out why. I got the code from here and put in just a couple little changes so I could see what was working (I included a print statement, an import statement which they clearly forgot, and cleaned up their HTML to something I could read a little easier). It now looks like this:
# Bottle requires gevent.monkey.patch_all() even if you don't like it.
from gevent import monkey; monkey.patch_all()
from gevent import sleep
from bottle import get, post, request, response
from bottle import GeventServer, run
import time
sse_test_page = """
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8" />
<script src="http://cdnjs.cloudflare.com/ajax/libs/jquery/1.8.3/jquery.min.js "></script>
<script>
var es = new EventSource("/stream");
es.onmessage = function(e) {
document.getElementById("log").innerHTML = e.data;
}
</script>
</head>
<body>
<h1>Server Sent Events Demo</h1>
<p id="log">Response Area</p>
</body>
</html>
"""
#get('/')
def index():
return sse_test_page
#get('/stream')
def stream():
# "Using server-sent events"
# https://developer.mozilla.org/en-US/docs/Server-sent_events/Using_server-sent_events
# "Stream updates with server-sent events"
# http://www.html5rocks.com/en/tutorials/eventsource/basics/
response.content_type = 'text/event-stream'
response.cache_control = 'no-cache'
# Set client-side auto-reconnect timeout, ms.
yield 'retry: 100\n\n'
n = 1
# Keep connection alive no more then... (s)
end = time.time() + 60
while time.time() < end:
yield 'data: %i\n\n' % n
print n
n += 1
sleep(1)
if __name__ == '__main__':
run(server=GeventServer, port = 21000)
So here's what ends up happening: I can see the original header and paragraph on the website, but response area never changes. On the python side, it prints n once per second, but I never see that change on the web page. I get the feeling that I just lack a fundamental understanding of what I'm trying to do but I can't find anything missing.
I'm running Python 2.7, windows 7, chrome 43.0.2357.81 m.
EDIT: I got rid of the extra quotation mark. Now it only seems to update when it gets to 60 (which I guess is better than not at all...)
Why would it wait until the end of the function to send the event?
You've got 2 sets of quotes after p id="log""

AJAX Submission Form using Bottle (Python)

I'm having some issues getting AJAX communication working using the Bottle framework. This is my first time using AJAX, so it's likely I just have the basics wrong. Hopefully a Bottle/AJAX guru can point this novice in the right direction. Here is the code I'm using:
#!/usr/bin/env python
from bottle import route, request, run, get
# Form constructor route
#route('/form')
def construct_form():
return '''
<html>
<head>
<script type="text/javascript">
function loadXMLDoc()
{
xmlhttp = new XMLHTTPRequest();
xmlhttp.onReadyStateChange = function()
{
if(xmlhttp.readyState == 4 && xmlhttp.status == 200)
{
document.getElementById("responseDiv").innerHTML = xmlhttp.responseText;
}
}
xmlhttp.open("GET", "/ajax", true);
xmlhttp.send();
}
</script>
</head>
<body>
<form>
<input name="username" type="text"/>
<input type="button" value="Submit" onclick="loadXMLDoc()"/>
</form>
<div id="responseDiv">Change this text to what you type in the box above.</div>
</body>
</html>
'''
# Server response generator
#route('/ajax', method='GET')
def ajaxtest():
inputname = request.forms.username
if inputname:
return 'You typed %s.' % (inputname)
return "You didn't type anything."
run(host = 'localhost', port = '8080')
There are a few issues here.
Javascript is case sensitive. XMLHTTPRequest should be XMLHttpRequest. You should have seen an error about this in your Javascript console.
onReadyStateChange should be onreadystatechange.
If you fix the above two issues your AJAX call will work, but you will only ever get the 'You didn't type anything.' response. This is because you are using GET. You need to change your code so the form values are posted using the POST method.
Also, why aren't you using jQuery to do AJAX? It would make your life much easier. :)

thread posting script, invision powerboard

So there's two things that we do on a certain site, which is download new files and post files, and they're easily definable (and tedious), so I've been trying to script it.
So anyways, I have the file downloading script done. I'm able to re use the login code
#Cookies
cookies = http.cookiejar.LWPCookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cookies))
urllib.request.install_opener(opener)
#Authenticate user
print("logging in")
url = "http://someurl.com/index.php?app=core&module=global&section=login&do=process"
values = {"username" : USERNAME,
"password" : PASSWORD}
data = urllib.parse.urlencode(values)
req = urllib.request.Request(url, data)
urllib.request.urlopen(req)
But to post a thread with said file you have to attach it. I know how to send it the proper title and text for the thread from what I learned on how to log in. My problem is that to attach files you have to send a request to a form not a page request.
E.G. You select what file you want from file dialog and then click attach files, which it then uploads, you then finish writing up you're thread and THEN submit page.
Here's relevant html
<fieldset class='attachments'>
<script type='text/javascript'>
//<![CDATA[
ipb.lang['used_space'] = "Used <strong>[used]</strong> of your <strong>[total]</strong> global upload quota (Max. single file size: <strong>256MB</strong>)";
//]]>
</script>
<h3 class='bar'>Attachments</h3>
<!--SKINNOTE: traditional uploader needs this. -->
<div id='attach_error_box' class='message error' style='display:none'></div>
<input type='file' id='nojs_attach_0_1' class='input_upload' name='FILE_UPLOAD' tabindex='1' />
<input type='file' id='nojs_attach_0_2' class='input_upload' name='FILE_UPLOAD' tabindex='1' />
<ul id='attachments'><li style='display: none'></li></ul>
<br />
<span id='buttonPlaceholder'></span>
<input type='button' id='add_files_attach_0' class='input_submit' value='Attach This File' style='display: none; clear: both' tabindex='1' /> <span class='desc' id='space_info_attach_0'>Used <strong>9.45MB</strong> of your <strong>976.56MB</strong> global upload quota (Max. single file size: <strong>256MB</strong>)</span>
I have no idea how to code this, so I'm looking for direction.
Also on a sidenote if this sorta script might've been easier in other languages tell me which? I only used python because I knew it.
Thanks a lot.

Categories

Resources