Getting the upvotes of an answer using Py-Stackexchange - python

I'm using Py_Stackexchange to pull data from Stackoverflow for some statistical analysis, and stumbled upon a problem.
I need to retrieve the upvotes and downvotes on an answer. I have the stackexchange.Answer object, and it has a field called 'transfers' which is a tuple of strings like:
'is_accepted', 'locked_date', 'question_id', 'up_vote_count', 'down_vote_count',
'view_count', 'score', 'community_owned', 'title', 'body'
How do I get the actual numerical values corresponding to these fields?

I utilized the question demo, provided by Py-Stackexchange for this answer.
The biggest thing you need to do is ensure that your filter includes the up_vote_count and down_vote_count attributes.
Once you have this filter, you can access the value by question.up_vote_count (or answer.up_vote_count if you are checking an answer).
As an example, I modified line 22 in the demo, to include these two attributes in the filter:
question = site.question(id, filter="!b0OfMwwD.s*79x")
Filters can be created here.
Then I added this line at the very end of the script:
print('%d Upvotes.' % question.up_vote_count)
When I run it against this question, I get this output:
Please enter an API key if you have one (Return for none):
Enter a question ID: 26143702
--- Getting the upvotes of an answer using Py-Stackexchange ---
<p>I'm using <code>Py_Stackexchange</code> to pull data from <code>Stackoverflow</code> for some statistical analysis, and stumbled upon a problem.</p>
<p>I need to retrieve the upvotes and downvotes on an answer. I have the <code>stackexchange.Answer</code> object, and it has a field called 'transfers' which is a tuple of strings like:</p>
<pre><code>'is_accepted', 'locked_date', 'question_id', 'up_vote_count', 'down_vote_count',
'view_count', 'score', 'community_owned', 'title', 'body'
</code></pre>
<p>How do I get the actual numerical values corresponding to these fields?</p>
0 answers.
1 Upvotes.

Related

How to print particular id(i.e word ) from the result

Hi i have the following code:
Z = [ [<Entity:0*7fasdas55c:type1101(1101,NGRID)id:-2600>, <Entity:0*5fafaef45c:type1101(1101,NGRID)id:-3665>]
, [<Entity:0*7fasdas55c:type1101(1101,NGRID)id:-5600>, <Entity:0*5fafaef45c:type1101(1101,NGRID)id:-545465>] ]
edge1= ansa.basecollectentity(constant.nastran, Z[0],'NODE')
print(edge1)
and my result is
[<Entity:0*7fasdas55c:type1101(1101,NGRID)id:-2600>, <Entity:0*5fafaef45c:type1101(1101,NGRID)id:-3665>]
Enen though code is written in ansa python, my question is General
I would like to write a code such that it goes through the 'edge1' and prints the number after ids with two different names: like
Node1= 2600
Node2= 3665
Pls help me with writing the code, thanks in advance
Each class controls its own printable representation with the __repr__() special method.
The number you're looking at, id: could potentially be anywhere in the Entity, in any field, or somewhere in an internal datastructure, or nowhere and calculated at display time. It might easily be an id property as #PM2Ring's comment suggests - but it might not be.
So it's either a very specific question - you need to examine the Entity for an appropriate field or method to get the ID. And you haven't said what it is, so that could be anything.
Or it's a general question about processing the repr() value - which is probably not what you want to do ever, really.
But if you did want to, it would be:
for count, item in enumerate(edge1):
id = repr(item).split(':')[-1].rstrip('>')
print "Node" + str(count), id
in ansa say you have a entity eg:
nod=<Entity:0*7fasdas55c:type1101(1101,NGRID)id:-2600>
to print id you can use:
Print(nod._id)
result:
2600
you can also use ._type to get the type of entity you are dealing with
hope it helps

Sort by a value in a many to one field with a django queryset?

I have a data model like this:
class Post(models.Model)
name = models.CharField(max_length=255)
class Tag(models.Model)
name = models.CharField(max_length=255)
rating = models.FloatField(max_length=255)
parent = models.ForeignKey(Post, related_name="tags")
I want to get Posts that have a tag, and order them by the tags rating.
something like:
Posts.objects.filter(tags__name="exampletag").order_by("tags(name=exampletag)__rating")
Currently, I am thinking it makes sense to do something like
tags = Tags.objects.filter(name="sometagname").order_by("rating")[0:10]
posts = [t.parent for t in tags]
But I like to know if there is a better way, preferably querying Post, and getting me back a queryset.
Edit:
I don't think this: (Edit 2 - this does give the correct sorting!)
Posts.objects.filter(tags__name="exampletag").order_by("tags__rating")
will give the correct sorting, as it does not sort only by the related item with name "exampletag"
Something like the following would be needed
Posts.objects.filter(tags__name="exampletag").order_by("tags(name=exampletag)__rating")
I've been looking over the django docs, and it seem "annotate" nearly works - but I don't see a way to use it to select a tag by name.
Edit 2
Both the Answers are correct! See my comments to observe some epic brain-farts (one test, the results WERE in order, the other i filter and sort by different tags!)
how it works
the query
Posts.objects.filter(tags__name="exampletag").order_by("tags__rating")
and
Posts.objects.filter(tags__name="exampletag").filter(tags__name="someothertag").order_by("tags__rating")
will work correctly and by sorted by the rating of "exampletag"
it seems the tag(From a ForeignKey BackReference Set) used for sorting when calling order_by is the one in the first filter.
You can do like:
tags = Tags.objects.filter(name="sometagname")
posts = Post.objects.filter(tags__in=tags).order_by('tags__rating')
Even shorter than Anush's, with a JOIN rather than a subquery:
Post.objects.filter(tags__name='exampletag').order_by('tags__rating')

How to extract FreeText answer from an assignment using boto

I'm trying to extract free-text answer submitted by workers of Amazon Mechanical Turk using the boto library.
assignments = conn.get_assignments(hit_id)
for assignment in assignments:
worker = assignment.WorkerId
answer = assignment.Answer
Here I expect answer to be a free-text string (the only thing that the HIT asks workers to submit) submitted by a worker, however, the code above doesn't give me that. What am I missing here?
In boto in order to get the FreeText information you are looking for, you'll need to iterate over the assignment property answers. Unless you have submitted multiple forms, your form should be the first index.
This list is of type QuestionFormAnswer
Here is boto documentation on QuestionFormAnswer
http://sourcecodebrowser.com/python-boto/2.3.0/classboto_1_1mturk_1_1connection_1_1_question_form_answer.html
You can see that the properties you actually want are qid and fields
Here is some updated code that should make better sense.
assignments = conn.get_assignments(hit_id)
for assignment in assignments:
worker_id = assignment.WorkerId
# Iterate through question forms answers which are our fields
for question_form_answer in assignment.answers[0]:
field_id = question_form_answer.qid
field_value = question_form_answer.fields
I think the assignment object in the above example will have an attribute called answers which is a list of QuestionFormAnswer objects. Each of these objects should have an attribute called FreeText.

Can a formfield be selected w/mechanize based on the type of the field (eg. TextControl, TextareaControl)?

I'm trying to parse an html form using mechanize. The form itself has an arbitrary number of hidden fields and the field names and id's are randomly generated so I have no obvious way to directly select them. Clearly using a name or id is out, and due to the random number of hidden fields I cannot select them based on the sequence number since this always changes too.
However there are always two TextControl fields right after each other, and then below that is a TextareaControl. These are the 3 fields I need access too, basically I need to parse their names and all is well. I've been looking through the mechanize documentation for the past couple hours and haven't come up with anything that seems to be able to do this, however simple it should seem to be (to me anyway).
I have come up with an alternate solution that involves making a list of the form controls, iterating through it to find the controls that contain the string 'Text' returning a new list of those, and then finally stripping out the name using a regular expression. While this works it seems unnecessary and I'm wondering if there's a more elegant solution. Thanks guys.
edit: Here's what I'm currently doing to extract that info if anyone's curious. I think I'm probably just going to stick with this. It seems unnecessary but it gets the job done and it's nothing intensive so I'm not worried about efficiency or anything.
def formtextFieldParse(browser):
'''Expects a mechanize.Browser object with a form already selected. Parses
through the fields returning a tuple of the name of those fields. There
SHOULD only be 3 fields. 2 text followed by 1 textarea corresponding to
Posting Title, Specific Location, and Posting Description'''
import re
pattern = '\(.*\)'
fields = str(browser).split('\n')
textfields = []
for field in fields:
if 'Text' in field: textfields.append(field)
titleFieldName = re.findall(pattern, textfields[0])[0][1:-2]
locationFieldName = re.findall(pattern, textfields[1])[0][1:-2]
descriptionFieldName = re.findall(pattern, textfields[2])[0][1:-2]
I don't think mechanize has the exact functionality you require; could you use mechanize to get the HTML page, then parse the latter for example with BeautifulSoup?

Aggregating across columns in Django

I'm trying to figure out if there's a way to do a somewhat-complex aggregation in Django using its ORM, or if I'm going to have to use extra() to stick in some raw SQL.
Here are my object models (stripped to show just the essentials):
class Submission(Models.model)
favorite_of = models.ManyToManyField(User, related_name="favorite_submissions")
class Response(Models.model)
submission = models.ForeignKey(Submission)
voted_up_by = models.ManyToManyField(User, related_name="voted_up_responses")
What I want to do is sum all the votes for a given submission: that is, all of the votes for any of its responses, and then also including the number of people who marked the submission as a favorite.
I have the first part working using the following code; this returns the total votes for all responses of each submission:
submission_list = Response.objects\
.values('submission')\
.annotate(votes=Count('voted_up_by'))\
.filter(votes__gt=0)\
.order_by('-votes')[:TOP_NUM]
(So after getting the vote total, I sort in descending order and return the top TOP_NUM submissions, to get a "best of" listing.)
That part works. Is there any way you can suggest to include the number of people who have favorited each submission in its votes? (I'd prefer to avoid extra() for portability, but I'm thinking it may be necessary, and I'm willing to use it.)
EDIT: I realized after reading the suggestions below that I should have been clearer in my description of the problem. The ideal solution would be one that allowed me to sort by total votes (the sum of voted_up_by and favorited) and then pick just the top few, all within the database. If that's not possible then I'm willing to load a few of the fields of each response and do the processing in Python; but since I'll be dealing with 100,000+ records, it'd be nice to avoid that overhead. (Also, to Adam and Dmitry: I'm sorry for the delay in responding!)
One possibility would be to re-arrange your current query slightly. What if you tried something like the following:
submission_list = Response.objects\
.annotate(votes=Count('voted_up_by'))\
.filter(votes__gt=0)\
.order_by('-votes')[:TOP_NUM]
submission_list.query.group_by = ['submission_id']
This will return a queryset of Response objects (objects with the same Submission will be lumped together). In order to access the related submission and/or the favorite_of list/count, you have two options:
num_votes = submission_list[0].votes
submission = submission_list[0].submission
num_favorite = submission.favorite_of.count()
or...
submissions = []
for response in submission_list:
submission = response.submission
submission.votes = response.votes
submissions.append(submission)
num_votes = submissions[0].votes
submission = submissions[0]
num_favorite = submission.favorite_of.count()
Basically the first option has the benefit of still being a queryset, but you have to be sure to access the submission object in order to get any info about the submission (since each object in the queryset is technically a Response). The second option has the benefit of being a list of the submissions with both the favorite_of list as well as the votes, but it is no longer a queryset (so be sure you don't need to alter the query anymore afterwards).
You can count favorites in another query like
favorite_list = Submission.objects.annotate(favorites=Count(favorite_of))
After that you add the values from two lists:
total_votes = {}
for item in submission_list:
total_votes[item.submission.id] = item.voted_by
for item in favorite_list:
has_votes = total_votes.get(item.id, 0)
total_votes[item.id] = has_votes + item.favorites
I am using ids in the dictionary because Submission objects will not be identical. If you need the Submissions themselves, you may use one more dictionary or store tuple (submission, votes) instead of just votes.
Added: this solution is better than the previous because you have only two DB requests.

Categories

Resources