I have a large amount of users (over 400k) that have been sent a survey to complete. As part of logging into my site I'm using the surveymonkey api to check to see if they completed their assigned survey. I'm keying on email address. I'm thinking of using:
https://developer.surveymonkey.com/mashery/get_respondent_list
however, I don't want to page through all 400k users to find a specific email - anyway to do this search more efficiently?
Using django backend to ping the surveymonkey api
get_respondent_list allows you to search for respondents by modified date/time range. For 400K respondents, you should store the results in a local database and only query the API when the email address you're looking for isn't found locally.
To avoid having to parse the whole list every time, you should only get new respondents since the last time your checked using that date/time range feature and add the new respondents to your DB. There is some example code which illustrates polling for new respondents based on date/time range on SurveyMonkey's public GitHub here:
https://github.com/SurveyMonkey/python_guides/blob/master/guides/polling.py
Related
I use an API named Sheetsu to retrieve data from a Googlesheet.
Here is an example of sheetsu in python to retreive data depending the parameter date:
data = client.search(sheet="Session", date=03-06-2018)
So this code allow me to retrieve every rows from my sheet call session where colomn date equal 03-06-2018.
What i don't manage to do is to retrieve, with a time value like 16:30:00, every row where the value 16:30:00 is between 2 datetime.
So i would like to know if it is possible to retrieve the data with sheetsu or i should use an another API or if i could use a librairies like datetime to pick the data from sheetsu.
Google Sheets API is available and there's a Dialogflow-Google Sheets sample up on Github I'd recommend taking a look at to get started. You'll need to ensure that your service account client_email has permission to access that specific spreadsheetId of interest. This sample goes into the necessary auth steps but take a look at the Sheets documentation as well.
I am trying to access historical data of mentions of a few keywords for a data analysis project using Reddit API. Utilizing python's wonderful easy-to-use PRAW package to fetch the data. Does anyone know if Reddit api has any functionality that allows historical access to data from a subreddit?
You can only get the last 1000 items for a specific view. Use the Subreddit's submissions property.
You can get the different views. I describe some of the other views in a reddit comment I made:
yes. generally speaking you can get the last 1000 items in a listing
(/r/all and /r/popular listings are higher), regardless of how long
ago it is.
to get more than 1000 items:
check all of the views (/r/subreddit/top, etc) and over all of the time scales
check all of the moderation queues (with parameter only=links):
unmoderated (/about/unmoderated)
moderation queue (/about/modqueue)
spam (/about/spam)
edited (/about/edited)
reports (/about/reports)
if this is a public subreddit, consider also using pushshift.io
I've been using the Twitter API (with Python) for quite some time, but I'm unable to search for Twitter users having a specific criterion. For example, the API has several user attributes in the user JSON data it returns, like statuses_count or profile_link_color. But how do I do a reverse search using such parameters, like searching for users who have tweeeted more than 1000 times, or users who have created their accounts last week?
Based on the documentation, it looks like you can search for users that fulfill certain criteria with a GET users/search query:
Provides a simple, relevance-based search interface to public user
accounts on Twitter. Try querying by topical interest, full name,
company name, location, or other criteria.
We have a table in Azure Table Storage that is storing a LOT of data in it (IoT stuff). We are attempting a simple migration away from Azure Tables Storage to our own data services.
I'm hoping to get a rough idea of how much data we are migrating exactly.
EG: 2,000,000 records for IoT device #1234.
The problem I am facing is in getting a count of all the records that are present in the table with some constrains (EG: Count all records pertaining to one IoT device #1234 etc etc).
I did some fair amount of research to find posts that say that this count feature is not implemented in the ATS. These posts however, were circa 2010 to 2014.
I'm assumed (hoped) that this feature has been implemented now since it's now 2017 and I'm trying to find docs to it.
I'm using python to interact with out ATS.
Could someone please post the link to the docs here that show how I can get the count of records using python (or even HTTP / rest etc)?
Or if someone knows for sure that this feature is still unavailable, that would help me move on as well and figure another way to go about things!
Thanks in advance!
Returning number of entities in the table storage is for sure not available in Azure Table Storage SDK and service. You could make a table scan query to return all entities from your table but if you have millions of these entities the query will probably time out. it is also going to have pretty big perf impact on your table. Alternatively you could try making segmented queries in a loop until you reach the end of the table.
Or if someone knows for sure that this feature is still unavailable,
that would help me move on as well and figure another way to go about
things!
This feature is still not available or in other words as of today there's no API which will give you a count of total number of rows in a table. You would have to write your own code to do so.
Could someone please post the link to the docs here that show how I
can get the count of records using python (or even HTTP / rest etc)?
For this you would need to list all entities in a table. Since you're only interested in the count, you can reduce the size response data by making use of Query Projection and fetching just one or two attributes of the entities (may be PartitionKey and RowKey). Please see my answer here for more details: Count rows within partition in Azure table storage.
Is it possible to get lifetime data from using facebookads api on python? I tried to use date_preset:lifetime and time_increment:1, but got a server error instead. And, then I found this on their website:
"We use data-per-call limits to prevent a query from retrieving too much data beyond what the system can handle. There are 2 types of data limits:
By number of rows in response, and
By number of data points required to compute the total, such as summary row."
Any way I can do this? And, another question, is there like any way to pull raw data from facebook ad account, like a dump of all the data that resides on facebook for an ad account?
The first thing is to try is to add the limit parameter, which limits the number of results returned per page.
However, if the account has a large amount of history, the likelihood is that the total amount of data is too great, and in this case, you'll have to query ranges of data.
As you're looking for data by individual day, I'd start trying to query for month blocks, and if this is still too much data, query for each date individually.