I am willing to show Google Analytics and Google Search Console data directly into Superset through their API.
Make direct queries to Google Analytics API in JSON (instead of storing the results into my database then showing them into Superset) and show the result in Superset
Make direct queries to Google Search Console API in JSON and show the result in Superset
Make direct queries to other amazing JSON APIs and show the result in Superset
How can I do so?
I couldn't find a Google Analytics datasource. I couldn't find a Google Search Console datasource either.
I can't find a way to display in Superset data retrieved from an API, only data stored in a database. I must be missing something, but I can't find anything in the docs related to authenticating & querying external APIs.
Superset can’t query external data API’s directly. Superset has to work with a supported database or data engine (https://superset.incubator.apache.org/installation.html#database-dependencies). This means that you need to find a way to fetch data out of the API and store it in a supported database / data engine. Some options:
Build a little Python pipeline that will query the data API, flatten the data to something tabular / relational, and upload that data to a supported data source - https://superset.incubator.apache.org/installation.html#database-dependencies - and set up Superset so it can talk to that database / data engine.
For more robust solutions, you may want to work with an devops / infrastructure to stand up a workflow scheduler like Apache Airflow (https://airflow.apache.org/) to regularly ping this API and store it in a database of some kind that Superset can talk to.
If you want to regularly query data from a popular 3rd party API, I also recommend checking out Meltano and learning more about Singer taps. These will handle some of the heavy lifting of fetching data from an API regularly and storing it in a database like Postgres. The good news is that there's a Singer tap for Google Analytics - https://github.com/singer-io/tap-google-analytics
Either way, Superset is just a thin layer above your database / data engine. So there’s no way around the reality that you need to find a way to extract data out of an API and store it in a compatible data source.
There is this project named shillelagh by one of Superset's contributors. This gives a SQL interface to REST APIs. This same package is used in Apache Superset to connect with gsheets.
New adapters are relatively easy to implement. There's a step-by-step tutorial that explains how to create a new adapter to an API or filetype in shillelagh.
The package shillelagh underlying uses SQLite Virtual Tables by using the SQLite wrapper APSW
Redash is an alternative to Superset for that task, but it doesn't have the same features. Here is a compared list of integrations for both tools: https://discuss.redash.io/t/a-comparison-of-redash-and-superset/1503
A quick alternative is paying for a third party service like: https://www.stitchdata.com/integrations/google-analytics/superset/
There is no such connector available by default.
A recommended solution would be storing your Google Analytics and Search Console data in a database, you could write a script that pulls data every 4 hours or whichever interval works for you.
Also, you shouldn't store all data but only the dimension/metrics you wish to see in your reports.
Related
I am currently getting a lot of data via API and I would like to display it on a dynamic dashboard.
So far, I saw that I could use Grafana, but it seems to require a database such as InfluxDB.
Is it possible to use Grafana without storing the data I get via API into a database, and then display only the data I get each with each request?
You can define a RESTful API endpoint as a datasource using SimpleJson datasource plugin. In this way, you are able to remove the direct dependency to a database. However, your back-end needs to implement certain URLs and conforms to the plugin's request/response formats. I would recommend that you have a look at this link for a sample implementation, and see if it really meets your specific requirement.
I am looking to back up user-generated data (user profiles, that may change from time to time) from my AppEngine python application into Google Cloud Storage. I could easily periodically brute-force back up all of the user-generated data, but it probably makes more sense to only update data that has changed (only writing it to the cloud storage if the user has changed their data). Later, in the case that data needs to be restored, I would like to take advantage of the object-versioning functionality of the Cloud Storage service to determine which objects need to be restored.
I am trying to understand exactly how the google cloud storage interacts with AppEngine based on the information regarding cloudstorage.open() found at https://developers.google.com/appengine/docs/python/googlecloudstorageclient/functions. However, there is no indication of how this service interacts with versioned objects that are stored in the cloud (versioned objects are documented here: https://developers.google.com/storage/docs/object-versioning).
So, my question is: how can an application running on the AppEngine access specific versions of objects that are stored in Google Cloud Storage.
If there is a better way of doing this, I would be interested in hearing about it as well.
The AppEngine GCS Client Library doesn't support versioning at this time. If you enable versioning on a bucket through other channels, the GCS Client Library will keep working fine, but in order to access or delete older generations of objects, you'll need to use either the XML API or the JSON API (as opposed to the appengine-specific API). There is a Python client for the JSON API that works fine from within appengine, but you'll lose a few of appengine's niceties by using it. See https://developers.google.com/appengine/docs/python/googlecloudstorageclient/#gcs_rest_api for more details.
Here's a bit of info on how to use versioning from the XML and JSON APIs: https://developers.google.com/storage/docs/generations-preconditions
In my company we want to build an application in Google app engine which will manage user provisioning to Google apps. But we do not really know what data source to use?
We made two propositions :
spreadsheet which will contains users' data and we will use spreadsheet API to get this data and use it for user provisioning
Datastore which will contains also users' data and this time we will use Datastore API.
Please note that my company has 3493 users and we do not know too many advantages and disadvantages of each solution.
Any suggestions please?
If you use the Datastore API, you will also need to build out a way to manage users data in the system.
If you use Spreadsheets, that will serve as your way to manage users data, so in that way managing the data would be taken care of for you.
The benefits to use the Datastore API would be if you'd like to have a seamless integration of managing the user data into your application. Spreadsheet integration would remain separate from your main application.
My friend has a website built using Pyramid framework and using MongoDB to store data. If I want to build an iPhone app, how do I access the data from that database?
I know Obj-C and have built simple, iOS apps, but none of them used non-local data. I've googled but no good result returned. I just don't know where to start. Any good tutorial or sample code on the related issue would be appreciated!!
As far as best practices go, you would not want to be accessing MongoDB (or any database) directly over the internet without appropriate security considerations.
The most straightforward option from iOS would probably be either add a RESTful interface to your own application, or use a third party hosted solution that provides an API. In either case I would recommend using https in addition to authentication, as the MongoDB wire protocol is not currently encrypted.
For iOS I would consider using the RestKit framework as a handy helper. It includes reasonable documentation and examples to get you started.
How to store data on local machine using Python (and it's libraries, extensions ...) to easily access data (in OOP manner) which is similar to Google Python data store for App Engine?
You should define "similar" to get more accurate answers... but here's my first attempt: use mongodb in conjunction with pymongo.
My answer is based on the idea that both Google's data store and mongodb are schemaless databases, and that mongodb uses BSON (Binary JSON), where the "O" stands for "Object".
EDIT: Apparently the very genesis of mongodb is based on an attempt to imitate the Google stack.
Does this help?
What about installing appengine dev SDK and use it locally ? Its internally storing the data in an sqllite db, but the python usage is same.
Recommended only if you just want to mimic the appengine style usage.