Google Analytics and Python

Google Analytics and Python - python

I'm brand new at Python and I'm trying to write an extension to an app that imports GA information and parses it into MySQL. There is a shamfully sparse amount of infomation on the topic. The Google Docs only seem to have examples in JS and Java...
...I have gotten to the point where my user can authenticate into GA using SubAuth. That code is here:
import gdata.service
import gdata.analytics
from django import http
from django import shortcuts
from django.shortcuts import render_to_response
def authorize(request):
next = 'http://localhost:8000/authconfirm'
scope = 'https://www.google.com/analytics/feeds'
secure = False # set secure=True to request secure AuthSub tokens
session = False
auth_sub_url = gdata.service.GenerateAuthSubRequestUrl(next, scope, secure=secure, session=session)
return http.HttpResponseRedirect(auth_sub_url)
So, step next is getting at the data. I have found this library: (beware, UI is offensive) http://gdata-python-client.googlecode.com/svn/trunk/pydocs/gdata.analytics.html
However, I have found it difficult to navigate. It seems like I should be gdata.analytics.AnalyticsDataEntry.getDataEntry(), but I'm not sure what it is asking me to pass it.
I would love a push in the right direction. I feel I've exhausted google looking for a working example.
Thank you!!
EDIT: I have gotten farther, but my problem still isn't solved. The below method returns data (I believe).... the error I get is: "'str' object has no attribute '_BecomeChildElement'" I believe I am returning a feed? However, I don't know how to drill into it. Is there a way for me to inspect this object?
def auth_confirm(request):
gdata_service = gdata.service.GDataService('iSample_acctSample_v1.0')
feedUri='https://www.google.com/analytics/feeds/accounts/default?max-results=50'
# request feed
feed = gdata.analytics.AnalyticsDataFeed(feedUri)
print str(feed)

Maybe this post can help out. Seems like there are not Analytics specific bindings yet, so you are working with the generic gdata.

I've been using GA for a little over a year now and since about April 2009, i have used python bindings supplied in a package called python-googleanalytics by Clint Ecker et al. So far, it works quite well.
Here's where to get it: http://github.com/clintecker/python-googleanalytics.
Install it the usual way.
To use it: First, so that you don't have to manually pass in your login credentials each time you access the API, put them in a config file like so:
[Credentials]
google_account_email = youraccount#gmail.com
google_account_password = yourpassword
Name this file '.pythongoogleanalytics' and put it in your home directory.
And from an interactive prompt type:
from googleanalytics import Connection
import datetime
connection = Connection() # pass in id & pw as strings **if** not in config file
account = connection.get_account(<*your GA profile ID goes here*>)
start_date = datetime.date(2009, 12, 01)
end_data = datetime.date(2009, 12, 13)
# account object does the work, specify what data you want w/
# 'metrics' & 'dimensions'; see 'USAGE.md' file for examples
account.get_data(start_date=start_date, end_date=end_date, metrics=['visits'])
The 'get_account' method will return a python list (in above instance, bound to the variable 'account'), which contains your data.

You need 3 files within the app. client_secrets.json, analytics.dat and google_auth.py.
Create a module Query.py within the app:
class Query(object):
def __init__(self, startdate, enddate, filter, metrics):
self.startdate = startdate.strftime('%Y-%m-%d')
self.enddate = enddate.strftime('%Y-%m-%d')
self.filter = "ga:medium=" + filter
self.metrics = metrics
Example models.py: #has the following function
import google_auth
service = googleauth.initialize_service()
def total_visit(self):
object = AnalyticsData.objects.get(utm_source=self.utm_source)
trial = Query(object.date.startdate, object.date.enddate, object.utm_source, ga:sessions")
result = service.data().ga().get(ids = 'ga:<your-profile-id>', start_date = trial.startdate, end_date = trial.enddate, filters= trial.filter, metrics = trial.metrics).execute()
total_visit = result.get('rows')
<yr save command, ColumnName.object.create(data=total_visit) goes here>

Related

Create activities for an user with Stream-Framework

I'm trying to setup stream-framework the one here not the newer getstream. I've setup the Redis server and the environment properly, the issue I'm facing is in creating the activities for a user.
I've been trying to create activities, following the documentation to add an activity but it gives me an error message as follows:
...
File "/Users/.../stream_framework/activity.py", line 110, in serialization_id
if self.object_id >= 10 ** 10 or self.verb.id >= 10 ** 3:
AttributeError: 'int' object has no attribute 'id'
Here is the code
from stream_framework.activity import Activity
from stream_framework.feeds.redis import RedisFeed
class PinFeed(RedisFeed):
key_format = 'feed:normal:%(user_id)s'
class UserPinFeed(PinFeed):
key_format = 'feed:user:%(user_id)s'
feed = UserPinFeed(13)
print(feed)
activity = Activity(
actor=13, # Thierry's user id
verb=1, # The id associated with the Pin verb
object=1, # The id of the newly created Pin object
)
feed.add(activity) # Error at this line
I think there is something missing in the documentation or maybe I'm doing something wrong. I'll be very grateful if anyone helps me get the stream framework working properly.

The documentation is inconsistent. The verb you pass to the activity should be (an instance of?*) a subclass of stream_framework.verbs.base.Verb. Check out this documentation page on custom verbs and the tests for this class.
The following should fix the error you posted:
from stream_framework.activity import Activity
from stream_framework.feeds.redis import RedisFeed
from stream_framework.verbs import register
from stream_framework.verbs.base import Verb
class PinFeed(RedisFeed):
key_format = 'feed:normal:%(user_id)s'
class UserPinFeed(PinFeed):
key_format = 'feed:user:%(user_id)s'
class Pin(Verb):
id = 5
infinitive = 'pin'
past_tense = 'pinned'
register(Pin)
feed = UserPinFeed(13)
activity = Activity(
actor=13,
verb=Pin,
object=1,
)
feed.add(activity)
I quickly looked over the code for Activity and it looks like passing ints for actor and object should work. However, it is possible that these parameters are also outdated in the documentation.
* The tests pass in classes as verb. However, the Verb base class has the methods serialize and __str__ that can only be meaningfully invoked if you have an object of this class. So I'm still unsure which is required here. It seems like in the current state, the framework never calls these methods, so classes still work, but I feel like the author originally intended to pass instances.

With the help of great answer by #He3lixxx, I was able to solve it partially. As the package is no more maintained, the package installs the latest Redis client for python which was creating too many issues so by installation redis-2.10.5 if using stream-framework-1.3.7, should fix the issue.
I would also like to add a complete guide to properly add activity to a user feed.
Key points:
If you are not using feed manager, then make sure to first insert the activity before you add it to the user with feed.insert_activity(activity) method.
In case of getting feeds with feed[:] throws an error something like below:
File "/Users/.../stream_framework/activity.py", line 44, in get_hydrated
activity = activities[int(self.serialization_id)]
KeyError: 16223026351730000000001005L
then you need to clear data for that user using the key format for it in my case the key is feed:user:13 for user 13, delete it with DEL feed:user:13, In case if that doesn't fix the issue then you can FLUSHALL which will delete everything from Redis.
Sample code:
from stream_framework.activity import Activity
from stream_framework.feeds.redis import RedisFeed
from stream_framework.verbs import register
from stream_framework.verbs.base import Verb
class PinFeed(RedisFeed):
key_format = 'feed:normal:%(user_id)s'
class UserPinFeed(PinFeed):
key_format = 'feed:user:%(user_id)s'
class Pin(Verb):
id = 5
infinitive = 'pin'
past_tense = 'pinned'
register(Pin)
feed = UserPinFeed(13)
print(feed[:])
activity = Activity(
actor=13,
verb=Pin,
object=1)
feed.insert_activity(activity)
activity_id = feed.add(activity)
print(activity_id)
print(feed[:])

Can CampaignPerformanceReportRequest return for all campaigns?

Trying to use the Bing Ads API to duplicate what I see on the Hourly report.
Unfortunately, even though I'm properly authenticated, the data I'm getting back is only for One Campaign (one which has like 1 impression per day). I can see the data in the UI just fine, but authenticated as the same user via the API, I can only seem to get back the smaller data set. I'm using https://github.com/BingAds/BingAds-Python-SDK and basing my code on the example:
def get_hourly_report(
account_id,
report_file_format,
return_only_complete_data,
time):
report_request = reporting_service.factory.create('CampaignPerformanceReportRequest')
report_request.Aggregation = 'Hourly'
report_request.Format = report_file_format
report_request.ReturnOnlyCompleteData = return_only_complete_data
report_request.Time = time
report_request.ReportName = "Hourly Bing Report"
scope = reporting_service.factory.create('AccountThroughCampaignReportScope')
scope.AccountIds = {'long': [account_id]}
# scope.Campaigns = reporting_service.factory.create('ArrayOfCampaignReportScope');
# scope.Campaigns.CampaignReportScope.append();
report_request.Scope = scope
report_columns = reporting_service.factory.create('ArrayOfCampaignPerformanceReportColumn')
report_columns.CampaignPerformanceReportColumn.append([
'TimePeriod',
'CampaignId',
'CampaignName',
'DeviceType',
'Network',
'Impressions',
'Clicks',
'Spend'
])
report_request.Columns = report_columns
return report_request
I'm not super familiar with these ad data APIs, so any insight will be helpful, even if you don't have a solution.

I spent weeks back and forth with Microsoft Support. Here's the result:
You can get logs out of the examples by adding this code:
import logging
logging.basicConfig(level=logging.INFO)
logging.getLogger('suds.client').setLevel(logging.DEBUG)
logging.getLogger('suds.transport').setLevel(logging.DEBUG)
The issue was related to the way the example is built. In the auth_helper.py file there is a method named authenticate that looks like this:
def authenticate(authorization_data):
# import logging
# logging.basicConfig(level=logging.INFO)
# logging.getLogger('suds.client').setLevel(logging.DEBUG)
# logging.getLogger('suds.transport.http').setLevel(logging.DEBUG)
customer_service = ServiceClient(
service='CustomerManagementService',
version=13,
authorization_data=authorization_data,
environment=ENVIRONMENT,
)
# You should authenticate for Bing Ads services with a Microsoft Account.
authenticate_with_oauth(authorization_data)
# Set to an empty user identifier to get the current authenticated Bing Ads user,
# and then search for all accounts the user can access.
user = get_user_response = customer_service.GetUser(
UserId=None
).User
accounts = search_accounts_by_user_id(customer_service, user.Id)
# For this example we'll use the first account.
authorization_data.account_id = accounts['AdvertiserAccount'][0].Id
authorization_data.customer_id = accounts['AdvertiserAccount'][0].ParentCustomerId
As you can see, at the very bottom, it says "For this example, we'll use the first account." It turns out that my company had 2 accounts. This was not configurable anywhere and I had no idea this code was here, but you can add a breakpoint here to see your full list of accounts. We only had 2, so I flipped the 0 to a 1 and everything started working.

Batch posting on blogger using gdata python client

I'm trying to copy all my Livejournal posts to my new blog on blogger.com. I do so by using slightly modified example that ships with the gdata python client. I have a json file with all of my posts imported from Livejournal. Issue is that blogger.com has a daily limit for posting new blog entries per day — 50, so you can imagine that 1300+ posts I have will be copied in a month, since I can't programmatically enter captcha after 50 imports.
I recently learned that there's also batch operation mode somewhere in gdata, but I couldn't figure how to use it. Googling didn't really help.
Any advice or help will be highly appreciated.
Thanks.
Update
Just in case, I use the following code
#!/usr/local/bin/python
import json
import requests
from gdata import service
import gdata
import atom
import getopt
import sys
from datetime import datetime as dt
from datetime import timedelta as td
from datetime import tzinfo as tz
import time
allEntries = json.load(open("todays_copy.json", "r"))
class TZ(tz):
def utcoffset(self, dt): return td(hours=-6)
class BloggerExample:
def __init__(self, email, password):
# Authenticate using ClientLogin.
self.service = service.GDataService(email, password)
self.service.source = "Blogger_Python_Sample-1.0"
self.service.service = "blogger"
self.service.server = "www.blogger.com"
self.service.ProgrammaticLogin()
# Get the blog ID for the first blog.
feed = self.service.Get("/feeds/default/blogs")
self_link = feed.entry[0].GetSelfLink()
if self_link:
self.blog_id = self_link.href.split("/")[-1]
def CreatePost(self, title, content, author_name, label, time):
LABEL_SCHEME = "http://www.blogger.com/atom/ns#"
# Create the entry to insert.
entry = gdata.GDataEntry()
entry.author.append(atom.Author(atom.Name(text=author_name)))
entry.title = atom.Title(title_type="xhtml", text=title)
entry.content = atom.Content(content_type="html", text=content)
entry.published = atom.Published(time)
entry.category.append(atom.Category(scheme=LABEL_SCHEME, term=label))
# Ask the service to insert the new entry.
return self.service.Post(entry,
"/feeds/" + self.blog_id + "/posts/default")
def run(self, data):
for year in allEntries:
for month in year["yearlydata"]:
for day in month["monthlydata"]:
for entry in day["daylydata"]:
# print year["year"], month["month"], day["day"], entry["title"].encode("utf-8")
atime = dt.strptime(entry["time"], "%I:%M %p")
hr = atime.hour
mn = atime.minute
ptime = dt(year["year"], int(month["month"]), int(day["day"]), hr, mn, 0, tzinfo=TZ()).isoformat("T")
public_post = self.CreatePost(entry["title"],
entry["content"],
"My name",
",".join(entry["tags"]),
ptime)
print "%s, %s - published, Waiting 30 minutes" % (ptime, entry["title"].encode("utf-8"))
time.sleep(30*60)
def main(data):
email = "my#email.com"
password = "MyPassW0rd"
sample = BloggerExample(email, password)
sample.run(data)
if __name__ == "__main__":
main(allEntries)

I would recommend using Google Blog converters instead ( https://code.google.com/archive/p/google-blog-converters-appengine/ )
To get started you will have to go through
https://github.com/google/gdata-python-client/blob/master/INSTALL.txt - Steps for setting up Google GData API
https://github.com/pra85/google-blog-converters-appengine/blob/master/README.txt - Steps for using Blog Convertors
Once you have everything setup , you would have to run the following command (its the LiveJournal Username and password)
livejournal2blogger.sh -u <username> -p <password> [-s <server>]
Redirect its output into a .xml file. This file can now be imported into a Blogger blog directly by going to Blogger Dashboard , your blog > Settings > Other > Blog tools > Import Blog
Here remember to check the Automatically publish all imported posts and pages option. I have tried this once before with a blog with over 400 posts and Blogger did successfully import & published them without issue
Incase you have doubts the Blogger might have some issues (because the number of posts is quite high) or you have other Blogger blogs in your account. Then just for precaution sake , create a separate Blogger (Google) account and then try importing the posts. After that you can transfer the admin controls to your real Blogger account (To transfer , you will first have to send an author invite , then raise your real Blogger account to admin level and lastly remove the dummy account. Option for sending invite is present at Settings > Basic > Permissions > Blog Authors )
Also make sure that you are using Python 2.5 otherwise these scripts will not run. Before running livejournal2blogger.sh , change the following line (Thanks for Michael Fleet for this fix http://michael.f1337.us/2011/12/28/google-blog-converters-blogger2wordpress/ )
PYTHONPATH=${PROJ_DIR}/lib python ${PROJ_DIR}/src/livejournal2blogger/lj2b.py $*
to
PYTHONPATH=${PROJ_DIR}/lib python2.5 ${PROJ_DIR}/src/livejournal2blogger/lj2b.py $*
P.S. I am aware this is not the answer to your question but as the objective of this answer is same as your question (To import more than 50 posts in a day) , Thats why I shared it. I don't have much knowledge of Python or GData API , I setup the environment & followed these steps to answer this question (And I was able to import posts from LiveJournal to Blogger with it ).

# build feed
request_feed = gdata.base.GBaseItemFeed(atom_id=atom.Id(text='test batch'))
# format each object
entry1 = gdata.base.GBaseItemFromString('--XML for your new item goes here--')
entry1.title.text = 'first batch request item'
entry2 = gdata.base.GBaseItemFromString('--XML for your new item here--')
entry2.title.text = 'second batch request item'
# Add each blog item to the request feed
request_feed.AddInsert(entry1)
request_feed.AddInsert(entry2)
# Execute the batch processes through the request_feed (all items)
result_feed = gd_client.ExecuteBatch(request_feed)

How do I apply a monkeypatch to GAE?

Can you tell me how I apply this patch to google app engine as in where to put it? Thank you
def _user_init(self, email=None, _auth_domain=None,
_user_id=None, federated_identity=None, federated_provider=None):
if not _auth_domain:
_auth_domain = os.environ.get('AUTH_DOMAIN')
assert _auth_domain
if email is None and federated_identity is None:
email = os.environ.get('USER_EMAIL', email)
_user_id = os.environ.get('USER_ID', _user_id)
federated_identity = os.environ.get('FEDERATED_IDENTITY',
federated_identity)
federated_provider = os.environ.get('FEDERATED_PROVIDER',
federated_provider)
if not email and not federated_identity:
raise UserNotFoundError
self.__email = email
self.__federated_identity = federated_identity
self.__federated_provider = federated_provider
self.__auth_domain = _auth_domain
self.__user_id = _user_id or None
users.User.__init__ = _user_init

Just use it as-is: Put that code in a module that gets imported before you use the relevant User module or datastore functionality. I included the relevant line to patch the code (the last line) with the patch itself.

Overriding the constructor like this is not safe. If the internal implementation of the Users API changes in production, your application could break without warning.
What are you trying to accomplish here? I don't see any custom logic; it looks like you've just copied the constructor from the SDK verbatim. If you need to add custom logic, try subclassing UserProperty and/or wrapping the users API calls instead.

I think, this belongs to some application as a grep within the appengine sdk, for 'federated_identity' does not result any clues. BTW, you should be doing the same. Grep (or WinGrep) for terms like 'federated' to see if this partial patch can be applied to any source.
Thanks for the updated link. The patch can be added to the file google/appengine/api/users.py
You might just need to add the last line: users.User.__init__ = _user_init
I could figure this out after checking the latest code in the svn.

I need help using the library.add_album feature of pylast (python last.fm api wrapper)

I am trying to access the library class of pylast, but must be doing something wrong. I can get most other features to work. The following is a code example which just takes the standard working example and adds what I believe to be the correct way of adding an album to my last.fm library:
import pylast
# You have to have your own unique two values for API_KEY and API_SECRET
# Obtain yours from http://www.last.fm/api/account for Last.fm
API_KEY = "80a1c765efb52869575821c03d93a30e" # this is a sample key
API_SECRET = "2ba567f5b0d74c6cc6a8d07ef2cbc2d"
# In order to perform a write operation you need to authenticate yourself
username = "astroid0"
password_hash = pylast.md5("xxx")
network = pylast.LastFMNetwork(api_key = API_KEY, api_secret =
API_SECRET, username = username, password_hash = password_hash)
# now you can use that object every where
artist = network.get_artist("System of a Down")
artist.shout("<3")
track = network.get_track("Iron Maiden", "The Nomad")
track.love()
track.add_tags(("awesome", "favorite"))
## This is the area causing trouble
library1 = pylast.Library(user = "astroid0", network = "LastFM")
album1 = network.get_album("The Rolling Stones", "Sticky Fingers")
library1.add_album(album1)
ss the library class of pylast, but must be doing something wrong. I can get most other features to work. The following is a code example which just takes the standard working example and adds what I believe to be the correct way of adding an album to my last.fm library:
library1 = pylast.Library(user = "astroid0", network = "LastFM")
album1 = network.get_album("The Rolling Stones", "Sticky Fingers")
library1.add_album(album1)
I am new to python, so I am sorry if this is obvious, I have just been stuck for days now, and decided to ask.

It's a bug in pylast.
Line 1957 (from trunk) should be:
params["artist"] = album.get_artist().get_name()
instead of:
params["artist"] = album.get_artist.get_name()
You can report the issue to the author here.

The answer by miles82 shows the bug and it's been reported to pylast.
Unfortunately there's been no updates in a few years so I've fixed this in my fork of pylast.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Google Analytics and Python - python

Maybe this post can help out. Seems like there are not Analytics specific bindings yet, so you are working with the generic gdata.

Related

Create activities for an user with Stream-Framework

Can CampaignPerformanceReportRequest return for all campaigns?

Batch posting on blogger using gdata python client

How do I apply a monkeypatch to GAE?

I need help using the library.add_album feature of pylast (python last.fm api wrapper)

Categories

Resources