Python Mechanize + GAEpython code - python

I am aware of previous questions regarding mechanize + Google App Engine,
What pure Python library should I use to scrape a website?
and Mechanize and Google App Engine.
Also there is some code here, which I cannot get to work on app engine, throwing
File “D:\data\eclipse-php\testpy4\src\mechanize\_http.py”, line 43, in socket._fileobject(”fake socket”, close=True)
File “C:\Program Files (x86)\Google\google_appengine\google\appengine\dist\socket.py”, line 42, in _fileobject
fp.fileno = lambda: None
AttributeError: ’str’ object has no attribute ‘fileno’
INFO 2009-12-14 09:37:50,405 dev_appserver.py:3178] “GET / HTTP/1.1″ 500 -
Is anybody willing to share their working mechanize + appengine code?

I have solved this problem, just change the code of mechanize._http.py, about line 43,
from:
try:
socket._fileobject("fake socket", close=True)
except TypeError:
# python <= 2.4
create_readline_wrapper = socket._fileobject
else:
def create_readline_wrapper(fh):
return socket._fileobject(fh, close=True)
to:
try:
# fixed start -- fixed for gae
class x:
pass
# the x should be an object, not a string,
# This is the key
socket._fileobject(x, close=True)
# fixed ended
except TypeError:
# python <= 2.4
create_readline_wrapper = socket._fileobject
else:
def create_readline_wrapper(fh):
return socket._fileobject(fh, close=True)

I managed to get mechanize code that runs on GAE, many thanks to
MStodd,
from GAEMechanize project http://code.google.com/p/gaemechanize/ and
If anybody needs the code, you can contact MStodd !
ps: the code is not on google code, so you have to contact the owner..
Cheers
don

I've uploaded the source of the gaemechanize project to a new project: http://code.google.com/p/gaemechanize2/
Insert usual caveats.

Related

Got Index out of Bound while trying to get status of VM compute_client.virtual_machines.get(resrc, vm, expand="instanceView")

I am getting below error when trying to get the status of the VM inside while loop:
File "C:\Users\RohitMishra\Documents\cost-controller-engine\services\service.py", line 533, in schedule
compute_client.virtual_machines.get(resrc, vm, expand="instanceView")
IndexError: list index out of range
Below is my code:
Status_vm = (
compute_client.virtual_machines.get(resrc, vm, expand="instanceView").instance_view.statuses[1].display_status)
print("STATUS VMMMM---",Status_vm)
if Status_vm =="VM deallocated":
compute_client.virtual_machines.begin_start(resrc, vm)
while True:
logger.info("enterd into while loop for starting VM..")
print("STATATAus >>>>>: ",Status_vm)
Status_vm = (
compute_client.virtual_machines.get(resrc, vm, expand="instanceView")
.instance_view.statuses[1]
.display_status
)
logger.info(f"Status.. #: {Status_vm}")
if Status_vm =="VM running":
break
else:
time.sleep(constants.SLEEP_TIME)
return "Successfully running your VM"
Could you guys help me with any code or logic in python to resolve this issue...
Like Laurent Mazuel said it would be great to have an issue with a full stack trace in https://github.com/Azure/azure-sdk-for-python/issues. I'm also wondering if it could be .instance_view.statuses[1] that you're calling on the returned object from compute_client.virtual_machines.get, are you sure there are at least 2 statuses on the instance view?

Problem using Yandex translater API in Python

I've been asked to translate some words, and I'm using Python to do it. Yandex has an API that is supposed to be used withing Python, documentation here :
https://pypi.org/project/yandex-translater/1.0/
I followed the steps, but Always get the same error that seems to be withing the API, or maybe I'm not setting Something right in my code.
The code goes as follow :
from yandex import Translater
tr = Translater()
tr.set_key('my API key not given here')
tr.set_text("Hello World")
tr.set_from_lang('en')
tr.set_to_lang('fr')
result = tr.translate()
print(result)
I then get this error :
File "C:\Users\BMQT\Desktop\Scraping\test.py", line 2, in <module>
tr = Translater()
File "C:\Program Files\Python37\lib\site-packages\yandex\Translater.py", line 23, in __init__
self.default_ui = locale.getlocale()[0].split('_')[0]
AttributeError: 'NoneType' object has no attribute 'split'
A quick look if you need in the translater.py goes as follow for line 23 :
self.default_ui = locale.getlocale()[0].split('_')[0]
Is the API broken or am I wrong in my code? Thanks for the answers!
I've used another api module called yandex_translate, and it works fine.
from yandex_translate import YandexTranslate
translate = YandexTranslate('mykey')
traduction =('Translate:', translate.translate('bonjour', 'fr-ar'))
print(traduction)
Don't know what was wrong with the previous one, maybe outdated.
translater object need to be created like this: tr = Translater.Translater()
from yandex import Translater
tr = Translater.Translater()
tr.set_key('my API key not given here')
tr.set_text("Hello World")
tr.set_from_lang('en')
tr.set_to_lang('fr')
result = tr.translate()
print(result)

Why is my try catch solution not working as I expect it

I am trying to validate a form in a django project and part of the validation is to check if a project exists.
The Environment:
python 3.6.3
django 1.10.8
python-keystoneclient 3.14.0
I have this check currently
def clean_projectname(self):
submitted_data = self.cleaned_data['projectname']
newproj = "PROJ-FOO" + submitted_data.upper()
keystone = osauth.connect()
try:
project = keystone.projects.find(name=newproj)
raise forms.ValidationError('The project name is already taken')
except NotFound:
return submitted_data
The try section will return either a project object or it will have a 404 not found Exception.
I have tried to except on the NotFound but Django gives me an error
name 'NotFound' is not defined
I would appreciate help with this.
Have you imported NotFound from python-keystoneclient? The only way your code would work is if you had this line somewhere else in your file:
from keystoneclient.exceptions import NotFound
I'm not aware of the NotFound Exception. Is it something you've written yourself or perhaps you meant to use a similar sound Django exception?
https://docs.djangoproject.com/en/2.0/ref/exceptions/

Create blob from URL in gae using python

I'm trying to create a blob from an image URL but I can't figure out how to do that.
I've already read the documentation from google about creating blob, but it talks only about creating blob from a form and using blobstore.create_upload_url('/upload_photo').
I read some question over there but I didn't found anything that could help me.
My app has a list of image URL and I want to save theese images into the blobstore so I will able to serve them afterwards. I think that blobstore is the solution but if there is a better way to do this, please tell me!
EDIT
I'm trying to use google cloud storage:
class Upload(webapp2.RequestHandler):
def get(self):
self.response.write('Upload blob from url')
url = 'https://500px.com/photo/163151085/evening-light-by-jack-resci'
url_preview = 'https://drscdn.500px.org/photo/163151085/q%3D50_w%3D140_h%3D140/d3b8d92296f9381a91f6d41b1c607c92?v=3'
result = urlfetch.fetch(url_preview)
if result.status_code == 200:
doSomethingWithResult(result.content)
self.response.write(url_preview)
def doSomethingWithResult(content):
gcs_file_name = '/%s/%s' % ('provajack-1993', 'prova.jpg')
content_type = mimetypes.guess_type('prova.jpg')[0]
with cloudstorage.open(gcs_file_name, 'w', content_type=content_type, options={b'x-goog-acl': b'public-read'}) as f:
f.write(content)
return images.get_serving_url(blobstore.create_gs_key('/gs' + gcs_file_name))
(found in stackoverflow) but this code give me an error:
*File "/base/data/home/apps/e~places-1993/1.394547465865256081/main.py", line 54, in doSomethingWithResult
with cloudstorage.open(gcs_file_name, 'w', content_type=content_type, options={b'x-goog-acl': b'public-read'}) as f:
AttributeError: 'module' object has no attribute 'open'*
I can't understand why. I have to set something in cloud storage?
The cloudstorage.open() clearly exists, so the error likely indicates some library/naming conflict in your environment.
A method for debugging such conflict is described in this post: google-cloud-sdk - Fatal errors on both update and attempted reinstall Mac OSX 10.7.5

Google Analytics and Python

I'm brand new at Python and I'm trying to write an extension to an app that imports GA information and parses it into MySQL. There is a shamfully sparse amount of infomation on the topic. The Google Docs only seem to have examples in JS and Java...
...I have gotten to the point where my user can authenticate into GA using SubAuth. That code is here:
import gdata.service
import gdata.analytics
from django import http
from django import shortcuts
from django.shortcuts import render_to_response
def authorize(request):
next = 'http://localhost:8000/authconfirm'
scope = 'https://www.google.com/analytics/feeds'
secure = False # set secure=True to request secure AuthSub tokens
session = False
auth_sub_url = gdata.service.GenerateAuthSubRequestUrl(next, scope, secure=secure, session=session)
return http.HttpResponseRedirect(auth_sub_url)
So, step next is getting at the data. I have found this library: (beware, UI is offensive) http://gdata-python-client.googlecode.com/svn/trunk/pydocs/gdata.analytics.html
However, I have found it difficult to navigate. It seems like I should be gdata.analytics.AnalyticsDataEntry.getDataEntry(), but I'm not sure what it is asking me to pass it.
I would love a push in the right direction. I feel I've exhausted google looking for a working example.
Thank you!!
EDIT: I have gotten farther, but my problem still isn't solved. The below method returns data (I believe).... the error I get is: "'str' object has no attribute '_BecomeChildElement'" I believe I am returning a feed? However, I don't know how to drill into it. Is there a way for me to inspect this object?
def auth_confirm(request):
gdata_service = gdata.service.GDataService('iSample_acctSample_v1.0')
feedUri='https://www.google.com/analytics/feeds/accounts/default?max-results=50'
# request feed
feed = gdata.analytics.AnalyticsDataFeed(feedUri)
print str(feed)
Maybe this post can help out. Seems like there are not Analytics specific bindings yet, so you are working with the generic gdata.
I've been using GA for a little over a year now and since about April 2009, i have used python bindings supplied in a package called python-googleanalytics by Clint Ecker et al. So far, it works quite well.
Here's where to get it: http://github.com/clintecker/python-googleanalytics.
Install it the usual way.
To use it: First, so that you don't have to manually pass in your login credentials each time you access the API, put them in a config file like so:
[Credentials]
google_account_email = youraccount#gmail.com
google_account_password = yourpassword
Name this file '.pythongoogleanalytics' and put it in your home directory.
And from an interactive prompt type:
from googleanalytics import Connection
import datetime
connection = Connection() # pass in id & pw as strings **if** not in config file
account = connection.get_account(<*your GA profile ID goes here*>)
start_date = datetime.date(2009, 12, 01)
end_data = datetime.date(2009, 12, 13)
# account object does the work, specify what data you want w/
# 'metrics' & 'dimensions'; see 'USAGE.md' file for examples
account.get_data(start_date=start_date, end_date=end_date, metrics=['visits'])
The 'get_account' method will return a python list (in above instance, bound to the variable 'account'), which contains your data.
You need 3 files within the app. client_secrets.json, analytics.dat and google_auth.py.
Create a module Query.py within the app:
class Query(object):
def __init__(self, startdate, enddate, filter, metrics):
self.startdate = startdate.strftime('%Y-%m-%d')
self.enddate = enddate.strftime('%Y-%m-%d')
self.filter = "ga:medium=" + filter
self.metrics = metrics
Example models.py: #has the following function
import google_auth
service = googleauth.initialize_service()
def total_visit(self):
object = AnalyticsData.objects.get(utm_source=self.utm_source)
trial = Query(object.date.startdate, object.date.enddate, object.utm_source, ga:sessions")
result = service.data().ga().get(ids = 'ga:<your-profile-id>', start_date = trial.startdate, end_date = trial.enddate, filters= trial.filter, metrics = trial.metrics).execute()
total_visit = result.get('rows')
<yr save command, ColumnName.object.create(data=total_visit) goes here>

Categories

Resources