Multiple Term search by following multiple users using Streaming API

Multiple Term search by following multiple users using Streaming API - python

I am trying to Retrieve multiple keyword term tweets by following specific group of users. Using the code below:
I have posted one more code before that regarding issues for value error:
I figure it out somehow but again I am stuck because of this traceback
import tweepy
from tweepy.error import TweepError
consumer_key=('ABC'),
consumer_secret=('ABC'),
access_key=('ABC'),
access_secret=('ABC')
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api=tweepy.API(auth)
class CustomStreamListener(tweepy.StreamListener):
def on_status(self, status):
try:
print "%s\t%s\t%s\t%s" % (status.text,
status.author.screen_name,
status.created_at,
status.source,)
except Exception, e:
print error
#def filter(self, follow=None, track=None, async=False, locations=None):
#self.parameters = {}
#self.headers['Content-type'] = "application/x-www-form-urlencoded"
#if self.running:
#raise TweepError('Stream object already connected!')
#self.url = '/%i/statuses/filter.json?delimited=length' % STREAM_VERSION
def filter(self, follow=None, track=None, async=False, locations=None):
self.parameters = {}
self.headers['Content-type'] = "application/x-www-form-urlencoded"
if self.running:
raise TweepError('Stream object already connected!')
self.url = '/%i/statuses/filter.json?delimited=length' % STREAM_VERSION
if obey:
self.parameters['follow'] = ' '.join(map(str, obey))
if track:
self.parameters['track'] = ' '.join(map(str, track))
if locations and len(locations) > 0:
assert len(locations) % 4 == 0
self.parameters['locations'] = ' '.join('%.2f' % l for l in locations)
self.body = urllib.urlencode(self.parameters)
self.parameters['delimited'] = 'length'
self._start(async)
def on_error(self, status_code):
return True
streaming_api = tweepy.streaming.Stream(auth, CustomStreamListener(), timeout=60)
list_users = ['17006157','59145948','157009365','16686144','68044757','33338729']#Some ids
list_terms = ['narendra modi','robotics']#Some terms
streaming_api.filter(follow=[list_users])
streaming_api.filter(track=[list_terms])
I am getting a traceback:
Traceback (most recent call last):
File "C:\Python27\nytimes\26052014\Multiple term search with multiple addreses.py", line 49, in <module>
streaming_api.filter(follow=[list_users])
File "build\bdist.win32\egg\tweepy\streaming.py", line 296, in filter
encoded_follow = [s.encode(encoding) for s in follow]
AttributeError: 'list' object has no attribute 'encode'
Please help me resolving the issue.

You define list_users here
list_users = ['17006157','59145948','157009365','16686144','68044757','33338729']
and then you pass it to streaming_api.filter like this
streaming_api.filter(follow=[list_users])
When the streaming_api.filter function is iterating over the value you pass as follow, it gives the error
AttributeError: 'list' object has no attribute 'encode'
The reason for this is as follows
You call streaming_api.filter like this
streaming_api.filter(follow=[list_users])
Here
streaming_api.filter(follow=[list_users])
you are trying to pass your list as value for follow, however because you put list_users in enclosing [] you are creating a list in a list. Then streaming_api.filter iterates over follow, calling .encode on each entry as we see here
[s.encode(encoding) for s in follow]
But the entry s is a list while it should be a string.
That is because you accidentally created a list in a list like you can see above.
The solution is simple. Change
streaming_api.filter(follow=[list_users])
to
streaming_api.filter(follow=list_users)
To pass a list to a function, you can just specify the name. No need to enclose it in []
Same applies to the last line. Change
streaming_api.filter(track=[list_terms])
to
streaming_api.filter(track=list_terms)

Related

Python - IBM Watson Speech to Text 'NoneType' object has no attribute 'get_result'

I'm developing a program with IBM Watson Speech to Text and currently using Python 2.7. Here's a stub of some code for development:
class MyRecognizeCallback(RecognizeCallback):
def __init__(self):
RecognizeCallback.__init__(self)
def on_data(self, data):
pass
def on_error(self, error):
pass
def on_inactivity_timeout(self, error):
pass
speech_to_text = SpeechToTextV1(username='*goes here*', password='*goes here*')
speech_to_text.set_detailed_response(True)
f = '/home/user/file.wav'
rate, data = wavfile.read(f)
work = data.tolist()
with open(f, 'rb') as audio_file:
# Get IBM Watson analytics
currentModel = "en-US_NarrowbandModel" if rate <= 8000 else "en-US_BroadbandModel"
x = ""
print(" - " + f)
try:
# Callback info
myRecognizeCallback = MyRecognizeCallback()
# X represents the responce from Watson
audio_source = AudioSource(audio_file)
my_result = speech_to_text.recognize_using_websocket(
audio_source,
content_type='audio/wav',
timestamps=True,
recognize_callback=myRecognizeCallback,
model=currentModel,
inactivity_timeout=-1,
max_alternatives=0)
x = json.loads(json.dumps(my_result, indent=2), object_hook=lambda d: n
namedtuple('X', d.keys())(*d.values()))
What I'm expecting to be returned is a JSON object with the results of the file given the above parameters. What instead I'm recieving is an error that looks like this:
Error received: 'NoneType' object has no attribute 'connected'
That's the entire traceback - no other errors than that. However, when I try to access the JSON object in further code, I get this error:
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/watson_developer_cloud/websocket/recognize_listener.py", line 96, in run
chunk = self.audio_source.input.read(ONE_KB)
ValueError: I/O operation on closed file
Did I forget something or put something in the wrong place?
Edit:
My original code had an error in it that I fixed myself. Regardless, I'm still getting the original error. Here's the update:
my_result = speech_to_text.recognize_using_websocket(
audio_source,
content_type='audio/wav',
timestamps=True,
recognize_callback=myRecognizeCallback,
model=currentModel,
inactivity_timeout=None,
max_alternatives=None).get_result()
x = json.loads(json.dumps(my_result, indent=2), object_hook=lambda d: namedtuple('X', d.keys())(*d.values()))

Take a look at object_hook=lambda d: n, in python lambda d: n means "a function that takes d, ignores d, and returns n".
I'm guessing n is set to None somewhere else.
If that doesn't work, it may be easier to debug if you break your lambda into a separate function, def to_named_tuple(object): perhaps.

Handling key error in python

The below function parses the cisco command output,stores the output in dictionary and returns the value for a given key. This function works as expected when the dictionary contains the output. However, if the command returns no output at all the length of dictionary is 0 and the function returns a key error . I have used exception KeyError: But this doesn't seem to work.
from qa.ssh import Ssh
import re
class crypto:
def __init__(self, username, ip, password, machinetype):
self.user_name = username
self.ip_address = ip
self.pass_word = password
self.machine_type = machinetype
self.router_ssh = Ssh(ip=self.ip_address,
user=self.user_name,
password=self.pass_word,
machine_type=self.machine_type
)
def session_status(self, interface):
command = 'show crypto session interface '+interface
result = self.router_ssh.cmd(command)
try:
resultDict = dict(map(str.strip, line.split(':', 1))
for line in result.split('\n') if ':' in line)
return resultDict
except KeyError:
return False
test script :
obj = crypto('uname', 'ipaddr', 'password', 'router')
out = obj.session_status('tunnel0')
status = out['Peer']
print(status)
Error
Traceback (most recent call last):
File "./test_parser.py", line 16, in <module>
status = out['Peer']
KeyError: 'Peer'

The KeyError did not happend in the function session_status,it is happend in your script at status = out['Peer'].So your try and except in session_status will not work.you should make a try and except for status = out['Peer']:
try:
status = out['Peer']
except KeyError:
print 'no Peer'
or :
status = out.get('Peer', None)

Your exception is not in the right place. As you said you just return an empty dictionary with your function. The exception is trying to lookup the key on empty dictionary object that is returned status = outertunnel['Peer']. It might be easier to check it with the dict get function. status = outertunnel.get('Peer',False) or improve the test within the function session_status, like testing the length to decide what to return False if len(resultDict) == 0

This explains the problem you're seeing.
The exception happens when you reference out['Peer'] because out is an empty dict. To see where the KeyError exception can come into play, this is how it operates on an empty dict:
out = {}
status = out['Peer']
Throws the error you're seeing. The following shows how to deal with an unfound key in out:
out = {}
try:
status = out['Peer']
except KeyError:
status = False
print('The key you asked for is not here status has been set to False')
Even if the returned object was False, out['Peer'] still fails:
>>> out = False
>>> out['Peer']
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
out['Peer']
TypeError: 'bool' object is not subscriptable
I'm not sure how you should proceed, but dealing with the result of session_status not having the values you need is the way forward, and the try: except: block inside the session_status function isn't doing anything at the moment.

skipping a json key if does not exist

I'm running the following:
for server in server_list:
for item in required_fields:
print item, eval(item)
There is a possibility that some keys may not exist, but worse it's represented on a parent key not the one I'm scanning for.
So I'm scanning the json for the following key:
server['server_management']['server_total_cost_of_ownership']['description']
Which doesn't exist but it's actually the parent that is null:
server['server_management']['server_total_cost_of_ownership']
How do I write my code to account for this? It's not giving a key error. Right now I get the following traceback:
Traceback (most recent call last):
File "C:/projects/blah/scripts/test.py", line 29, in <module>
print item, eval(item)
File "<string>", line 1, in <module>
TypeError: 'NoneType' object has no attribute '__getitem__'
Full code:
import csv
import json
import os
import requests
import sys
required_fields = ["server['server_name']","server['server_info']['asset_type']['display_name']",
"server['asset_status']['display_name']", "server['record_owner']['group_name']",
"server['server_management']['server_total_cost_of_ownership']['description']",
"server['server_management']['primary_business_owner']['name']",
"server['environment']['display_name']", "server['is_virtual']",
"server['managed_by']['display_name']", "server['server_info']['billable_ibm']",
"server['server_info']['billing_sub_type']['display_name']",
"server['server_info']['serial_number']", "server['location']['display_name']",
"server['inception_date']", "server['server_info']['decommission_date']" ]
# Query API for all servers
def get_servers_info():
servers_info = requests.get('url')
return servers_info.json()
def get_server_info(sid):
server_info = requests.get('url')
return server_info.json()
server_list = get_servers_info()
for server in server_list:
for item in required_fields:
print item, eval(item)

In fact you should avoid eval. After the json load since you know the key name, you can use a list to go deeper in the tree.
server['server_management']['primary_business_owner']['name']" => ["server_management', 'primary_business_owner', 'name']
Here a snippet for a json validation against a list of required fields.
data={
"d": {
"p":{
"r":[
"test"
]
}
},
"a": 3
}
def _get_attr(dict_, attrs):
try:
src = attrs[:]
root = attrs.pop(0)
node = dict_[root]
null = object()
for i, attr in enumerate(attrs[:]):
try:
node = node.get(attr, null)
except AttributeError:
node = null
if node is null:
# i+2 pop and last element
raise ValueError("%s not present (level %s)" % (attr, '->'.join(src[: i+2])))
return node
except KeyError:
raise ValueError("%s not present" % root)
# assume list of required field
reqs = [
["d", "p", "r"],
["d"],
["k"],
["d", "p", "r", "e"],
]
for req in reqs:
try:
_get_attr(data, req)
except ValueError as E:
print(E)
# prints
# k not present
# e not present (level d->p->r->e)

Ignoring the context of the code and not understanding the use of eval here, the way to do this is to use .get() and seed it with reasonable defaults.
For example:
server['server_management']['server_total_cost_of_ownership']['description']
Can be:
server.get('server_management', {}).get('server_total_cost_of_ownership', {}).get('description', '')
Then if any of the keys do not exist you will always get back an empty description ''.

Your problem here is totally unrelated to using eval[1]. The exception you get is the same as if the code would have been there directly. What you are running (via eval) is:
a = server['server_management']
b = a['server_total_cost_of_ownership']
c = b['description']
Yet, b is None, so resolving it to c will fail. Like a KeyError, you can also catch a TypeError:
for server in server_list:
for item in required_fields:
try:
print item, eval(item)
except TypeError:
print("Guess you're lucky you didn't include a fork bomb in your own code to eval.")
You may of course alternatively pass, print the offending item, open a browser to some page or do whatever error handling is appropriate given your input data.
[1] While not bickering around, I've made a new answer that works without eval. You can use precisely the same error handling:
for server in server_list:
for item in required_fields:
value = server
for key in parse_fields(field):
try:
value = value[key]
except TypeError:
print("Remember Kiddo: Eval is Evil!")
break
else: # for: else: triggers only if no break was issued
print item, value

AttributeError: 'unicode' object has no attribute 'success'

I have a simple script using requests to validate a list of emails. Relevant code:
def ___process_email(email, output_file=None):
profile = request(email)
if profile and profile.success != 'nothing_useful':
logger.info('Found match for {0}'.format(email))
print(profile)
if output_file:
output_file.write(str(profile) + '\n')
else:
print("No information found\n")
This ran through 5 loops successfully then threw:
Traceback (most recent call last):
File "app.py", line 147, in <module> main()
File "app.py", line 141, in main ___process_email(arg, output)
File "app.py", line 107, in ___process_email if profile and profile.success != 'nothing_useful':
AttributeError: 'unicode' object has no attribute 'success'
Here's the model:
class Profile(object):
def __init__(self, person):
if person:
self.name = person.get('name')
self.jobinfo = [
(occupation.get('job_title'), occupation.get('company'))
for occupation in person.get('occupations', [])
]
self.memberships = [
(membership.get('site_name'), membership.get('profile_url'))
for membership in person.get('memberships', [])
]
self.success = person.get('success')
def __str__(self):
return dedent("""
Name: {0}
{1}
{2}
""").format(
self.name,
"\n".join(
"{0} {1}".format(title, company)
for title, company in self.jobinfo),
"\n".join(
"\t{0} {1}".format(site_name, url)
for site_name, url in self.memberships)
)
Request:
import requests
def request(email):
status_url = STATUS_URL.format(email)
response = requests.get(status_url).json()
session_token = response.get('session_token')
# fail gracefully if there is an error
if 'error' in response:
return response['error']
elif response['status'] == 200 and session_token:
logger.debug('Session token: {0}'.format(session_token))
url = URL.format(email)
headers = {'X-Session-Token': session_token}
response = requests.get(url, headers=headers).json()
if response.get('success') != 'nothing_useful':
return Profile(response.get('contact'))
return {}
Anyone see why my strings are unicode? thanks

If there is an error in the response, you return the error string:
if 'error' in response:
return response['error']
That's your unicode value there. Note that the same function returns either the 'error' value, a new Profile() instance, or an empty dictionary. You may want to make this more consistent, return only Profile() istances and None instead.
Instead of the error string, raise an exception and handle the exception in your ___process_email method:
class EmailValidationError(Exception):
pass
and in your request() function:
if 'error' in response:
raise EmailValidationError(response['error'])
then handle this in __process_email() with something like:
try:
profile = request(email)
if profile and profile.success != 'nothing_useful':
logger.info('Found match for {0}'.format(email))
print(profile)
if output_file:
output_file.write(str(profile) + '\n')
else:
print("No information found\n")
except EmailValidationError:
# Do something here

Python Invalid Snytax

Below is the code I have been working on.
The very last line write_csv('twitter_gmail.csv', messages, append=True) throws a
[ec2-user#ip-172-31-46-164 ~]$ ./twitter_test16.sh
Traceback (most recent call last):
File "./twitter_test16.sh", line 53, in
write_csv('twitter_gmail.csv', messages, append=True)
NameError: name 'messages' is not defined
I have messages defined so I dont understand why it would do that.
import csv
import json
import oauth2 as oauth
import urllib
import sys
import requests
import time
CONSUMER_KEY = "
CONSUMER_SECRET = "
ACCESS_KEY = "
ACCESS_SECRET = "
class TwitterSearch:
def __init__(self, ckey=CONSUMER_KEY, csecret=CONSUMER_SECRET,
akey=ACCESS_KEY, asecret=ACCESS_SECRET,
query='https://api.twitter.com/1.1/search/tweets.{mode}?{query}'
):
consumer = oauth.Consumer(key=ckey, secret=csecret)
access_token = oauth.Token(key=akey, secret=asecret)
self.client = oauth.Client(consumer, access_token)
self.query = query
def search(self, q, mode='json', **queryargs):
queryargs['q'] = q
query = urllib.urlencode(queryargs)
return self.client.request(self.query.format(query=query, mode=mode))
def write_csv(fname, rows, header=None, append=False, **kwargs):
filemode = 'ab' if append else 'wb'
with open(fname, filemode) as outf:
out_csv = csv.writer(outf, **kwargs)
if header:
out_csv.writerow(header)
out_csv.writerows(rows)
def main():
ts = TwitterSearch()
response, data = ts.search('#gmail.com', result_type='recent')
js = json.loads(data)
messages = ([msg['created_at'], msg['txt'], msg['user']['id']] \
for msg in js.get('statuses', []))
write_csv('twitter_gmail.csv', messages, append=True)

The previous line is missing a parenthesis.
messages = ([msg['created_at'], msg['txt'], msg['user']['id']] for msg in js.get('statuses', [])
Should be:
messages = ([msg['created_at'], msg['txt'], msg['user']['id']] for msg in js.get('statuses', []))
I'm surprised that it works when you change to print? Are you also changing the comprehension when you do that?
You asked why the line number of the error was after the bad syntax?
Try putting this in line one of a file and running it, and note the line of the SyntaxError.
a = (]
Then try this and check out the line number:
a = (
b = "some stuff"
Finally, try this:
a = (
b = "some stuff"
Think about when you would know that the programmer had made a python-illegal typo if you were reading the code and carrying it out via pen and paper.
Basically, a SyntaxError is raised as soon as it can be unambiguously determined that invalid syntax was used, which is often immediately after a statement where a mistake was made, not immediately at.
You'll frequently get line numbers on SyntaxErrors that are a line (or several lines if there's empty lines or a corner case) below the actual typo.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Multiple Term search by following multiple users using Streaming API - python

Related

Python - IBM Watson Speech to Text 'NoneType' object has no attribute 'get_result'

Handling key error in python

skipping a json key if does not exist

AttributeError: 'unicode' object has no attribute 'success'

Python Invalid Snytax

Categories

Resources