Extract timestamp and username when streaming tweets using tweepy

Extract timestamp and username when streaming tweets using tweepy - python

I have the following class, in order to extract tweets in real time containing a given hashtag #Today:
class TweetListener(StreamingClient):
def on_data(self, raw_data):
logging.info(raw_data)
producer.send(topic_name, value=raw_data)
return True
def on_error(self, status_code):
if status_code == 420:
return False
def start_streaming_tweets(self):
rule = StreamRule(value="#Today lang:en")
self.add_rules(rule)
self.filter()`
However, in this way, the object sent is something like:
ConsumerRecord(topic='twitter', partition=0, offset=46, timestamp=1675201799030, timestamp_type=0, key=None, value=b'{"data":{"edit_history_tweet_ids":["16205398989347923"],"id":"16205398989347923","text":"#Today is a great day!"},"matching_rules":[{"id":"16238748236833856","tag":""}]}', headers=[], checksum=None, serialized_key_size=-1, serialized_value_size=196, serialized_header_size=-1
And so, I don't have any info about the user, the time of publication, the number of likes... Is there any way to get this info?

Related

Is it possible to test tweepy function module with pytest

I'm using tweepy for an application using Twitter. In order to check if my code do what it is expected to do, I want to test it. But here the thing, tweepy module,at the end, consists of API requests. For example, I want to test this following is_already_liked function:
class twitter:
def __init__(self):
self.auth = tweepy.OAuth1UserHandler(
API_KEY, API_KEY_SECRET, ACCESS_TOKEN, ACCESS_TOKEN_SECRET
)
self.api = tweepy.API(self.auth)
def is_already_liked(self, tweet_id: int) -> bool:
"""Checks if the tweet is already liked
Args:
tweet_id (int): id of the tweet
Returns:
bool: True if the tweet is already liked else False
"""
return self.api.get_status(tweet_id).favorited
I would like to test it on a fake tweet_id that I would create. Is there a way to do something like that with pytest and some mocking ?

Here is one way to do it:
In test_script.py
# module.submodule.script: where you define or import twitter
from module.submodule.script import twitter
class Tweet:
def __init__(self, favorited):
self.favorited = favorited
def test_is_already_liked(mocker):
api = mocker.patch("drafts.temp.tweepy.API")
api.return_value.get_status.side_effect = (
lambda x: Tweet(True) if x == "123456" else Tweet(False)
)
t = twitter()
assert t.is_already_liked("123456")
assert not t.is_already_liked("654321")
pytest .\tests\test_script.py
# Output: 1 passed

Python: why robinhood API doesn't response to request to push ticker into watchlist?

Hey guys I have a problem wiht the robin-stocks library. The authentication is working fine so I didn't post the first lines of authentication. What I'm trying to do is push a list of stocks into a watchlist in my RH account. The library has a function called "robin_stocks.account.post_symbols_to_watchlist(inputSymbols, name='Default')" found here: https://robin-stocks.readthedocs.io/en/latest/functions.html#robin_stocks.account.unlink_bank_account
Here's the code I'm trying:
inputSymbolslist=['NKLA']
def post_symbols_to_watchlist(inputSymbols, name='Andre3'):
"""Posts multiple stock tickers to a watchlist.
:param inputSymbols: May be a single stock ticker or a list of stock tickers.
:type inputSymbols: str or list
:param name: The name of the watchlist to post data to.
:type name: Optional[str]
:returns: Returns result of the post request.
"""
symbols = helper.inputs_to_set(inputSymbols)
payload = {
'symbols': ','.join(symbols)
}
print(payload)
url = urls.watchlists(name, True)
print(url)
data = helper.request_post(url, payload)
print(data)
return(data)
post_symbols_to_watchlist(inputSymbolslist,name='Andre3')
OUTPUT from command prompt line:
{'symbols': 'NKLA'}
https://api.robinhood.com/watchlists/Andre3/bulk_add/
{'detail': 'Not found.'}
Can you guys take a look and let me know what I'm doing wrong. It's possible the API may have gone changes recently and the library wasn't updated accordingly. Appreciate your help!
Andre

The documentation you refer to and quoted in your code clearly states:
Parameters:
inputSymbols (str or list) – May be a single stock ticker or a list of stock tickers.
not JSON.
Try to call Robinhood's function with string or list of strings as first parameter.
post_symbols_to_watchlist(['NKLA'], name=Andre3)

Adding more information on my question.
From the code I posted, the helper and url py files are called on.
HELPER CODE found here: https://robin-stocks.readthedocs.io/en/latest/_modules/
URLS CODE found here:
"""Contains all the url endpoints for interacting with Robinhood API."""
from robin_stocks.helper import id_for_chain, id_for_stock
Login
def login_url():
return('https://api.robinhood.com/oauth2/token/')
def challenge_url(challenge_id):
return('https://api.robinhood.com/challenge/{0}/respond/'.format(challenge_id))
Profiles
def account_profile():
return('https://api.robinhood.com/accounts/')
def basic_profile():
return('https://api.robinhood.com/user/basic_info/')
def investment_profile():
return('https://api.robinhood.com/user/investment_profile/')
def portfolio_profile():
return('https://api.robinhood.com/portfolios/')
def security_profile():
return('https://api.robinhood.com/user/additional_info/')
def user_profile():
return('https://api.robinhood.com/user/')
def portfolis_historicals(account_number):
return('https://api.robinhood.com/portfolios/historicals/{0}/'.format(account_number))
Stocks
def earnings():
return('https://api.robinhood.com/marketdata/earnings/')
def events():
return('https://api.robinhood.com/options/events/')
def fundamentals():
return('https://api.robinhood.com/fundamentals/')
def historicals():
return('https://api.robinhood.com/quotes/historicals/')
def instruments():
return('https://api.robinhood.com/instruments/')
def news(symbol):
return('https://api.robinhood.com/midlands/news/{0}/?'.format(symbol))
def popularity(symbol):
return('https://api.robinhood.com/instruments/{0}/popularity/'.format(id_for_stock(symbol)))
def quotes():
return('https://api.robinhood.com/quotes/')
def ratings(symbol):
return('https://api.robinhood.com/midlands/ratings/{0}/'.format(id_for_stock(symbol)))
def splits(symbol):
return('https://api.robinhood.com/instruments/{0}/splits/'.format(id_for_stock(symbol)))
account
def phoenix():
return('https://phoenix.robinhood.com/accounts/unified')
def positions():
return('https://api.robinhood.com/positions/')
def banktransfers():
return('https://api.robinhood.com/ach/transfers/')
def cardtransactions():
return('https://minerva.robinhood.com/history/transactions/')
def daytrades(account):
return('https://api.robinhood.com/accounts/{0}/recent_day_trades/'.format(account))
def dividends():
return('https://api.robinhood.com/dividends/')
def documents():
return('https://api.robinhood.com/documents/')
def linked(id=None, unlink=False):
if unlink:
return('https://api.robinhood.com/ach/relationships/{0}/unlink/'.format(id))
if id:
return('https://api.robinhood.com/ach/relationships/{0}/'.format(id))
else:
return('https://api.robinhood.com/ach/relationships/')
def margin():
return('https://api.robinhood.com/margin/calls/')
def margininterest():
return('https://api.robinhood.com/cash_journal/margin_interest_charges/')
def notifications(tracker=False):
if tracker:
return('https://api.robinhood.com/midlands/notifications/notification_tracker/')
else:
return('https://api.robinhood.com/notifications/devices/')
def referral():
return('https://api.robinhood.com/midlands/referral/')
def stockloan():
return('https://api.robinhood.com/stock_loan/payments/')
def subscription():
return('https://api.robinhood.com/subscription/subscription_fees/')
def wiretransfers():
return('https://api.robinhood.com/wire/transfers')
def watchlists(name=None, add=False):
if add:
return('https://api.robinhood.com/watchlists/{0}/bulk_add/'.format(name))
if name:
return('https://api.robinhood.com/midlands/lists/items/')
else:
return('https://api.robinhood.com/midlands/lists/default/')
def watchlist_delete(name, instrument):
return('https://api.robinhood.com/watchlists/{}/{}/instruments'.format(name,instrument))
#return('https://api.robinhood.com/watchlists/{}/{}/'.format(name,instrument))
markets
def currency():
return('https://nummus.robinhood.com/currency_pairs/')
def markets():
return('https://api.robinhood.com/markets/')
def market_hours(market, date):
return('https://api.robinhood.com/markets/{}/hours/{}/'.format(market, date))
def movers_sp500():
return('https://api.robinhood.com/midlands/movers/sp500/')
def get_100_most_popular():
return('https://api.robinhood.com/midlands/tags/tag/100-most-popular/')
def movers_top():
return('https://api.robinhood.com/midlands/tags/tag/top-movers/')
def market_category(category):
return('https://api.robinhood.com/midlands/tags/tag/{}/'.format(category))
options
def aggregate():
return('https://api.robinhood.com/options/aggregate_positions/')
def chains(symbol):
return('https://api.robinhood.com/options/chains/{0}/'.format(id_for_chain(symbol)))
def option_historicals(id):
return('https://api.robinhood.com/marketdata/options/historicals/{0}/'.format(id))
def option_instruments(id=None):
if id:
return('https://api.robinhood.com/options/instruments/{0}/'.format(id))
else:
return('https://api.robinhood.com/options/instruments/')
def option_orders(orderID=None):
if orderID:
return('https://api.robinhood.com/options/orders/{0}/'.format(orderID))
else:
return('https://api.robinhood.com/options/orders/')
def option_positions():
return('https://api.robinhood.com/options/positions/')
def marketdata_options(id):
return('https://api.robinhood.com/marketdata/options/{0}/'.format(id))
pricebook
def marketdata_quotes(id):
return ('https://api.robinhood.com/marketdata/quotes/{0}/'.format(id))
def marketdata_pricebook(id):
return ('https://api.robinhood.com/marketdata/pricebook/snapshots/{0}/'.format(id))
crypto
def order_crypto():
return('https://nummus.robinhood.com/orders/')
def crypto_account():
return('https://nummus.robinhood.com/accounts/')
def crypto_currency_pairs():
return('https://nummus.robinhood.com/currency_pairs/')
def crypto_quote(id):
return('https://api.robinhood.com/marketdata/forex/quotes/{0}/'.format(id))
def crypto_holdings():
return('https://nummus.robinhood.com/holdings/')
def crypto_historical(id):
return('https://api.robinhood.com/marketdata/forex/historicals/{0}/'.format(id))
def crypto_orders(orderID=None):
if orderID:
return('https://nummus.robinhood.com/orders/{0}/'.format(orderID))
else:
return('https://nummus.robinhood.com/orders/')
def crypto_cancel(id):
return('https://nummus.robinhood.com/orders/{0}/cancel/'.format(id))
orders
def cancel(url):
return('https://api.robinhood.com/orders/{0}/cancel/'.format(url))
def option_cancel(id):
return('https://api.robinhood.com/options/orders/{0}/cancel/'.format(id))
def orders(orderID=None):
if orderID:
return('https://api.robinhood.com/orders/{0}/'.format(orderID))
else:
return('https://api.robinhood.com/orders/')

Try sending a comma-delimited string of ticker symbols as the parameter instead of just one ticker symbol. When I played around with this functionality I could only get it to work if I had more than one ticker in the string. Try "NKLA,AAPL" instead of "NKLA" and see if that works. If it does then maybe you can send a parameter string like "NKLA,NKLA" to get it to register the ticker in the watchlist.

Python : storing class instance function in dict

I try to write a ChatBot program that will respond to each user differently.
So I implement like this: When there is a new user, ask the bot to do something and my bot needs to ask user back for more information and wait for the response message, my code will register a dict with a key of user_id and value of call_back function of class User like example code below.
class User:
api_dict = {}
def __init__(self, user_id):
self.user_id = user_id
def ask_username(self,chat_env):
chat_env.send_msg(self.user_id,"Please enter your username")
api_dict[self.user_id] = self.ask_birth_date
def ask_birth_date(self,message,chat_env)
chat_env.send_msg(self.user_id,"Mr. {} what is your birth date".format(message))
# do some thing
def hook_function(user_id,message,chat_env)
if is_first_hook(user_id):
user = User(user_id)
user.ask_username()
else:
User.api_dict[user_id](message,chat_env)
But it was not working as python threw an error that it didn't receive chat_env parameter in ask_birth_date() in which I think self wasn't passed to the function.
So is there any way to make self still attach with ask_birth_date()?

I think that you must be storing all the instances of User somewhere to be able to call ask_username when a connection is first made. Therefore you can transform api_dict into a state pattern.
Users = {} # I'm guessing you already have this!
class User:
def __init__(self, user_id):
self.user_id = user_id
def ask_username(self, chat_env):
chat_env.send_msg(self.user_id, "Please enter your username")
self.current = self.ask_birth_date
def ask_next_question(self, message, chat_env)
self.current(message, chat_env)
def ask_birth_date(self, message, chat_env)
chat_env.send_msg(self.user_id, "Mr. {} what is your birth date".format(message))
self.current = self.record_birth_date # for example
# There must be code that does this already
def new_connection(user_id, chat_env):
Users[user_id] = User(user_id)
Users[user_id].ask_username(chat_env)
# I assume this is called for every message that arrives from a user
def hook_function(user_id, message, chat_env)
Users[user_id].ask_next_question(message, chat_env)
Update:
Your hook function doesn't call ask_username() properly. That is why you are getting the error. That is why you should post all your code and the whole of the stack trace!
This code should fix your call site:
def hook_function(user_id, message, chat_env)
if is_first_hook(user_id):
user = User(user_id)
user.ask_username(chat_env) # add param here!
# btw the user instance here is thrown away!
else:
User.api_dict[user_id](message, chat_env)
If the above fixes your problems, then that means that the User class is unnecessary. You could just have api_dict as a global and the methods can become free functions.
Your code can be reduced to this:
api_dict = {}
def ask_username(chat_env, user_id):
chat_env.send_msg(user_id, "Please enter your username")
api_dict[user_id] = ask_birth_date
def ask_birth_date(chat_env, user_id, message)
chat_env.send_msg(user_id, "Mr. {} what is your birth date".format(message))
# do some thing
def hook_function(user_id, message, chat_env)
if is_first_hook(user_id):
ask_username(chat_env, user_id)
else:
api_dict[user_id](chat_env, user_id, message)

Set tweets counts for each items in tweepy stream

I have a problem and cant get to a solution..
I have written a python script to Stream twitter tweets.
My issue is I need to read 5 tweets for each words in the given list.
Below is the code:
class TweetListener(StreamListener):
def on_status(self,status):
print "TWEET ARRIVED!!!"
print "Tweet Text : %s" % status.text
print "Author's name : %s" % status.author.screen_name
print "Time of creation : %s" % status.created_at
print "Source of Tweet : %s" % status.source
time.sleep(10)
return True
def on_error(self, status):
print status
if status == 420:
print "Too soon reconnected, Exiting!!"
return False
sys.exit()
def search_tweets():
twitterStream = Stream(connect().auth, TweetListener())
twitterStream.filter(track=['Cricket','Maths','Army','Sports'],languages = ["en"],async=True)
Here I need to get 5 tweets each for Cricket, Maths, Army & Sports
What I am getting is an infinite number of tweets for the above elements.
Any help will be highly appreciated.
Thanks & regards.

class TweetListener(StreamListener):
def __init__(self, list_=None,dict_= None):
self.keys_= list_
self.dict = dict_
def on_status(self, status):
str_ = status.text.lower()
for key in self.dict.keys():
if key.lower() in str_.lower():
if self.dict[key] <= 0:
return True
else:
self.dict[key] -=1
self.performAction(key,status)
if all(value == 0 for value in self.dict.values()):
return False
def on_error(self, status):
print status
if status == 420:
print "Too soon reconnected . Will terminate the program"
return False
sys.exit()
def create_dict(list_):
no_of_tweets = 5
dict_ = {k:no_of_tweets for k in list_ }
return dict_
def search_tweets():
search_word = ['Cricket','Maths','Army','Sports']
twitterStream = Stream(connect().auth, TweetListener(list_=search_word , dict_=create_dict(search_word)))
twitterStream.filter(track=search_word ,languages = ["en"],async=True)
Here I initialize a list with all the required words that are to be searched for tweets, then I create a dictionary with key:value as word_to_be_searched:count_as_5 in the create_dict(list_) function, like Cricket:5, Maths:5, Army:5, Sports:5 and so on. Then I pass the list along with the dictionary to the TweetListener class.
I override the on_status function to retrieve tweets and then compare the tweets with the key field of my dictionary. It is obvious there will be a match and then, in that case, I decrease the value(as counter here) by 1.
When all the values become 0, then I return false to break the loop and close the thread.
[Note, if any value corresponding to a key has become zero, it indicates that the required no of tweets are already captured so we will not proceed with any more tweets on that word.]
Then in the performAction(key, status) function {key=one of the searched words and status = tweet captured} I perform my required task.

Factoring out asynchronous code involving tornado.gen.Task

I have numerous tornado.web.RequestHandler classes that test for authorized access using id and access key secure cookies. I access mongodb asynchronously with inline callbacks using gen.Task. I am having trouble figuring out a way to factor out the repetitive code because of its asynchronicity. How can I do this?
class MyHandler(RequestHandler):
#tornado.web.asynchronous
#gen.engine
def get(self):
id = self.get_secure_cookie('id', None)
accesskey = self.get_secure_cookie('accesskey', None)
if not id or not accesskey:
self.redirect('/a_public_area')
return
try:
# convert to bson id format to access mongodb
bson.objectid.ObjectId(id)
except:
# if not valid object id
self.redirect('/a_public_area')
return
found_id, error = yield gen.Task(asyncmong_client_inst.collection.find_one,
{'_id': id, 'accesskey': accesskey}, fields={'_id': 1})
if error['error']:
raise HTTPError(500)
return
if not found_id[0]:
self.redirect('/a_public_area')
return
# real business code follows
I would like to factor the above into a function that yields perhaps an HTTP status code.

Tornado has decorator #tornado.web.authenticated. Let's use it.
class BaseHandler(RequestHandler):
def get_login_url(self):
return u"/a_public_area"
#gen.engine #Not sure about this step
def get_current_user(self):
id = self.get_secure_cookie('id', None)
accesskey = self.get_secure_cookie('accesskey', None)
if not id or not accesskey:
return False
#Are you sure need this?
try:
# convert to bson id format to access mongodb
bson.objectid.ObjectId(id)
except:
# if not valid object id
return False
#I believe that you don't need asynchronous mongo on auth query, so if it's not working - replace it with sync call
found_id, error = yield gen.Task(asyncmong_client_inst.collection.find_one,
{'_id': id, 'accesskey': accesskey}, fields={'_id': 1})
if error['error']:
raise HTTPError(500)
if not found_id[0]:
return False
return found_id
class MyHandler(BaseHandler):
#tornado.web.asynchronous
#tornado.web.authenticated
#gen.engine
def get(self):
# real business code follows
Using gen everywhere - not good practice. It can turn this world in big spaghetti. Think about it.

perhaps a decorator (not tested or anything, just some ideas)
def sanitize(fn):
def _sanitize(self, *args, **kwargs):
id = self.get_secure_cookie('id', None)
accesskey = self.get_secure_cookie('accesskey', None)
if not id or not accesskey:
self.redirect('/a_public_area')
return
try:
# convert to bson id format to access mongodb
bson.objectid.ObjectId(id)
except:
# if not valid object id
self.redirect('/a_public_area')
return
return fn(self, *args, **kwargs)
return _sanitize
dunno if you can make the check_errors work with the business logic..but maybe..
def check_errors(fn):
def _check_errors(*args, **kwargs)
found_id, error = fn(*args, **kwargs)
if error['error']:
raise HTTPError(500)
return
if not found_id[0]:
self.redirect('/a_public_area')
return
return _check_errors
then
class MyHandler(RequestHandler):
#tornado.web.asynchronous
#gen.engine
#sanitize
#check_errors #..O.o decorators
def get(self):
found_id, error = yield gen.Task(asyncmong_client_inst.collection.find_one,
{'_id': id, 'accesskey': accesskey}, fields={'_id': 1})
return found_id, error

I'd like to address this general problem with gen.Task, which is that factoring out code is either impossible or extremely clumsy.
You can only do "yield gen.Task(...)" within the get() or post() method. If you want to have get() call another function foo(), and do the work in foo(), well: You can't, unless you want to write everything as a generator and chain them together in some unwieldy way. As your project gets bigger, this is going to be a huge problem.
This is a much better alternative: https://github.com/mopub/greenlet-tornado
We used this to convert a large synchronous codebase to Tornado, with almost no changes.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Extract timestamp and username when streaming tweets using tweepy - python

Related

Is it possible to test tweepy function module with pytest

Python: why robinhood API doesn't response to request to push ticker into watchlist?

Python : storing class instance function in dict

Set tweets counts for each items in tweepy stream

Factoring out asynchronous code involving tornado.gen.Task

Categories

Resources