I have a Django management command, launched via supervisord, that uses tweepy to consume the twitter streaming API.
The agent works quite well however I notice in the logs there's an SSLError every 10-15 minutes and supervisord is re-launching the agent.
The tweepy package is latest, version 1.11. The server is ubuntu 12.04 LTS. I've tried installing the cacert into the key chain as mentioned in the link below, but no luck.
Twitter API SSL Root CA Certificate
Any suggestions?
[2012-08-26 19:28:15,656: ERROR] Error establishing the connection
Traceback (most recent call last):.../.../datasinks.py", line 102, in start
stream.filter(locations=self.locations)
File "/site/pythonenv/local/lib/python2.7/site-packages/tweepy/streaming.py", line 228, in filter
self._start(async)
File "/site/pythonenv/local/lib/python2.7/site-packages/tweepy/streaming.py", line 172, in _start
self._run()
File "/site/pythonenv/local/lib/python2.7/site-packages/tweepy/streaming.py", line 117, in _run
self._read_loop(resp)
File "/site/pythonenv/local/lib/python2.7/site-packages/tweepy/streaming.py", line 150, in _read_loop
c = resp.read(1)
File "/usr/lib/python2.7/httplib.py", line 541, in read
return self._read_chunked(amt)
File "/usr/lib/python2.7/httplib.py", line 574, in _read_chunked
line = self.fp.readline(_MAXLINE + 1)
File "/usr/lib/python2.7/socket.py", line 476, in readline
data = self._sock.recv(self._rbufsize)
File "/usr/lib/python2.7/ssl.py", line 241, in recv
return self.read(buflen)
File "/usr/lib/python2.7/ssl.py", line 160, in read
return self._sslobj.read(len)
SSLError: The read operation timed out
Following is an outline of the code.
from tweepy import API, OAuthHandler
from tweepy.streaming import StreamListener, Stream
# snip other imports
class TwitterSink(StreamListener, TweetSink):
def __init__(self):
self.auth = OAuthHandler(settings.TWITTER_OAUTH_CONSUMER_KEY, settings.TWITTER_OAUTH_CONSUMER_SECRET)
self.auth.set_access_token(settings.TWITTER_OAUTH_ACCESS_TOKEN_KEY, settings.TWITTER_OAUTH_ACCESS_TOKEN_SECRET)
self.locations = '' # Snip for brevity
def start(self):
try:
stream = Stream(self.auth, self,timeout=60, secure=True)
stream.filter(locations=self.locations)
except SSLError as e:
logger.exception("Error establishing the connection")
except IncompleteRead as r:
logger.exception("Error with HTTP connection")
# snip on_data()
# snip on_timeout()
# snip on_error()
The certificate doesn't seem to be the problem. The error is just a timeout. Seems like an issue with tweepy's SSL handling to me. The code is equipped to handle socket.timeout and reopen the connection, but not a timeout arriving through SSLError.
Looking at the ssl module code (or docs), though, I don't see a pretty way to catch that. The SSLError object is raised without any arguments, just a string description. For lack of a better solution, I'd suggest adding the following right before line 118 of tweepy/streaming.py:
except SSLError, e:
if 'timeout' not in exception.message.lower(): # support all timeouts
exception = e
break
if self.listener.on_timeout() == False:
break
if self.running is False:
break
conn.close()
sleep(self.snooze_time)
Why it's timing out in the first place is a good question. I have nothing better than repeating Travis Mehlinger's suggestion of setting a higher timeout.
Here is how I have it (modified solution from here https://groups.google.com/forum/?fromgroups=#!topic/tweepy/80Ayu1joGJ4):
l = MyListener()
auth = OAuthHandler(settings.CONSUMER_KEY, settings.CONSUMER_SECRET)
auth.set_access_token(settings.ACCESS_TOKEN, settings.ACCESS_TOKEN_SECRET)
# connect to stream
stream = Stream(auth, l, timeout=30.0)
while True:
# Call tweepy's userstream method with async=False to prevent
# creation of another thread.
try:
stream.filter(follow=reporters, async=False)
# Normal exit: end the thread
break
except Exception, e:
# Abnormal exit: Reconnect
logger.error(e)
nsecs = random.randint(60, 63)
logger.error('{0}: reconnect in {1} seconds.'.format(
datetime.datetime.utcnow(), nsecs))
time.sleep(nsecs)
There is another alternative solution provided on Github:
https://github.com/tweepy/tweepy/pull/132
Related
I'm trying to setup a GRPC client in Python to hit a particular server. The server is setup to require authentication via access token. Therefore, my implementation looks like this:
def create_connection(target, access_token):
credentials = composite_channel_credentials(
ssl_channel_credentials(),
access_token_call_credentials(access_token))
target = target if target else DEFAULT_ENDPOINT
return secure_channel(target = target, credentials = credentials)
conn = create_connection(svc = "myservice", session = Session(client_id = id, client_secret = secret)
stub = FakeStub(conn)
stub.CreateObject(CreateObjectRequest())
The issue I'm having is that, when I attempt to use this connection I get the following error:
File "<stdin>", line 1, in <module>
File "\anaconda3\envs\test\lib\site-packages\grpc\_interceptor.py", line 216, in __call__
response, ignored_call = self._with_call(request,
File "\anaconda3\envs\test\lib\site-packages\grpc\_interceptor.py", line 257, in _with_call
return call.result(), call
File "anaconda3\envs\test\lib\site-packages\grpc\_channel.py", line 343, in result
raise self
File "\anaconda3\envs\test\lib\site-packages\grpc\_interceptor.py", line 241, in continuation
response, call = self._thunk(new_method).with_call(
File "\anaconda3\envs\test\lib\site-packages\grpc\_interceptor.py", line 266, in with_call
return self._with_call(request,
File "\anaconda3\envs\test\lib\site-packages\grpc\_interceptor.py", line 257, in _with_call
return call.result(), call
File "\anaconda3\envs\test\lib\site-packages\grpc\_channel.py", line 343, in result
raise self
File "\anaconda3\envs\test\lib\site-packages\grpc\_interceptor.py", line 241, in continuation
response, call = self._thunk(new_method).with_call(
File "\anaconda3\envs\test\lib\site-packages\grpc\_channel.py", line 957, in with_call
return _end_unary_response_blocking(state, call, True, None)
File "\anaconda3\envs\test\lib\site-packages\grpc\_channel.py", line 849, in _end_unary_response_blocking
raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "failed to connect to all addresses"
debug_error_string = "{
"created":"#1633399048.828000000",
"description":"Failed to pick subchannel",
"file":"src/core/ext/filters/client_channel/client_channel.cc",
"file_line":3159,
"referenced_errors":[
{
"created":"#1633399048.828000000",
"description":
"failed to connect to all addresses",
"file":"src/core/lib/transport/error_utils.cc",
"file_line":147,
"grpc_status":14
}
]
}"
I looked up the status code associated with this response and it seems that the server is unavailable. So, I tried waiting for the connection to be ready:
channel_ready_future(conn).result()
but this hangs. What am I doing wrong here?
UPDATE 1
I converted the code to use the async connection instead of the synchronous connection but the issue still persists. Also, I saw that this question had also been posted on SO but none of the solutions presented there fixed the problem I'm having.
UPDATE 2
I assumed that this issue was occurring because the client couldn't find the TLS certificate issued by the server so I added the following code:
def _get_cert(target: str) -> bytes:
split_around_port = target.split(":")
data = ssl.get_server_certificate((split_around_port[0], split_around_port[1]))
return str.encode(data)
and then changed ssl_channel_credentials() to ssl_channel_credentials(_get_cert(target)). However, this also hasn't fixed the problem.
The issue here was actually fairly deep. First, I turned on tracing and set GRPC log-level to debug and then found this line:
D1006 12:01:33.694000000 9032 src/core/lib/security/transport/security_handshaker.cc:182] Security handshake failed: {"created":"#1633489293.693000000","description":"Cannot check peer: missing selected ALPN property.","file":"src/core/lib/security/security_connector/ssl_utils.cc","file_line":160}
This lead me to this GitHub issue, which stated that the issue was with grpcio not inserting the h2 protocol into requests, which would cause ALPN-enabled servers to return that specific error. Some further digging led me to this issue, and since the server I connected to also uses Envoy, it was just a matter of modifying the envoy deployment file so that:
clusters:
- name: my-server
connect_timeout: 10s
type: strict_dns
lb_policy: round_robin
http2_protocol_options: {}
hosts:
- socket_address:
address: python-server
port_value: 1337
tls_context:
common_tls_context:
tls_certificates:
alpn_protocols: ["h2"] <====== Add this.
ElasticSearch 7.10.2
Python 3.8.5
elasticsearch-py 7.12.1
I'm trying to do a bulk insert of 100,000 records to ElasticSearch using elasticsearch-py bulk helper.
Here is the Python code:
import sys
import datetime
import json
import os
import logging
from elasticsearch import Elasticsearch
from elasticsearch.helpers import streaming_bulk
# ES Configuration start
es_hosts = [
"http://localhost:9200",]
es_api_user = 'user'
es_api_password = 'pw'
index_name = 'index1'
chunk_size = 10000
errors_before_interrupt = 5
refresh_index_after_insert = False
max_insert_retries = 3
yield_ok = False # if set to False will skip successful documents in the output
# ES Configuration end
# =======================
filename = file.json
logging.info('Importing data from {}'.format(filename))
es = Elasticsearch(
es_hosts,
#http_auth=(es_api_user, es_api_password),
sniff_on_start=True, # sniff before doing anything
sniff_on_connection_fail=True, # refresh nodes after a node fails to respond
sniffer_timeout=60, # and also every 60 seconds
retry_on_timeout=True, # should timeout trigger a retry on different node?
)
def data_generator():
f = open(filename)
for line in f:
yield {**json.loads(line), **{
"_index": index_name,
}}
errors_count = 0
for ok, result in streaming_bulk(es, data_generator(), chunk_size=chunk_size, refresh=refresh_index_after_insert,
max_retries=max_insert_retries, yield_ok=yield_ok):
if ok is not True:
logging.error('Failed to import data')
logging.error(str(result))
errors_count += 1
if errors_count == errors_before_interrupt:
logging.fatal('Too many import errors, exiting with error code')
exit(1)
print("Documents loaded to Elasticsearch")
When the json file contains a small amount of documents (~100), this code runs without issue. But I just tested it with a file of 100k documents, and I got this error:
WARNING:elasticsearch:POST http://127.0.0.1:9200/_bulk?refresh=false [status:N/A request:10.010s]
Traceback (most recent call last):
File "/Users/me/opt/anaconda3/lib/python3.8/site-packages/urllib3/connectionpool.py", line 426, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/Users/me/opt/anaconda3/lib/python3.8/site-packages/urllib3/connectionpool.py", line 421, in _make_request
httplib_response = conn.getresponse()
File "/Users/me/opt/anaconda3/lib/python3.8/http/client.py", line 1347, in getresponse
response.begin()
File "/Users/me/opt/anaconda3/lib/python3.8/http/client.py", line 307, in begin
version, status, reason = self._read_status()
File "/Users/me/opt/anaconda3/lib/python3.8/http/client.py", line 268, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/Users/me/opt/anaconda3/lib/python3.8/socket.py", line 669, in readinto
return self._sock.recv_into(b)
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/me/opt/anaconda3/lib/python3.8/site-packages/elasticsearch/connection/http_urllib3.py", line 251, in perform_request
response = self.pool.urlopen(
File "/Users/me/opt/anaconda3/lib/python3.8/site-packages/urllib3/connectionpool.py", line 726, in urlopen
retries = retries.increment(
File "/Users/me/opt/anaconda3/lib/python3.8/site-packages/urllib3/util/retry.py", line 386, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/Users/me/opt/anaconda3/lib/python3.8/site-packages/urllib3/packages/six.py", line 735, in reraise
raise value
File "/Users/me/opt/anaconda3/lib/python3.8/site-packages/urllib3/connectionpool.py", line 670, in urlopen
httplib_response = self._make_request(
File "/Users/me/opt/anaconda3/lib/python3.8/site-packages/urllib3/connectionpool.py", line 428, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
File "/Users/me/opt/anaconda3/lib/python3.8/site-packages/urllib3/connectionpool.py", line 335, in _raise_timeout
raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host='127.0.0.1', port=9200): Read timed out. (read timeout=10)
I have to admit this one is a bit over my head. I don't typically like to paste large error messages here, but I'm not sure what about this message is relevant.
I can't help but think that I maybe need to adjust some of the params in the es object? Or the configuration variables? I don't know enough about the params to be able to make an educated decision on my own.
And the last but certainly not least point - it looks like some documents were loaded into the ES index nonetheless. But even stranger, the count shows 110k when the json file only has 100k.
TL;DR:
Reduce the chunk_size from 10000 to the default of 500 and I'd expect it to work. You probably want to disable automatic retries if that can give you duplicates.
What happened?
When creating your Elasticsearch object, you specified chunk_size=10000. This means that the streaming_bulk call will try to insert chunks of 10000 elements. The connection to elasticsearch has a configurable timeout, which by default is 10 seconds. So, if your elasticsearch server takes more than 10 seconds to process the 10000 elements you want to insert, a timeout will happen and this will be handled as an error.
When creating your Elasticsearch object, you also specified retry_on_timeout as True and in the streaming_bulk_call you set max_retries=max_insert_retries, which is 3.
This means that when such a timeout happens, the library will try reconnecting 3 times, however, when the insert still has a timeout after that, it will give you the error you noticed. (Documentation)
Also, when the timeout happens, the library can not know whether the documents were inserted successfully, so it has to assume that they were not. Thus, it will try to insert the same documents again. I don't know how your input lines look like, but if they do not contain an _id field, this would create duplicates in your index. You probably want to prevent this -- either by adding some kind of _id, or by disabling the automatic retry and handling it manually.
What to do?
There is two ways you can go about this:
Increase the timeout
Reduce the chunk_size
streaming_bulk by default has chunk_size set to 500. Your 10000 is much higher. I wouldn't expect a high performance gain when increasing this to more than 500, so I'd advice you to just use the default of 500 here. If 500 still fails with a timeout, you may even want to reduce it further. This could happen if the documents you want to index are very complex.
You could also increase the timeout for the streaming_bulk call, or, alternatively, for your es object. To only change it for the streaming_bulk call, you can provide the request_timeout keyword argument:
for ok, result in streaming_bulk(
es,
data_generator(),
chunk_size=chunk_size,
refresh=refresh_index_after_insert,
request_timeout=60*3, # 3 minutes
yield_ok=yield_ok):
# handle like you did
pass
However, this also means that elasticsearch node failure will only be detected after this higher timeout. See the documentation for more details
I have a textbased interface (asciimatics module) for my program that uses asyncio and discord.py module and occasionally when my wifi adapter goes down I get an exception like so:
Task exception was never retrieved
future: <Task finished coro=<WebSocketCommonProtocol.run() done, defined at /home/mike/.local/lib/python3.5/site-packages/websockets/protocol.py:428> exception=ConnectionResetError(104, 'Connection reset by peer')>
Traceback (most recent call last):
File "/usr/lib/python3.5/asyncio/tasks.py", line 241, in _step
result = coro.throw(exc)
File "/home/mike/.local/lib/python3.5/site-packages/websockets/protocol.py", line 434, in run
msg = yield from self.read_message()
File "/home/mike/.local/lib/python3.5/site-packages/websockets/protocol.py", line 456, in read_message
frame = yield from self.read_data_frame(max_size=self.max_size)
File "/home/mike/.local/lib/python3.5/site-packages/websockets/protocol.py", line 511, in read_data_frame
frame = yield from self.read_frame(max_size)
File "/home/mike/.local/lib/python3.5/site-packages/websockets/protocol.py", line 546, in read_frame
self.reader.readexactly, is_masked, max_size=max_size)
File "/home/mike/.local/lib/python3.5/site-packages/websockets/framing.py", line 86, in read_frame
data = yield from reader(2)
File "/usr/lib/python3.5/asyncio/streams.py", line 670, in readexactly
block = yield from self.read(n)
File "/usr/lib/python3.5/asyncio/streams.py", line 627, in read
yield from self._wait_for_data('read')
File "/usr/lib/python3.5/asyncio/streams.py", line 457, in _wait_for_data
yield from self._waiter
File "/usr/lib/python3.5/asyncio/futures.py", line 361, in __iter__
yield self # This tells Task to wait for completion.
File "/usr/lib/python3.5/asyncio/tasks.py", line 296, in _wakeup
future.result()
File "/usr/lib/python3.5/asyncio/futures.py", line 274, in result
raise self._exception
File "/usr/lib/python3.5/asyncio/selector_events.py", line 662, in _read_ready
data = self._sock.recv(self.max_size)
ConnectionResetError: [Errno 104] Connection reset by peer
This exception is non-fatal and the program is able to re-connect despite it - what I want to do is prevent this exception from dumping to stdout and mucking up my text interface.
I tried using ensure_future to handle it but it doesn't seem to work. Am I missing something:
#asyncio.coroutine
def handle_exception():
try:
yield from WebSocketCommonProtocol.run()
except Exception:
print("SocketException-Retrying")
asyncio.ensure_future(handle_exception())
#start discord client
client.run(token)
Task exception was never retrieved - is not actually exception propagated to stdout, but a log message that warns you that you never retrieved exception in one of your tasks. You can find details here.
I guess, most easy way to avoid this message in your case is to retrieve exception from task manually:
coro = WebSocketCommonProtocol.run() # you don't need any wrapper
task = asyncio.ensure_future(coro)
try:
#start discord client
client.run(token)
finally:
# retrieve exception if any:
if task.done() and not task.cancelled():
task.exception() # this doesn't raise anything, just mark exception retrieved
The answer provided by Mikhail is perfectly acceptable, but I realized it wouldn't work for me since the task that is raising the exception is buried deep in some module so trying to retrieve it's exception is kind've difficult. I found that instead if I simply set a custom exception handler for my asyncio loop (loop is created by the discord client):
def exception_handler(loop,context):
print("Caught the following exception")
print(context['message'])
client.loop.set_exception_handler(exception_handler)
client.run(token)
I am using python-requests and noticed that it returns unrelated errors after failing to fetch a web page when not connected to the Internet.
The documentation mentions Exceptions, but not how to use them. How should the program verify that it is indeed connected, and fail nicely if not?
I currently have no error-handling system in place, and this is what I get:
File "mem.py", line 78, in <module>
login()
File "mem.py", line 38, in login
csrf = s.cookies['csrftoken']
File "/usr/local/lib/python2.7/site-packages/requests/cookies.py", line 276, in __getitem__
return self._find_no_duplicates(name)
File "/usr/local/lib/python2.7/site-packages/requests/cookies.py", line 331, in _find_no_duplicates
raise KeyError('name=%r, domain=%r, path=%r' % (name, domain, path))
KeyError: "name='csrftoken', domain=None, path=None"
It appears that HTTP errors are not raised by default in python-requests. This answer sums it up nicely: https://stackoverflow.com/a/24460981/908703
import requests
def connected_to_internet(url='http://www.google.com/', timeout=5):
try:
_ = requests.get(url, timeout=timeout)
return True
except requests.ConnectionError:
print("No internet connection available.")
return False
I have a web-service deployed in my box. I want to check the result of this service with various input. Here is the code I am using:
import sys
import httplib
import urllib
apUrl = "someUrl:somePort"
fileName = sys.argv[1]
conn = httplib.HTTPConnection(apUrl)
titlesFile = open(fileName, 'r')
try:
for title in titlesFile:
title = title.strip()
params = urllib.urlencode({'search': 'abcd', 'text': title})
conn.request("POST", "/somePath/", params)
response = conn.getresponse()
data = response.read().strip()
print data+"\t"+title
conn.close()
finally:
titlesFile.close()
This code is giving an error after same number of lines printed (28233). Error message:
Traceback (most recent call last):
File "testService.py", line 19, in ?
conn.request("POST", "/somePath/", params)
File "/usr/lib/python2.4/httplib.py", line 810, in request
self._send_request(method, url, body, headers)
File "/usr/lib/python2.4/httplib.py", line 833, in _send_request
self.endheaders()
File "/usr/lib/python2.4/httplib.py", line 804, in endheaders
self._send_output()
File "/usr/lib/python2.4/httplib.py", line 685, in _send_output
self.send(msg)
File "/usr/lib/python2.4/httplib.py", line 652, in send
self.connect()
File "/usr/lib/python2.4/httplib.py", line 636, in connect
raise socket.error, msg
socket.error: (99, 'Cannot assign requested address')
I am using Python 2.4.3. I am doing conn.close() also. But why is this error being given?
This is not a python problem.
In linux kernel 2.4 the ephemeral port range is from 32768 through 61000. So number of available ports = 61000-32768+1 = 28233. From what i understood, because the web-service in question is quite fast (<5ms actually) thus all the ports get used up. The program has to wait for about a minute or two for the ports to close.
What I did was to count the number of conn.close(). When the number was 28000 wait for 90sec and reset the counter.
BIGYaN identified the problem correctly and you can verify that by calling "netstat -tn" right after the exception occurs. You will see very many connections with state "TIME_WAIT".
The alternative to waiting for port numbers to become available again is to simply use one connection for all requests. You are not required to call conn.close() after each call of conn.request(). You can simply leave the connection open until you are done with your requests.
I too faced similar issue while executing multiple POST statements using python's request library in Spark. To make it worse, I used multiprocessing over each executor to post to a server. So thousands of connections created in seconds that took few seconds each to change the state from TIME_WAIT and release the ports for the next set of connections.
Out of all the available solutions available over the internet that speak of disabling keep-alive, using with request.Session() et al, I found this answer to be working which makes use of 'Connection' : 'close' configuration as header parameter. You may need to put the header content in a separte line outside the post command though.
headers = {
'Connection': 'close'
}
with requests.Session() as session:
response = session.post('https://xx.xxx.xxx.x/xxxxxx/x', headers=headers, files=files, verify=False)
results = response.json()
print results
This is my answer to the similar issue using the above solution.