I'm having trouble with one of my scripts seemingly disconnecting from my FTP during long batches of jobs. To counter this, I've attempted to make a module as shown below:
def connect_ftp(ftp):
print "ftp1"
starttime = time.time()
retry = False
try:
ftp.voidcmd("NOOP")
print "ftp2"
except:
retry = True
print "ftp3"
print "ftp4"
while (retry):
try:
print "ftp5"
ftp.connect()
ftp.login('LOGIN', 'CENSORED')
print "ftp6"
retry = False
print "ftp7"
except IOError as e:
print "ftp8"
retry = True
sys.stdout.write("\rTime disconnected - "+str(time.time()-starttime))
sys.stdout.flush()
print "ftp9"
I call the function using only:
ftp = ftplib.FTP('CENSORED')
connect_ftp(ftp)
However, I've traced how the code runs using print lines, and on the first use of the module (before the FTP is even connected to) my script runs ftp.voidcmd("NOOP") and does not except it, so no attempt is made to connect to the FTP initially.
The output is:
ftp1
ftp2
ftp4
ftp success #this is ran after the module is called
I admit my code isn't the best or prettiest, and I haven't implemented anything yet to make sure I'm not reconnecting constantly if I keep failing to reconnect, but I can't work out why this isn't working for the life of me so I don't see a point in expanding the module yet. Is this even the best approach for connecting/reconnecting to an FTP?
Thank you in advance
This connects to the server:
ftp = ftplib.FTP('CENSORED')
So, naturally the NOOP command succeeds, as it does not need an authenticated connection.
Your connect_ftp is correct, except that you need to specify a hostname in your connect call.
Related
I am writing a Python Script to fully boot up a handful of ESXI hosts remotely, and I am having trouble with determining when ESXI has finished booting and is ready to receive commands send over SSH. I am running the script on a windows host that is hardwired to each ESXI host and the system is air-gapped so there is no firewalls in the way and no security software would interfere.
Currently I am doing this: I remote into the chassis through SSH and send the power commands to the ESXI host - this works and has always worked. Then, I attempt to SSH into each blade and send the following command
esxcli system stats uptime get
The command doesn't matter, I just need a response to make sure that the host is up. Below is the function I am using to send the SSH commands in hopes of getting a response
def send_command(ip, port, timeout, retry_interval, cmd, user, password):
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
retry_interval = float(retry_interval)
timeout = int(timeout)
timeout_start = time.time()
worked = False
while worked == False:
time.sleep(retry_interval)
try:
ssh.connect(ip, port, user, password, timeout=5)
stdin,stdout,stderr=ssh.exec_command(cmd)
outlines=stdout.readlines()
resp=''.join(outlines)
print(resp)
worked = True
return (resp)
except socket_error as e:
worked = False
print(e)
continue
except paramiko.ssh_exception.SSHException as e:
worked = False
# socket is open, but not SSH service responded
print(e)
continue
except TimeoutError as e:
print(e)
worked = False
pass
except socket.timeout as e:
print(e)
worked = False
continue
except paramiko.ssh_exception.NoValidConnectionsError as e:
print(e)
worked = False
continue
except socket.error as serr:
print(serr)
worked = False
continue
except IOError as e:
print(e)
worked = False
continue
except:
print(e)
worked = False
continue
My goal here is to catch all of the exceptions long enough for the host to finish booting and then receive a response. The issue is that sometimes it will loop for several minutes (as expected when booting a system like this), and then it will print
IO error: [Errno 111] Connection refused
And then drop out of the function/try catch block and never establish the connection. I know that this is a fault of my exceptions handling because when this happens, I stop the script, wait a few minutes, run it again without touching anything else and the esxcli command will work perfectly and the script will work great.
How do I prevent the Errno 111 error from breaking my loop? Any help is greatly appreciated
Edit: One possible duct tape solution could be changing the command to "esxcli system hostname get" and checking the response for the word "Domain". This might work because the IOError seems to be a response and not an exception, I'll have to wait until monday to test that solution though.
I solved it. It occured to me that I was handling all possible exceptions that any python code could possibly throw, so my defect wasn't a python error and that would make sense why I wasn't finding anything online about the relationship between Python, SSH and the Errno 111 error.
The print out is in fact a response from the ESXI host, and my code is looking for any response. So I simply changed the esxcli command from requesting the uptime to
esxcli system hostname get
and then through this into the function
substring = "Domain"
if substring not in resp:
print(resp)
continue
I am looking for the word "Domain" because that must be there if that call is successful.
How I figure it out: I installed ESXI 7 on an old Intel Nuc, turned on SSH in the kickstart script, started the script and then turned on the nuc. The reason I used the NUC is because a fresh install on simple hardware boots up much faster and quietly than Dell Blades! Also, I wrapped the resp variable in a print(type(OBJECT)) line and was able to determine that it was infact a string and not an error object.
This may not help someone that has a legitimate Errno 111 error, I knew I was going to run into this error each and everytime I ran the code and I just needed to know how to handle it and hold the loop until I got the response I wanted.
Edit: I suppose it would be easier to just filter for the world "errno" and then continue the loop instead of using a different substring. That would handle all of my use cases and eliminate the need for a different function.
I wrote a bot for TeamSpeak 3 that runs over ServerQuery (a telnet interface).
But the bot keeps responding later and later, in the beginning it takes like 0.1 sec, after like 1 minute the bot takes about 10 seconds to respond, and using commands makes it even faster.
Any idea why?
So basically the telnet interface sends data from the TS3 Server to my python script, the ts3 module recieves and processes the data, then the script will make a decision of what the action will be.
As modules I am using MySQLdb and ts3(https://github.com/benediktschmitt/py-ts3)
My sourcecode is here: https://pastebin.com/cJuyB9ZH
Another script, which just takes all clients and pushes them into a database every 5 min, runs multiple days without any issues.
I checked the code multiple times now and even deleted variables right after they have been used, but it still has the same issue.
My guess would be that is sortof clogges the RAM, so I looked through the code multiple times, but couldn't find out why or where.
Sidenote: I know I sometimes call a commit() when its totally not necessary, but I don't know if that might cause problems, but I dont see how.
Short(er) version of my code:
import ts3
import MySQLdb
# Some other imports like time and threading and such
## Connect to TS3
tsConn = ts3.query.TS3Connection(tsAddr, tsPort)
try:
tsConn.login(client_login_name=tsUser, client_login_password=tsPass)
tsConn.use(sid=tsSID, virtual=True)
print(" ==>> CONNECTED TO TS3 SERVER: " + tsAddr)
except ts3.query.TS3QueryError as e:
print("Login to TS Server failed! Aborting...")
exit(1)
## Connect to mySQL
try:
qConn = MySQLdb.connect(host=qHost, user=qUser, passwd=qPass, db=qDB)
qServer = qConn.cursor()
print(" ==>> CONNECTED TO mySQL SERVER: " + qHost)
except OperationalError:
print("Cannot connect to mySQL Database! Aborting...")
exit(1)
running = True
while running:
tsConn.send_keepalive()
qServer.execute("SELECT 1") # keepalive
try:
e = tsConn.wait_for_event(timeout=1)
except TS3TimeoutError:
pass
else:
try:
# <some command processing here>
except KeyError:
try:
if event[0]["reasonid"] == "0":
tsConn.sendtextmessage(targetmode=1, target=event[0]["clid"], msg=greetingmsg.format(event[0]["client_nickname"]))
except:
pass
I have this simple minimal 'working' example below that opens a connection to google every two seconds. When I run this script when I have a working internet connection, I get the Success message, and when I then disconnect, I get the Fail message and when I reconnect again I get the Success again. So far, so good.
However, when I start the script when the internet is disconnected, I get the Fail messages, and when I connect later, I never get the Success message. I keep getting the error:
urlopen error [Errno -2] Name or service not known
What is going on?
import urllib2, time
while True:
try:
print('Trying')
response = urllib2.urlopen('http://www.google.com')
print('Success')
time.sleep(2)
except Exception, e:
print('Fail ' + str(e))
time.sleep(2)
This happens because the DNS name "www.google.com" cannot be resolved. If there is no internet connection the DNS server is probably not reachable to resolve this entry.
It seems I misread your question the first time. The behaviour you describe is, on Linux, a peculiarity of glibc. It only reads "/etc/resolv.conf" once, when loading. glibc can be forced to re-read "/etc/resolv.conf" via the res_init() function.
One solution would be to wrap the res_init() function and call it before calling getaddrinfo() (which is indirectly used by urllib2.urlopen().
You might try the following (still assuming you're using Linux):
import ctypes
libc = ctypes.cdll.LoadLibrary('libc.so.6')
res_init = libc.__res_init
# ...
res_init()
response = urllib2.urlopen('http://www.google.com')
This might of course be optimized by waiting until "/etc/resolv.conf" is modified before calling res_init().
Another solution would be to install e.g. nscd (name service cache daemon).
For me, it was a proxy problem.
Running the following before import urllib.request helped
import os
os.environ['http_proxy']=''
response = urllib.request.urlopen('http://www.google.com')
I'm currently doing this with my script:
Get the body (from sourcecode) and search for a string, it does it until the string is found. (If the site updates.)
Altough, if the connection is lost, the script stops.
My 'connection' code looks something like this (This keeps repeating in a while loop every 20 seconds):
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
url = ('url')
openUrl = opener.open(url).read()
soup = BeautifulSoup(openUrl)
I've used urllib2 & BeautifulSoup.
Can anyone tell me how I could tell the script to "freeze" if the connection is lost and look to see if the internet connection is alive? Then continue based on the answer.(So, to check if the script CAN connect, not to see if the site is up. If it does checkings this way, the script will stop with a bunch of errors.)
Thank you!
Found the solution!
So, I need to check the connection every LOOP, before actually doing stuff.
So I created this function:
def check_internet(self):
try:
header = {"pragma" : "no-cache"}
req = urllib2.Request("http://www.google.ro", headers=header)
response = urllib2.urlopen(req,timeout=2)
return True
except urllib2.URLError as err:
return False
And it works, tested it with my connection down & up!
For the other newbies wodering:
while True:
conn = check_internet('Site or just Google, just checking for connection.')
try:
if conn is True:
#code
else:
#need to make it wait and re-do the while.
time.sleep(30)
except: urllib2.URLError as err:
#need to wait
time.sleep(20)
Works perfectly, the script has been running for about 10 hours now and it handles errors perfectly! It also works with my connection off and shows proper messages.
Open to suggestions for optimization!
Rather than "freeze" the script, I would have the script continue to run only if the connection is alive. If it's alive, run your code. If it's not alive, either attempt to reconnect, or halt execution.
while keepRunning:
if connectionIsAlive():
run_your_code()
else:
reconnect_maybe()
One way to check whether the connection is alive is described here Checking if a website is up via Python
If your program "stops with a bunch of errors" then that is likely because you're not properly handling the situation where you're unable to connect to the site (for various reasons such as you not having internet, their website is down, etc.).
You need to use a try/except block to make sure that you catch any errors that occur because you were unable to open a live connection.
try:
openUrl = opener.open(url).read()
except urllib2.URLError:
# something went wrong, how to respond?
I'm writing code that will run on Linux, OS X, and Windows. It downloads a list of approximately 55,000 files from the server, then steps through the list of files, checking if the files are present locally. (With SHA hash verification and a few other goodies.) If the files aren't present locally or the hash doesn't match, it downloads them.
The server-side is plain-vanilla Apache 2 on Ubuntu over port 80.
The client side works perfectly on Mac and Linux, but gives me this error on Windows (XP and Vista) after downloading a number of files:
urllib2.URLError: <urlopen error <10048, 'Address already in use'>>
This link: http://bytes.com/topic/python/answers/530949-client-side-tcp-socket-receiving-address-already-use-upon-connect points me to TCP port exhaustion, but "netstat -n" never showed me more than six connections in "TIME_WAIT" status, even just before it errored out.
The code (called once for each of the 55,000 files it downloads) is this:
request = urllib2.Request(file_remote_path)
opener = urllib2.build_opener()
datastream = opener.open(request)
outfileobj = open(temp_file_path, 'wb')
try:
while True:
chunk = datastream.read(CHUNK_SIZE)
if chunk == '':
break
else:
outfileobj.write(chunk)
finally:
outfileobj = outfileobj.close()
datastream.close()
UPDATE: I find by greping the log that it enters the download routine exactly 3998 times. I've run this multiple times and it fails at 3998 each time. Given that the linked article states that available ports are 5000-1025=3975 (and some are probably expiring and being reused) it's starting to look a lot more like the linked article describes the real issue. However, I'm still not sure how to fix this. Making registry edits is not an option.
If it is really a resource problem (freeing os socket resources)
try this:
request = urllib2.Request(file_remote_path)
opener = urllib2.build_opener()
retry = 3 # 3 tries
while retry :
try :
datastream = opener.open(request)
except urllib2.URLError, ue:
if ue.reason.find('10048') > -1 :
if retry :
retry -= 1
else :
raise urllib2.URLError("Address already in use / retries exhausted")
else :
retry = 0
if datastream :
retry = 0
outfileobj = open(temp_file_path, 'wb')
try:
while True:
chunk = datastream.read(CHUNK_SIZE)
if chunk == '':
break
else:
outfileobj.write(chunk)
finally:
outfileobj = outfileobj.close()
datastream.close()
if you want you can insert a sleep or you make it os depended
on my win-xp the problem doesn't show up (I reached 5000 downloads)
I watch my processes and network with process hacker.
Thinking outside the box, the problem you seem to be trying to solve has already been solved by a program called rsync. You might look for a Windows implementation and see if it meets your needs.
You should seriously consider copying and modifying this pyCurl example for efficient downloading of a large collection of files.
Instead of opening a new TCP connection for each request you should really use persistent HTTP connections - have a look at urlgrabber (or alternatively, just at keepalive.py for how to add keep-alive connection support to urllib2).
All indications point to a lack of available sockets. Are you sure that only 6 are in TIME_WAIT status? If you're running so many download operations it's very likely that netstat overruns your terminal buffer. I find that netstat stat overruns my terminal during normal useage periods.
The solution is to either modify the code to reuse sockets. Or introduce a timeout. It also wouldn't hurt to keep track of how many open sockets you have. To optimize waiting. The default timeout on Windows XP is 120 seconds. so you want to sleep for at least that long if you run out of sockets. Unfortunately it doesn't look like there's an easy way to check from Python when a socket has closed and left the TIME_WAIT status.
Given the asynchronous nature of the requests and timeouts, the best way to do this might be in a thread. Make each threat sleep for 2 minutes before it finishes. You can either use a Semaphore or limit the number of active threads to ensure that you don't run out of sockets.
Here's how I'd handle it. You might want to add an exception clause to the inner try block of the fetch section, to warn you about failed fetches.
import time
import threading
import Queue
# assumes url_queue is a Queue object populated with tuples in the form of(url_to_fetch, temp_file)
# also assumes that TotalUrls is the size of the queue before any threads are started.
class urlfetcher(threading.Thread)
def __init__ (self, queue)
Thread.__init__(self)
self.queue = queue
def run(self)
try: # needed to handle empty exception raised by an empty queue.
file_remote_path, temp_file_path = self.queue.get()
request = urllib2.Request(file_remote_path)
opener = urllib2.build_opener()
datastream = opener.open(request)
outfileobj = open(temp_file_path, 'wb')
try:
while True:
chunk = datastream.read(CHUNK_SIZE)
if chunk == '':
break
else:
outfileobj.write(chunk)
finally:
outfileobj = outfileobj.close()
datastream.close()
time.sleep(120)
self.queue.task_done()
elsewhere:
while url_queue.size() < TotalUrls: # hard limit of available ports.
if threading.active_threads() < 3975: # Hard limit of available ports
t = urlFetcher(url_queue)
t.start()
else:
time.sleep(2)
url_queue.join()
Sorry, my python is a little rusty, so I wouldn't be surprised if I missed something.