Here are some bits of code I use to download through ftp. I was trying to stop the download then continue or redownload it afterwards. I've tried ftp.abort() but it only hangs and returns timeout.
ftplib.error_proto: 421 Data timeout. Reconnect. Sorry.
SCENARIO:
The scenario is the user will choose the file to download, while downloading, the user can stop the current download and download a new file. The code 'if os.path.getsize(self.file_path) >117625:' is just my example if the user stops the download. Its not the full size of the file.
thanks.
from ftplib import FTP
class ftpness:
def __init__(self):
self.connect(myhost, myusername, mypassword)
def handleDownload(self,block):
self.f.write(block)
if os.path.getsize(self.file_path) >117625:
self.ftp.abort()
def connect(self,host, username, password):
self.ftp = FTP(host)
self.ftp.login(username, password)
self.get(self.file_path)
def get(self,filename):
self.f = open(filename, 'wb')
self.ftp.retrbinary('RETR ' + filename, self.handleDownload)
self.f.close()
self.ftp.close
a = ftpness()
error 421 is the std timeout error. so need to have the connection until the file has been downloaded.
def handleDownload(self,block):
self.f.write(block)
if os.path.getsize(self.file_path) >117625:
self.ftp.abort()
else:
self.ftp.sendcmd('NOOP')
#try to add this line just to keep the connection alive.
hope this will help you. :)
Here's a way to do it with a watchdog timer. This involves creating a separate thread, which depending on the design of your application may not be acceptable.
To kill a download with a user event, it's the same idea. If the GUI works in a separate thread, then that thread can just reach inside the FTP instance and close its socket directly.
from threading import Timer
class ftpness:
...
def connect(self,host, username, password):
self.ftp = FTP(host)
self.ftp.login(username, password)
watchdog = Timer(self.timeout, self.ftp.sock.close)
watchdog.start()
self.get(self.file_path)
watchdog.cancel() # if file transfer succeeds cancel timer
This way, if the file transfer runs longer than your preset timeout, the timer thread will close the socket underneath the transfer, forcing the get call to raise an exception. Only when the transfer succeeds is the watchdog timer cancelled.
And though this has nothing to do with your question, normally a connect call should not transfer payload data.
This is Your session idle time too long.You can file after the President into instantiate ftplib. Otherwise. Modify ftp software configuration.
For example, you use vsftpd, you can add the following configuration to vsftpd.conf:
idle_session_timeout=60000 # The default is 600 seconds
Related
I'm trying to write a html-file and then upload it to my website using the following code:
webpage = open('testfile.html',"w")
webpage.write(contents)
webpage.close
server = 'ftp.xxx.be'
username = 'userxxx'
password = 'topsecret'
ftp_connection = ftplib.FTP(server, username, password)
remote_path = "/"
ftp_connection.cwd(remote_path)
fh = open("testfile.html", 'rb')
ftp_connection.storbinary('STOR testfile.html', fh)
fh.close()
The problem is the .close command seems to be slower than the ftp connection and the file that is sent over ftp is empty. A few seconds after the ftp is executed I see the file correctly locally on my PC.
Any hints to be certain the .close is finished before the ftp starts (apart from using time.sleep())?
Running Python 3.xx on W7pro
Try blocking on the close call:
Blocking until a file is closed in python
By the way, are the parentheses missing on your close call?
Briefing
I am currently building a python SMTP Mail sender program.
I added a feature so that the user would not be able to log in if there was no active internet connection, I tried many solutions/variations to make the real time connection checking as swift as possible, there were many problems such as:
The thread where the connection handler was running suddenly lagged when I pulled out the ethernet cable ( to test how it would handle the sudden disconnect )
The whole program crashed
It took several seconds for the program to detect the change
My current solution
I set up a data handling class which would contain all the necessary info ( the modules needed to share info effectively )
import smtplib
from socket import gaierror, timeout
class DataHandler:
is_logged_in = None
is_connected = None
server_conn = None
user_address = ''
user_passwd = ''
#staticmethod
def try_connect():
try:
DataHandler.server_conn = smtplib.SMTP('smtp.gmail.com', 587, timeout=1) # The place where the connection is checked
DataHandler.is_connected = True
except (smtplib.SMTPException, gaierror, timeout):
DataHandler.is_connected = False # Connection status changed upon a connection error
I put a connection handler class on a second thread, the server connection process slowed down the gui when it was all on one thread.
from root_gui import Root
import threading
from time import sleep
from data_handler import DataHandler
def handle_conn():
DataHandler.try_connect()
smtp_client.refresh() # Refreshes the gui according to the current status
def conn_manager(): # Working pretty well
while 'smtp_client' in globals():
sleep(0.6)
try:
handle_conn() # Calls the connection
except NameError: # If the user quits the tkinter gui
break
smtp_client = Root()
handle_conn()
MyConnManager = threading.Thread(target=conn_manager)
MyConnManager.start()
smtp_client.mainloop()
del smtp_client # The connection manager will detect this and stop running
My question is:
Is this a good practice or a terrible waste of resources? Is there a better way to do this because no matter what I tried, this was the only solution that worked.
From what I know the try_connect() function creates a completely new smtp object each time it is run ( which is once in 0.6 seconds! )
Resources/observations
The project on git: https://github.com/cernyd/smtp_client
Observation: the timeout parameter when creating the smtp object improved response times drastically, why is that so?
I am trying to download a file from FTP server. I am able to connect to the server. But not able to change the directory.
#! /user/bin/python33
import os
import ftplib
ftp = ftplib.FTP("ftp.sec.gov")
ftp.login("anonymous", "abcd#yahoo.com")
data = []
ftp.dir(data.append)
ftp.quit()
for line in data:
print( "-", line)
print(os.getcwd())
path= "/edgar/full-index/2013/"
print(path)
ftp.cwd(path)
it fails in the last line. can some one suggest what needs to be done"
thanks a lot in advance
Your cwd call fails because you previously called ftp.quit().
The docs for that method say:
Send a QUIT command to the server and close the connection. This is the “polite” way to close a connection, but it may raise an exception if the server responds with an error to the QUIT command. This implies a call to the close() method which renders the FTP instance useless for subsequent calls (see below).
(The "below" reference is to the next part of the documentation which says you can't do any operations on a closed FTP object.)
I am trying to import this file
http://pastebin.com/bEss4J6Q
Into this file
def MainLoop(self): #MainLoop is used to make the commands executable ie !google !say etc;
try:
while True:
# This method sends a ping to the server and if it pings it will send a pong back
#in other clients they keep receiving till they have a complete line however mine does not as of right now
#The PING command is used to test the presence of an active client or
#server at the other end of the connection. Servers send a PING
#message at regular intervals if no other activity detected coming
#from a connection. If a connection fails to respond to a PING
#message within a set amount of time, that connection is closed. A
#PING message MAY be sent even if the connection is active.
#PONG message is a reply to PING message. If parameter <server2> is
#given, this message will be forwarded to given target. The <server>
#parameter is the name of the entity who has responded to PING message
#and generated this message.
self.data = self.irc.recv( 4096 )
print self.data
if self.data.find ( 'PING' ) != -1:
self.irc.send(( "PONG %s \r\n" ) % (self.data.split() [ 1 ])) #Possible overflow problem
if "!chat" in self.data:
.....
So that I can successfully call upon the imported file (ipibot) whenever
'!chat' in self.data: # is called.
But I'm not sure how to write it. This is what I have so far
if "!chat" in self.data:
user = ipibot.ipibot()
user.respond
I'd like to state I have taken a look at the module portion of Python as well as Importing I just can't seem to grasp it I guess?
file -> class -> function is what I understand it to be.
A module is nothing but a python source file. You keep that python source file in the same directory as other source file and you can import that module in other source files. When you are importing that module, the classes and functions defined in that module are available for you to use.
For e.g. in your case, you would just do
import ipibot
At the top of your source, provided that ipibot.py (your pastebin) file is present in the same directory or PYTHONPATH (a standard directory where python programs can lookup for a module) and then start using ipibot.ipibot() to use the function ipibot()from that module. Thats it.
I'm writing code that will run on Linux, OS X, and Windows. It downloads a list of approximately 55,000 files from the server, then steps through the list of files, checking if the files are present locally. (With SHA hash verification and a few other goodies.) If the files aren't present locally or the hash doesn't match, it downloads them.
The server-side is plain-vanilla Apache 2 on Ubuntu over port 80.
The client side works perfectly on Mac and Linux, but gives me this error on Windows (XP and Vista) after downloading a number of files:
urllib2.URLError: <urlopen error <10048, 'Address already in use'>>
This link: http://bytes.com/topic/python/answers/530949-client-side-tcp-socket-receiving-address-already-use-upon-connect points me to TCP port exhaustion, but "netstat -n" never showed me more than six connections in "TIME_WAIT" status, even just before it errored out.
The code (called once for each of the 55,000 files it downloads) is this:
request = urllib2.Request(file_remote_path)
opener = urllib2.build_opener()
datastream = opener.open(request)
outfileobj = open(temp_file_path, 'wb')
try:
while True:
chunk = datastream.read(CHUNK_SIZE)
if chunk == '':
break
else:
outfileobj.write(chunk)
finally:
outfileobj = outfileobj.close()
datastream.close()
UPDATE: I find by greping the log that it enters the download routine exactly 3998 times. I've run this multiple times and it fails at 3998 each time. Given that the linked article states that available ports are 5000-1025=3975 (and some are probably expiring and being reused) it's starting to look a lot more like the linked article describes the real issue. However, I'm still not sure how to fix this. Making registry edits is not an option.
If it is really a resource problem (freeing os socket resources)
try this:
request = urllib2.Request(file_remote_path)
opener = urllib2.build_opener()
retry = 3 # 3 tries
while retry :
try :
datastream = opener.open(request)
except urllib2.URLError, ue:
if ue.reason.find('10048') > -1 :
if retry :
retry -= 1
else :
raise urllib2.URLError("Address already in use / retries exhausted")
else :
retry = 0
if datastream :
retry = 0
outfileobj = open(temp_file_path, 'wb')
try:
while True:
chunk = datastream.read(CHUNK_SIZE)
if chunk == '':
break
else:
outfileobj.write(chunk)
finally:
outfileobj = outfileobj.close()
datastream.close()
if you want you can insert a sleep or you make it os depended
on my win-xp the problem doesn't show up (I reached 5000 downloads)
I watch my processes and network with process hacker.
Thinking outside the box, the problem you seem to be trying to solve has already been solved by a program called rsync. You might look for a Windows implementation and see if it meets your needs.
You should seriously consider copying and modifying this pyCurl example for efficient downloading of a large collection of files.
Instead of opening a new TCP connection for each request you should really use persistent HTTP connections - have a look at urlgrabber (or alternatively, just at keepalive.py for how to add keep-alive connection support to urllib2).
All indications point to a lack of available sockets. Are you sure that only 6 are in TIME_WAIT status? If you're running so many download operations it's very likely that netstat overruns your terminal buffer. I find that netstat stat overruns my terminal during normal useage periods.
The solution is to either modify the code to reuse sockets. Or introduce a timeout. It also wouldn't hurt to keep track of how many open sockets you have. To optimize waiting. The default timeout on Windows XP is 120 seconds. so you want to sleep for at least that long if you run out of sockets. Unfortunately it doesn't look like there's an easy way to check from Python when a socket has closed and left the TIME_WAIT status.
Given the asynchronous nature of the requests and timeouts, the best way to do this might be in a thread. Make each threat sleep for 2 minutes before it finishes. You can either use a Semaphore or limit the number of active threads to ensure that you don't run out of sockets.
Here's how I'd handle it. You might want to add an exception clause to the inner try block of the fetch section, to warn you about failed fetches.
import time
import threading
import Queue
# assumes url_queue is a Queue object populated with tuples in the form of(url_to_fetch, temp_file)
# also assumes that TotalUrls is the size of the queue before any threads are started.
class urlfetcher(threading.Thread)
def __init__ (self, queue)
Thread.__init__(self)
self.queue = queue
def run(self)
try: # needed to handle empty exception raised by an empty queue.
file_remote_path, temp_file_path = self.queue.get()
request = urllib2.Request(file_remote_path)
opener = urllib2.build_opener()
datastream = opener.open(request)
outfileobj = open(temp_file_path, 'wb')
try:
while True:
chunk = datastream.read(CHUNK_SIZE)
if chunk == '':
break
else:
outfileobj.write(chunk)
finally:
outfileobj = outfileobj.close()
datastream.close()
time.sleep(120)
self.queue.task_done()
elsewhere:
while url_queue.size() < TotalUrls: # hard limit of available ports.
if threading.active_threads() < 3975: # Hard limit of available ports
t = urlFetcher(url_queue)
t.start()
else:
time.sleep(2)
url_queue.join()
Sorry, my python is a little rusty, so I wouldn't be surprised if I missed something.