I want to download .torrent file from the download link . I want the said file to be saved in .torrent format in the project folder
I've tried the following code
import urllib.request
url = 'https://torcache.net/torrent/92B4D5EA2D21BC2692A2CB1E5B9FBECD489863EC.torrent?title=[kat.cr]avengers.age.of.ultron.2015.1080p.brrip.x264.yify'
def download_torrent(url):
name= "movie"
full_name = str(name) + ".torrent"
urllib.request.urlretrieve(url, full_name)
download_torrent(url)
It shows the following error:
Traceback (most recent call last):
File "/home/taarush/PycharmProjects/untitled/fkn.py", line 10, in <module>
download_torrent(url)
File "/home/taarush/PycharmProjects/untitled/fkn.py", line 7, in download_torrent
urllib.request.urlretrieve(url, full_name)
File "/usr/lib/python3.5/urllib/request.py", line 187, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "/usr/lib/python3.5/urllib/request.py", line 162, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.5/urllib/request.py", line 465, in open
response = self._open(req, data)
File "/usr/lib/python3.5/urllib/request.py", line 483, in _open
'_open', req)
File "/usr/lib/python3.5/urllib/request.py", line 443, in _call_chain
result = func(*args)
File "/usr/lib/python3.5/urllib/request.py", line 1286, in https_open
context=self._context, check_hostname=self._check_hostname)
File "/usr/lib/python3.5/urllib/request.py", line 1246, in do_open
r = h.getresponse()
File "/usr/lib/python3.5/http/client.py", line 1197, in getresponse
response.begin()
File "/usr/lib/python3.5/http/client.py", line 297, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.5/http/client.py", line 266, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
Where did it go wrong? is there a way to use magnet links? (urllib library only)
Add User-Agent http header as #Alik suggested. The question you've linked shows how. Here's an adaptation for Python 3:
#!/usr/bin/env python3
import urllib.request
opener = urllib.request.build_opener()
opener.addheaders = [('User-Agent', 'CERN-LineMode/2.15 libwww/2.17b3')]
urllib.request.install_opener(opener) #NOTE: global for the process
urllib.request.urlretrieve(url, filename)
See the specification for User-Agent. If the site rejects your custom User-Agent, you could send User-Agent produced by ordinary browsers such as fake_useragent.UserAgent().random.
Related
I'm using SMTP lib to send emails with Python, and am suddenly encountering errors connecting to the mail server. I have never run into errors before and have sent many emails using this code - the only possibility I can think of is that I've recently updated to Mac OS 12.3.1 and that this may have modified my python installation in some way. Has anyone encountered (and resolved) a similar error?
Traceback (most recent call last):
File "subtest4.py", line 25, in <module>
userIP = urlopen('http://ip.42.pl/raw').read()
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 525, in open
response = self._open(req, data)
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 543, in _open
'_open', req)
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 503, in _call_chain
result = func(*args)
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 1345, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 1320, in do_open
r = h.getresponse()
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1321, in getresponse
response.begin()
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 296, in begin
version, status, reason = self._read_status()
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 257, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/socket.py", line 589, in readinto
return self._sock.recv_into(b)
ConnectionResetError: [Errno 54] Connection reset by peer
My code is as follows:
import urllib.request as urllib
def read_text():
quotes = open(r"C:\Users\hjayasinghe2\Desktop\Hasara Work\Learn\demofile.txt")
contents_of_file = quotes.read()
print(contents_of_file)
quotes.close()
check_pofanity(contents_of_file)
def check_pofanity(text_to_check):
connection = urllib.urlopen("http://www.wdyl.com/profanity?q= " + text_to_check)
output = connection.read()
print(output)
connection.close()
read_text()
errors i get is this:
Traceback (most recent call last):
File "C:/Users/hjayasinghe2/Desktop/Hasara Work/Learn/check_profanity.py", line 16, in <module>
read_text()
File "C:/Users/hjayasinghe2/Desktop/Hasara Work/Learn/check_profanity.py", line 8, in read_text
check_pofanity(contents_of_file)
File "C:/Users/hjayasinghe2/Desktop/Hasara Work/Learn/check_profanity.py", line 11, in check_pofanity
connection = urllib.urlopen("http://www.wdyl.com/profanity?q= " + text_to_check)
File "C:\Users\hjayasinghe2\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\hjayasinghe2\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 525, in open
response = self._open(req, data)
File "C:\Users\hjayasinghe2\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 543, in _open
'_open', req)
File "C:\Users\hjayasinghe2\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Users\hjayasinghe2\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 1345, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "C:\Users\hjayasinghe2\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 1317, in do_open
encode_chunked=req.has_header('Transfer-encoding'))
File "C:\Users\hjayasinghe2\AppData\Local\Programs\Python\Python37-32\lib\http\client.py", line 1244, in request
self._send_request(method, url, body, headers, encode_chunked)
File "C:\Users\hjayasinghe2\AppData\Local\Programs\Python\Python37-32\lib\http\client.py", line 1255, in _send_request
self.putrequest(method, url, **skips)
File "C:\Users\hjayasinghe2\AppData\Local\Programs\Python\Python37-32\lib\http\client.py", line 1117, in putrequest
raise InvalidURL(f"URL can't contain control characters. {url!r} "
http.client.InvalidURL: URL can't contain control characters. '/profanity?q= Video provides a powerful way to help you prove your point. When you click Online Video, you can paste in the embed code for the video you want to add.' (found at least ' ')
You'll need to URL encode the query. Try something like:
import urllib, urllib.parse
url = "http://www.wdyl.com/profanity?q=" + urllib.parse.quote(text_to_check)
connection = urllib.urlopen(url)
See https://docs.python.org/3/library/urllib.parse.html#urllib.parse.quote
I am trying to make a function that will download file from the internet. If I go to the direct web address I do get the images or download the files. But, when I am running my code, it just hangs and then I get the timeout error. Is there any particular reason why that might be happening?
#fpp = "http://www.blog.pythonlibrary.org/wp-content/uploads/2012/06/wxDbViewer.zip"
fpp = "http://www.gunnerkrigg.com//comics/00000001.jpg"
download_file(fpp)
This is my function:
import urllib2
def download_file(url_path):
response = urllib2.urlopen(url_path)
data = response.read()
Is there any particular reason why might work from the browser but not in the code?
this is the error i get:
Traceback (most recent call last):
File "/Users/dk/testing/myfile.py", line 42, in <module>
download_file(fpp)
File "/Users/dk/Documents/testing/code_project.py", line 154, in download_file
response = urllib2.urlopen(url_path)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 431, in open
response = self._open(req, data)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 449, in _open
'_open', req)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1227, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1197, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno 60] Operation timed out>
I want to create a web based scraper using Python, Selenium and PhantomJS where you can input a url into a form and the results from the scrape will be returned to the webpage. I can run it on my PC and I can also get it to work through the terminal.
It is located in a virtual environment on Dreamhost shared hosting with Python3.5 installed. I have tested that the parameters are being passed in fine, and it does work using just lxml and requests. However, when I try to run the script from the form on the webpage using PhantomJS then it doesn't work properly. The following error in returned...
Traceback (most recent call last):
File "testscrape.py", line 140, in <module>
driver = init_driver()
File "testscrape.py", line 69, in init_driver
driver = webdriver.PhantomJS(executable_path=phantomPATH,desired_capabilities=dcap)
File "/home/paul/.python35/bin/magenv/lib/python3.5/site-packages/selenium/webdriver/phantomjs/webdriver.py", line 56, in __init__
desired_capabilities=desired_capabilities)
File "/home/paul/.python35/bin/magenv/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 91, in __init__
self.start_session(desired_capabilities, browser_profile)
File "/home/paul/.python35/bin/magenv/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 173, in start_session
'desiredCapabilities': desired_capabilities,
File "/home/paul/.python35/bin/magenv/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 231, in execute
response = self.command_executor.execute(driver_command, params)
File "/home/paul/.python35/bin/magenv/lib/python3.5/site-packages/selenium/webdriver/remote/remote_connection.py", line 395, in execute
return self._request(command_info[0], url, body=data)
File "/home/paul/.python35/bin/magenv/lib/python3.5/site-packages/selenium/webdriver/remote/remote_connection.py", line 463, in _request
resp = opener.open(request, timeout=self._timeout)
File "/home/paul/.python35/lib/python3.5/urllib/request.py", line 465, in open
response = self._open(req, data)
File "/home/paul/.python35/lib/python3.5/urllib/request.py", line 483, in _open
'_open', req)
File "/home/paul/.python35/lib/python3.5/urllib/request.py", line 443, in _call_chain
result = func(*args)
File "/home/paul/.python35/lib/python3.5/urllib/request.py", line 1268, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "/home/paul/.python35/lib/python3.5/urllib/request.py", line 1243, in do_open
r = h.getresponse()
File "/home/paul/.python35/lib/python3.5/http/client.py", line 1174, in getresponse
response.begin()
File "/home/paul/.python35/lib/python3.5/http/client.py", line 282, in begin
version, status, reason = self._read_status()
File "/home/paul/.python35/lib/python3.5/http/client.py", line 243, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/home/paul/.python35/lib/python3.5/socket.py", line 575, in readinto
return self._sock.recv_into(b)
ConnectionResetError: [Errno 104] Connection reset by peer
I have tried a few different variations of desired_capabilities and even changing file permissions of everything in the virtual environment but to no avail. I must be missing something, or is it just not possible? Any suggestions gratefully received.
i'm trying to open an url http://الاعلي-للاتصالات.قطر/ar/news-events/event/future-internet-privacy
with the urllib2.urlopen but it reports always an error.
The similar occurs to http://الاعلي-للاتصالات.قطر/ar ... other pages (chinese ones) are opened ok.
Any ideas to point me to the right way to open those urls?
urllib2.urlopen("http://الاعلي-للاتصالات.قطر/ar/news-events/event/future-internet-privacy").read()
urllib2.urlopen('http://الاعلي-للاتصالات.قطر').read()
[Edited]
the error is :
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.6/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.6/urllib2.py", line 391, in open
response = self._open(req, data)
File "/usr/lib/python2.6/urllib2.py", line 409, in _open
'_open', req)
File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib/python2.6/urllib2.py", line 1170, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib/python2.6/urllib2.py", line 1142, in do_open
h.request(req.get_method(), req.get_selector(), req.data, headers)
File "/usr/lib/python2.6/httplib.py", line 914, in request
self._send_request(method, url, body, headers)
File "/usr/lib/python2.6/httplib.py", line 951, in _send_request
self.endheaders()
File "/usr/lib/python2.6/httplib.py", line 908, in endheaders
self._send_output()
File "/usr/lib/python2.6/httplib.py", line 780, in _send_output
self.send(msg)
File "/usr/lib/python2.6/httplib.py", line 759, in send
self.sock.sendall(str)
I also tried with the u'http://الاعلي-للاتصالات.قطر'.encode('utf-8') but the result url can't be opened too.
As #Donal says, the URL has to be punycoded. Luckily Python includes this already. Here is a sample Python code
domain = "الاعلي-للاتصالات.قطر"
domain_unicode = unicode(domain, "utf8")
domain_idna = domain_unicode.encode("idna")
urllib2.urlopen("http://" + domain_idna).read()
Hope this helps.