Python program to find active port on a website? - python

My college has some ports. Something like this
http://www.college.in:913
I want a program to find the active ones. I mean I want those port number in which the website is working.
Here is a code. But it takes a lot of time.
from urllib.request import Request, urlopen
from urllib.error import URLError, HTTPError
for i in range(1,10000):
req = Request("http://college.edu.in:"+str(i))
try:
response = urlopen(req)
except URLError as e:
print("Error at port"+str(i) )
else:
print ('Website is working fine'+str(i))

It might be faster to try open a socket connection to each port in the range and then only try to make a request if the socket is actually open. But it's often slow to iterate through a bunch of ports. if it takes 0.5 seconds for each, and you're scanning 10000 ports that's a lot of time waiting.
# create an INET, STREAMing socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# now connect to the web server on port 80 - the normal http port
s.connect(("www.python.org", 80))
s.close()
from https://docs.python.org/3/howto/sockets.html
You might also consider profiling the code and finding out where the slow parts are.

You can use python-nmap, which is similar to nmap.

Related

How would I increase DNS resolution timeout in a Python script?

The server that runs my Python code which get some data off of a website has to jump through so many hoops just to get to the DNS server that the latency gets to the point where it times out the DNS resolution (this is out of my control). Sometimes it works, sometimes it doesn't.
So I am trying to do some exception handling and try to ensure that it works.
Goal:
Increase the DNS timeout. I am unsure of a good time but let's go with 30 seconds.
Try to resolve the website 5 times, if it resolves, proceed to scrape the website. If it doesn't, keep trying until the 5 attempts are up.
Here is the code using google.com as an example.
import socket
import http.client
#Confirm that DNS resolution is successful.
def dns_lookup(host):
try:
socket.getaddrinfo(host, 80)
except socket.gaierror:
return "DNS resolution to the host failed."
return True
#Scrape the targeted website.
def request_website_data():
conn = http.client.HTTPConnection("google.com")
conn.request("GET", "/")
res = conn.getresponse()
if (res.status == 200):
print("Connection to the website worked! Do some more stuff...")
else:
print("Connection to the website did not work. Terminating.")
#Attempt DNS resolution 5 times, if it succeeds then immediately request the website and break the loop.
for x in range(5):
dns_resolution = dns_lookup('google.com')
if dns_resolution == True:
request_website_data()
break
else:
print(dns_resolution)
I am looking through the socket library socket.settimeout(value) and I am unsure if that's what I'm looking for. What would I insert into my code to have a more forgiving and longer DNS resolution time?
Thank you.

How do I gracefully close a socket with a persistent HTTP connection?

I'm writing a very simple client in Python that fetches an HTML page from the WWW. This is the code I've come up with so far:
import socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect(("www.mywebsite.com", 80))
sock.send(b"GET / HTTP/1.1\r\nHost:www.mywebsite.com\r\n\r\n")
while True:
chunk = sock.recv(1024) # (1)
if len(chunk) == 0:
break
print(chunk)
sock.close()
The problem is: being an HTTP/1.1 connection persistent by default, the code gets stuck in # (1) waiting for more data from the server once the transmission is over.
I know I can solve this by a) adding the Connection: close request header, or by b) setting a timeout to the socket. A non-blocking socket here would not help, as the select() syscall would still hang (unless I set a timeout on it, but that's just another form of case b)).
So is there another way to do it, while keeping the connection persistent?
As has already been said in the comments, there's a lot to consider if you're trying to write an all-singing, all-dancing HTTP processor. However, if you're just practising with sockets then consider this.
Let's assume that you know how the response will end. For example, if we do essentially what you're doing in your code to the main Google page, we know that the response will end with '\r\n\r\n'. So, what we can do is just read 1 byte at a time and look out for that terminating sequence.
This code will NOT give you the full Google main page because, as you will see, the response is chunked - and that's a whole new ball game.
Having said all of that, you may find this instructive:
import socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
try:
sock.connect(('www.google.com', 80))
sock.send(b'GET / HTTP/1.1\r\nHost:www.google.com\r\n\r\n')
end = [b'\r', b'\n', b'\r', b'\n']
d = []
while d[-len(end):] != end:
d.append(sock.recv(1))
print(''.join(b.decode() for b in d))
finally:
sock.close()

Simple Python Web Server trouble

I'm trying to write a python web server using the socket library. I've been through several sources and can't figure out why the code I've written doesn't work. Others have run very similar code and claim it works. I'm new to python so I might be missing something simple.
The only way it will work now is I send the data variable back to the client. The browser prints the original GET request. When I try to send an HTTP response, the connection times out.
import socket
##Creates several variables, including the host name, the port to use
##the size of a transmission, and how many requests can be handled at once
host = ''
port = 8080
backlog = 5
size = 1024
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((host,port))
s.listen(backlog)
while 1:
client, address = s.accept()
data = client.recv(16)
if data:
client.send('HTTP/1.0 200 OK\r\n')
client.send("Content-Type: text/html\r\n\r\n")
client.send('<html><body><h1>Hello World</body></html>')
client.close()
s.close()
You need to consume the input before responding, and you shouldn't close the socket in your while loop:
Replace client.recv(16) with client.recv(size), to consume the request.
Move your last line, s.close() back one indent, so that it is not in your while loop. At the moment you are closing the connection, then trying to accept from it again, so your server will crash after the first request.
Unless you are doing this as an exercise, you should extend SimpleHTTPServer instead of using sockets directly.
Also, adding this line after your create the socket (before bind) fixes any "Address already in use" errors you might be getting.
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
Good luck!

Python: Downloading files from HTTP server

I have written some python scripts to download images off an HTTP website, but because I'm using urllib2, it closes the existing connection and then opens another before opening another. I don't really understand networking all that much, but this probably slows things down considerably, and grabbing 100 images at a time would take a considerable amount of time.
I started looking at other alternatives like pycurl or httplib, but found them complicated to figure out compared to urllib2 and haven't found a lot of code snippets that I could just take and use.
Simply, how would I establish a persistent connection to a website and download a number of files and then close the connection only when I am done? (probably an explicit call to close it)
since you asked for an httplib snippet:
import httplib
images = ['img1.png', 'img2.png', 'img3.png']
conn = httplib.HTTPConnection('www.example.com')
for image in images:
conn.request('GET', '/images/%s' % image)
resp = conn.getresponse()
data = resp.read()
with open(image, 'wb') as f:
f.write(data)
conn.close()
this would issue multiple (sequential) GET requests for the images in the list, then close the connection.
I found urllib3 and it claims to reuse exisiting TCP connection.
As I already stated in a comment to the question I disagree with the claim, that this will not make a big difference: Because auf TCP Slow Start Algorithm every newly created connection will be slow at first. So reusing the same TCP socket will make a difference if the data is big enoug. And I think for 100 the data will be between 10 and 100 MB.
Here is a code sample from http://code.google.com/p/urllib3/source/browse/test/benchmark.py
TO_DOWNLOAD = [
'http://code.google.com/apis/apps/',
'http://code.google.com/apis/base/',
'http://code.google.com/apis/blogger/',
'http://code.google.com/apis/calendar/',
'http://code.google.com/apis/codesearch/',
'http://code.google.com/apis/contact/',
'http://code.google.com/apis/books/',
'http://code.google.com/apis/documents/',
'http://code.google.com/apis/finance/',
'http://code.google.com/apis/health/',
'http://code.google.com/apis/notebook/',
'http://code.google.com/apis/picasaweb/',
'http://code.google.com/apis/spreadsheets/',
'http://code.google.com/apis/webmastertools/',
'http://code.google.com/apis/youtube/',
]
from urllib3 import HTTPConnectionPool
import urllib
pool = HTTPConnectionPool.from_url(url_list[0])
for url in url_list:
r = pool.get_url(url)
If you are not going to make any complicated requests you could open a socket and make requests your self like:
import sockets
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect((server_name, server_port))
for url in urls:
sock.write('get %s\r\nhost: %s\r\n\r\n' % (url, server_name))
# Parse HTTP header
# Download picture (Size should be in the HTTP header)
sock.close()
But I do not think establishing 100 TCP sessions will make a lot of overhead in general.

Python send cmd on socket

I have a simple question about Python:
I have another Python script listening on a port on a Linux machine.
I have made it so I can send a request to it, and it will inform another system that it is alive and listening.
My problem is that I don't know how to send this request from another python script running on the same machine (blush)
I have a script running every minute, and I would like to expand it to also send this request. I dont expect to get a response back, my listening-script postes to a database.
In Internet Explorer, I write like this: http://192.168.1.46:8193/?Ping
I would like to know how to do this from Python, and preferably just send and not hang if the other script is not running.
thanks
Michael
It looks like you are doing an HTTP request, rather than an ICMP ping.
urllib2, built-in to Python, can help you do that.
You'll need to override the timeout so you aren't hanging too long. Straight from that article, above, here is some example code for you to tweak with your desired time-out and URL.
import socket
import urllib2
# timeout in seconds
timeout = 10
socket.setdefaulttimeout(timeout)
# this call to urllib2.urlopen now uses the default timeout
# we have set in the socket module
req = urllib2.Request('http://www.voidspace.org.uk')
response = urllib2.urlopen(req)
import urllib2
try:
response = urllib2.urlopen('http://192.168.1.46:8193/?Ping', timeout=2)
print 'response headers: "%s"' % response.info()
except IOError, e:
if hasattr(e, 'code'): # HTTPError
print 'http error code: ', e.code
elif hasattr(e, 'reason'): # URLError
print "can't connect, reason: ", e.reason
else:
raise # don't know what it is
This is a bit outside my knowledge, but maybe this question might help?
Ping a site in Python?
Considered Twisted? What you're trying to achieve could be taken straight out of their examples. It might be overkill, but if you'll eventually want to start adding authentication, authorization, SSL, etc. you might as well start in that direction.

Categories

Resources