I am trying to write a simple HTTP client program using raw sockets in Python 3. However, the server does not return a response despite having been sent a simple HTTP request. My question is why the server doesn't return a response.
Here is my code:
from socket import *
BUF_LEN = 8192 * 100000
info = getaddrinfo('google.com', 80, AF_INET)
addr = info[-1][-1]
print(addr)
client = socket(AF_INET, SOCK_STREAM)
client.connect(addr)
client.send(b"GET /index.html HTTP1.1\r\nHost: www.google.com\r\n")
print(client.recv(BUF_LEN).decode("utf-8")) # print nothing
You've missed a blank line at the end and mis-specified the HTTP version without a slash:
>>> client.send(b"GET /index.html HTTP1.1\r\nHost: www.google.com\r\n")
Should be:
>>> client.send(b"GET /index.html HTTP/1.1\r\nHost: www.google.com\r\n\r\n")
50
>>> client.recv(BUF_LEN).decode("utf-8")
u'HTTP/1.1 302 Found\r\nCache-Control: private\r\nContent-Type: text/html; charset=UTF-8\r\nLocation: http://www.google.co.uk/index.html?gfe_rd=cr&ei=fIR7WJ7QGejv8AeZzbWgCw\r\nContent-Length: 271\r\nDate: Sun, 15 Jan 2017 14:17:32 GMT\r\n\r\n<HTML><HEAD><meta http-equiv....
The blank line tells the server its the end of the headers, and since this is a GET request there's no payload and so it can then return the content.
Without the / in the HTTP/1.1 spec Google's servers will return an Error: 400 Bad Request response.
Related
I'm sending data in a TCP client in python and the tutorial I'm following is telling me to send this:
"GET / HTTP/1.1\r\nHost: google.com\r\n\r\n"
I've tried looking up information about the formatting here and I'm confused about what the GET is actually requesting or what data would be sent back by this request, and also what is the purpose of the carriage returns and newlines?
If want to write low-level HTTP GET in Python then you can create a TCP Socket and write the GET command optionally with header parameters then read the response.
The HTTP request starts with a Request-line (e.g. GET / HTTP/1.1 with a terminating CRLF or "\r\n"). The request line is followed by zero or more headers each ending with a CRLF. A final CRLF sequence marks the end of the request line and header part of the HTTP request followed by an optional message body. The request structure is defined in section 5 of the HTTP 1.1 spec
import socket
# host and port map to URL http://localhost:8000/
host = "localhost"
port = 8000
try:
sock = socket.socket()
sock.connect((host, port))
sock.sendall("GET / HTTP/1.1\r\nHost: google.com\r\n\r\n".encode())
# keep reading from socket until no more data in response
while True:
response = sock.recv(8096)
if len(response) == 0:
break
print(response)
except Exception as ex:
print("I/O Error:", ex)
The first line of the HTTP response is the status line including status code terminated with \r\n and followed by response headers.
HTTP/1.1 200 OK\r\n
Content-type: text/plain\r\n
Content-length: 14\r\n
\r\n
This is a test
You need to parse the status line and headers to determine how to decode the message body of the HTTP response.
Details of the HTTP response are in section 6 of the HTTP 1.1 Spec.
Alternatively, the requests module implements the HTTP spec in a simple API.
Example to make a HTTP GET using requests API.
import requests
url = 'http://localhost:8000/'
response = requests.get(url)
print("Status code:", response.status_code)
print("Content:", response.text)
im trying to send an http request to google, but all I receive is empty (b""). Here is my code:
import socket
target_host = "www.google.com"
target_port = 80
client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client.connect((target_host, target_port))
print("Connected...")
request = "GET / HTTP/1.1\r\nHost:%s\r\n\r\n" % target_host
response = client.recv(4096)
http_response = repr(response)
http_response_len = len(http_response)
print("[+RECV+] - length %d" % http_response_len)
print(http_response)
Here is my response:
[+RECV+] - length 3
b''
(also it took like 240 seconds to complete the request, is that normal?)
Thanks!
My bad, I forgot to send the data with
client.send(request.encode())
I've got here a code that sends an HTTPS request.
My problem is handling redirection requests using the same socket connection.
I know that the requests module can handle this redirects very well but this code
is for a proxy server that I'm developing.
Any help would be appreciated. Thanks!
import socket, ssl
from ssl import SSLContext
HOST = "www.facebook.com"
PORT = 443
ContextoSSL = SSLContext(protocol=ssl.PROTOCOL_SSLv23)
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sslSocket = ContextoSSL.wrap_socket(sock, server_hostname=HOST)
sslSocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sslSocket.connect((HOST, PORT))
sslSocket.do_handshake()
der = sslSocket.getpeercert(binary_form=True)
pem_data = ssl.DER_cert_to_PEM_cert(der)
#print(pem_data) # print certificate
'''1st request'''
headers = \
"GET / HTTP/1.1\r\n"\
"Host: www.facebook.com\r\n"\
"User-Agent: python-requests/2.22.0\r\n"\
"Accept-Encoding: gzip, deflate\r\nAccept: */*\r\n"\
"Connection: keep-alive\r\n\r\n"
print("\n\n" + headers)
sslSocket.send(headers.encode()) # send request
response = sslSocket.recv(9999)
print(response) # print receive response
'''2nd request''' # on this redirect with cookie set, response should be 200 OK
cookie, location = "", ""
for line in response.decode().splitlines():
if "Set-Cookie" in line:
cookie = line.replace("Set-Cookie: ", "").split(";")[0]
if "Location" in line:
location = line.replace("Location: ", "").split("/")[3]
print(cookie, location)
headers = \
f"GET /{location} HTTP/1.1\r\n"\
"Host: www.facebook.com\r\n"\
"User-Agent: python-requests/2.22.0\r\n"\
"Accept-Encoding: gzip, deflate\r\nAccept: */*\r\n"\
"Connection: keep-alive\r\n"\
f"Cookie: {cookie}\r\n\r\n"
print("\n\n" + headers)
sslSocket.send(headers.encode()) # send request
response = sslSocket.recv(9999)
print(response) # print received response
To handle a redirect you must first get the new location:
first properly read the response as specified in the HTTP standard, i.e. read the full body based on the length declared in the response
parse the response
check for a response code which indicates a redirect
in case of a redirect extract the new location from the Location field in the response header
Once you have the new location you can issue the new request for this location. If the new location is for the same domain and if both request and response indicated that the TCP connection can be reused you can try to issue the new request on the same TCP connection. But you need to handle the case that the server might close the connection anyway since this is explicitly allowed.
In all other cases you have to create a new TCP connection for the new request.
Note that showing you how you exactly can code this would be too broad. There is a reason HTTP libraries exist which you'd better use for this purpose instead of implementing all the complexity yourself.
I am trying to issue a simple HTTP GET request to a website through Python socket library. The sample website I used here is https://azlyrics.com/lyrics/charlieputh/attention.html. My code is:
from socket import *
serverName = 'azlyrics.com';
serverPort = 80;
clientSocket = socket(AF_INET, SOCK_STREAM);
print(clientSocket.connect((serverName, serverPort)));
message = '''GET /lyrics/charlieputh/attention.html HTTP/1.1
Host: www.azlyrics.com
Connection: keep-alive
''';
print(message);
clientSocket.send(message.encode());
modifiedMessage = clientSocket.recv(2048).decode();
print(modifiedMessage);
clientSocket.close();
But I get no response message in return. Also the object I get in return of connect() is None. I have tried the same URL with Python Request Library and it works fine. What am I doing wrong here?
HTTP requests body should end with \r\n (CR-LF),
please try this:
from socket import *
serverName = 'azlyrics.com';
serverPort = 80;
clientSocket = socket(AF_INET, SOCK_STREAM);
print(clientSocket.connect((serverName, serverPort)));
message = '''GET /lyrics/charlieputh/attention.html HTTP/1.1
Host: www.azlyrics.com
Connection: keep-alive
''';
print(message);
clientSocket.send(message.encode());
modifiedMessage = clientSocket.recv(2048).decode();
print(modifiedMessage);
clientSocket.close();
I'm building a proxy server in Python and I got a question.
First I'll be showing you a part of my code that presents the receiving data from the client: If there is data from the client, it downloads the content of the requested website (By using the urllib library) and then sending to the client 200 OK with the content length and the content itself:
data = currentSocket.recv(4096)
if data == "":
open_client_sockets.remove(currentSocket)
print 'Conn is closed'
else:
dataSplit = data.split("\r\n")
Host = HostFliter(dataSplit)
print Host, " Host"
if Host == "":
break
contentURL = urllib.urlopen(Host)
content_to_send = contentURL.read()
currentSocket.send("HTTP/1.1 200 OK\r\nContent-Length:"+str(len(content_to_send))+"\r\n\r\n"+str(content_to_send))
contentURL.close()
**The variable "Host" contains the url of the website.
Now for the question:
Where do I get the headers from the server and then send them to the client?
**The libraries I use: socket, select, urllib.
**This is for the select library:
rlist, wlist, xlist = select.select([serverSocket] + open_client_sockets, open_client_sockets, [])
The HTTP response syntax is as follows
HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 38
<html><body>Hello world!</body></html>
So you need to send headers just before \r\n separated by \n in above format.