I am trying to get the content from the body but when I need the sock.recv I always have a return of 0 bytes. I already got the header and it worked fine but I received it byte by byte. my problem now is: I have the content length the length of the header and also the header. Now i want to get the body separately
Task 3d
PS: I am aware that it can't work as it is on the screenshot but I haven't found another solution yet
# -*- coding: utf-8 -*-
"""
task3.simple_web_browser
XX-YYY-ZZZ
<Your name>
"""
from socket import gethostbyname, socket, timeout, AF_INET, SOCK_STREAM
from sys import argv
HTTP_HEADER_DELIMITER = b'\r\n\r\n'
CONTENT_LENGTH_FIELD = b'Content-Length:'
HTTP_PORT = 80
ONE_BYTE_LENGTH = 1
def create_http_request(host, path, method='GET'):
'''
Create a sequence of bytes representing an HTTP/1.1 request of the given method.
:param host: the string contains the hostname of the remote server
:param path: the string contains the path to the document to retrieve
:param method: the string contains the HTTP request method (e.g., 'GET', 'HEAD', etc...)
:return: a bytes object contains the HTTP request to send to the remote server
e.g.,) An HTTP/1.1 GET request to http://compass.unisg.ch/
host: compass.unisg.ch
path: /
return: b'GET / HTTP/1.1\nHost: compass.unisg.ch\r\n\r\n'
'''
### Task 3(a) ###
# Hint 1: see RFC7230-7231 for the HTTP/1.1 syntax and semantics specification
# https://tools.ietf.org/html/rfc7230
# https://tools.ietf.org/html/rfc7231
# Hint 2: use str.encode() to create an encoded version of the string as a bytes object
# https://docs.python.org/3/library/stdtypes.html#str.encode
r = '{} {} HTTP/1.1\nHost: {}\r\n\r\n'.format(method, path, host)
response = r.encode()
return response
### Task 3(a) END ###
def get_content_length(header):
'''
Get the integer value from the Content-Length HTTP header field if it
is found in the given sequence of bytes. Otherwise returns 0.
:param header: the bytes object contains the HTTP header
:return: an integer value of the Content-Length, 0 if not found
'''
### Task 3(c) ###
# Hint: use CONTENT_LENGTH_FIELD to find the value
# Note that the Content-Length field may not be always at the end of the header.
for line in header.split(b'\r\n'):
if CONTENT_LENGTH_FIELD in line:
return int(line[len(CONTENT_LENGTH_FIELD):])
return 0
### Task 3(c) END ###
def receive_body(sock, content_length):
'''
Receive the body content in the HTTP response
:param sock: the TCP socket connected to the remote server
:param content_length: the size of the content to recieve
:return: a bytes object contains the remaining content (body) in the HTTP response
'''
### Task 3(d) ###
body = bytes()
data = bytes()
while True:
data = sock.recv(content_length)
if len(data)<=0:
break
else:
body += data
return body
### Task 3(d) END ###
def receive_http_response_header(sock):
'''
Receive the HTTP response header from the TCP socket.
:param sock: the TCP socket connected to the remote server
:return: a bytes object that is the HTTP response header received
'''
### Task 3(b) ###
# Hint 1: use HTTP_HEADER_DELIMITER to determine the end of the HTTP header
# Hint 2: use sock.recv(ONE_BYTE_LENGTH) to receive the chunk byte-by-byte
header = bytes()
chunk = bytes()
try:
while HTTP_HEADER_DELIMITER not in chunk:
chunk = sock.recv(ONE_BYTE_LENGTH)
if not chunk:
break
else:
header += chunk
except socket.timeout:
pass
return header
### Task 3(b) END ###
def main():
# Change the host and path below to test other web sites!
host = 'example.com'
path = '/index.html'
print(f"# Retrieve data from http://{host}{path}")
# Get the IP address of the host
ip_address = gethostbyname(host)
print(f"> Remote server {host} resolved as {ip_address}")
# Establish the TCP connection to the host
sock = socket(AF_INET, SOCK_STREAM)
sock.connect((ip_address, HTTP_PORT))
print(f"> TCP Connection to {ip_address}:{HTTP_PORT} established")
# Uncomment this comment block after Task 3(a)
# Send an HTTP GET request
http_get_request = create_http_request(host, path)
print('\n# HTTP GET request ({} bytes)'.format(len(http_get_request)))
print(http_get_request)
sock.sendall(http_get_request)
# Comment block for Task 3(a) END
# Uncomment this comment block after Task 3(b)
# Receive the HTTP response header
header = receive_http_response_header(sock)
print(type(header))
print('\n# HTTP Response Header ({} bytes)'.format(len(header)))
print(header)
# Comment block for Task 3(b) END
# Uncomment this comment block after Task 3(c)
content_length = get_content_length(header)
print('\n# Content-Length')
print(f"{content_length} bytes")
# Comment block for Task 3(c) END
# Uncomment this comment block after Task 3(d)
body = receive_body(sock, content_length)
print('\n# Body ({} bytes)'.format(len(body)))
print(body)
# Comment block for Task 3(d) END
if __name__ == '__main__':
main()
I have the content length the length of the header and also the header
You don't. In receive_http_response_header you check HTTP_HEADER_DELIMITER always only again the latest byte (chunk instead of header) which means that you'll never match the end of the header:
while HTTP_HEADER_DELIMITER not in chunk:
chunk = sock.recv(ONE_BYTE_LENGTH)
if not chunk:
break
else:
header += chunk
Then you just assume that you've read the full header while in reality you've read the full response. This means that another recv you are doing when trying to read the response body will only return 0 since no more data are there, i.e. the body was already included in what you consider the HTTP header.
Apart from that receive_body is wrong too since you make a similar mistake is in receive_http_response_header: the goal is not to read recv content_length bytes again and again until no more bytes are available as you do currently but the goal is to return when length(body) matches the content_length and continue reading the remaining data as long the body is not fully read.
I am trying to understand the resolving process in dnslib. Specifically, I am using the proxy.py example to implement a local DNS proxy which will send a request to specific servers based on the query.
(copy of proxy.py):
# -*- coding: utf-8 -*-
from __future__ import print_function
import binascii,socket,struct
from dnslib import DNSRecord,RCODE
from dnslib.server import DNSServer,DNSHandler,BaseResolver,DNSLogger
class ProxyResolver(BaseResolver):
"""
Proxy resolver - passes all requests to upstream DNS server and
returns response
Note that the request/response will be each be decoded/re-encoded
twice:
a) Request packet received by DNSHandler and parsed into DNSRecord
b) DNSRecord passed to ProxyResolver, serialised back into packet
and sent to upstream DNS server
c) Upstream DNS server returns response packet which is parsed into
DNSRecord
d) ProxyResolver returns DNSRecord to DNSHandler which re-serialises
this into packet and returns to client
In practice this is actually fairly useful for testing but for a
'real' transparent proxy option the DNSHandler logic needs to be
modified (see PassthroughDNSHandler)
"""
def __init__(self,address,port,timeout=0):
self.address = address
self.port = port
self.timeout = timeout
def resolve(self,request,handler):
try:
if handler.protocol == 'udp':
proxy_r = request.send(self.address,self.port,
timeout=self.timeout)
else:
proxy_r = request.send(self.address,self.port,
tcp=True,timeout=self.timeout)
reply = DNSRecord.parse(proxy_r)
except socket.timeout:
reply = request.reply()
reply.header.rcode = getattr(RCODE,'NXDOMAIN')
return reply
class PassthroughDNSHandler(DNSHandler):
"""
Modify DNSHandler logic (get_reply method) to send directly to
upstream DNS server rather then decoding/encoding packet and
passing to Resolver (The request/response packets are still
parsed and logged but this is not inline)
"""
def get_reply(self,data):
host,port = self.server.resolver.address,self.server.resolver.port
request = DNSRecord.parse(data)
self.server.logger.log_request(self,request)
if self.protocol == 'tcp':
data = struct.pack("!H",len(data)) + data
response = send_tcp(data,host,port)
response = response[2:]
else:
response = send_udp(data,host,port)
reply = DNSRecord.parse(response)
self.server.logger.log_reply(self,reply)
return response
def send_tcp(data,host,port):
"""
Helper function to send/receive DNS TCP request
(in/out packets will have prepended TCP length header)
"""
sock = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
sock.connect((host,port))
sock.sendall(data)
response = sock.recv(8192)
length = struct.unpack("!H",bytes(response[:2]))[0]
while len(response) - 2 < length:
response += sock.recv(8192)
sock.close()
return response
def send_udp(data,host,port):
"""
Helper function to send/receive DNS UDP request
"""
sock = socket.socket(socket.AF_INET,socket.SOCK_DGRAM)
sock.sendto(data,(host,port))
response,server = sock.recvfrom(8192)
sock.close()
return response
if __name__ == '__main__':
import argparse,sys,time
p = argparse.ArgumentParser(description="DNS Proxy")
p.add_argument("--port","-p",type=int,default=53,
metavar="<port>",
help="Local proxy port (default:53)")
p.add_argument("--address","-a",default="",
metavar="<address>",
help="Local proxy listen address (default:all)")
p.add_argument("--upstream","-u",default="8.8.8.8:53",
metavar="<dns server:port>",
help="Upstream DNS server:port (default:8.8.8.8:53)")
p.add_argument("--tcp",action='store_true',default=False,
help="TCP proxy (default: UDP only)")
p.add_argument("--timeout","-o",type=float,default=5,
metavar="<timeout>",
help="Upstream timeout (default: 5s)")
p.add_argument("--passthrough",action='store_true',default=False,
help="Dont decode/re-encode request/response (default: off)")
p.add_argument("--log",default="request,reply,truncated,error",
help="Log hooks to enable (default: +request,+reply,+truncated,+error,-recv,-send,-data)")
p.add_argument("--log-prefix",action='store_true',default=False,
help="Log prefix (timestamp/handler/resolver) (default: False)")
args = p.parse_args()
args.dns,_,args.dns_port = args.upstream.partition(':')
args.dns_port = int(args.dns_port or 53)
print("Starting Proxy Resolver (%s:%d -> %s:%d) [%s]" % (
args.address or "*",args.port,
args.dns,args.dns_port,
"UDP/TCP" if args.tcp else "UDP"))
resolver = ProxyResolver(args.dns,args.dns_port,args.timeout)
handler = PassthroughDNSHandler if args.passthrough else DNSHandler
logger = DNSLogger(args.log,args.log_prefix)
udp_server = DNSServer(resolver,
port=args.port,
address=args.address,
logger=logger,
handler=handler)
udp_server.start_thread()
if args.tcp:
tcp_server = DNSServer(resolver,
port=args.port,
address=args.address,
tcp=True,
logger=logger,
handler=handler)
tcp_server.start_thread()
while udp_server.isAlive():
time.sleep(1)
I have successfully injected the business logic of my interactions in the get_reply method of PassthroughDNSHandler:
def get_reply(self, data):
host, port = self.server.resolver.address, self.server.resolver.port
request = DNSRecord.parse(data)
query = str(request.questions[0].qname)
if query.endswith('.example.info.'):
server = "192.168.10.1"
elif any(query.endswith(x) for x in ["example.net.", "example.com."]):
server = "10.24.131.10"
else:
server = "1.1.1.1"
log.debug(f"{query} redirected to {server}")
response = send_udp(data, server, port)
reply = DNSRecord.parse(response)
This works as expected, the right DNS servers are queried depending on the request.
The part which I do not understand is the involvement of ProxyResolver in the initialization of the server.
resolver = ProxyResolver(args.dns, args.dns_port, args.timeout)
udp_server = DNSServer(resolver, port=53, address="127.0.0.1", handler=PassthroughDNSHandler)
What is resolver needed for?
As far as I understand, the packet received on 127.0.0.1:53 is passed, via handler, to PassthroughDNSHandler and actually processed in get_reply().
It is then further sent to the relevant upstream server via send_udp() and the response is forwarded back to the requesting client.
At what point does resolver gets into the picture and what is its role?
I put a breakpoint in the resolve() method of ProxyResolver and it is never hit.
I need to build a http server without using an HTTP library.
I have the server running and an html page beeing loaded but my <img src="..."/> tags are not beeing loaded, I recive the call but cannot preset the png/JPEG in the page.
httpServer.py
# Define socket host and port
SERVER_HOST = '0.0.0.0'
SERVER_PORT = 8000
# Create socket
server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
server_socket.bind((SERVER_HOST, SERVER_PORT))
server_socket.listen(1)
print('Listening on port %s ...' % SERVER_PORT)
while True:
# Wait for client connections
client_connection, client_address = server_socket.accept()
# Handle client request
request = client_connection.recv(1024).decode()
content = handle_request(request)
# Send HTTP response
if content:
response = 'HTTP/1.1 200 OK\n\n'
response += content
else:
response = 'HTTP/1.1 404 NOT FOUND\n\nFile Not Found'
client_connection.sendall(response.encode())
client_connection.close()
# Close socket
server_socket.close()
Function where handles the call
def handle_request(request):
http = HttpHandler.HTTPHandler
# Parse headers
print(request)
headers = request.split('\n')
get_content = headers[0].split()
accept = headers[6].split()
type_content = accept[1].split('/')
try:
# Filename
filename = get_content[1]
if get_content[0] == "GET":
content = http.get(None, get_content[1], type_content[0])
return content
except FileNotFoundError:
return None
class to handle the http verbs
class HTTPHandler:
def get(self, args, type):
if args == '/':
args = '/index.html'
fin = open('htdocs' + args)
if type != "image":
fin = open('htdocs/' + args)
if type == "image":
fin = open('htdocs/' + args, 'rb')
# Read file contents
content = fin.read()
fin.close()
return content
Realize that I´m trying to make an HTTP 1.1, if you see anything out of pattern fell free to say thanks in advance.
I don't know where you've learnt how HTTP works but I'm pretty sure that you did not study the actual standard which you should do when implementing a protocol. Some notes about your implementation:
Line ends should be \r\n not \n. This is true for both responses from the server as requests from the client.
You are assuming that the clients requests is never larger than 1024 bytes and that it can be read within a single recv. But, requests can have arbitrary length and there is no guarantee that you get all within a single recv (TCP is a streaming protocol and not a message protocol).
While it is kind of ok to simply close the TCP connection after the body it would be better to include the length of the body in the Content-length header or use chunked transfer encoding.
The type of the content should be given by using the Content-Type header, i.e. Content-type: text/html for HTML and Content-type: image/jpeg for JPEG images. Without this browser might guess correctly or wrongly what the type might be or depending on the context might also insist on a proper content-type header.
Apart from that, if you debug such problems it is helpful to find out what gets actually exchanged between client and server. It might be that you've checked this for yourself but you did not include such information into your question. Your only error description is "...I recive the call but cannot preset the png/JPEG in the page" and then a dump of your code.
httpServer.py
Ended up like:
while True:
# Wait for client connections
client_connection, client_address = server_socket.accept()
# Handle client request
request = client_connection.recv(10240).decode()
content = handle_request(request)
# Send HTTP response
if content:
if str(content).find("html") > 0:
client_connection.send('HTTP/1.1 200 OK\n\n'.encode())
client_connection.send(content.encode())
else:
client_connection.send('HTTP/1.1 200 OK\r\n'.encode())
client_connection.send("Content-Type: image/jpeg\r\n".encode())
client_connection.send("Accept-Ranges: bytes\r\n\r\n".encode())
client_connection.send(content)
else:
response = 'HTTP/1.1 404 NOT FOUND\r\nFile Not Found'
client_connection.close()
And the Get method like:
class HTTPHandler:
def get(self, args, type):
if args == '/':
args = '/index.html'
fin = open('htdocs' + args)
if type != "image":
fin = open('htdocs/' + args)
if type.find("html") == -1:
image_data = open('htdocs/' + args, 'rb')
bytes = image_data.read()
# Content-Type: image/jpeg, image/png \n\n
content = bytes
fin.close()
return content
# Read file contents
content = fin.read()
fin.close()
return content
I'm trying to get all HTTP GET/POST incoming requests.
I've found this code which seems promising, but I've noticed that it only works on standard HTTP ports. If I use another port (say 8080) scapy can't find the HTTP layer (packet.haslayer(http.HTTPRequest) == False).
This is the code:
from scapy.all import IP, sniff
from scapy.layers import http
def process_tcp_packet(packet):
'''
Processes a TCP packet, and if it contains an HTTP request, it prints it.
'''
if not packet.haslayer(http.HTTPRequest):
# This packet doesn't contain an HTTP request so we skip it
return
http_layer = packet.getlayer(http.HTTPRequest)
ip_layer = packet.getlayer(IP)
print '\n{0[src]} just requested a {1[Method]} {1[Host]}{1[Path]}'.format(ip_layer.fields, http_layer.fields)
# Start sniffing the network.
sniff(filter='tcp', prn=process_tcp_packet)
Any idea about what I'm doing wrong?
**** UPDATE ****
I got rid of scapy_http and just looked at the raw data.
I'm posting here the code I'm using - it works fine for me as I'm debugging a strange problem I'm having on Apache Solr - but your mileage may vary.
def process_tcp_packet(packet):
msg = list()
try:
if packet.dport == 8983 and packet.haslayer(Raw):
lines = packet.getlayer(Raw).load.split(os.linesep)
# GET or POST requests ?
if lines[0].lower().startswith('get /') or lines[0].lower().startswith('post /'):
# request forwarded by a proxy ?
_ = [line.split()[1] for line in lines if line.startswith('X-Forwarded-For:')]
s_ip = (_[0] if _ else packet.getlayer(IP).src)
# collect info
d_port = packet.getlayer(IP).dport
now = datetime.datetime.now()
msg.append('%s: %s > :%i -> %s' % (now, s_ip, d_port, lines[0]))
except Exception, e:
msg.append('%s: ERROR! %s' % (datetime.datetime.now(), str(e)))
pass
with open('http.log', 'a') as out:
for m in msg:
out.write(m + os.linesep)
I am sending some data after html content (it has a little delay) in the same response during keep-alive session and want browser to show html before the whole response is downloaded.
For example, I have text 'hello, ' and a function that computes 'world' with delay (let it be 1 sec). So I want browser to show 'hello, ' immediately and 'world' with its delay. Is it possible within one request (so, without ajax)
Here is example python code of what I do (highlighted: https://pastebin.com/muUJyR36):
import socket
from time import sleep
sock = socket.socket()
sock.bind(('', 9090))
sock.listen(1)
conn, addr = sock.accept()
def give_me_a_world():
sleep(1)
return b'world'
while True:
data = conn.recv(1024)
response = b'HTTP/1.1 200 OK\r\n'\
b'Content-Length: 12\r\n'\
b'Connection: keep-alive\r\n'\
b'\r\n'\
b'hello, '
conn.send(response) # send first part
conn.send(give_me_a_world()) # make a delay and send other part
conn.close()
First and foremost, read How the web works: HTTP and CGI explained to understand why and where your current code violates HTTP and thus doesn't and shouldn't work.
Now, as per Is Content-Length or Transfer-Encoding is mandatory in a response when it has body , after fixing the violation, you should
omit the Content-Length header and close the socket after sending all the data, OR
calculate the length of the entire data to send beforehand and specify it in the Content-Length header
You could use Transfer-Encoding: chunked and omit Content-Length.
It works fine on text browsers like curl and Links WWW Browser. But, modern graphical browsers don't really start rendering until it reaches some sort of buffer boundaries.
import socket
from time import sleep
sock = socket.socket()
sock.bind(('', 9090))
sock.listen(1)
conn, addr = sock.accept()
def give_me_a_world():
sleep(1)
return b'5\r\n'\
b'world\r\n'\
b'0\r\n'\
b'\r\n'
while True:
data = conn.recv(1024)
response = b'HTTP/1.1 200 OK\r\n'\
b'Transfer-Encoding: chunked\r\n'\
b'Connection: keep-alive\r\n'\
b'\r\n'\
b'7\r\n'\
b'hello, \r\n'
conn.send(response) # send first part
conn.send(give_me_a_world()) # make a delay and send other part
conn.close()