Record streaming and saving internet radio in python

Record streaming and saving internet radio in python - python

I am looking for a python snippet to read an internet radio stream(.asx, .pls etc) and save it to a file.
The final project is cron'ed script that will record an hour or two of internet radio and then transfer it to my phone for playback during my commute. (3g is kind of spotty along my commute)
any snippits or pointers are welcome.

The following has worked for me using the requests library to handle the http request.
import requests
stream_url = 'http://your-stream-source.com/stream'
r = requests.get(stream_url, stream=True)
with open('stream.mp3', 'wb') as f:
try:
for block in r.iter_content(1024):
f.write(block)
except KeyboardInterrupt:
pass
That will save a stream to the stream.mp3 file until you interrupt it with ctrl+C.

So after tinkering and playing with it Ive found Streamripper to work best. This is the command i use
streamripper http://yp.shoutcast.com/sbin/tunein-station.pls?id=1377200 -d ./streams -l 10800 -a tb$FNAME

If you find that your requests or urllib.request call in Python 3 fails to save a stream because you receive "ICY 200 OK" in return instead of an "HTTP/1.0 200 OK" header, you need to tell the underlying functions ICY 200 OK is OK!
What you can effectively do is intercept the routine that handles reading the status after opening the stream, just before processing the headers.
Simply put a routine like this above your stream opening code.
def NiceToICY(self):
class InterceptedHTTPResponse():
pass
import io
line = self.fp.readline().replace(b"ICY 200 OK\r\n", b"HTTP/1.0 200 OK\r\n")
InterceptedSelf = InterceptedHTTPResponse()
InterceptedSelf.fp = io.BufferedReader(io.BytesIO(line))
InterceptedSelf.debuglevel = self.debuglevel
InterceptedSelf._close_conn = self._close_conn
return ORIGINAL_HTTP_CLIENT_READ_STATUS(InterceptedSelf)
Then put these lines at the start of your main routine, before you open the URL.
ORIGINAL_HTTP_CLIENT_READ_STATUS = urllib.request.http.client.HTTPResponse._read_status
urllib.request.http.client.HTTPResponse._read_status = NiceToICY
They will override the standard routine (this one time only) and run the NiceToICY function in place of the normal status check when it has opened the stream. NiceToICY replaces the unrecognised status response, then copies across the relevant bits of the original response which are needed by the 'real' _read_status function. Finally the original is called and the values from that are passed back to the caller and everything else continues as normal.
I have found this to be the simplest way to get round the problem of the status message causing an error. Hope it's useful for you, too.

I am aware this is a year old, but this is still a viable question, which I have recently been fiddling with.
Most internet radio stations will give you an option of type of download, I choose the MP3 version, then read the info from a raw socket and write it to a file. The trick is figuring out how fast your download is compared to playing the song so you can create a balance on the read/write size. This would be in your buffer def.
Now that you have the file, it is fine to simply leave it on your drive (record), but most players will delete from file the already played chunk and clear the file out off the drive and ram when streaming is stopped.
I have used some code snippets from a file archive without compression app to handle a lot of the file file handling, playing, buffering magic. It's very similar in how the process flows. If you write up some sudo-code (which I highly recommend) you can see the similarities.

I'm only familiar with how shoutcast streaming works (which would be the .pls file you mention):
You download the pls file, which is just a playlist. It's format is fairly simple as it's just a text file that points to where the real stream is.
You can connect to that stream as it's just HTTP, that streams either MP3 or AAC. For your use, just save every byte you get to a file and you'll get an MP3 or AAC file you can transfer to your mp3 player.
Shoutcast has one addition that is optional: metadata. You can find how that works here, but is not really needed.
If you want a sample application that does this, let me know and I'll make up something later.

In line with the answer from https://stackoverflow.com/users/1543257/dingles (https://stackoverflow.com/a/41338150), here's how you can achieve the same result with the asynchronous HTTP client library - aiohttp:
import functools
import aiohttp
from aiohttp.client_proto import ResponseHandler
from aiohttp.http_parser import HttpResponseParserPy
class ICYHttpResponseParser(HttpResponseParserPy):
def parse_message(self, lines):
if lines[0].startswith(b"ICY "):
lines[0] = b"HTTP/1.0 " + lines[0][4:]
return super().parse_message(lines)
class ICYResponseHandler(ResponseHandler):
def set_response_params(
self,
*,
timer = None,
skip_payload = False,
read_until_eof = False,
auto_decompress = True,
read_timeout = None,
read_bufsize = 2 ** 16,
timeout_ceil_threshold = 5,
) -> None:
# this is a copy of the implementation from here:
# https://github.com/aio-libs/aiohttp/blob/v3.8.1/aiohttp/client_proto.py#L137-L165
self._skip_payload = skip_payload
self._read_timeout = read_timeout
self._reschedule_timeout()
self._timeout_ceil_threshold = timeout_ceil_threshold
self._parser = ICYHttpResponseParser(
self,
self._loop,
read_bufsize,
timer=timer,
payload_exception=aiohttp.ClientPayloadError,
response_with_body=not skip_payload,
read_until_eof=read_until_eof,
auto_decompress=auto_decompress,
)
if self._tail:
data, self._tail = self._tail, b""
self.data_received(data)
class ICYConnector(aiohttp.TCPConnector):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self._factory = functools.partial(ICYResponseHandler, loop=self._loop)
This can then be used as follows:
session = aiohttp.ClientSession(connector=ICYConnector())
async with session.get("url") as resp:
print(resp.status)
Yes, it's using a few private classes and attributes but this is the only solution to change the handling of something that's part of HTTP spec and (theoretically) should not ever need to be changed by the library's user...
All things considered, I would say this is still rather clean in comparison to monkey patching which would cause the behavior to be changed for all requests (especially true for asyncio where setting before and resetting after a request does not guarantee that something else won't make a request while request to ICY is being made). This way, you can dedicate a ClientSession object specifically for requests to servers that respond with the ICY status line.
Note that this comes with a performance penalty for requests made with ICYConnector - in order for this to work, I am using the pure Python implementation of HttpResponseParser which is going to be slower than the one that aiohttp uses by default and is written in C. This cannot really be done differently without vendoring the whole library as the behavior for parsing status line is deeply hidden in the C code.

Related

Broken Pipe with socket connection

I am using a socket connection to download data through a third party API. It works fine for a while but every now and then my script will crash giving the following error: BrokenPipeError: [Errno 32] Broken pipe
After some research it seems the suggestion (link here) is to do the following:
from signal import signal, SIGPIPE, SIG_DFL
signal(SIGPIPE,SIG_DFL)
However im firstly not sure what this actually does (im still confused after reading the python manual on signal). And I also don't know where to put the code.
If anyone is familiar with this error please could you advise if this is infact the correct solution and where the signal(SIGPIPE,SIG_DFL) would be placed. Should there be a try/except block inside which this is placed, or is it simply placed at the start of the program? Im confused.
Here's some of the relevant code. I basically have a dataframe consisting of several thousand items. I loop through each item passing it to the download method. The download method downloads the data via the api and then writes it to a database. I then move to the next item to download.
def recv_data(sock, recv_buffer=4096, delim='\n'):
buffer = ''
data = True
while data:
data = sock.recv(recv_buffer)
buffer += str(data.decode('latin-1'))
while buffer.find(delim) != -1:
line, buffer = buffer.split('\n', 1)
yield line
def update_existing_symbol_data(engine, sock, exchange, exchange_id, symbol, symbol_id, start_date):
data = ''
message = #request data message
sock.sendall(message.encode())
for line in recv_data(sock):
if "!ENDMSG!" in line:
break
data += line[:-2] + '\n'
df = pd.read_csv(io.StringIO(data))
df.set_index('date', inplace=True)
df.to_sql('daily', engine, if_exists='append')
def main():
df = #dataframe all symbols that need to be downloaded
for index, row in df.iterrows():
update_existing_symbol_data(args)

SIGPIPE is a POSIX thing that gets sent when a socket write operation fails. The default behavior is for the signal (this is an OS/socket thing, not a Python thing) to just kill your process. Python instead gives it to you as an exception so that it's possible to write more robust programs. But if you don't need to handle that event, which it sounds like you don't considering your use case, you can safely ignore it. There's no logic you need to do when you receive the signal, so the solution from that blog post should be fine. No try/except needed.
If your use case changes at a later date and you do need to handle the SIGPIPE, then wrapping that in a try/except and handling it there would be the way to go.

Is it possible to loop over an httplib.HTTPResponse's data?

I'm trying to develop a very simple proof-of-concept to retrieve and process data in a streaming manner. The server I'm requesting from will send data in chunks, which is good, but I'm having issues using httplib to iterate through the chunks.
Here's what I'm trying:
import httplib
def getData(src):
d = src.read(1024)
while d and len(d) > 0:
yield d
d = src.read(1024)
if __name__ == "__main__":
con = httplib.HTTPSConnection('example.com', port='8443', cert_file='...', key_file='...')
con.putrequest('GET', '/path/to/resource')
response = con.getresponse()
for s in getData(response):
print s
raw_input() # Just to give me a moment to examine each packet
Pretty simple. Just open an HTTPS connection to server, request a resource, and grab the result, 1024 bytes at a time. I'm definitely making the HTTPS connection successfully, so that's not a problem at all.
However, what I'm finding is that the call to src.read(1024) returns the same thing every time. It only ever returns the first 1024 bytes of the response, apparently never keeping track of a cursor within the file.
So how am I supposed to receive 1024 bytes at a time? The documentation on read() is pretty sparse. I've thought about using urllib or urllib2, but neither seems to be able to make an HTTPS connection.
HTTPS is required, and I am working in a rather restricted corporate environment where packages like Requests are a bit tough to get my hands on. If possible, I'd like to find a solution within Python's standard lib.
// Big Old Fat Edit
Turns out in my original code I had simply forgot to update the d variable. I initialized it with a read outside the yield loop and never changed it in the loop. Once I added it back in there it worked perfectly.
So, in short, I'm just a big idiot.

Is your con.putrequest() actually working? Doing a request with that method requires you to also call a bunch of other methods as you can see in the official httplib documentation:
http://docs.python.org/2/library/httplib.html
As an alternative to using the request() method described above, you
can also send your request step by step, by using the four functions
below.
putrequest()
putheader()
endheaders()
send()
Is there any reason why you're not using the default HTTPConnection.request() function?
Here's a working version for me, using request() instead:
import httlplib
def getData(src, chunk_size=1024):
d = src.read(chunk_size)
while d:
yield d
d = src.read(chunk_size)
if __name__ == "__main__":
con = httplib.HTTPSConnection('google.com')
con.request('GET', '/')
response = con.getresponse()
for s in getData(response, 8):
print s
raw_input() # Just to give me a moment to examine each packet

You can use the seek command to move the cursor along with your read.
This is my attempt at the problem. I apologize if I made it less pythonic in process.
if __name__ == "__main__":
con = httplib.HTTPSConnection('example.com', port='8443', cert_file='...', key_file='...')
con.putrequest('GET', '/path/to/resource')
response = con.getresponse()
c=0
while True:
response.seek(c*1024,0)
data =d.read(1024)
c+=1
if len(data)==0:
break
print data
raw_input()
I hope it is at least helpful.

My python program is running really slow

I'm making a program that (at least right now) retrives stream information from TwitchTV (streaming platform). This program is to self educate myself but when i run it, it's taking 2 minutes to print just the name of the streamer.
I'm using Python 2.7.3 64bit on Windows7 if that is important in anyway.
classes.py:
#imports:
import urllib
import re
#classes:
class Streamer:
#constructor:
def __init__(self, name, mode, link):
self.name = name
self.mode = mode
self.link = link
class Information:
#constructor:
def __init__(self, TWITCH_STREAMS, GAME, STREAMER_INFO):
self.TWITCH_STREAMS = TWITCH_STREAMS
self.GAME = GAME
self.STREAMER_INFO = STREAMER_INFO
def get_game_streamer_names(self):
"Connects to Twitch.TV API, extracts and returns all streams for a spesific game."
#start connection
self.con = urllib2.urlopen(self.TWITCH_STREAMS + self.GAME)
self.info = self.con.read()
self.con.close()
#regular expressions to get all the stream names
self.info = re.sub(r'"teams":\[\{.+?"\}\]', '', self.info) #remove all team names (they have the same name: parameter as streamer names)
self.streamers_names = re.findall('"name":"(.+?)"', self.info) #looks for the name of each streamer in the pile of info
#run in a for to reduce all "live_user_NAME" values
for name in self.streamers_names:
if name.startswith("live_user_"):
self.streamers_names.remove(name)
#end method
return self.streamers_names
def get_streamer_mode(self, name):
"Returns a streamers mode (on/off)"
#start connection
self.con = urllib2.urlopen(self.STREAMER_INFO + name)
self.info = self.con.read()
self.con.close()
#check if stream is online or offline ("stream":null indicates offline stream)
if self.info.count('"stream":null') > 0:
return "offline"
else:
return "online"
main.py:
#imports:
from classes import *
#consts:
TWITCH_STREAMS = "https://api.twitch.tv/kraken/streams/?game=" #add the game name at the end of the link (space = "+", eg: Game+Name)
STREAMER_INFO = "https://api.twitch.tv/kraken/streams/" #add streamer name at the end of the link
GAME = "League+of+Legends"
def main():
#create an information object
info = Information(TWITCH_STREAMS, GAME, STREAMER_INFO)
streamer_list = [] #create a streamer list
for name in info.get_game_streamer_names():
#run for every streamer name, create a streamer object and place it in the list
mode = info.get_streamer_mode(name)
streamer_name = Streamer(name, mode, 'http://twitch.tv/' + name)
streamer_list.append(streamer_name)
#this line is just to try and print something
print streamer_list[0].name, streamer_list[0].mode
if __name__ == '__main__':
main()
the program itself works perfectly, just really slow
any ideas?

Program efficiency typically falls under the 80/20 rule (or what some people call the 90/10 rule, or even the 95/5 rule). That is, 80% of the time the program is actually running in 20% of the code. In other words, there is a good shot that your code has a "bottleneck": a small area of the code that is running slow, while the rest runs very fast. Your goal is to identify that bottleneck (or bottlenecks), then fix it (them) to run faster.
The best way to do this is to profile your code. This means you are logging the time of when a specific action occurs with the logging module, use timeit like a commenter suggested, use some of the built-in profilers, or simply print out the current time at very points of the program. Eventually, you will find one part of the code that seems to be taking the most amount of time.
Experience will tell you that I/O (stuff like reading from a disk, or accessing resources over the internet) will take longer than in-memory calculations. My guess as to the problem is that you're using 1 HTTP connection to get a list of streamers, and then one HTTP connection to get the status of that streamer. Let's say that there are 10000 streamers: your program will need to make 10001 HTTP connections before it finishes.
There would be a few ways to fix this if this is indeed the case:
See if Twitch.TV has some alternatives in their API that allows you to retrieve a list of users WITH their streaming mode so that you don't need to call an API for each streamer.
Cache results. This won't actually help your program run faster the first time it runs, but you might be able to make it so that if it runs a second time within a minute, it can reuse results.
Limit your application to only dealing with a few streamers at a time. If there are 10000 streamers, what exactly does your application do that it really needs to look at the mode of all 10000 of them? Perhaps it's better to just grab the top 20, at which point the user can press a key to get the next 20, or close the application. Often times, programming is not just about writing code, but managing expectations of what your users want. This seems to be a pet project, so there might not be "users", meaning you have free reign to change what the app does.
Use multiple connections. Right now, your app makes one connection to the server, waits for the results to come back, parses the results, saves it, then starts on the next connection. This process might take an entire half a second. If there were 250 streamers, running this process for each of them would take a little over two minutes total. However, if you could run four of them at a time, you could potentially reduce your time to just under 30 seconds total. Check out the multiprocessing module. Keep in mind that some APIs might have limits to how many connections you can make at a certain time, so hitting them with 50 connections at a time might irk them and cause them to forbid you from accessing their API. Use caution here.

You are using the wrong tool here to parse the json data returned by your URL. You need to use json library provided by default rather than parsing the data using regex.
This will give you a boost in your program's performance
Change the regex parser
#regular expressions to get all the stream names
self.info = re.sub(r'"teams":\[\{.+?"\}\]', '', self.info) #remove all team names (they have the same name: parameter as streamer names)
self.streamers_names = re.findall('"name":"(.+?)"', self.info) #looks for the name of each streamer in the pile of info
To json parser
self.info = json.loads(self.info) #This will parse the json data as a Python Object
#Parse the name and return a generator
return (stream['name'] for stream in data[u'streams'])

"Snooping" python's telnetlib

I have an application that calls telnetlib.read_until(). For the most part, it works fine.
However when my app's telnet connection fails, it's hard to debug the exact cause. Is it my script or is the server connection dodgy? (This is a development lab, so there are a lot of dodgy servers).
What I would like to do is to be able to easily snoop the data placed into the cooked queue before my app calls telnetlib.read_until() (thereby hopefully avoiding impacting my app's operation.)
Poking around in telnetlib.py, I found that 'buf[0]' is just the data I want: the newly-added data without the repetition caused by snooping 'cookedq'.
I can insert a line right before the end of telnetlib.process_rawq() to print out the processed data as it is received from the server.
telnetlib.process_rawq ...
...
self.cookedq = self.cookedq + buf[0]
print("Dbg: Cooked Queue contents = %r" % buf[0] <= my added debug line
self.sbdataq = self.sbdataq + buf[1]
This works well. I can see the data almost exactly as received by my app without impacting its operation at all.
Here's the question: Is there a snazzier way to accomplish this? This approach is basic and works, but I'll have to remember to re-make this change every time I upgrade Python's libraries.
My attempts to simply extend telnet.process_rawq() were unsuccessful, as buf is internal to telnet.process_rawq()
Is there a (more pythonic) way to snoop this telnetlib.process_rawq()-internal value without modifying telnetlib.py?
Thanks.

I just found a much better solution (by reading the code, duh!)
telnetlib has a debugging output option already built-in. Just call set_debuglevel(1) and Bob's your uncle.

The easy hack is to monkey patch the library. Copy and paste the function you want to change into your source (unfortunately, process_rawq is a rather large function), and modify it as you need. You can then replace the method in the class with your own.
import telnetlib
def process_rawq(self):
#existing stuff
self.cookedq = self.cookedq + buf[0]
print("Dbg: Cooked Queue contents = %r" % buf[0]
self.sbdataq = self.sbdataq + buf[1]
telnetlib.Telnet.process_rawq = process_rawq
You could alternatively try the debugging built-in to the telnetlib module with set_debuglevel(1), which prints a lot of info to stdout.
In this situation, I would tend to just grab wireshark/tshark/tcpdump and directly inspect the network session.

How to solve Python memory leak when using urrlib2?

I'm trying to write a simple Python script for my mobile phone to periodically load a web page using urrlib2. In fact I don't really care about the server response, I'd only like to pass some values in the URL to the PHP. The problem is that Python for S60 uses the old 2.5.4 Python core, which seems to have a memory leak in the urrlib2 module. As I read there's seems to be such problems in every type of network communications as well. This bug have been reported here a couple of years ago, while some workarounds were posted as well. I've tried everything I could find on that page, and with the help of Google, but my phone still runs out of memory after ~70 page loads. Strangely the Garbege Collector does not seem to make any difference either, except making my script much slower. It is said that, that the newer (3.1) core solves this issue, but unfortunately I can't wait a year (or more) for the S60 port to come.
here's how my script looks after adding every little trick I've found:
import urrlib2, httplib, gc
while(true):
url = "http://something.com/foo.php?parameter=" + value
f = urllib2.urlopen(url)
f.read(1)
f.fp._sock.recv=None # hacky avoidance
f.close()
del f
gc.collect()
Any suggestions, how to make it work forever without getting the "cannot allocate memory" error?
Thanks for advance,
cheers, b_m
update:
I've managed to connect 92 times before it ran out of memory, but It's still not good enough.
update2:
Tried the socket method as suggested earlier, this is the second best (wrong) solution so far:
class UpdateSocketThread(threading.Thread):
def run(self):
global data
while 1:
url = "/foo.php?parameter=%d"%data
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('something.com', 80))
s.send('GET '+url+' HTTP/1.0\r\n\r\n')
s.close()
sleep(1)
I tried the little tricks, from above too. The thread closes after ~50 uploads (the phone has 50MB of memory left, obviously the Python shell has not.)
UPDATE:
I think I'm getting closer to the solution! I tried sending multiple data without closing and reopening the socket. This may be the key since this method will only leave one open file descriptor. The problem is:
import socket
s=socket.socket(socket.AF_INET, socket.SOCK_STREAM)
socket.connect(("something.com", 80))
socket.send("test") #returns 4 (sent bytes, which is cool)
socket.send("test") #4
socket.send("test") #4
socket.send("GET /foo.php?parameter=bar HTTP/1.0\r\n\r\n") #returns the number of sent bytes, ok
socket.send("GET /foo.php?parameter=bar HTTP/1.0\r\n\r\n") #returns 0 on the phone, error on Windows7*
socket.send("GET /foo.php?parameter=bar HTTP/1.0\r\n\r\n") #returns 0 on the phone, error on Windows7*
socket.send("test") #returns 0, strange...
*: error message: 10053, software caused connection abort
Why can't I send multiple messages??

Using the test code suggested by your link, I tested my Python installation and confirmed that it indeed leaks. But, if, as #Russell suggested, I put each urlopen in its own process, the OS should clean up the memory leaks. In my tests, memory, unreachable objects and open files all remain more or less constant. I split the code into two files:
connection.py
import cPickle, urllib2
def connectFunction(queryString):
conn = urllib2.urlopen('http://something.com/foo.php?parameter='+str(queryString))
data = conn.read()
outfile = ('sometempfile'. 'wb')
cPickle.dump(data, outfile)
outfile.close()
if __name__ == '__main__':
connectFunction(sys.argv[1])
###launcher.py
import subprocess, cPickle
#code from your link to check the number of unreachable objects
def print_unreachable_len():
# check memory on memory leaks
import gc
gc.set_debug(gc.DEBUG_SAVEALL)
gc.collect()
unreachableL = []
for it in gc.garbage:
unreachableL.append(it)
return len(str(unreachableL))
#my code
if __name__ == '__main__':
print 'Before running a single process:', print_unreachable_len()
return_value_list = []
for i, value in enumerate(values): #where values is a list or a generator containing (or yielding) the parameters to pass to the URL
subprocess.call(['python', 'connection.py', str(value)])
print 'after running', i, 'processes:', print_unreachable_len()
infile = open('sometempfile', 'rb')
return_value_list.append(cPickle.load(infile))
infile.close()
Obviously, this is sequential, so you will only execute a single connection at a time, which may or may not be an issue for you. If it is, you will have to find a non-blocking way of communicating with the processes you're launching, but I'll leave that as an exercise for you.
EDIT: On re-reading your question, it seems you don't care about the server response. In that case, you can get rid of all the pickling related code. And obviously, you won't have the print_unreachable_len() related bits in your final code either.

There exist a reference cycle in urllib2 created in urllib2.py:1216. The issue is on going and exists since 2009.
https://bugs.python.org/issue1208304

I think this is probably your problem. To summarize that thread, there's a memory leak in Pys60's DNS lookup, and you can work around it by moving DNS lookup outside the inner loop.

This seems like a (very!) hacky workaround, but a bit of googling found this comment on the problem:
Apparently adding f.read(1) will stop the leaking!
import urllib2
f = urllib2.urlopen('http://www.google.com')
f.read(1)
f.close()
EDIT: oh, I see you already have f.read(1)... I'm all out of ideas then :/

Consider using the low-level socket API (related howto) instead of urllib2.
HOST = 'daring.cwi.nl' # The remote host
PORT = 50007 # The same port as used by the server
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((HOST, PORT))
s.send('GET /path/to/file/index.html HTTP/1.0\n\n')
# you'll need to figure out how much data to read and read that exactly
# or wait for read() to return data of zero length (I think!)
DATA_SZ = 1024
data = s.recv(DATA_SZ)
s.close()
print 'Received', repr(data)
How to execute and read a HTTP request via low-level sockets is a bit beyond the scope of the question (and perhaps may make a good question on its own on stackoverflow — I searched but didn't see it), but I hope this points you in the direction of a solution that may resolve your problem!
edit An answer in here about using makefile may be helpful: HTTP basic authentication using sockets in python

This does not leak for me with Python 2.6.1 on a Mac. Which version are you using?
BTW, your program doesn't work due to a few typos. Here is one that does work:
import urllib2, httplib, gc
value = "foo"
count = 0
while(True):
url = "http://192.168.1.1/?parameter=" + value
f = urllib2.urlopen(url)
f.read(1)
f.fp._sock.recv=None # hacky avoidance
f.close()
del f
print "count=",count
count += 1

Depending on platform and python version, python might not release memory back to OS. See this stackoverflow thread. That said, python should not endlessly consume memory. Judging from the code you use, it appears to be bug in python runtime unless, urllib/sockets use globals which I don't believe it does - blame it on Python on S60!
Have you considered other sources of memory leakage? Endless log file open, ever increasing array or smth like that? If it truly is a bug in sockets interface, then your only option is to use the subprocess approach.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.