How to examine a urllib2 object in Python?

How to examine a urllib2 object in Python? - python

I am learning Python at the moment, and I come from a Java/C++ and C background. I usually like to "examine" the "objects" in debuggers to get a better understanding of what is going on, so excuse my question if it seems odd for python.
I was reading the urllib2 documentation at Python's website. The following example was shown:
>>> import urllib2
>>> for line in urllib2.urlopen('http://tycho.usno.navy.mil/cgi-bin/timer.pl'):
... if 'EST' in line or 'EDT' in line: # look for Eastern Time
... print line
I understand that urlopen will download the content of a page.
Does urlopen download the HTML content? I tried doing the following:
content = urllib2.urlopen('http://tycho.usno.navy.mil/cgi-bin/timer.pl')
print content
which yields an object. What is the nature of this object? Is it a dictionary-like object? If so, how can I examine what its key-values are? Would that be done using pickling in Python?
I am aware of the geturl() method, but I'd like to understand fully what urlopen() does and return.
Thank you!

import pdb
pdb.set_trace()
place this at any point in your source, like a breakpoint - it allows you to interactively inspect names and objects. once you're in, you can also use
import inspect
which has a number of options for inspecting the properties and methods of an object http://docs.python.org/library/inspect.html#module-inspect
also dir(my_object) is a cheap way to do something similar

From the documentation:
This function returns a file-like object with two additional methods: ...
So you can read it like a file (as you already do).

Yes, and you can print the content using:
print content.read()
also, I'd like to suggest IPython, so you can inspect object methods and attributes very easily:
dir(content)

Related

A set object has no attribute 'encode' in socket module

I've tried to encode a set object and it failed to do it, claiming an AttributeError code.
Is there a way for it to work?
NOTE: I've been using the socket module.
albums = set()
for key, val in data.items():
albums.add(val['album'])
msg = albums.encode()

Are you trying to send some form of representation of albums via a socket? Then you need a binary representation of that object first. Using .encode() suggests that you want the string-representation of albumns, which you can get using msg = repr(albums).encode().

There are a few options here, but probably the easiest is to encode your dictionary using the Python pickle protocol.
To do that, use sending code like this:
import pickle
data_to_send = pickle.dumps(albums)
mysocket.send(data_to_send)
and receiving code like this:
import pickle
albums = pickle.loads(data_received_from_socket)
However, I want to warn you: using the socket module at all is a can of worms. It is low level and meant for experienced programmers. Compared to the question you've just asked, the other problems you'll have to deal with are probably a lot harder. For example, you have to think about how to delimit your messages before you figure out what to pass as the second parameter to socket.recv (the bufsize parameter).
I suggest you try something higher-level like Python's xmlrpc modules.

Python TypeError: 'TagList' object is not iterable

This happens all the time. A function returns an object that I can't read. Here:
discoverer = GstPbutils.Discoverer()
discoverer.connect('discovered', on_discovered)
info = discoverer.discover_uri(self.loaded_file)
print(vinfo.get_tags())
Returns this:
<Gst.TagList object at 0x7f00a360c0a8 (GstTagList at 0x7f00880024a0)>
But when I try to do this:
tags = vinfo.get_tags()
for tag in tags:
print (tag)
I get this:
TypeError: 'TagList' object is not iterable
But when I read the doc of this data structure, I seem to understand it's ... List? Can somebody, beyond telling me how to get the tags, indicate me how to read those docs? Also, am I missing some introspection methods and tools, that I could use to discover what the objects I encounter are, and how they work?

This is all hypothetical as I never used python with GStreamer:
According to documentation - yes it is said its list.. but this could be represented as internal structure.. remember that python bindings are just.. bindings - it all works similarly (if not implemented in a better way) as in C.. and what do you do in C with tags to iterate them .. but dont ask me how I found it out - you have to look around the docs checking all available functions.
You have to be wise and think of how could the object you are using may be implemented - along with the fact you know what it represents.. I mean - this is the list of tags when each tag has different type - one is string, the other one is int etc.. you cannot easily iterate over that.
So I think you have two options - according to what do you want to do with the tags..
1, serialize to string and work with that:
I am not sure but in C there is to_string which may do the same thing as in to_string in python - so try that if you are interested only in the tag names.. or whatever it returns.
2, use builtin foreach with its callback definition:
tags = vinfo.get_tags()
tags.foreach(my_callback, self)
And in your callback:
def my_callback(list, tag, user_data):
print(tag)
#do whatever you want with list
#not sure how to use casting in python:
YourClass ptr = user_data
ptr.your_method(whatever, tag);

Python: save the help result to a variable

Using python and NLTK I want to save the help result to a variable.
x = nltk.help.upenn_tagset('RB')
for example.
x variable is assigned with None. The console prints the result of the help function but it doesn't save that to var x.

Looking at the source file of help.py, it uses the print statement and doesn't return anything. upenn_tagset calls _format_tagset, which passes everything to _print_entries, which uses print.
So, what we really want to do is to redirect the print statement.
Quick search, and we've got https://stackoverflow.com/a/4110906/1210278 - replace sys.stdout.
As pointed out in the question linked by #mgilson, this is a permanent solution to a temporary problem. So what do we do? That should be easy - just keep the original around somewhere.
import sys
print "Hello"
cons_out = sys.stdout
sys.stdout = (other writable handle you can get result of)
do_printing_function()
sys.stdout = cons_out
print "World!"
This is actually exactly what the accepted answer at https://stackoverflow.com/a/6796752/1210278 does, except it uses a reusable class wrapper - this is a one-shot solution.

Easiest way to get output of tag explanation is by loading whole tag-set and then extracting explanation of only required tags.
tags = nltk.data.load('help/tagsets/upenn_tagset.pickle')
tags['RB']

shelve gives strange error

I'm trying to put some sites i crawled into a shelve, but the shelve won't accept any Site-objects. It will accept lists, strings, tuples, what have you, but as soon as i put in a Site-object, it crashes when i try to get the contents of the shelve
So when i fill up my shelve like this:
def add_to_shelve(self, site):
db = shelve.open("database")
print site, site.url
for word in site.content:
db[word] = site.url #site.url is a string, word has to be one too
shelve.open("database")['whatever'] works perfectly.
But if I do this:
def add_to_shelve(self, site):
db = shelve.open("database")
print site, site.url
for word in site.content:
db[word] = site #site is now an object of Site
shelve.open("database")['whatever'] errors out with this error message:
AttributeError: 'module' object has no attribute 'Site'
I'm completely stumped, and the pythondocs, strangely, don't have much info either. All they say is that the key in a shelve has to be a string, but the value or data can be "an arbitrary object"

It looks like you refactored your code after saving objects in the shelve. When retrieving objects from the shelve, Python rebuilds the object, and it needs to find the original class that, presumably, you have moved. This problem is typical when working with pickle (as the shelve module does).
The solution, as pduel suggests, is to provide a backwards-compatibility reference to the class in the same location that it used to be, so that pickle can find it. If you re-save all the objects, thereby rebuilding the pickles, you can remove that backwards-comatibility referece.

It seems that Python is looking for a constructor for a 'Site' object, and not finding it. I have not used shelve, but I recall the rules for what can be pickled are byzantine, and suspect the shelve rules are similar.
Try adding the line:
Site = sitemodule.Site
(with the name of the module providing 'Site') before you try unshelving. This ensures that a Site class can be found.

Make python doc property display correctly in a Windows CMD window?

In Windows, if I open a command prompt, start python, and inspect something using its __doc__ property, it doesn't display correctly. Instead of the lines being separated, I see one continuous string with the newline character every once and a while.
Is there a way to make it appear correctly?
Here's an example of what I see:
>>> hashlib.__doc__
'hashlib module - A common interface to many hash functions.\n\nnew(name, string=\'\') - returns a n
ew hash object implementing the\n given hash function; initializing the hash\n
using the given string data.\n\nNamed constructor functions are also availabl
e, these are much faster\nthan using new():\n\nmd5(), sha1(), sha224(), sha256(), sha384(), and sha5
12()\n\nMore algorithms may be available on your platform but the above are\nguaranteed to exist.\n\
nNOTE: If you want the adler32 or crc32 hash functions they are available in\nthe zlib module.\n\nCh

Rather than pulling __doc__ yourself, try this:
>>> help(hashlib)
It will give you a nicely formatted summary of the module, including (but not limited to) the docstring.

try
>>> print hashlib.__doc__
or (v3)
>>> print(hashlib.__doc__)

def help_(obj):
if type(obj).__name__ == 'ufunc':
print obj.__doc__
else:
help(obj)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to examine a urllib2 object in Python? - python

From the documentation: This function returns a file-like object with two additional methods: ... So you can read it like a file (as you already do).

Yes, and you can print the content using: print content.read() also, I'd like to suggest IPython, so you can inspect object methods and attributes very easily: dir(content)

Related

A set object has no attribute 'encode' in socket module

Python TypeError: 'TagList' object is not iterable

Python: save the help result to a variable

shelve gives strange error

Make python doc property display correctly in a Windows CMD window?

Categories

Resources

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to examine a urllib2 object in Python? - python

From the documentation: This function returns a file-like object with two additional methods: ... So you can read it like a file (as you already do).

Yes, and you can print the content using: print content.read() also, I'd like to suggest IPython, so you can inspect object methods and attributes very easily: dir(content)

Related

A set object has no attribute 'encode' in socket module

Python TypeError: 'TagList' object is not iterable

Python: save the help result to a variable

shelve gives strange error

Make python __doc__ property display correctly in a Windows CMD window?

Categories

Resources

Make python doc property display correctly in a Windows CMD window?