Disable python's help() from accessing web - python

I am using Python2.7. When I enter help(), and enter "modules", I get the message
>>> help()
Welcome to Python 2.7! This is the online help utility.
...
help> modules
Please wait a moment while I gather a list of all available modules...
Then I get a series of warnings
Warning: cannot register existing type 'GtkWidget'
...
Warning: cannot add class private field to invalid type '<invalid>'
...
Then the whole thing hangs... to the point where I had to start a second remote session to send a SIGKILL.
obviously something is wrong, but what I was most surprised by was the bit where it reaches out to the web to gather information.
Isn't Python's help documentation available stored locally? How do I stop it from going out to the web? I want regular help, not online help.

The help() command does not search on the internet; the "online" simply means that you can use it interactively, in the documentation it calls it "built-in help system" which is less amiguous. What it does it traverses all the PYTHONPATH and tries to import every module in order to see which are the available modules in your system.
Here's the source code that is used to obtain the list of module(you can find in under Lib/pydoc.py in the python sources):
def listmodules(self, key=''):
if key:
self.output.write('''
Here is a list of matching modules. Enter any module name to get more help.
''')
apropos(key)
else:
self.output.write('''
Please wait a moment while I gather a list of all available modules...
''')
modules = {}
def callback(path, modname, desc, modules=modules):
if modname and modname[-9:] == '.__init__':
modname = modname[:-9] + ' (package)'
if modname.find('.') < 0:
modules[modname] = 1
def onerror(modname):
callback(None, modname, None)
ModuleScanner().run(callback, onerror=onerror)
self.list(modules.keys())
self.output.write('''
Enter any module name to get more help. Or, type "modules spam" to search
for modules whose descriptions contain the word "spam".
''')
Where the ModuleScanner class simply traverses built-in modules, and the modules that pkgutil.walk_packages finds, this function in the ends calls the iter_modules method of the importer objects. The built-in importer does not support importing modules from the internet, hence the internet is not searched. If you install custom importers than help() may trigger an internet research.
If you have a lot of modules available then this operation may take some time. Some modules may also take significant time to import(e.g. numpy, scipy etc may take in the order of seconds to load).

On console export PYTHONDOCS=/usr/share/doc/python2/html/ to determine where python should search help.

Related

Pythonic way to set module-wide settings from external file

Some background (not mandatory, but might be nice to know): I am writing a Python command-line module which is a wrapper around latexdiff. It basically replaces all \cite{ref1, ref2, ...} commands in LaTeX files with written-out and properly formatted references before passing the files to latexdiff, so that latexdiff will properly mark changes to references in the text (otherwise, it treats the whole \cite{...} command as a single "word"). All the code is currently in a single file which can be run with python -m latexdiff-cite, and I have not yet decided how to package or distribute it. To make the script useful for anybody else, the citation formatting needs to be configurable. I have implemented an optional command-line argument -c CONFIGFILE to allow the user to point to their own JSON config file (a default file resides in the module folder and is loaded if the argument is not used).
Current implementation: My single-file command-line Python module currently parses command-line arguments in if __name__ == '__main__', and loads the config file (specified by the user in -c CONFIGFILE) here before running the main function of the program. The config variable is thus available in the entire module and all is well. However, I'm considering publishing to PyPI by following this guide which seems to require me to put the command-line parsing in a main() function, which means the config variable will not be available to the other functions unless passed down as arguments to where it's needed. This "passing down by arguments" method seems a little cluttered to me.
Question: Is there a more pythonic way to set some configuration globals in a module or otherwise accomplish what I'm trying to? (I don't want to rely on 3rd party modules.) Am I perhaps completely off the tracks in some fundamental way?
One way to do it is to have the configurations defined in a class or a simple dict:
class Config(object):
setting1 = "default_value"
setting2 = "default_value"
#staticmethod
def load_config(json_file):
""" load settings from config file """
with open(json_file) as f:
config = json.load(f)
for k, v in config.iteritems():
setattr(Config, k, v)
Then your application can access the settings via this class: Config.setting1 ...

Writing a Python module that can tell when it's hit with Spyder's UMD (user module deleter)

Spyder's UMD is usually great for me but periodically I trip myself when writing a module that I don't want to delete and reload. I know I can control the UMD via Tools > Preferences > Console > Advanced settings > User Module Deleter. But I would also like to be able to mark certain modules I write as non-UMD friendly in the code of the module itself.
In a perfect world I would just write something like
assert_no_umd()
which would throw an exception if the module is hit by the UMD. It would be fine if the code was tripped by any reloading of the module, whether by UMD or otherwise.
Note that this is different from Method that gets called on module deletion in Python because that question is about cleaning up a database connection which only needs to be done once, and therefore can be done with atexit.
(Spyder dev here) If I understand you correctly, this would be my assert_no_umd function:
import os
def assert_no_umd():
mod = __file__
if os.environ.get("UMD_ENABLED", "").lower() == "true":
namelist = os.environ.get("UMD_NAMELIST", None)
if namelist is not None:
namelist = namelist.split(',')
if mod not in namelist:
raise ValueError('UMD active!!')

How to read LV2 ttl file in Python?

I have an LV2 plugin and I want to use Python to extract its metadata - plugin name, description, list of control and audio ports and specification of each port.
With LADSPA the instructions were pretty clear, although a bit difficult to implement in Python: I just needed to call ladspa_descriptor() function. Now with LV2 there's a .ttl file, simples to access but more complicated to parse.
Is there any python library that will make this job simple?
The LV2 documentation generation tools use RDFLib. It is probably the most popular RDF interface for Python, though does much more than just parse Turtle. It is a good choice if performance is not an issue, but is unfortunately really slow.
If you need to actually instantiate and use plugins, you probably want to use an existing LV2 implementation. As Steve mentioned, Lilv is for this. It is not limited to any static default location, but will look in all the locations in LV2_PATH. You can set this environment variable to whatever you want before calling Lilv and it will only look in those locations. Alternatively, if you want to specifically load just one bundle at a time, there is a function for that: lilv_world_load_bundle().
There are SWIG-based Python bindings included with Lilv, but they stop short of actually allowing you to process data. However there is a project to wrap Lilv that allows processing of audio using scipy arrays: http://pyslv2.sourceforge.net/ (despite the name they are indeed Lilv bindings and not bindings for its predecessor SLV2)
That said, if you only need to get static information from the Turtle files, involving C libraries is probably more trouble than it is worth. One of the big advantages of using standard data files is ease of use with existing tools. To get the number of ports on a plugin, you simply need to count the number of triples that match the pattern (plugin, lv2:port, *). Here is an example Python script that prints the number of ports of a plugin, given the file to read and the plugin URI as command line arguments:
#!/usr/bin/env python
import rdflib
import sys
lv2 = rdflib.Namespace('http://lv2plug.in/ns/lv2core#')
path = sys.argv[1]
plugin = rdflib.URIRef(sys.argv[2])
model = rdflib.ConjunctiveGraph()
model.parse(path, format='n3')
num_ports = 0
for i in model.triples(plugin, lv2.port, None]):
num_ports += 1
print('%s has %u ports' % (plugin, num_ports))
This is how to get the number of ports each plugin supports:
w = lilv.World()
w.load_all()
for p in w.get_all_plugins():
print p.get_name().as_string(), p.get_num_ports()
At least this is all i got while trying to figure this out.

Getting python -m module to work for a module implemented in C

I have a pure C module for Python and I'd like to be able to invoke it using the python -m modulename approach. This works fine with modules implemented in Python and one obvious workaround is to add an extra file for that purpose. However I really want to keep things to my one single distributed binary and not add a second file just for this workaround.
I don't care how hacky the solution is.
If you do try to use a C module with -m then you get an error message No code object available for <modulename>.
-m implementation is in runpy._run_module_as_main . Its essence is:
mod_name, loader, code, fname = _get_module_details(mod_name)
<...>
exec code in run_globals
A compiled module has no "code object" accociated with it so the 1st statement fails with ImportError("No code object available for <module>"). You need to extend runpy - specifically, _get_module_details - to make it work for a compiled module. I suggest returning a code object constructed from the aforementioned "import mod; mod.main()":
(python 2.6.1)
code = loader.get_code(mod_name)
if code is None:
+ if loader.etc[2]==imp.C_EXTENSION:
+ code=compile("import %(mod)s; %(mod)s.main()"%{'mod':mod_name},"<extension loader wrapper>","exec")
+ else:
+ raise ImportError("No code object available for %s" % mod_name)
- raise ImportError("No code object available for %s" % mod_name)
filename = _get_filename(loader, mod_name)
(Update: fixed an error in format string)
Now...
C:\Documents and Settings\Пользователь>python -m pythoncom
C:\Documents and Settings\Пользователь>
This still won't work for builtin modules. Again, you'll need to invent some notion of "main code unit" for them.
Update:
I've looked through the internals called from _get_module_details and can say with confidence that they don't even attempt to retrieve a code object from a module of type other than imp.PY_SOURCE, imp.PY_COMPILED or imp.PKG_DIRECTORY . So you have to patch this machinery this way or another for -m to work. Python fails before retrieving anything from your module (it doesn't even check if the dll is a valid module) so you can't do anything by building it in a special way.
Does your requirement of single distributed binary allow for the use of an egg? If so, you could package your module with a __main__.py with your calling code and the usual __init__.py...
If you're really adamant, maybe you could extend pkgutil.ImpLoader.get_code to return something for C modules (e.g., maybe a special __code__ function). To do that, I think you're going to have to actually change it in the Python source. Even then, pkgutil uses exec to execute the code block, so it would have to be Python code anyway.
TL;DR: I think you're euchred. While Python modules have code at the global level that runs at import time, C modules don't; they're mostly just a dict namespace. Thus, running a C module doesn't really make sense from a conceptual standpoint. You need some real Python code to direct the action.
I think that you need to start by making a separate file in Python and getting the -m option to work. Then, turn that Python file into a code object and incorporate it into your binary in such a way that it continues to work.
Look up setuptools in PyPi, download the .egg and take a look at the file. You will see that the first few bytes contain a Python script and these are followed by a .ZIP file bytestream. Something similar may work for you.
There's a brand new thing that may solve your problems easily. I've just learnt about it and it looks preety decent to me: http://code.google.com/p/pts-mini-gpl/wiki/StaticPython

Why does calling kernel32.GetModuleHandleA() for msvcr100 fail in Python?

I am having a problem with calling GetModuleHandleA() using Python. I have a module that attaches as debugger to the process. I'm working on a function that would return address of the function in the specific DLL module. GetModuleHandleA("msvcr100") fails all of the time.
from ctypes import *
kernel32 = windll.kernel32
Function declared as part of a bigger debug class. That's the part of function declaration:
def resolve_function(self,dll,function):
handle = kernel32.GetModuleHandleA(dll)
if handle == False:
print "kernel32.GetModuleNameA() failed!!!"
return False
address = kernel32.GetProcAddress(handle, function)
if address == False:
print "kernel32.GetProcAddress() failed!!!"
return False
kernel32.CloseHandle(handle)
return address
Call the function made as:
function_address = debug.resolve_function("msvcr100", "printf")
I run separate process that uses printf() and then attach to it. Everything works fine until I get to GetModuleHandleA() that returns False all of the time.
Code that runs printf():
from ctypes import *
import time
msvcr100 = cdll.msvcr100
counter = 0
while 1:
msvcr100.printf("Counter = %d\n" % counter)
time.sleep(1)
counter += 1
Any ideas?
You've found the solution to your problem, but I'm answering anyway to explain why your original effort failed (and why your fix worked).
First, msvcrt/msvcr100 are two different versions of Microsoft's C runtime library. There are other versions as well, and all of them contain their own definitions for printf(). A given process may have any one of them loaded, or multiple versions loaded, or no versions loaded - it's possible to produce console output using only WinAPI functions! In short, if it's not your process, you can't depend on any given version of the C runtime being available.
Second, GetModuleHandle() doesn't load anything. It returns a handle to the named module only if it has already been loaded. msvcr100.dll can be sitting right there on disk, but if the process hasn't already loaded it then GetModuleHandle won't give a handle to you. LoadLibrary() is the function you'd call if you wanted to both load and retrieve a handle to a named module... But you probably don't want to do this in a process you don't own.
FWIW, Process Explorer is a handy tool for viewing the DLLs already loaded by a process.
After modifying:
...
handle = kernel32.GetModuleHandleA(dll)
if handle == False:
error = GetLastError()
print "ERROR: %d - %s" % (error, FormatError(error))
return False
...
I get: ERROR: 126 - The specified module could not be found
I actually replaced msvcr100.dll with msvcrt.dll in my code and it worked perfect. I found out that msvcrt.dll is system dll. msvcr100.dll ships with Studio 2010. They are both located in C:\Windows\system32. It is still a mystery for me why msvcr100.dll did not work.
Use GetLastError() (or WinError) from ctypes to find out why you're getting a NULL return, then add that information to your error message. Even after you figure out this specific problem, you'll want that more robust error reporting.
See the ctypes docs for details: http://docs.python.org/library/ctypes.html
Try calling:
msvcr100 = cdll.msvcr100
before calling:
function_address = debug.resolve_function("msvcr100", "printf")
to make sure the DLL is loaded in your process. msvcrt might work because it was already loaded.

Categories

Resources