Does anyone know how pydev determines what to use for code completion? I'm trying to define a set of classes specifically to enable code completion. I've tried using __new__ to set __dict__ and also __slots__, but neither seems to get listed in pydev autocomplete.
I've got a set of enums I want to list in autocomplete, but I'd like to set them in a generator, not hardcode them all for each class.
So rather than
class TypeA(object):
ValOk = 1
ValSomethingSpecificToThisClassWentWrong = 4
def __call__(self):
return 42
I'd like do something like
def TYPE_GEN(name, val, enums={}):
def call(self):
return val
dct = {}
dct["__call__"] = call
dct['__slots__'] = enums.keys()
for k, v in enums.items():
dct[k] = v
return type(name, (), dct)
TypeA = TYPE_GEN("TypeA",42,{"ValOk":1,"ValSomethingSpecificToThisClassWentWrong":4})
What can I do to help the processing out?
edit:
The comments seem to be about questioning what I am doing. Again, a big part of what I'm after is code completion. I'm using python binding to a protocol to talk to various microcontrollers. Each parameter I can change (there are hundreds) has a name conceptually, but over the protocol I need to use its ID, which is effectively random. Many of the parameters accept values that are conceptually named, but are again represented by integers. Thus the enum.
I'm trying to autogenerate a python module for the library, so the group can specify what they want to change using the names instead of the error prone numbers. The __call__ property will return the id of the parameter, the enums are the allowable values for the parameter.
Yes, I can generate the verbose version of each class. One line for each type seemed clearer to me, since the point is autocomplete, not viewing these classes.
Ok, as pointed, your code is too dynamic for this... PyDev will only analyze your own code statically (i.e.: code that lives inside your project).
Still, there are some alternatives there:
Option 1:
You can force PyDev to analyze code that's in your library (i.e.: in site-packages) dynamically, in which case it could get that information dynamically through a shell.
To do that, you'd have to create a module in site-packages and in your interpreter configuration you'd need to add it to the 'forced builtins'. See: http://pydev.org/manual_101_interpreter.html for details on that.
Option 2:
Another option would be putting it into your predefined completions (but in this case it also needs to be in the interpreter configuration, not in your code -- and you'd have to make the completions explicit there anyways). See the link above for how to do this too.
Option 3:
Generate the actual code. I believe that Cog (http://nedbatchelder.com/code/cog/) is the best alternative for this as you can write python code to output the contents of the file and you can later change the code/rerun cog to update what's needed (if you want proper completions without having to put your code as it was a library in PyDev, I believe that'd be the best alternative -- and you'd be able to grasp better what you have as your structure would be explicit there).
Note that cog also works if you're in other languages such as Java/C++, etc. So, it's something I'd recommend adding to your tool set regardless of this particular issue.
Fully general code completion for Python isn't actually possible in an "offline" editor (as opposed to in an interactive Python shell).
The reason is that Python is too dynamic; basically anything can change at any time. If I type TypeA.Val and ask for completions, the system had to know what object TypeA is bound to, what its class is, and what the attributes of both are. All 3 of those facts can change (and do; TypeA starts undefined and is only bound to an object at some specific point during program execution).
So the system would have to know st what point in the program run do you want the completions from? And even if there were some unambiguous way of specifying that, there's no general way to know what the state of everything in the program is like at that point without actually running it to that point, which you probably don't want your editor to do!
So what pydev does instead is guess, when it's pretty obvious. If you have a class block in a module foo defining class Bar, then it's a safe bet that the name Bar imported from foo is going to refer to that class. And so you know something about what names are accessible under Bar., or on an object created by obj = Bar(). Sure, the program could be rebinding foo.Bar (or altering its set of attributes) at runtime, or could be run in an environment where import foo is hitting some other file. But that sort of thing happens rarely, and the completions are useful in the common case.
What that means though is that you basically lose completions whenever you use "too much" of Python's dynamic language flexibility. Defining a class by calling a function is one of those cases. It's not ready to guess that TypeA has names ValOk and ValSomethingSpecificToThisClassWentWrong; after all, there's presumably lots of other objects that result from calls to TYPE_GEN, but they all have different names.
So if your main goal is to have completions, I think you'll have to make it easy for pydev and write these classes out in full. Of course, you could use similar code to generate the python files (textually) if you wanted. It looks though like there's actually more "syntactic overhead" of defining these with dictionaries than as a class, though; you're writing "a": b, per item rather than a = b. Unless you can generate these more systematically or parse existing definition files or something, I think I'd find the static class definition easier to read and write than the dictionary driving TYPE_GEN.
The simpler your code, the more likely completion is to work. Would it be reasonable to have this as a separate tool that generates Python code files containing the class definitions like you have above? This would essentially be the best of both worlds. You could even put the name/value pairs in a JSON or INI file or what have you, eliminating the clutter of the methods call among the name/value pairs. The only downside is needing to run the tool to regenerate the code files when the codes change, but at least that's an automated, simple process.
Personally, I would just go with making things more verbose and writing out the classes manually, but that's just my opinion.
On a side note, I don't see much benefit in making the classes callable vs. just having an id class variable. Both require knowing what to type: TypeA() vs TypeA.id. If you want to prevent instantiation, I think throwing an exception in __init__ would be a bit more clear about your intentions.
Related
I have an application that dynamically generates a lot of Python modules with class factories to eliminate a lot of redundant boilerplate that makes the code hard to debug across similar implementations and it works well except that the dynamic generation of the classes across the modules (hundreds of them) takes more time to load than simply importing from a file. So I would like to find a way to save the modules to a file after generation (unless reset) then load from those files to cut down on bootstrap time for the platform.
Does anyone know how I can save/export auto-generated Python modules to a file for re-import later. I already know that pickling and exporting as a JSON object won't work because they make use of thread locks and other dynamic state variables and the classes must be defined before they can be pickled. I need to save the actual class definitions, not instances. The classes are defined with the type() function.
If you have ideas of knowledge on how to do this I would really appreciate your input.
You’re basically asking how to write a compiler whose input is a module object and whose output is a .pyc file. (One plausible strategy is of course to generate a .py and then byte-compile that in the usual fashion; the following could even be adapted to do so.) It’s fairly easy to do this for simple cases: the .pyc format is very simple (but note the comments there), and the marshal module does all of the heavy lifting for it. One point of warning that might be obvious: if you’ve already evaluated, say, os.getcwd() when you generate the code, that’s not at all the same as evaluating it when loading it in a new process.
The “only” other task is constructing the code objects for the module and each class: this requires concatenating a large number of boring values from the dis module, and will fail if any object encountered is non-trivial. These might be global/static variables/constants or default argument values: if you can alter your generator to produce modules directly, you can probably wrap all of these (along with anything else you want to defer) in function calls by compiling something like
my_global=(lambda: open(os.devnull,'w'))()
so that you actually emit the function and then a call to it. If you can’t so alter it, you’ll have to have rules to recognize values that need to be constructed in this fashion so that you can replace them with such calls.
Another detail that may be important is closures: if your generator uses local functions/classes, you’ll need to create the cell objects, perhaps via “fake” closures of your own:
def cell(x): return (lambda: x).__closure__[0]
I am starting a new Python project that is supposed to run both sequentially and in parallel. However, because the behavior is entirely different, running in parallel would require a completely different set of classes than those used when running sequentially. But there is so much overlap between the two codes that it makes sense to have a unified code and defer the parallel/sequential behavior to a certain group of classes.
Coming from a C++ world, I would let the user set a Parallel or Serial class in the main file and use that as a template parameter to instantiate other classes at runtime. In Python there is no compilation time so I'm looking for the most Pythonic way to accomplish this. Ideally, it would be great that the code determines whether the user is running sequentially or in parallel to select the classes automatically. So if the user runs mpirun -np 4 python __main__.py the code should behave entirely different than when the user calls just python __main__.py. Somehow it makes no sense to me to have if statements to determine the type of an object at runtime, there has to be a much more elegant way to do this. In short, I would like to avoid:
if isintance(a, Parallel):
m = ParallelObject()
elif ifinstance(a, Serial):
m = SerialObject()
I've been reading about this, and it seems I can use factories (which somewhat have this conditional statement buried in the implementation). Yet, using factories for this problem is not an option because I would have to create too many factories.
In fact, it would be great if I can just "mimic" C++'s behavior here and somehow use Parallel/Serial classes to choose classes properly. Is this even possible in Python? If so, what's the most Pythonic way to do this?
Another idea would be to detect whether the user is running in parallel or sequentially and then load the appropriate module (either from a parallel or sequential folder) with the appropriate classes. For instance, I could have the user type in the main script:
from myPackage.parallel import *
or
from myPackage.serial import *
and then have the parallel or serial folders import all shared modules. This would allow me to keep all classes that differentiate parallel/serial behavior with the same names. This seems to be the best option so far, but I'm concerned about what would happen when I'm running py.test because some test files will load parallel modules and some other test files would load the serial modules. Would testing work with this setup?
You may want to check how a similar issue is solved in the stdlib: https://github.com/python/cpython/blob/master/Lib/os.py - it's not a 100% match to your own problem, nor the only possible solution FWIW, but you can safely assume this to be a rather "pythonic" solution.
wrt/ the "automagic" thing depending on execution context, if you decide to go for it, by all means make sure that 1/ both implementations can still be explicitely imported (like os.ntpath and os.posixpath) so they are truly unit-testable, and 2/ the user can still manually force the choice.
EDIT:
So if I understand it correctly, this file you points out imports modules depending on (...)
What it "depends on" is actually mostly irrelevant (in this case it's a builtin name because the target OS is known when the runtime is compiled, but this could be an environment variable, a command line argument, a value in a config file etc). The point was about both conditional import of modules with same API but different implementations while still providing direct explicit access to those modules.
So in a similar way, I could let the user type from myPackage.parallel import * and then in myPackage/init.py I could import all the required modules for the parallel calculation. Is this what you suggest?
Not exactly. I posted this as an example of conditional imports mostly, and eventually as a way to build a "bridge" module that can automagically select the appropriate implementation at runtime (on which basis it does so is up to you).
The point is that the end user should be able to either explicitely select an implementation (by explicitely importing the right submodule - serial or parallel and using it directly) OR - still explicitely - ask the system to select one or the other depending on the context.
So you'd have myPackage.serial and myPackage.parallel (just as they are now), and an additional myPackage.automagic that dynamically selects either serial or parallel. The "recommended" choice would then be to use the "automagic" module so the same code can be run either serial or parallel without the user having to care about it, but with still the ability to force using one or the other where it makes sense.
My fear is that py.test will have modules from parallel and serial while testing different files and create a mess
Why and how would this happen ? Remember that Python has no "process-global" namespace - "globals" are really "module-level" only - and that python's import is absolutely nothing like C/C++ includes.
import loads a module object (can be built directly from python source code, or from compiled C code, or even dynamically created - remember, at runtime a module is an object, instance of the module type) and binds this object (or attributes of this object) into the enclosing scope. Also, modules are garanteed (with a couple caveats, but those are to be considered as error cases) to be imported only once for a given process (and then cached) so importing the same module twice in a same process will yield the same object (IOW a module is a singleton).
All this means that given something like
# module A
def foo():
return bar(42)
def bar(x):
return x * 2
and
# module B
def foo():
return bar(33)
def bar(x):
return x / 2
It's garanteed that however you import from A and B, A.foo will ALWAYS call A.bar and NEVER call B.bar and B.foo will only ever call B.bar (unless you explicitely monkeyptach them of course but that's not the point).
Also, this means that within a module you cannot have access to the importing namespace (the module or function that's importing your module), so you cannot have a module depending on "global" names set by the importer.
To make a long story short, you really need to forget about C++ and learn how Python works, as those are wildly different languages with wildly different object models, execution models and idioms. A couple interesting reads are http://effbot.org/zone/import-confusion.htm and https://nedbatchelder.com/text/names.html
EDIT 2:
(about the 'automagic' module)
I would do that based on whether the user runs mpirun or just python. However, it seems it's not possible (see for instance this or this) in a portable way without a hack. Any ideas in that direction?
I've never ever had anything to do with mpi so I can't help with this - but if the general consensus is that there's no reliable portable way to detect this then obviously there's your answer.
This being said, simple stupid solutions are sometimes overlooked. In your case, explicitly setting an environment variable or passing a command-line switch to your main script would JustWork(tm), ie the user should for example use
SOMEFLAG=serial python main.py
vs
SOMEFLAG=parallel mpirun -np4 python main.py
or
python main.py serial
vs
mpirun -np4 python main.py parallel
(whichever works best for you needs - is the most easily portable).
This of course requires a bit more documentation and some more effort from the end-user but well...
I'm not really what you're asking here. Python classes are just (callable/instantiable) objects themselves, so you can of course select and use them conditionally. If multiple classes within multiple modules are involved, you can also make the imports conditional.
if user_says_parallel:
from myPackage.parallel import ParallelObject
ObjectClass = ParallelObject
else:
from myPackage.serial import SerialObject
ObjectClass = SerialObject
my_abstract_object = ObjectClass()
If that's very useful depends on your classes and the effort it takes to make sure they have the same API so they're compatible when replacing each other. Maybe even inheritance à la ParallelObject => SerialObject is possible, or at least a common (virtual) base class to put all the shared code. But that's just the same as in C++.
I just want to check whether a particular function is called by other function or not. If yes then I have to store it in a different category and the function that does not call a particular function will be stored in different category.
I have 3 .py files with classes and functions in them. I need to check each and every function. e.g. let's say a function trial(). If a function calls this function, then that function is in example category else non-example.
I have no idea what you are asking, but even if it is be technically possible, the one and only answer: don't do that.
If your design is as such that method A needs to know whether it was called from method B or C; then your design is most likely ... broken. Having such dependencies within your code will quickly turn the whole thing un-maintainable. Simply because you will very soon be constantly asking yourself "that path seems fine, but will happen over here?"
One way out of that: create different methods; so that B can call something else as C does; but of course, you should still extract the common parts into one method.
Long story short: my non-answer is: take your current design; and have some other people review it. As you should step back from whatever you are doing right now; and find a way to it differently! You know, most of the times, when you start thinking about strange/awkward ways to solve a problem within your current code, the real problem is your current code.
EDIT: given your comments ... The follow up questions would be: what is the purpose of this classification? How often will it need to take place? You know, will it happen only once (then manual counting might be an option), or after each and any change? For the "manual" thing - ideas such as pycharm are pretty good in analyzing python source code, you can do simple things like "search usages in workspaces" - so the IDE lists you all those methods that invoke some method A. But of course, that works only for one level.
The other option I see: write some test code that imports all your methods; and then see how far the inspect module helps you. Probably you could really iterate through a complete method body and simply search for matching method names.
I think understand the idea of duck typing, and would like to use it more often in my code. However, I am concerned about one potential problem: name collision.
Suppose I want an object to do something. I know the appropriate method, so I simply call it and see what happens. In general, there are three possible outcomes:
The method is not found and AttributeError exception is raised. This indicates that the object isn't what I think it is. That's fine, since with duck typing I'm either catching such an exception, or I am willing to let the outer scope deal with it (or let the program terminate).
The method is found, it does precisely what I want, and everything is great.
The method is found, but it's not the method that I want; it's a same-name method from an entirely unrelated class. The execution continues, until either inconsistent state is detected later, or, in the worst case, the program silently produces incorrect output.
Now, I can see how good quality names can reduce the chances of outcome #3. But projects are combined, code is reused, libraries are swapped, and it's quite possible that at some point two methods have the same name and are completely unrelated (i.e., they are not intended to substitute for each other in a polymorphism).
One solution I was thinking about is to add a registry of method names. Each registry record would contain:
method name (unique; i.e., only one record per name)
its generalized description (i.e., applicable to any instance it might be called on)
the set of classes which it is intended to be used in
If a method is added to a new class, the class needs to be added to the registry (by hand). At that time, the programmer would presumably notice if the method is not consistent with the meaning already attached to it, and if necessary, use another name.
Whenever a method is called, the program would automatically verify that the name is in the registry and the class of the instance is one of the classes in the record. If not, an exception would be raised.
I understand this is a very heavy approach, but in some cases where precision is critical, I can see it might be useful. Has it been tried (in Python or other dynamically typed languages)? Are there any tools that do something similar? Are there any other approaches worth considering?
Note: I'm not referring to name clashes at the global level, where avoiding namespace pollution would be the right approach. I'm referring to clashes at the method names; these are not affected by namespaces.
Well, if this is critical, you probably should not be using duck typing...
In practice, programs are finite systems, and the range of possible types passed into any particular routine does not cause the issues you are worrying about (most often there's only ever one type passed in).
But if you want address this issue anyway, python provides ABCs (abstract base classes). these allow you to associated a "type" with any set of methods and so would work something like the registry you suggest (you can either inherit from an ABC in the normal way, or simply "register" with it).
You can then check for these types manually or automate the checking with decorators from pytyp.
But, despite being the author of pytyp, and finding these questions interesting, I personally do not find such an approach useful. In practice, what you are worrying about simply does not happen (if you want to worry about something, focus on the lack of documentation from types when using higher order functions!).
PS note - ABCs are purely metadata. They do not enforce anything. Also, checking with pytyp decorators is horrendously inefficient - you really want to do this only where it is critical.
If you are following good programming practice or let me rather say if your code is Pythoic then chances are you would seldom face such issues. Refer the FAQ What are the “best practices” for using import in a module?.
It is generally not advised to clutter the namespace and the only time when there could be a conflict if you are trying to reuse the Python reserved names and or standard libraries or name conflicts with module name. But if you encounter conflict as such then there is a serious issue with the code. For example
Why would someone name a variable as list or define a function called len?
Why would someone name a variable difflib when s/he is intending to import it in the current namespace?
To address your problem, look at abstract base classes. They're a Pythonic way to deal with this issue; you can define common behavior in a base class, and even define ways to determine if a particular object is a "virtual baseclass" of the abstract base class. This somewhat mimics the registry you're describing, without requiring all classes know about the registry beforehand.
In practice, though, this issue doesn't crop up as often as you might expect. Objects with an __iter__ method or a __str__ method are simply broken if the methods don't work the way you expect. Likewise, if you say an argument to your function requires a .callback() method defined on it, people are going to do the right thing.
If you are worried that the lack of static type checking will let some bugs get through, the answer isn't to bolt on type checking, it is to write tests.
In the presence of unit tests, a type checking system becomes largely redundant as a means of catching bugs. While it is true that a type checking system can catch some bugs, it will only catch a small subset of potential bugs. To catch the rest you'll need tests. Those unit tests will necessarily catch most of the type errors that a type checking system would have caught, as well as bugs that the type checking system cannot catch.
Let's suppose I have several functions for a RPG I'm working on...
def name_of_function():
action
and wanted to implement axe class (see below) into each function without having to rewrite each class. How would I create the class as a global class. I'm not sure if I'm using the correct terminology or not on that, but please help. This has always held me abck from creating Text based RPG games. An example of a global class would be awesome!
class axe:
attack = 5
weight = 6
description = "A lightweight battle axe."
level_required = 1
price = 10
You can't create anything that's truly global in Python - that is, something that's magically available to any module no matter what. But it hardly matters once you understand how modules and importing work.
Typically, you create classes and organize them into modules. Then you import them into whatever module needs them, which adds the class to the module's symbol table.
So for instance, you might create a module called weapons.py, and create a WeaponBase class in it, and then Axe and Broadsword classes derived from WeaponsBase. Then, in any module that needed to use weapons, you'd put
import weapons
at the top of the file. Once you do this, weapons.Axe returns the Axe class, weapons.Broadsword returns the Broadsword class, and so on. You could also use:
from weapons import Axe, Broadsword
which adds Axe and Broadsword to the module's symbol table, allowing code to do pretty much exactly what you are saying you want it to do.
You can also use
from weapons import *
but this generally is not a great idea for two reasons. First, it imports everything in the module whether you're going to use it or not - WeaponsBase, for instance. Second, you run into all kinds of confusing problems if there's a function in weapons that's got the same name as a function in the importing module.
There are a lot of subtleties in the import system. You have to be careful to make sure that modules don't try to import each other, for instance. And eventually your project gets large enough that you don't want to put all of its modules in the same directory, and you'll have to learn about things like __init__.py. But you can worry about that down the road.
i beg to differ with the view that you can't create something truly global in python. in fact, it is easy. in Python 3.1, it looks like this:
def get_builtins():
"""Due to the way Python works, ``__builtins__`` can strangely be either a module or a dictionary,
depending on whether the file is executed directly or as an import. I couldn’t care less about this
detail, so here is a method that simply returns the namespace as a dictionary."""
return getattr( __builtins__, '__dict__', __builtins__ )
like a bunch of other things, builtins are one point where Py3 differs in details from the way it used to work in Py2. read the "What's New in Python X.X" documents on python.org for details. i have no idea what the reason for the convention mentioned above might be; i just want to ignore that stuff. i think that above code should work in Py2 as well.
so the point here is there is a __builtins__ thingie which holds a lot of stuff that comes as, well, built-into Python. all the sum, max, range stuff you've come to love. well, almost everything. but you don't need the details, really. the simplest thing you could do is to say
G = get_builtins()
G[ 'G' ] = G
G[ 'axe' ] = axe
at a point in your code that is always guaranteed to execute. G stands in for the globally available namespace, and since i've registered G itself within G, G now magically transcends its existence into the background of every module. means you should use it with care. where naming collisions occur between whatever is held in G and in a module's namespace, the module's namespace should win (as it gets inspected first). also, be prepared for everybody to jump on you when you tell them you're POLLUTING THE GLOBAL NAMESPACE dammit. i'm relly surprised noone has copmplained about that as yet, here.
well, those people would be quite right, in a way. personally, however, this is one of my main application composition techniques: what you do is you take a step away from the all-purpose module (which shouldn't do such a thing) towards a fine-tuned application-specific namespace. your modules are bound not to work outside that namespace, but then they're not supposed to, either. i actually started this style of programming as an experimental rebellion against (1) established views, hah!, and (2) the desperation that befalls me whenever i want to accomplish something less than trivial using Python's import statement. these days, i only use import for standard library stuff and regularly-installed modules; for my own stuff, i use a homegrown system. what can i say, it works!
ah yes, two more points: do yourself a favor, if you like this solution, and write yourself a publish() method or the like that oversees you never publish a name that has already been taken. in most cases, you do not want that.
lastly, let me second the first commenter: i have been programming in exactly the style you show above, coz that's what you find in the textbook examples (most of the time using cars, not axes to be sure). for a rather substantial number of reasons, i've pretty much given up on that.
consider this: JSON defines but seven kinds of data: null, true, false, numbers, texts, lists, dictionaries, that's it. i claim you can model any other useful datatype with those.
there is still a lot of justification for things like sets, bags, ordered dictionaries and so on. the claim here is not that it is always convenient or appropriate to fall back on a pure, directly JSON-compatible form; the claim is only that it is possible to simulate. right now, i'm implementing a sparse list for use in a messaging system, and that data type i do implement in classical OOP. that's what it's good for.
but i never define classes that go beyond these generic datatypes. rather, i write libraries that take generic datatypes and that provide the functionality you need. all of my business data (in your case probably representations of players, scenes, implements and so on) go into generic data container (as a rule, mostly dicts). i know there are open questions with this way of architecturing things, but programming has become ever so much easier, so much more fluent since i broke thru the BS that part of OOP propaganda is (apart from the really useful and nice things that another part of OOP is).
oh yes, and did i mention that as long as you keep your business data in JSON-compatible objects you can always write them to and resurrect them from the disk? or send them over the wire so you can interact with remote players? and how incredibly twisted the serialization business can become in classical OOP if you want to do that (read this for the tip of the iceberg)? most of the technical detail you have to know in this field is completely meaningless for the rest of your life.
You can add (or change existing) Python built-in functions and classes by doing either of the following, at least in Py 2.x. Afterwards, whatever you add will available to all code by default, although not permanently.
Disclaimer: Doing this sort of thing can be dangerous due to possible name clashes and problematic due to the fact that it's extremely non-explicit. But, hey, as they say, we're all adults here, right?
class MyCLass: pass
# one way
setattr(__builtins__, 'MyCLass', MyCLass)
# another way
import __builtin__
__builtin__.MyCLass = MyCLass
Another way is to create a singleton:
class Singleton(type):
def __init__(cls, name, bases, dict):
super(Singleton, cls).__init__(name, bases, dict)
cls.instance = None
class GlobalClass(object):
__metaclass__ = Singleton
def __init__():
pinrt("I am global and whenever attributes are added in one instance, any other instance will be affected as well.")