Hooking str.__getitem__ in Python

Hooking str.__getitem__ in Python - python

Is there a way of hooking str.__getitem__?
Example:
I'd like to be capable of do:
>>> "this is a string"[[1,3,4]]
'hs '
passing a list to [] and get the items in that list.
A more realistic example:
class STR(str):
pass
class INT(int):
pass
It's easy to make that STR("a string")[1] or STR("a string")[INT(1)] return an STR instance.
I'd like to be capable to make "a non STR string"[INT(1)] return an STR instance.

Why hook an often-used internal function when you can
def get_characters (s, l):
return "".join(s[i] for i in l)
>>> get_characters("this is a string", [1,3,4])
"hs "

Methods on objects defined in C cannot be monkeypatched. The best you can do is to use an external function to complete the task.

Related

What to do with the error [<main.Student object at 0x000001E84D968090>, <main.Student object at 0x000001E84D95E750>] [duplicate]

This question already has answers here:
How to print instances of a class using print()?
(12 answers)
Closed 7 months ago.
Well this interactive python console snippet will tell everything:
>>> class Test:
... def __str__(self):
... return 'asd'
...
>>> t = Test()
>>> print(t)
asd
>>> l = [Test(), Test(), Test()]
>>> print(l)
[__main__.Test instance at 0x00CBC1E8, __main__.Test instance at 0x00CBC260,
__main__.Test instance at 0x00CBC238]
Basically I would like to get three asd string printed when I print the list. I have also tried pprint but it gives the same results.

Try:
class Test:
def __repr__(self):
return 'asd'
And read this documentation link:

The suggestion in other answers to implement __repr__ is definitely one possibility. If that's unfeasible for whatever reason (existing type, __repr__ needed for reasons other than aesthetic, etc), then just do
print [str(x) for x in l]
or, as some are sure to suggest, map(str, l) (just a bit more compact).

You need to make a __repr__ method:
>>> class Test:
def __str__(self):
return 'asd'
def __repr__(self):
return 'zxcv'
>>> [Test(), Test()]
[zxcv, zxcv]
>>> print _
[zxcv, zxcv]
Refer to the docs:
object.__repr__(self)
Called by the repr() built-in function and by string conversions (reverse quotes) to compute the “official” string representation of an object. If at all possible, this should look like a valid Python expression that could be used to recreate an object with the same value (given an appropriate environment). If this is not possible, a string of the form <...some useful description...> should be returned. The return value must be a string object. If a class defines __repr__() but not __str__(), then __repr__() is also used when an “informal” string representation of instances of that class is required.
This is typically used for debugging, so it is important that the representation is information-rich and unambiguous.

Functions, methods, and how many arguments do I have to give them?

Why do the following lines give me the same result?
str.upper('hello')
and
'hello'.upper()
I tried to do the same with list.append but got a TypeError.
list.append([1])
Is the str type in Python overloaded? How can this be achieved by writing a class/function? I would appreciate an example.

list.append takes two arguments - the list to modify and the element to append. So you need to do it like this:
ls = [1]
list.append(ls, 2)
which is equivalent to the much more popular:
ls.append(2)

str.upper and list.append are both functions.
str.upper takes one argument.
>>> str.upper('test')
'TEST'
list.append takes two arguments.
>>> my_list = []
>>> list.append(my_list, 1)
>>> my_list
[1]
str.upper and list.append (like other functions) are also non-data-descriptors with a __get__ method which in this context has two implications:
When you access the function through the class via the dot notation (str.upper, list.append) the function's __get__ method (i.e. string.upper.__get__ and list.append.__get__) is called but it returns just the function itself.
When you access the function through an instance (my_string.upper, my_list.append) the function's __get__ method is called and it will return a new callable acting like the original function, but with whatever was "in front of the dot" automatically passed as the first argument. .
That's why you need to pass 1 - 1 = 0 arguments when calling my_string.upper() and 2 - 1 = 1 argument when calling my_list.append(1).
>>> 'my_string'.upper()
'MY_STRING'
>>>
>>> my_list = []
>>> my_list.append(1)
>>> my_list
[1]
You could even get these modified callables (methods) by explicitly calling __get__ and passing the argument to be bound (what has been before the dot) as its argument.
>>> my_string = 'my_string'
>>> upper_maker = str.upper.__get__(my_string)
>>> upper_maker()
'MY_STRING'
>>>
>>> my_list = []
>>> appender = list.append.__get__(my_list)
>>> appender(1)
>>> my_list
[1]
Finally, here's a short example demonstrating how descriptor instances can detect whether they are being accessed via their owner-class or via an instance.
class Descriptor:
def __get__(self, instance, owner_class):
if instance is None:
print('accessed through class')
# list.append.__get__ would return list.append here
else:
print('accessed through instance')
# list.append.__get__ would build a new callable here
# that takes one argument x and that internally calls
# list.append(instance, x)
class Class:
attribute = Descriptor()
Class.attribute # prints 'accessed through class'
instance = Class()
instance.attribute # prints 'accessed through instance'

Quoting Dave Kirbys answer from Relationship between string module and str:
There is some overlap between the string module and the str type,
mainly for historical reasons. In early versions of Python str objects
did not have methods, so all string manipulation was done with
functions from the string module. When methods were added to the str
type (in Python 1.5?) the functions were left in the string module for
compatibility, but now just forward to the equivalent str method.
However the string module also contains constants and functions that
are not methods on str, such as formatting, character translation etc.

There is nothing at all magical going on with str (except that we have a nice syntactic shortcut to creating one using ""). You can write a class that behaves like str and list to see more clearly what is happening here.
class MyClass():
def __init__(self, arg):
self.val=str(arg)
def do_thing(self):
self.val = "asdf"
def do_thing_with_arg(self, arg):
self.val = "asdf " + str(arg)
def __repr__(self):
return self.val
my_thing = MyClass("qwerty")
# this is like 'hello'.upper()
my_thing.do_thing()
print(my_thing)
# it prints 'asdf'
my_thing = MyClass("qwerty")
# this is like str.upper('hello')
MyClass.do_thing(my_thing)
print(my_thing)
# it prints 'asdf'
my_thing = MyClass("qwerty")
# this is like my_list.append('qwerty')
my_thing.do_thing_with_arg('zxcv')
print(my_thing)
# it prints 'asdf zxcv'
my_thing = MyClass("qwerty")
# this is like list.append(my_list, 'qwerty')
MyClass.do_thing_with_arg(my_thing, 'zxcv')
print(my_thing)
# it prints 'asdf zxcv'
The short version is, you're invoking what looks like an "instance method" on a class, but you are supplying the instance ('self') yourself as the first argument to the function call.

Is there a Python 'shortcut' to define a class variable equal to a string version of its own name?

This is a bit of a silly thing, but I want to know if there is concise way in Python to define class variables that contain string representations of their own names. For example, one can define:
class foo(object):
bar = 'bar'
baz = 'baz'
baf = 'baf'
Probably a more concise way to write it in terms of lines consumed is:
class foo(object):
bar, baz, baf = 'bar', 'baz', 'baf'
Even there, though, I still have to type each identifier twice, once on each side of the assignment, and the opportunity for typos is rife.
What I want is something like what sympy provides in its var method:
sympy.var('a,b,c')
The above injects into the namespace the variables a, b, and c, defined as the corresponding sympy symbolic variables.
Is there something comparable that would do this for plain strings?
class foo(object):
[nifty thing]('bar', 'baz', 'baf')
EDIT: To note, I want to be able to access these as separate identifiers in code that uses foo:
>>> f = foo(); print(f.bar)
bar
ADDENDUM: Given the interest in the question, I thought I'd provide more context on why I want to do this. I have two use-cases at present: (1) typecodes for a set of custom exceptions (each Exception subclass has a distinct typecode set); and (2) lightweight enum. My desired feature set is:
Only having to type the typecode / enum name (or value) once in the source definition. class foo(object): bar = 'bar' works fine but means I have to type it out twice in-source, which gets annoying for longer names and exposes a typo risk.
Valid typecodes / enum values exposed for IDE autocomplete.
Values stored internally as comprehensible strings:
For the Exception subclasses, I want to be able to define myError.__str__ as just something like return self.typecode + ": " + self.message + " (" + self.source + ")", without having to do a whole lot of dict-fu to back-reference an int value of self.typecode to a comprehensible and meaningful string.
For the enums, I want to just be able to obtain widget as output from e = myEnum.widget; print(e), again without a lot of dict-fu.
I recognize this will increase overhead. My application is not speed-sensitive (GUI-based tool for driving a separate program), so I don't think this will matter at all.
Straightforward membership testing, by also including (say) a frozenset containing all of the typecodes / enum string values as myError.typecodes/myEnum.E classes. This addresses potential problems from accidental (or intentional.. but why?!) use of an invalid typecode / enum string via simple sanity checks like if not enumVal in myEnum.E: raise(ValueError('Invalid enum value: ' + str(enumVal))).
Ability to import individual enum / exception subclasses via, say, from errmodule import squirrelerror, to avoid cluttering the namespace of the usage environment with non-relevant exception subclasses. I believe this prohibits any solutions requiring post-twiddling on the module level like what Sinux proposed.
For the enum use case, I would rather avoid introducing an additional package dependency since I don't (think I) care about any extra functionality available in the official enum class. In any event, it still wouldn't resolve #1.
I've already figured out implementation I'm satisfied with for all of the above but #1. My interest in a solution to #1 (without breaking the others) is partly a desire to typo-proof entry of the typecode / enum values into source, and partly plain ol' laziness. (Says the guy who just typed up a gigantic SO question on the topic.)

I recommend using collections.namedtuple:
Example:
>>> from collections import namedtuple as nifty_thing
>>> Data = nifty_thing("Data", ["foo", "bar", "baz"])
>>> data = Data(foo=1, bar=2, baz=3)
>>> data.foo
1
>>> data.bar
2
>>> data.baz
3
Side Note: If you are using/on Python 3.x I'd recommend Enum as per #user2357112's comment. This is the standardized approach going forward for Python 3+
Update: Okay so if I understand the OP's exact requirement(s) here I think the only way to do this (and presumably sympy does this too) is to inject the names/variables into the globals() or locals() namespaces. Example:
#!/usr/bin/env python
def nifty_thing(*names):
d = globals()
for name in names:
d[name] = None
nifty_thing("foo", "bar", "baz")
print foo, bar, bar
Output:
$ python foo.py
None None None
NB: I don't really recommend this! :)
Update #2: The other example you showed in your question is implemented like this:
#!/usr/bin/env python
import sys
def nifty_thing(*names):
frame = sys._getframe(1)
locals = frame.f_locals
for name in names:
locals[name] = None
class foo(object):
nifty_thing("foo", "bar", "baz")
f = foo()
print f.foo, f.bar, f.bar
Output:
$ python foo.py
None None None
NB: This is inspired by zope.interface.implements().

current_list = ['bar', 'baz', 'baf']
class foo(object):
"""to be added"""
for i in current_list:
setattr(foo, i, i)
then run this:
>>>f = foo()
>>>print(f.bar)
bar
>>>print(f.baz)
baz

This doesn't work exactly like what you asked for, but it seems like it should do the job:
class AutoNamespace(object):
def __init__(self, names):
try:
# Support space-separated name strings
names = names.split()
except AttributeError:
pass
for name in names:
setattr(self, name, name)
Demo:
>>> x = AutoNamespace('a b c')
>>> x.a
'a'
If you want to do what SymPy does with var, you can, but I would strongly recommend against it. That said, here's a function based on the source code of sympy.var:
def var(names):
from inspect import currentframe
frame = currentframe().f_back
try:
names = names.split()
except AttributeError:
pass
for name in names:
frame.f_globals[name] = name
Demo:
>>> var('foo bar baz')
>>> bar
'bar'
It'll always create global variables, even if you call it from inside a function or class. inspect is used to get at the caller's globals, whereas globals() would get var's own globals.

How about you define the variable as emtpy string and then get their name:
class foo(object):
def __getitem__(self, item):
return item
foo = foo()
print foo['test']

Here's an extension of bman's idea. This has its advantages and disadvantages, but at least it does work with some autocompleters.
class FooMeta(type):
def __getattr__(self, attr):
return attr
def __dir__(self):
return ['bar', 'baz', 'baf']
class foo:
__metaclass__ = FooMeta
This allows access like foo.xxx → 'xxx' for all xxx, but also guides autocomplete through __dir__.

Figured out what I was looking for:
>>> class tester:
... E = frozenset(['this', 'that', 'the', 'other'])
... for s in E:
... exec(str(s) + "='" + str(s) + "'") # <--- THIS
...
>>> tester()
<__main__.tester instance at 0x03018BE8>
>>> t = tester()
>>> t.this
'this'
>>> t.that in tester.E
True
Only have to define the element strings once, and I'm pretty sure it will work for all of my requirements listed in the question. In actual implementation, I plan to encapsulate the str(s) + "='" + str(s) + "'" in a helper function, so that I can just call exec(helper(s)) in the for loop. (I'm pretty sure that the exec has to be placed in the body of the class, not in the helper function, or else the new variables would be injected into the (transitory) scope of the helper function, not that of the class.)
EDIT: Upon detailed testing, this DOES NOT WORK -- the use of exec prevents the introspection of the IDE from knowing of the existence of the created variables.

I think you can achieve a rather beautiful solution using metaclasses, but I'm not fluent enough in using those to present that as an answer, but I do have an option which seems to work rather nicely:
def new_enum(name, *class_members):
"""Builds a class <name> with <class_members> having the name as value."""
return type(name, (object, ), { val : val for val in class_members })
Foo = new_enum('Foo', 'bar', 'baz', 'baf')
This should recreate the class you've given as example, and if you want you can change the inheritance by changing the second parameter of the call to class type(name, bases, dict).

Python Use User Defined String Class

I know how to override string class with:
class UserString:
def __str__(self):
return 'Overridden String'
if __name__ == '__main__':
print UserString()
But how can i use this class instead of built-in str class without defining implicitly UserString class?. To be clear
I want this:
>>> print "Boo boo!"
Overridden String

It is not possible. You have not overridden string class.
You cannot override classes. You can override methods. What you have done is defined a class and only overridden its str() method.
But you can do something like this...
def overriden_print(x):
print "Overriden in the past!"
from __future__ import print_function # import after the definition of overriden_print
print = overriden_print
print("Hello")
Output:
Overriden in the past!

It's impossible to do what you want without hacking the python executable itself... after all, str is a built-in type, and the interpreter, when passed 'string' type immediates, will always create built-in strings.
However... it is possible, using delegation, to do something like this. This is slightly modified from another stackoverflow recipe (which sadly, I did not include a link to in my code...), so if this is your code, please feel free to claim it :)
def returnthisclassfrom(specials):
specialnames = ['__%s__' % s for s in specials.split()]
def wrapit(cls, method):
return lambda *a: cls(method(*a))
def dowrap(cls):
for n in specialnames:
method = getattr(cls, n)
setattr(cls, n, wrapit(cls, method))
return cls
return dowrap
Then you use it like this:
#returnthisclassfrom('add mul mod')
class UserString(str):
pass
In [11]: first = UserString('first')
In [12]: print first
first
In [13]: type(first)
Out[13]: __main__.UserString
In [14]: second = first + 'second'
In [15]: print second
firstsecond
In [16]: type(second)
Out[16]: __main__.UserString
One downside of this is that str has no radd support, so 'string1' + UserString('string2') will give a string, whereas UserString('string1') + 'string2' gives a UserString. Not sure if there is a way around that.
Maybe not helpful, but hopefully it puts you on the right track.

Are object literals Pythonic?

JavaScript has object literals, e.g.
var p = {
name: "John Smith",
age: 23
}
and .NET has anonymous types, e.g.
var p = new { Name = "John Smith", Age = 23}; // C#
Something similar can be emulated in Python by (ab)using named arguments:
class literal(object):
def __init__(self, **kwargs):
for (k,v) in kwargs.iteritems():
self.__setattr__(k, v)
def __repr__(self):
return 'literal(%s)' % ', '.join('%s = %r' % i for i in sorted(self.__dict__.iteritems()))
def __str__(self):
return repr(self)
Usage:
p = literal(name = "John Smith", age = 23)
print p # prints: literal(age = 23, name = 'John Smith')
print p.name # prints: John Smith
But is this kind of code considered to be Pythonic?

Why not just use a dictionary?
p = {'name': 'John Smith', 'age': 23}
print p
print p['name']
print p['age']

Have you considered using a named tuple?
Using your dict notation
>>> from collections import namedtuple
>>> L = namedtuple('literal', 'name age')(**{'name': 'John Smith', 'age': 23})
or keyword arguments
>>> L = namedtuple('literal', 'name age')(name='John Smith', age=23)
>>> L
literal(name='John Smith', age=23)
>>> L.name
'John Smith'
>>> L.age
23
It is possible to wrap this behaviour into a function easily enough
def literal(**kw):
return namedtuple('literal', kw)(**kw)
the lambda equivalent would be
literal = lambda **kw: namedtuple('literal', kw)(**kw)
but personally I think it's silly giving names to "anonymous" functions

From ActiveState:
class Bunch:
def __init__(self, **kwds):
self.__dict__.update(kwds)
# that's it! Now, you can create a Bunch
# whenever you want to group a few variables:
point = Bunch(datum=y, squared=y*y, coord=x)
# and of course you can read/write the named
# attributes you just created, add others, del
# some of them, etc, etc:
if point.squared > threshold:
point.isok = 1

I don't see anything wrong with creating "anonymous" classes/instances. It's often very convienient to create one with simple function call in one line of code. I personally use something like this:
def make_class( *args, **attributes ):
"""With fixed inability of using 'name' and 'bases' attributes ;)"""
if len(args) == 2:
name, bases = args
elif len(args) == 1:
name, bases = args[0], (object, )
elif not args:
name, bases = "AnonymousClass", (object, )
return type( name, bases, attributes )
obj = make_class( something = "some value" )()
print obj.something
For creating dummy objects it works just fine. Namedtuple is ok, but is immutable, which can be inconvenient at times. And dictionary is... well, a dictionary, but there are situations when you have to pass something with __getattr__ defined, instead of __getitem__.
I don't know whether it's pythonic or not, but it sometimes speeds things up and for me it's good enough reason to use it (sometimes).

I'd say that the solution you implemented looks pretty Pythonic; that being said, types.SimpleNamespace (documented here) already wraps this functionality:
from types import SimpleNamespace
p = SimpleNamespace(name = "John Smith", age = 23)
print(p)

From the Python IAQ:
As of Python 2.3 you can use the syntax
dict(a=1, b=2, c=3, dee=4)
which is good enough as far as I'm concerned. Before Python 2.3 I used the one-line function
def Dict(**dict): return dict

I think object literals make sense in JavaScript for two reasons:
In JavaScript, objects are only way to create a “thing” with string-index properties. In Python, as noted in another answer, the dictionary type does that.
JavaScript‘s object system is prototype-based. There’s no such thing as a class in JavaScript (although it‘s coming in a future version) — objects have prototype objects instead of classes. Thus it’s natural to create an object “from nothing”, via a literal, because all objects only require the built-in root object as a prototype. In Python, every object has a class — you’re sort of expected to use objects for things where you’d have multiple instances, rather than just for one-offs.
Thus no, object literals aren’t Pythonic, but they are JavaScripthonic.

A simple dictionary should be enough for most cases.
If you are looking for a similar API to the one you indicated for the literal case, you can still use dictionaries and simply override the special __getattr__ function:
class CustomDict(dict):
def __getattr__(self, name):
return self[name]
p = CustomDict(user='James', location='Earth')
print p.user
print p.location
Note: Keep in mind though that contrary to namedtuples, fields are not validated and you are in charge of making sure your arguments are sane. Arguments such as p['def'] = 'something' are tolerated inside a dictionary but you will not be able to access them via p.def.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Hooking str.getitem in Python - python

Why hook an often-used internal function when you can def get_characters (s, l): return "".join(s[i] for i in l) >>> get_characters("this is a string", [1,3,4]) "hs "

Methods on objects defined in C cannot be monkeypatched. The best you can do is to use an external function to complete the task.

Related

What to do with the error [<main.Student object at 0x000001E84D968090>, <main.Student object at 0x000001E84D95E750>] [duplicate]

Functions, methods, and how many arguments do I have to give them?

Is there a Python 'shortcut' to define a class variable equal to a string version of its own name?

Python Use User Defined String Class

Are object literals Pythonic?

Categories

Resources

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Hooking str.__getitem__ in Python - python

Why hook an often-used internal function when you can def get_characters (s, l): return "".join(s[i] for i in l) >>> get_characters("this is a string", [1,3,4]) "hs "

Methods on objects defined in C cannot be monkeypatched. The best you can do is to use an external function to complete the task.

Related

What to do with the error [<__main__.Student object at 0x000001E84D968090>, <__main__.Student object at 0x000001E84D95E750>] [duplicate]

Functions, methods, and how many arguments do I have to give them?

Is there a Python 'shortcut' to define a class variable equal to a string version of its own name?

Python Use User Defined String Class

Are object literals Pythonic?

Categories

Resources

Hooking str.getitem in Python - python

What to do with the error [<main.Student object at 0x000001E84D968090>, <main.Student object at 0x000001E84D95E750>] [duplicate]