remove charcaters from string - python

i need a function remove() that removes characters from a string.
This was my first approach:
def remove(self, string, index):
return string[0:index] + string[index + 1:]
def remove_indexes(self, string, indexes):
for index in indexes:
string = self.remove(string, index)
return string
Where I pass the indexes I want to remove in an array, but once I remove a character, the whole indexes change.
Is there a more pythonic whay to do this. it would be more preffarable to implement it like that:
"hello".remove([1, 2])

I dont know about a "pythonic" way, but you can achieve this. If you can ensure that in remove_indexes the indexes are always sorted, then you may do this
def remove_indexes(self, string, indexes):
for index in indexes.reverse():
string = self.remove(string, index)
return string
If you cant ensure that then just do
def remove_indexes(self, string, indexes):
for index in indexes.sort(reverse=True):
string = self.remove(string, index)
return string

I think below code will work for you.It removes the indexes(that you want to remove from string) and returns joined string formed with remaining indexes.
def remove_indexes(string,indexes):
return "".join([string[i] for i in range(len(string)) if i not in indexes])
remove_indexes("hello",[1,2])

The most pythonic way would be to use regular expressions. The danger with your indexing approach is that the string you are passing in may have variable length, and therefore you would be removing parts of the string unintentionally.
Lets say you wanted to remove all numbers from a string
import re
s = "This is a string with s0m3 numb3rs in it1 !"
num_reg = re.compile(r"\d+") # catches all digits 0-9
re.sub(num_reg , "**", s) # substitute numbers in `s` with "**"
>>> "This is a string with s**m** numb**rs in it** !"
This way, you define an general expression that may appear regularly in a string (a "regular expression" or regex), and you can quickly and reliably replace all instances of that regex in the string.

You cannot add attribute to built-in types you will have an error like this:
TypeError: can't set attributes of built-in/extension type 'str'
You can create a class that inherit the str and add this method:
class String(str):
def remove(self, index):
if isinstance(index, list):
# order the index to remove the biggest first
for i in sorted(index, reverse=True):
self = self.remove(i)
return self
return String(self[0:index] + self[index + 1:])
s = String("hello")
print(s.remove([0, 1]))
You want change in place you need to create a new type for example:
class String:
def __init__(self, value):
self._str = value
def __getattr__(self, item):
""" delegate to str"""
return getattr(self._str, item)
def __getitem__(self, item):
""" support slicing"""
return String(self._str[item])
def remove(self, indexex):
indexes = indexex if isinstance(indexex, list) else [indexex]
# order the index to remove the biggest first
for i in sorted(indexes, reverse=True):
self._str = self._str[0:i] + self._str[i + 1:]
# change in place should return None
return None
def __str__(self):
return str(self._str)
def __repr__(self):
return repr(self._str)
s = String("hello")
s.remove([0, 1])
print(s.upper()) # delegate to str class
print(s[:1]) # support slicing
print(list(x for x in s)) # it's iterable
But still missing a other magic method to act like a real str class. like __add__ , __mult___, .....
If you want a class like str but have a remove method that changes the instance itself you need to create your own mutable type, str are primitive immutable type and self = self.remove(i) will not really change the variable because it's just changing the reference of self argument to another object, but the reference s is still pointing to the same object created by String("hello").

Related

Is there any way to print a list of objects? [duplicate]

Situation:
I'm new to python and currently trying to learn the ropes, I've attempted creating a linked list class to assist in getting a better understanding of the language and its structures. I know that the __repr__ function is basically supposed to return the same thing as __str__ but I'm unsure on what the actual difference is.
Here's my class so far:
class LinkedList:
class Node:
def __init__(self, val, prior=None, next=None):
self.val = val
self.prior = prior
self.next = next
def __init__(self):
self.head = LinkedList.Node(None)
self.head.prior = self.head.next = self.head
self.length = 0
def __str__(self):
"""Implements `str(self)`. Returns '[]' if the list is empty, else
returns `str(x)` for all values `x` in this list, separated by commas
and enclosed by square brackets. E.g., for a list containing values
1, 2 and 3, returns '[1, 2, 3]'."""
if len(self)==0:
return '[]'
else:
return '[' + ', '.join(str(x) for x in self) + ']'
def __repr__(self):
"""Supports REPL inspection. (Same behavior as `str`.)"""
return '[' + ', '.join(str(x) for x in self) + ']'
When I test this code against the below code, I'll get an error basically saying the blank string '[]' isn't actually being returned when using the repr function. How could I edit this methods body to fix this issue? I've also tried return str(self) and I'm not sure why that won't work either.
from unittest import TestCase
tc = TestCase()
lst = LinkedList()
tc.assertEqual('[]', str(lst))
tc.assertEqual('[]', repr(lst))
lst.append(1)
tc.assertEqual('[1]', str(lst))
tc.assertEqual('[1]', repr(lst))
The __repr__ function returns a string representation of a Python object that may be evaluated by the Python interpreter to instantiate another instance of the object. So if you had a list:
x = ['foo', 'bar']
Its __repr__ string would be:
x_str = repr(x)
print(x_str)
>>>>
"['foo', 'bar']"
And you could do:
x2 = eval(x_str)
print(type(x2))
>>>>
<class 'list'>
It's a way to get a string representation of a Python object that can be converted back into a new instance of said object.
Basically the difference between __str__ and __repr__ is that the former returns a string representation of the object meant to be read by a person and the latter returns a string representation of the object meant to be parsed by the Python interpreter. Be very careful with this!
In your example code, it appears that __str__ and __repr__ return the same string representation. That's fine. However, if you wanted, you could make your __str__ return some prettier formatted version (for instance with carriage returns and no brackets), but __repr__ should always return a string that could be parsed by the Python interpreter to reconstruct the object.

Extend str class in Python and modify its attribute value [duplicate]

Do you know of a Python library which provides mutable strings? Google returned surprisingly few results. The only usable library I found is http://code.google.com/p/gapbuffer/ which is in C but I would prefer it to be written in pure Python.
Edit: Thanks for the responses but I'm after an efficient library. That is, ''.join(list) might work but I was hoping for something more optimized. Also, it has to support the usual stuff regular strings do, like regex and unicode.
In Python mutable sequence type is bytearray see this link
This will allow you to efficiently change characters in a string. Although you can't change the string length.
>>> import ctypes
>>> a = 'abcdefghijklmn'
>>> mutable = ctypes.create_string_buffer(a)
>>> mutable[5:10] = ''.join( reversed(list(mutable[5:10].upper())) )
>>> a = mutable.value
>>> print `a, type(a)`
('abcdeJIHGFklmn', <type 'str'>)
class MutableString(object):
def __init__(self, data):
self.data = list(data)
def __repr__(self):
return "".join(self.data)
def __setitem__(self, index, value):
self.data[index] = value
def __getitem__(self, index):
if type(index) == slice:
return "".join(self.data[index])
return self.data[index]
def __delitem__(self, index):
del self.data[index]
def __add__(self, other):
self.data.extend(list(other))
def __len__(self):
return len(self.data)
...
and so on, and so forth.
You could also subclass StringIO, buffer, or bytearray.
How about simply sub-classing list (the prime example for mutability in Python)?
class CharList(list):
def __init__(self, s):
list.__init__(self, s)
#property
def list(self):
return list(self)
#property
def string(self):
return "".join(self)
def __setitem__(self, key, value):
if isinstance(key, int) and len(value) != 1:
cls = type(self).__name__
raise ValueError("attempt to assign sequence of size {} to {} item of size 1".format(len(value), cls))
super(CharList, self).__setitem__(key, value)
def __str__(self):
return self.string
def __repr__(self):
cls = type(self).__name__
return "{}(\'{}\')".format(cls, self.string)
This only joins the list back to a string if you want to print it or actively ask for the string representation.
Mutating and extending are trivial, and the user knows how to do it already since it's just a list.
Example usage:
s = "te_st"
c = CharList(s)
c[1:3] = "oa"
c += "er"
print c # prints "toaster"
print c.list # prints ['t', 'o', 'a', 's', 't', 'e', 'r']
The following is fixed, see update below.
There's one (solvable) caveat: There's no check (yet) that each element is indeed a character. It will at least fail printing for everything but strings. However, those can be joined and may cause weird situations like this: [see code example below]
With the custom __setitem__, assigning a string of length != 1 to a CharList item will raise a ValueError. Everything else can still be freely assigned but will raise a TypeError: sequence item n: expected string, X found when printing, due to the string.join() operation. If that's not good enough, further checks can be added easily (potentially also to __setslice__ or by switching the base class to collections.Sequence (performance might be different?!), cf. here)
s = "test"
c = CharList(s)
c[1] = "oa"
# with custom __setitem__ a ValueError is raised here!
# without custom __setitem__, we could go on:
c += "er"
print c # prints "toaster"
# this looks right until here, but:
print c.list # prints ['t', 'oa', 's', 't', 'e', 'r']
Efficient mutable strings in Python are arrays.
PY3 Example for unicode string using array.array from standard library:
>>> ua = array.array('u', 'teststring12')
>>> ua[-2:] = array.array('u', '345')
>>> ua
array('u', 'teststring345')
>>> re.search('string.*', ua.tounicode()).group()
'string345'
bytearray is predefined for bytes and is more automatic regarding conversion and compatibility.
You can also consider memoryview / buffer, numpy arrays, mmap and multiprocessing.shared_memory for certain cases.
The FIFOStr package in pypi supports pattern matching and mutable strings. This may or may not be exactly what is wanted but was created as part of a pattern parser for a serial port (the chars are added one char at a time from left or right - see docs). It is derived from deque.
from fifostr import FIFOStr
myString = FIFOStr("this is a test")
myString.head(4) == "this" #true
myString[2] = 'u'
myString.head(4) == "thus" #true
(full disclosure I'm the author of FIFOstr)
Just do this
string = "big"
string = list(string)
string[0] = string[0].upper()
string = "".join(string)
print(string)
'''OUTPUT'''
  > Big

"subtracting" strings, classes in python

Learning about classes in python. I want the difference between two strings, a sort of subtraction. eg:
a = "abcdef"
b ="abcde"
c = a - b
This would give the output f.
I was looking at this class and I am new to this so would like some clarification on how it works.
class MyStr(str):
def __init__(self, val):
return str.__init__(self, val)
def __sub__(self, other):
if self.count(other) > 0:
return self.replace(other, '', 1)
else:
return self
and this will work in the following way:
>>> a = MyStr('thethethethethe')
>>> b = a - 'the'
>>> a
'thethethethethe'
>>> b
'thethethethe'
>>> b = a - 2 * 'the'
>>> b
'thethethe'
So a string is passed to the class and the constructor is called __init__. This runs the constructor and an object is returned, which contains the value of the string? Then a new subtraction function is created, so that when you use - with the MyStr object it is just defining how subtract works with that class? When sub is called with a string, count is used to check if that string is a substring of the object created. If that is the case, the first occurrence of the passed string is removed. Is this understanding correct?
Edit: basically this class could be reduced to:
class MyStr(str):
def __sub__(self, other):
return self.replace(other, '', 1)
Yes, your understanding is entirely correct.
Python will call a .__sub__() method if present on the left-hand operand; if not, a corresponding .__rsub__() method on the right-hand operand can also hook into the operation.
See emulating numeric types for a list of hooks Python supports for providing more arithmetic operators.
Note that the .count() call is redundant; .replace() will not fail if the other string is not present; the whole function could be simplified to:
def __sub__(self, other):
return self.replace(other, '', 1)
The reverse version would be:
def __rsub__(self, other):
return other.replace(self, '', 1)

Classes: Alternative way of adding len()

I am building a string Class that behaves like a regular string class except that the addition operator returns the sum of the lengths of the two strings instead of concatenating them. And then a multiplication operator returns the products of the length of the two strings. So I was planning on doing
class myStr(string):
def __add__(self):
return len(string) + len (input)
at least that is what I have for the first part but that is apparently not correct. Can someone help me correct it.
You need to derive from str, and you can use len(self) to get the length of the current instance. You also need to give __add__ a parameter for the other operand of the + operator.
class myStr(str):
def __add__(self, other):
return len(self) + len(other)
Demo:
>>> class myStr(str):
... def __add__(self, other):
... return len(self) + len(other)
...
>>> foo = myStr('foo')
>>> foo
'foo'
>>> foo + 'bar'
6
string is not a class. It's not anything*. There is no context where len(string) will work unless you define string.
Secondly, __add__ does not have an input parameter.
You need to fix both of these issues.
* You could import a module called string, but it's not something that just exists in global scope.

How to make print call the __str__ method of Python objects inside a list?

In Java, if I call List.toString(), it will automatically call the toString() method on each object inside the List. For example, if my list contains objects o1, o2, and o3, list.toString() would look something like this:
"[" + o1.toString() + ", " + o2.toString() + ", " + o3.toString() + "]"
Is there a way to get similar behavior in Python? I implemented a __str__() method in my class, but when I print out a list of objects, using:
print 'my list is %s'%(list)
it looks something like this:
[<__main__.cell instance at 0x2a955e95f0>, <__main__.cell instance at 0x2a955e9638>, <__main__.cell instance at 0x2a955e9680>]
how can I get python to call my __str__() automatically for each element inside the list (or dict for that matter)?
Calling string on a python list calls the __repr__ method on each element inside. For some items, __str__ and __repr__ are the same. If you want that behavior, do:
def __str__(self):
...
def __repr__(self):
return self.__str__()
You can use a list comprehension to generate a new list with each item str()'d automatically:
print([str(item) for item in mylist])
Two easy things you can do, use the map function or use a comprehension.
But that gets you a list of strings, not a string. So you also have to join the strings together.
s= ",".join( map( str, myList ) )
or
s= ",".join( [ str(element) for element in myList ] )
Then you can print this composite string object.
print 'my list is %s'%( s )
Depending on what you want to use that output for, perhaps __repr__ might be more appropriate:
import unittest
class A(object):
def __init__(self, val):
self.val = val
def __repr__(self):
return repr(self.val)
class Test(unittest.TestCase):
def testMain(self):
l = [A('a'), A('b')]
self.assertEqual(repr(l), "['a', 'b']")
if __name__ == '__main__':
unittest.main()
I agree with the previous answer about using list comprehensions to do this, but you could certainly hide that behind a function, if that's what floats your boat.
def is_list(value):
if type(value) in (list, tuple): return True
return False
def list_str(value):
if not is_list(value): return str(value)
return [list_str(v) for v in value]
Just for fun, I made list_str() recursively str() everything contained in the list.
Something like this?
a = [1, 2 ,3]
[str(x) for x in a]
# ['1', '2', '3']
This should suffice.
When printing lists as well as other container classes, the contained elements will be printed using __repr__, because __repr__ is meant to be used for internal object representation.
If we call: help(object.__repr__) it will tell us:
Help on wrapper_descriptor:
__repr__(self, /)
Return repr(self).
And if we call help(repr) it will output:
Help on built-in function repr in module builtins:
repr(obj, /)
Return the canonical string representation of the object.
For many object types, including most builtins, eval(repr(obj)) == obj.
If __str__ is implemented for an object and __repr__ is not repr(obj) will output the default output, just like print(obj) when non of these are implemented.
So the only way is to implement __repr__ for your class. One possible way to do that is this:
class C:
def __str__(self):
return str(f"{self.__class__.__name__} class str ")
C.__repr__=C.__str__
ci = C()
print(ci) #C class str
print(str(ci)) #C class str
print(repr(ci)) #C class str
The output you're getting is just the object's module name, class name, and then the memory address in hexadecimal as the the __repr__ function is not overridden.
__str__ is used for the string representation of an object when using print. But since you are printing a list of objects, and not iterating over the list to call the str method for each item it prints out the objects representation.
To have the __str__ function invoked you'd need to do something like this:
'my list is %s' % [str(x) for x in myList]
If you override the __repr__ function you can use the print method like you were before:
class cell:
def __init__(self, id):
self.id = id
def __str__(self):
return str(self.id) # Or whatever
def __repr__(self):
return str(self) # function invoked when you try and print the whole list.
myList = [cell(1), cell(2), cell(3)]
'my list is %s' % myList
Then you'll get "my list is [1, 2, 3]" as your output.

Categories

Resources