python 2.7 remove brackets - python

I have a string opening with { and closing with }. This brackets are always at first and at last and must appear, they can not appear in the middle. as following:
{-4,10746,.....,205}
{-3,105756}
what is the most efficient way to remove the brackets to receive:
-4,10746,.....,205
-3,105756

s[1:-1] # skip the first and last character

You can also use replace method.
In [1]: a = 'hello world'
In [3]: a.replace('l','')
Out[3]: 'heo word'

Since you were not clear there are two possibilities it may be a string or a set
If it is a set this might work:
a= {-4, 205, 10746}
",".join([str(s) for s in a])
output='10746,-4,205'
If it is a string this will work:
a= '{-4, 205, 10746}'
a.replace("{","").replace("}","")
output= '-4, 205, 10746'
Since there is no order in set the output is that way

Here's a rather roundabout way of doing exactly what you need:
l = {-3,105756}
new_l = []
for ch in l:
if ch!='{' and ch!= '}':
new_l.append(ch)
for i,val in enumerate(new_l):
length = len(new_l)
if(i==length-1):
print str(val)
else:
print str(val)+',',
I'm sure there are numerous single line codes to give you what you want, but this is kind of what goes on in the background, and will also remove the braces irrespective of their positions in the input string.

Just a side note, answer by #dlask is good to solve your issue.
But if what you really want is to convert that string (that looks like a set) to a set object (or some other data structure) , you can also use ast.literal_eval() function -
>>> import ast
>>> s = '{-3,105756}'
>>> sset = ast.literal_eval(s)
>>> sset
{105756, -3}
>>> type(sset)
<class 'set'>
From documentation -
ast.literal_eval(node_or_string)
Safely evaluate an expression node or a Unicode or Latin-1 encoded string containing a Python literal or container display. The string or node provided may only consist of the following Python literal structures: strings, numbers, tuples, lists, dicts, booleans, and None.

The safest way would be to strip:
'{-4, 205, 10746}'.strip("{}")

Related

Does split method in python returned something containing \u for some characters and how to get rid of it?

I have a unicode string:
s = "ᠤᠷᠢᠳᠤ ᠲᠠᠯ᠎ᠠ ᠶᠢᠨ ᠬᠠᠪᠲᠠᠭᠠᠢ ᠬᠡᠪᠲᠡᠭᠡ"
the split method it returns is somewhat changed, with a \u180e in the second word.
>>> print(s.split())
['ᠤᠷᠢᠳᠤ', 'ᠲᠠᠯ\u180eᠠ', 'ᠶᠢᠨ', 'ᠬᠠᠪᠲᠠᠭᠠᠢ', 'ᠬᠡᠪᠲᠡᠭᠡ']
What I want to get is:
['ᠤᠷᠢᠳᠤ', 'ᠲᠠᠯ᠎ᠠ ᠶᠢᠨ', 'ᠶᠢᠨ', 'ᠬᠠᠪᠲᠠᠭᠠᠢ', 'ᠬᠡᠪᠲᠡᠭᠡ']
What is the reason causing this, and how to solve it?
I don't think the problem is with the split function, but with the list itself.
>>> s = ["ᠤᠷᠢᠳᠤ ᠲᠠᠯ᠎ᠠ ᠶᠢᠨ ᠬᠠᠪᠲᠠᠭᠠᠢ ᠬᠡᠪᠲᠡᠭᠡ"]
>>> print(s)
['ᠤᠷᠢᠳᠤ ᠲᠠᠯ\u180eᠠ ᠶᠢᠨ ᠬᠠᠪᠲᠠᠭᠠᠢ ᠬᠡᠪᠲᠡᠭᠡ']
You should still be able to use the list normally, because it corrects itself when the element is used.
>>> s = "ᠤᠷᠢᠳᠤ ᠲᠠᠯ᠎ᠠ ᠶᠢᠨ ᠬᠠᠪᠲᠠᠭᠠᠢ ᠬᠡᠪᠲᠡᠭᠡ"
>>> s = s.split()
>>> [print(e) for e in s]
ᠤᠷᠢᠳᠤ
ᠲᠠᠯ᠎ᠠ
ᠶᠢᠨ
ᠬᠠᠪᠲᠠᠭᠠᠢ
ᠬᠡᠪᠲᠡᠭᠡ
According to Wikipedia: https://en.wikipedia.org/wiki/Whitespace_character#Unicode
U+180E is a space character until Uncode 6.3.0 so if python implements a earlier Unicode spec than i guess split() would break on all space characters. You could work arround this by giving split an argument if you want to only split on certain characters (s.split(" ")) that would give you:
>>> s.split(" ")
['ᠤᠷᠢᠳᠤ', 'ᠲᠠᠯ\u180eᠠ\u202fᠶᠢᠨ', 'ᠬᠠᠪᠲᠠᠭᠠᠢ', 'ᠬᠡᠪᠲᠡᠭᠡ']

How to copy changing substring in string?

How can I copy data from changing string?
I tried to slice, but length of slice is changing.
For example in one case I should copy number 128 from string '"edge_liked_by":{"count":128}', in another I should copy 15332 from "edge_liked_by":{"count":15332}
You could use a regular expression:
import re
string = '"edge_liked_by":{"count":15332}'
number = re.search(r'{"count":(\d*)}', string).group(1)
Really depends on the situation, however I find regular expressions to be useful.
To grab the numbers from the string without caring about their location, you would do as follows:
import re
def get_string(string):
return re.search(r'\d+', string).group(0)
>>> get_string('"edge_liked_by":{"count":128}')
'128'
To only get numbers from the *end of the string, you can use an anchor to ensure the result is pulled from the far end. The following example will grab any sequence of unbroken numbers that is both preceeded by a colon and ends within 5 characters of the end of the string:
import re
def get_string(string):
rval = None
string_match = re.search(r':(\d+).{0,5}$', string)
if string_match:
rval = string_match.group(1)
return rval
>>> get_string('"edge_liked_by":{"count":128}')
'128'
>>> get_string('"edge_liked_by":{"1321":1}')
'1'
In the above example, adding the colon will ensure that we only pick values and don't match keys such as the "1321" that I added in as a test.
If you just want anything after the last colon, but excluding the bracket, try combining split with slicing:
>>> '"edge_liked_by":{"count":128}'.split(':')[-1][0:-1]
'128'
Finally, considering this looks like a JSON object, you can add curly brackets to the string and treat it as such. Then it becomes a nested dict you can query:
>>> import json
>>> string = '"edge_liked_by":{"count":128}'
>>> string = '{' + string + '}'
>>> string = json.loads(string)
>>> string.get('edge_liked_by').get('count')
128
The first two will return a string and the final one returns a number due to being treated as a JSON object.
It looks like the type of string you are working with is read from JSON, maybe you are getting it as the output of some API you are working with?
If it is JSON, you've probably gone one step too far in atomizing it to a string like this. I'd work with the original output, if possible, if I were you.
If not, to make it more JSON like, I'd convert it to JSON by wrapping it in {}, and then working with the json.loads module.
import json
string = '"edge_liked_by":{"count":15332}'
string = "{"+string+"}"
json_obj = json.loads(string)
count = json_obj['edge_liked_by']['count']
count will have the desired output. I prefer this option to using regular expressions because you can rely on the structure of the data and reuse the code in case you wish to parse out other attributes, in a very intuitive way. With regular expressions, the code you use will change if the data are decimal, or negative, or contain non-numeric characters.
Does this help ?
a='"edge_liked_by":{"count":128}'
import re
b=re.findall(r'\d+', a)[0]
b
Out[16]: '128'

Replace string content with each others

I have a string: 1x22x1x.
I need to replace all 1 to 2 and vice versa. So example line would be 2x11x2x. Just wondering how is it done. I tried
a = "1x22x1x"
b = a.replace('1', '2').replace('2', '1')
print b
output is 1x11x1x
Maybe i should forget about using replace..?
Here's a way using the translate method of a string:
>>> a = "1x22x1x"
>>> a.translate({ord('1'):'2', ord('2'):'1'})
'2x11x2x'
>>>
>>> # Just to explain
>>> help(str.translate)
Help on method_descriptor:
translate(...)
S.translate(table) -> str
Return a copy of the string S, where all characters have been mapped
through the given translation table, which must be a mapping of
Unicode ordinals to Unicode ordinals, strings, or None.
Unmapped characters are left untouched. Characters mapped to None
are deleted.
>>>
Note however that I wrote this for Python 3.x. In 2.x, you will need to do this:
>>> from string import maketrans
>>> a = "1x22x1x"
>>> a.translate(maketrans('12', '21'))
'2x11x2x'
>>>
Finally, it is important to remember that the translate method is for interchanging characters with other characters. If you want to interchange substrings, you should use the replace method as Rohit Jain demonstrated.
One way is to use a some temporary string as intermediate replacement:
b = a.replace('1', '#temp_replace#').replace('2', '1').replace('#temp_replace#', '2')
But this may fail, if your string already contains #temp_replace#. This technique is also described in PEP 378
If the "sources" are all one character, you can make a new string:
>>> a = "1x22x1x"
>>> replacements = {"1": "2", "2": "1"}
>>> ''.join(replacements.get(c,c) for c in a)
'2x11x2x'
IOW, make a new string using the get method which accepts a default parameter. somedict.get(c,c) means something like somedict[c] if c in somedict else c, so if the character is in the replacements dictionary you use the associated value otherwise you simply use the character itself.

How do I strip a string given a list of unwanted characters? Python

Is there a way to pass in a list instead of a char to str.strip() in python? I have been doing it this way:
unwanted = [c for c in '!##$%^&*(FGHJKmn']
s = 'FFFFoFob*&%ar**^'
for u in unwanted:
s = s.strip(u)
print s
Desired output, this output is correct but there should be some sort of a more elegant way than how i'm coding it above:
oFob*&%ar
Strip and friends take a string representing a set of characters, so you can skip the loop:
>>> s = 'FFFFoFob*&%ar**^'
>>> s.strip('!##$%^&*(FGHJKmn')
'oFob*&%ar'
(the downside of this is that things like fn.rstrip(".png") seems to work for many filenames, but doesn't really work)
Since, you are looking to not delete elements from the middle, you can just use.
>>> 'FFFFoFob*&%ar**^'.strip('!##$%^&*(FGHJKmn')
'oFob*&%ar'
Otherwise, Use str.translate().
>>> 'FFFFoFob*&%ar**^'.translate(None, '!##$%^&*(FGHJKmn')
'oobar'

How to convert an integer to hexadecimal without the extra '0x' leading and 'L' trailing characters in Python?

I am trying to convert big integer number to hexadecimal, but in result I get extra "0x" in the beginning and "L" at the and. Is there any way to remove them. Thanks.
The number is:
44199528911754184119951207843369973680110397865530452125410391627149413347233422
34022212251821456884124472887618492329254364432818044014624401131830518339656484
40715571509533543461663355144401169142245599341189968078513301836094272490476436
03241723155291875985122856369808620004482511813588136695132933174030714932470268
09981252011612514384959816764532268676171324293234703159707742021429539550603471
00313840833815860718888322205486842202237569406420900108504810
In hex I get:
0x2ef1c78d2b66b31edec83f695809d2f86e5d135fb08f91b865675684e27e16c2faba5fcea548f3
b1f3a4139942584d90f8b2a64f48e698c1321eee4b431d81ae049e11a5aa85ff85adc2c891db9126
1f7f2c1a4d12403688002266798ddd053c2e2670ef2e3a506e41acd8cd346a79c091183febdda3ca
a852ce9ee2e126ca8ac66d3b196567ebd58d615955ed7c17fec5cca53ce1b1d84a323dc03e4fea63
461089e91b29e3834a60020437db8a76ea85ec75b4c07b3829597cfed185a70eeaL
The 0x is literal representation of hex numbers. And L at the end means it is a Long integer.
If you just want a hex representation of the number as a string without 0x and L, you can use string formatting with %x.
>>> a = 44199528911754184119951207843369973680110397
>>> hex(a)
'0x1fb62bdc9e54b041e61857943271b44aafb3dL'
>>> b = '%x' % a
>>> b
'1fb62bdc9e54b041e61857943271b44aafb3d'
Sure, go ahead and remove them.
hex(bignum).rstrip("L").lstrip("0x") or "0"
(Went the strip() route so it'll still work if those extra characters happen to not be there.)
Similar to Praveen's answer, you can also directly use built-in format().
>>> a = 44199528911754184119951207843369973680110397
>>> format(a, 'x')
'1fb62bdc9e54b041e61857943271b44aafb3d'
I think it's dangerous idea to use strip.
because lstrip or rstrip strips 0.
ex)
a = '0x0'
a.lstrip('0x')
''
result is '', not '0'.
In your case, you can simply use replace to prevent above situation.
Here's sample code.
hex(bignum).replace("L","").replace("0x","")
Be careful when using the accepted answer as lstrip('0x') will also remove any leading zeros, which may not be what you want, see below:
>>> account = '0x000067'
>>> account.lstrip('0x')
'67'
>>>
If you are sure that the '0x' prefix will always be there, it can be removed simply as follows:
>>> hex(42)
'0x2a'
>>> hex(42)[2:]
'2a'
>>>
[2:] will get every character in the string except for the first two.
A more elegant way would be
hex(_number)[2:-1]
but you have to be careful if you're working with gmpy mpz types,
then the 'L' doesn't exist at the end and you can just use
hex(mpz(_number))[2:]

Categories

Resources