Python Copying part of string - python

I have this line
Server:x.x.x.x # U:100 # P:100 # Pre:0810 # Tel:xxxxxxxxxx
and I want to copy the value 0810 which is after Pre: value
How i can do that ?

You could use the re module ('re' stands for regular expressions)
This solution assumes that your Pre: field will always have four numbers.
If the length of the number varies, you could replace the {4}(expect exactly 4) with + (expect 'one or more')
>>> import re
>>> x = "Server:x.x.x.x # U:100 # P:100 # Pre:0810 # Tel:xxxxxxxxxx"
>>> num = re.findall(r'Pre:(\d{4})', x)[0] # re.findall returns a list
>>> print num
'0810'
You can read about it here:
https://docs.python.org/2/howto/regex.html

As usual in these cases, the best answer depends upon how your strings will vary, and we only have one example to generalize from.
Anyway, you could use string methods like str.split to get at it directly:
>>> s = "Server:x.x.x.x # U:100 # P:100 # Pre:0810 # Tel:xxxxxxxxxx"
>>> s.split()[6].split(":")[1]
'0810'
But I tend to prefer more general solutions. For example, depending on how the format changes, something like
>>> d = dict(x.split(":") for x in s.split(" # "))
>>> d
{'Pre': '0810', 'P': '100', 'U': '100', 'Tel': 'xxxxxxxxxx', 'Server': 'x.x.x.x'}
which makes a dictionary of all the values, after which you could access any element:
>>> d["Pre"]
'0810'
>>> d["Server"]
'x.x.x.x'

Related

pattern match get list and dict from string

I have string below,and I want to get list,dict,var from this string.
How can I to split this string to specific format?
s = 'list_c=[1,2],a=3,b=1.3,c=abch,list_a=[1,2],dict_a={a:2,b:3}'
import re
m1 = re.findall (r'(?=.*,)(.*?=\[.+?\],?)',s)
for i in m1 :
print('m1:',i)
I only get result 1 correctly.
Does anyone know how to do?
m1: list_c=[1,2],
m1: a=3,b=1.3,c=abch,list_a=[1,2],
Use '=' to split instead, then you can work around with variable name and it's value.
You still need to handle the type casting for values (regex, split, try with casting may help).
Also, same as others' comment, using dict may be easier to handle
s = 'list_c=[1,2],a=3,b=1.3,c=abch,list_a=[1,2],dict_a={a:2,b:3}'
al = s.split('=')
var_l = [al[0]]
value_l = []
for a in al[1:-1]:
var_l.append(a.split(',')[-1])
value_l.append(','.join(a.split(',')[:-1]))
value_l.append(al[-1])
output = dict(zip(var_l, value_l))
print(output)
You may have better luck if you more or less explicitly describe the right-hand side expressions: numbers, lists, dictionaries, and identifiers:
re.findall(r"([^=]+)=" # LHS and assignment operator
+r"([+-]?\d+(?:\.\d+)?|" # Numbers
+r"[+-]?\d+\.|" # More numbers
+r"\[[^]]+\]|" # Lists
+r"{[^}]+}|" # Dictionaries
+r"[a-zA-Z_][a-zA-Z_\d]*)", # Idents
s)
# [('list_c', '[1,2]'), ('a', '3'), ('b', '1.3'), ('c', 'abch'),
# ('list_a', '[1,2]'), ('dict_a', '{a:2,b:3}')]
The answer is like below
import re
from pprint import pprint
s = 'list_c=[1,2],a=3,b=1.3,c=abch,list_a=[1],Save,Record,dict_a={a:2,b:3}'
m1 = re.findall(r"([^=]+)=" # LHS and assignment operator
+r"([+-]?\d+(?:\.\d+)?|" # Numbers
+r"[+-]?\d+\.|" # More numbers
+r"\[[^]]+\]|" # Lists
+r"{[^}]+}|" # Dictionaries
+r"[a-zA-Z_][a-zA-Z_\d]*)", # Idents
s)
temp_d = {}
for i,j in m1:
temp = i.strip(',').split(',')
if len(temp)>1:
for k in temp[:-1]:
temp_d[k]=''
temp_d[temp[-1]] = j
else:
temp_d[temp[0]] = j
pprint(temp_d)
Output is like
{'Record': '',
'Save': '',
'a': '3',
'b': '1.3',
'c': 'abch',
'dict_a': '{a:2,b:3}',
'list_a': '[1]',
'list_c': '[1,2]'}
Instead of picking out the types, you can start by capturing the identifiers. Here's a regex that captures all the identifiers in the string (for lowercase only, but see note):
regex = re.compile(r'([a-z]|_)+=')
#note if you want all valid variable names: r'([a-z]|[A-Z]|[0-9]|_)+'
cases = [x.group() for x in re.finditer(regex, s)]
This gives a list of all the identifiers in the string:
['list_c=', 'a=', 'b=', 'c=', 'list_a=', 'dict_a=']
We can now define a function to sequentially chop up s using the
above list to partition the string sequentially:
def chop(mystr, mylist):
temp = mystr.partition(mylist[0])[2]
cut = temp.find(mylist[1]) #strip leading bits
return mystr.partition(mylist[0])[2][cut:], mylist[1:]
mystr = s[:]
temp = [mystr]
mylist = cases[:]
while len() > 1:
mystr, mylist = chop(mystr, mylist)
temp.append(mystr)
This (convoluted) slicing operation gives this list of strings:
['list_c=[1,2],a=3,b=1.3,c=abch,list_a=[1,2],dict_a={a:2,b:3}',
'a=3,b=1.3,c=abch,list_a=[1,2],dict_a={a:2,b:3}',
'b=1.3,c=abch,list_a=[1,2],dict_a={a:2,b:3}',
'c=abch,list_a=[1,2],dict_a={a:2,b:3}',
'list_a=[1,2],dict_a={a:2,b:3}',
'dict_a={a:2,b:3}']
Now cut off the ends using each successive entry:
result = []
for x in range(len(temp) - 1):
cut = temp[x].find(temp[x+1]) - 1 #-1 to remove commas
result.append(temp[x][:cut])
result.append(temp.pop()) #get the last item
Now we have the full list:
['list_c=[1,2]', 'a=3', 'b=1.3', 'c=abch', 'list_a=[1,2]', 'dict_a={a:2,b:3}']
Each element is easily parsable into key:value pairs (and is also executable via exec).

Best way to add two numbers within a column that are seperated by brackets

I have a list that contains counts for different items as strings. Sometimes a restock is given in brackets. The list looks like this:
23
21 (+3)
32 (+14)
So there is always a space between the number and the brackets. To start of I've wrote a function that I apply over the Series. I've used the split-method two times to return only the first number:
splitted = item.split(" ")
splitted2 = splitted[0].split("+")
return int(splitted2[0])
This solution is kind of hacky already and on top of that, I am missing the restocks in brackets. Now I want to know, how I would possibly add both of the numbers together. I would therefore split the list one time so that I get this as a result:
['23']
['21', '(+3)']
Then I want to select only the list items, that have to values, get rid of the + and () and add the first and the second value together. How would I do that?
With help of this question Extract numbers from a string
str = "32 (+14)"
import re
data = re.findall(r'\d+', str)
# ['32', '14']
sum = 0
for d in data:
sum+=int(d)
print(sum)
Output:
46
See how numbers are parsed using regex. Here, d represents [0-9]
Another method: (with help of this answer)
>>> import re
>>> str = "32 (+14)"
>>> eval(re.sub('[\(\)]', '', str.replace(' ', '')))
46
This will also allow user to do any operation
This is one way, which literally evaluates the '+' operation.
import ast, re
lst = ['23', '21 (+3)', '32 (+14)']
lst = [ast.literal_eval(re.sub('[\(\)]', '', i.replace(' ', ''))) for i in lst]
# [23, 24, 46]
not the best method, but this works. regex will be better option.
>>> str = "32 (+14)"
>>> str.replace("(+", "").replace(")", "").split(" ")
['32', '14']
Regex:
>>> import re
>>> str = "32 (+14)"
>>> nums = re.findall('\d+', str )
>>> print(sum(int(i) for i in nums))
46

Get values in string - Python

I am new to Python so I have lots of doubts. For instance I have a string:
string = "xtpo, example1=x, example2, example3=thisValue"
For example, is it possible to get the values next to the equals in example1 and example3? knowing only the keywords, not what comes after the = ?
You can use regex:
>>> import re
>>> strs = "xtpo, example1=x, example2, example3=thisValue"
>>> key = 'example1'
>>> re.search(r'{}=(\w+)'.format(key), strs).group(1)
'x'
>>> key = 'example3'
>>> re.search(r'{}=(\w+)'.format(key), strs).group(1)
'thisValue'
Spacing things out for clarity
>>> Sstring = "xtpo, example1=x, example2, example3=thisValue"
>>> items = Sstring.split(',') # Get the comma separated items
>>> for i in items:
... Pair = i.split('=') # Try splitting on =
... if len(Pair) > 1: # Did split
... print Pair # or whatever you would like to do
...
[' example1', 'x']
[' example3', 'thisValue']
>>>

How to convert a malformed string to a dictionary?

I have a string s (note that the a and b are not enclosed in quotation marks, so it can't directly be evaluated as a dict):
s = '{a:1,b:2}'
I want convert this variable to a dict like this:
{'a':1,'b':2}
How can I do this?
This will work with your example:
import ast
def elem_splitter(s):
return s.split(':',1)
s = '{a:1,b:2}'
s_no_braces = s.strip()[1:-1] #s.translate(None,'{}') is more elegant, but can fail if you can have strings with '{' or '}' enclosed.
elements = (elem_splitter(ss) for ss in s_no_braces.split(','))
d = dict((k,ast.literal_eval(v)) for k,v in elements)
Note that this will fail if you have a string formatted as:
'{s:"foo,bar",ss:2}' #comma in string is a problem for this algorithm
or:
'{s,ss:1,v:2}'
but it will pass a string like:
'{s ss:1,v:2}' #{"s ss":1, "v":2}
You may also want to modify elem_splitter slightly, depending on your needs:
def elem_splitter(s):
k,v = s.split(':',1)
return k.strip(),v # maybe `v.strip() also?`
*Somebody else might cook up a better example using more of the ast module, but I don't know it's internals very well, so I doubt I'll have time to make that answer.
As your string is malformed as both json and Python dict so you neither can use json.loads not ast.literal_eval to directly convert the data.
In this particular case, you would have to manually translate it to a Python dictionary by having knowledge of the input data
>>> foo = '{a:1,b:2}'
>>> dict(e.split(":") for e in foo.translate(None,"{}").split(","))
{'a': '1', 'b': '2'}
As Updated by Tim, and my short-sightedness I missed the fact that the values should be integer, here is an alternate implementation
>>> {k: int(v) for e in foo.translate(None,"{}").split(",")
for k, v in [e.split(":")]}
{'a': 1, 'b': 2}
import re,ast
regex = re.compile('([a-z])')
ast.literal_eval(regex.sub(r'"\1"', s))
out:
{'a': 1, 'b': 2}
EDIT:
If you happen to have something like {foo1:1,bar:2} add an additional capture group to the regex:
regex = re.compile('(\w+)(:)')
ast.literal_eval(regex.sub(r'"\1"\2', s))
You can do it simply with this:
s = "{a:1,b:2}"
content = s[s.index("{")+1:s.index("}")]
to_int = lambda x: int(x) if x.isdigit() else x
d = dict((to_int(i) for i in pair.split(":", 1)) for pair in content.split(","))
For simplicity I've omitted exception handling if the string doesn't contain a valid specification, and also this version doesn't strip whitespace, which you may want. If the interpretation you prefer is that the key is always a string and the value is always an int, then it's even easier:
s = "{a:1,b:2}"
content = s[s.index("{")+1:s.index("}")]
d = dict((int(pair[0]), pair[1].strip()) for pair in content.split(","))
As a bonus, this version also strips whitespace from the key to show how simple it is.
import simplejson
s = '{a:1,b:2}'
a = simplejson.loads(s)
print a

Python - Make sure string is converted to correct Float

I have possible strings of prices like:
20.99, 20, 20.12
Sometimes the string could be sent to me wrongly by the user to something like this:
20.99.0, 20.0.0
These should be converted back to :
20.99, 20
So basically removing anything from the 2nd . if there is one.
Just to be clear, they would be alone, one at a time, so just one price in one string
Any nice one liner ideas?
For a one-liner, you can use .split() and .join():
>>> '.'.join('20.99.0'.split('.')[:2])
'20.99'
>>> '.'.join('20.99.1231.23'.split('.')[:2])
'20.99'
>>> '.'.join('20.99'.split('.')[:2])
'20.99'
>>> '.'.join('20'.split('.')[:2])
'20'
You could do something like this
>>> s = '20.99.0, 20.0.0'
>>> s.split(',')
['20.99.0', ' 20.0.0']
>>> map(lambda x: x[:x.find('.',x.find('.')+1)], s.split(','))
['20.99', ' 20.0']
Look at the inner expression of find. I am finding the first '.' and incrementing by 1 and then find the next '.' and leaving everything from that in the string slice operation.
Edit: Note that this solution will not discard everything from the second decimal point, but discard only the second point and keep additional digits. If you want to discard all digits, you could use e.g. #Blender's solution
It only qualifies as a one-liner if two instructions per line with a ; count, but here's what I came up with:
>>> x = "20.99.1234"
>>> s = x.split("."); x = s[0] + "." + "".join(s[1:])
>>> x
20.991234
It should be a little faster than scanning through the string multiple times, though. For a performance cost, you can do this:
>>> x = x.split(".")[0] + "." + "".join(x.split(".")[1:])
For a whole list:
>>> def numify(x):
>>> s = x.split(".")
>>> return float( s[0] + "." + "".join(s[1:]))
>>> x = ["123.4.56", "12.34", "12345.6.7.8.9"]
>>> [ numify(f) for f in x ]
[123.456, 12.34, 12345.6789]
>>> s = '20.99, 20, 20.99.23'
>>> ','.join(x if x.count('.') in [1,0] else x[:x.rfind('.')] for x in s.split(','))
'20.99, 20, 20.99'
If you are looking for a regex based solution and your intended behaviour is to discard everthing after the second .(decimal) than
>>> st = "20.99.123"
>>> string_decimal = re.findall(r'\d+\.\d+',st)
>>> float(''.join(string_decimal))
20.99

Categories

Resources