I am trying to make a list containing 2 strings:
List=["Hight = 7.2", "baselength = 8.32"]
But I am having a problem trying to extract the numbers from the strings:
For example:
If "Hight = 7.2" then the result should be: 7.2
or if the "Hight= 7.3232" then the result should be: 7.3232
Using re.findall :
>>> out = []
>>> for s in l:
out.append( float(re.findall('\d+(?:\.\d+)?', s)[0]) )
>>> out
=> [7.2, 8.0]
Or, without regex, using split,
>>> out = []
>>> for s in l:
num = s.replace(' ','').split('=')[1]
#note : removed whitespace so don't have to deal with cases like
# `n = 2` or `n=2`
out.append(float(num))
>>> out
=> [7.2, 8.0]
#driver values :
IN : l = ["Hight = 7.2","baselength = 8"]
How about this
[(item.split('=')[0],float(item.split('=')[1]) ) for item in List]
Output :
[('Hight ', 7.2), ('baselength ', 8.32)]
Having a label associated to a value is best managed with a dictionary, however if you must have each label=value pair as an entry in a list because perhaps you are reading it into Python from elsewhere, you could use the re module to extract the numeric value from each string in the list:
import re
list=["height = 7.2", "length = 8.32"]
for dim in list:
print(float(re.search('\d+.\d+', dim).group()))
You could convert your list to a dictionary using a comprehension:
import re
List=["Height = 7.2", "baselength = 8.32"]
rx = re.compile(r'(?P<key>\w+)\s*=\s*(?P<value>\d+(?:\.\d+)?)')
Dict = {m.group('key'): float(m.group('value'))
for item in List
for m in [rx.search(item)]}
print(Dict)
# {'Height': 7.2, 'baselength': 8.32}
Afterwards, you can access your values with e.g. Dict["Height"] (here: 7.2).
It's very simple. Use this method for any type of value
List=["Hight = 7.2", "baselength = 8.32"]
# showing example for one value , but you can loop the entire list
a = List[0].split("= ")[1] #accessing first element and split with "= "
print a
'7.2'
Related
I got a list of strings. Those strings have all the two markers in. I would love to extract the string between those two markers for each string in that list.
example:
markers 'XXX' and 'YYY' --> therefore i want to extract 78665786 and 6866
['XXX78665786YYYjajk', 'XXX6866YYYz6767'....]
You can just loop over your list and grab the substring. You can do something like:
import re
my_list = ['XXX78665786YYYjajk', 'XXX6866YYYz6767']
output = []
for item in my_list:
output.append(re.search('XXX(.*)YYY', item).group(1))
print(output)
Output:
['78665786', '6866']
import re
l = ['XXX78665786YYYjajk', 'XXX6866YYYz6767'....]
l = [re.search(r'XXX(.*)YYY', i).group(1) for i in l]
This should work
Another solution would be:
import re
test_string=['XXX78665786YYYjajk','XXX78665783336YYYjajk']
int_val=[int(re.search(r'\d+', x).group()) for x in test_string]
the command split() splits a String into different parts.
list1 = ['XXX78665786YYYjajk', 'XXX6866YYYz6767']
list2 = []
for i in list1:
d = i.split("XXX")
for g in d:
d = g.split("YYY")
list2.append(d)
print(list2)
it's saved into a list
I have string below,and I want to get list,dict,var from this string.
How can I to split this string to specific format?
s = 'list_c=[1,2],a=3,b=1.3,c=abch,list_a=[1,2],dict_a={a:2,b:3}'
import re
m1 = re.findall (r'(?=.*,)(.*?=\[.+?\],?)',s)
for i in m1 :
print('m1:',i)
I only get result 1 correctly.
Does anyone know how to do?
m1: list_c=[1,2],
m1: a=3,b=1.3,c=abch,list_a=[1,2],
Use '=' to split instead, then you can work around with variable name and it's value.
You still need to handle the type casting for values (regex, split, try with casting may help).
Also, same as others' comment, using dict may be easier to handle
s = 'list_c=[1,2],a=3,b=1.3,c=abch,list_a=[1,2],dict_a={a:2,b:3}'
al = s.split('=')
var_l = [al[0]]
value_l = []
for a in al[1:-1]:
var_l.append(a.split(',')[-1])
value_l.append(','.join(a.split(',')[:-1]))
value_l.append(al[-1])
output = dict(zip(var_l, value_l))
print(output)
You may have better luck if you more or less explicitly describe the right-hand side expressions: numbers, lists, dictionaries, and identifiers:
re.findall(r"([^=]+)=" # LHS and assignment operator
+r"([+-]?\d+(?:\.\d+)?|" # Numbers
+r"[+-]?\d+\.|" # More numbers
+r"\[[^]]+\]|" # Lists
+r"{[^}]+}|" # Dictionaries
+r"[a-zA-Z_][a-zA-Z_\d]*)", # Idents
s)
# [('list_c', '[1,2]'), ('a', '3'), ('b', '1.3'), ('c', 'abch'),
# ('list_a', '[1,2]'), ('dict_a', '{a:2,b:3}')]
The answer is like below
import re
from pprint import pprint
s = 'list_c=[1,2],a=3,b=1.3,c=abch,list_a=[1],Save,Record,dict_a={a:2,b:3}'
m1 = re.findall(r"([^=]+)=" # LHS and assignment operator
+r"([+-]?\d+(?:\.\d+)?|" # Numbers
+r"[+-]?\d+\.|" # More numbers
+r"\[[^]]+\]|" # Lists
+r"{[^}]+}|" # Dictionaries
+r"[a-zA-Z_][a-zA-Z_\d]*)", # Idents
s)
temp_d = {}
for i,j in m1:
temp = i.strip(',').split(',')
if len(temp)>1:
for k in temp[:-1]:
temp_d[k]=''
temp_d[temp[-1]] = j
else:
temp_d[temp[0]] = j
pprint(temp_d)
Output is like
{'Record': '',
'Save': '',
'a': '3',
'b': '1.3',
'c': 'abch',
'dict_a': '{a:2,b:3}',
'list_a': '[1]',
'list_c': '[1,2]'}
Instead of picking out the types, you can start by capturing the identifiers. Here's a regex that captures all the identifiers in the string (for lowercase only, but see note):
regex = re.compile(r'([a-z]|_)+=')
#note if you want all valid variable names: r'([a-z]|[A-Z]|[0-9]|_)+'
cases = [x.group() for x in re.finditer(regex, s)]
This gives a list of all the identifiers in the string:
['list_c=', 'a=', 'b=', 'c=', 'list_a=', 'dict_a=']
We can now define a function to sequentially chop up s using the
above list to partition the string sequentially:
def chop(mystr, mylist):
temp = mystr.partition(mylist[0])[2]
cut = temp.find(mylist[1]) #strip leading bits
return mystr.partition(mylist[0])[2][cut:], mylist[1:]
mystr = s[:]
temp = [mystr]
mylist = cases[:]
while len() > 1:
mystr, mylist = chop(mystr, mylist)
temp.append(mystr)
This (convoluted) slicing operation gives this list of strings:
['list_c=[1,2],a=3,b=1.3,c=abch,list_a=[1,2],dict_a={a:2,b:3}',
'a=3,b=1.3,c=abch,list_a=[1,2],dict_a={a:2,b:3}',
'b=1.3,c=abch,list_a=[1,2],dict_a={a:2,b:3}',
'c=abch,list_a=[1,2],dict_a={a:2,b:3}',
'list_a=[1,2],dict_a={a:2,b:3}',
'dict_a={a:2,b:3}']
Now cut off the ends using each successive entry:
result = []
for x in range(len(temp) - 1):
cut = temp[x].find(temp[x+1]) - 1 #-1 to remove commas
result.append(temp[x][:cut])
result.append(temp.pop()) #get the last item
Now we have the full list:
['list_c=[1,2]', 'a=3', 'b=1.3', 'c=abch', 'list_a=[1,2]', 'dict_a={a:2,b:3}']
Each element is easily parsable into key:value pairs (and is also executable via exec).
I would like to separate my string every both commas but I can not, can you help me.
This is what I want: ['nb1,nb2','nb3,nb4','nb5,nb6']
Here is what I did :
a= 'nb1,nb2,nb3,nb4,nb5,nb6'
compteur=0
for i in a:
if i==',' :
compteur+=1
if compteur%2==0:
print compteur
test = a.split(',', compteur%2==0 )
print a
print test
The result:
2
4
nb1,nb2,nb3,nb4,nb5,nb6
['nb1', 'nb2,nb3,nb4,nb5,nb6']
Thanks you by advances for you answers
You can use regex
In [12]: re.findall(r'([\w]+,[\w]+)', 'nb1,nb2,nb3,nb4,nb5,nb6')
Out[12]: ['nb1,nb2', 'nb3,nb4', 'nb5,nb6']
A quick fix could be to simply first separate the elements by commas and then join the elements by two together again. Like:
sub_result = a.split(',')
result = [','.join(sub_result[i:i+2]) for i in range(0,len(sub_result),2)]
This gives:
>>> result
['nb1,nb2', 'nb3,nb4', 'nb5,nb6']
This will also work if the number of elements is odd. For example:
>>> a = 'nb1,nb2,nb3,nb4,nb5,nb6,nb7'
>>> sub_result = a.split(',')
>>> result = [','.join(sub_result[i:i+2]) for i in range(0,len(sub_result),2)]
>>> result
['nb1,nb2', 'nb3,nb4', 'nb5,nb6', 'nb7']
You use a zip operation of the list with itself to create pairs:
a = 'nb1,nb2,nb3,nb4,nb5,nb6'
parts = a.split(',')
# parts = ['nb1', 'nb2', 'nb3', 'nb4', 'nb5', 'nb6']
pairs = list(zip(parts, parts[1:]))
# pairs = [('nb1', 'nb2'), ('nb2', 'nb3'), ('nb3', 'nb4'), ('nb4', 'nb5'), ('nb5', 'nb6')]
Now you can simply join every other pair again for your output:
list(map(','.join, pairs[::2]))
# ['nb1,nb2', 'nb3,nb4', 'nb5,nb6']
Split the string by comma first, then apply the common idiom to partition an interable into sub-sequences of length n (where n is 2 in your case) with zip.
>>> s = 'nb1,nb2,nb3,nb4,nb5,nb6'
>>> [','.join(x) for x in zip(*[iter(s.split(','))]*2)]
['nb1,nb2', 'nb3,nb4', 'nb5,nb6']
I am new to Python so I have lots of doubts. For instance I have a string:
string = "xtpo, example1=x, example2, example3=thisValue"
For example, is it possible to get the values next to the equals in example1 and example3? knowing only the keywords, not what comes after the = ?
You can use regex:
>>> import re
>>> strs = "xtpo, example1=x, example2, example3=thisValue"
>>> key = 'example1'
>>> re.search(r'{}=(\w+)'.format(key), strs).group(1)
'x'
>>> key = 'example3'
>>> re.search(r'{}=(\w+)'.format(key), strs).group(1)
'thisValue'
Spacing things out for clarity
>>> Sstring = "xtpo, example1=x, example2, example3=thisValue"
>>> items = Sstring.split(',') # Get the comma separated items
>>> for i in items:
... Pair = i.split('=') # Try splitting on =
... if len(Pair) > 1: # Did split
... print Pair # or whatever you would like to do
...
[' example1', 'x']
[' example3', 'thisValue']
>>>
If i have a list strings:
first = []
last = []
my_list = [' abc 1..23',' bcd 34..405','cda 407..4032']
how would i append the numbers flanking the .. to their corresponding lists ? to get:
first = [1,34,407]
last = [23,405,4032]
i wouldn't mind strings either because i can convert to int later
first = ['1','34','407']
last = ['23','405','4032']
Use re.search to match the numbers between .. and store them in two different groups:
import re
first = []
last = []
for s in my_list:
match = re.search(r'(\d+)\.\.(\d+)', s)
first.append(match.group(1))
last.append(match.group(2))
DEMO.
I'd use a regular expression:
import re
num_range = re.compile(r'(\d+)\.\.(\d+)')
first = []
last = []
my_list = [' abc 1..23',' bcd 34..405','cda 407..4032']
for entry in my_list:
match = num_range.search(entry)
if match is not None:
f, l = match.groups()
first.append(int(f))
last.append(int(l))
This outputs integers:
>>> first
[1, 34, 407]
>>> last
[23, 405, 4032]
One more solution.
for string in my_list:
numbers = string.split(" ")[-1]
first_num, last_num = numbers.split("..")
first.append(first_num)
last.append(last_num)
It will throw a ValueError if there is a string with no spaces in my_list or there is no ".." after the last space in some of the strings (or there is more than one ".." after the last space of the string).
In fact, this is a good thing if you want to be sure that values were really obtained from all the strings, and all of them were placed after the last space. You can even add a try…catch block to do something in case the string it tries to process is in an unexpected format.
first=[(i.split()[1]).split("..")[0] for i in my_list]
second=[(i.split()[1]).split("..")[1] for i in my_list]