Issue with Configobj-python and list items

Issue with Configobj-python and list items - python

I am trying to read .ini file with keywords having single items or list items. When I try to print single item strings and float values, it prints as h,e,l,l,o and 2, ., 1 respectively, whereas it should have been just hello and 2.1. Also, when I try to write new single item string/float/integer, there is , at the end. I am new to python and dealing with configobj. Any help is appreciated and if this question has been answered previously, please direct me to it. Thanks!
from configobj import ConfigObj
Read
config = ConfigObj('para_file.ini')
para = config['Parameters']
print(", ".join(para['name']))
print(", ".join(para['type']))
print(", ".join(para['value']))
Write
new_names = 'hello1'
para['name'] = [x.strip(' ') for x in new_names.split(",")]
new_types = '3.1'
para['type'] = [x.strip(' ') for x in new_types.split(",")]
new_values = '4'
para['value'] = [x.strip(' ') for x in new_values.split(",")]
config.write()
My para_file.ini looks like this,
[Parameters]
name = hello1
type = 2.1
value = 2

There are two parts to your question.
Options in ConfigObj can be either a string, or a list of strings.
[Parameters]
name = hello1 # This will be a string
pets = Fluffy, Spot # This will be a list with 2 items
town = Bismark, ND # This will also be a list of 2 items!!
alt_town = "Bismark, ND" # This will be a string
opt1 = foo, # This will be a list of 1 item (note the trailing comma)
So, if you want something to appear as a list in ConfigObj, you must make sure it includes a comma. A list of one item must have a trailing comma.
In Python, strings are iterable. So, even though they are not a list, they can be iterated over. That means in an expression like
print(", ".join(para['name']))
The string para['name'] will be iterated over, producing the list ['h', 'e', 'l', 'l', 'o', '1'], which Python dutifully joins together with spaces, producing
h e l l o 1

Related

pattern match get list and dict from string

I have string below,and I want to get list,dict,var from this string.
How can I to split this string to specific format?
s = 'list_c=[1,2],a=3,b=1.3,c=abch,list_a=[1,2],dict_a={a:2,b:3}'
import re
m1 = re.findall (r'(?=.*,)(.*?=\[.+?\],?)',s)
for i in m1 :
print('m1:',i)
I only get result 1 correctly.
Does anyone know how to do?
m1: list_c=[1,2],
m1: a=3,b=1.3,c=abch,list_a=[1,2],

Use '=' to split instead, then you can work around with variable name and it's value.
You still need to handle the type casting for values (regex, split, try with casting may help).
Also, same as others' comment, using dict may be easier to handle
s = 'list_c=[1,2],a=3,b=1.3,c=abch,list_a=[1,2],dict_a={a:2,b:3}'
al = s.split('=')
var_l = [al[0]]
value_l = []
for a in al[1:-1]:
var_l.append(a.split(',')[-1])
value_l.append(','.join(a.split(',')[:-1]))
value_l.append(al[-1])
output = dict(zip(var_l, value_l))
print(output)

You may have better luck if you more or less explicitly describe the right-hand side expressions: numbers, lists, dictionaries, and identifiers:
re.findall(r"([^=]+)=" # LHS and assignment operator
+r"([+-]?\d+(?:\.\d+)?|" # Numbers
+r"[+-]?\d+\.|" # More numbers
+r"\[[^]]+\]|" # Lists
+r"{[^}]+}|" # Dictionaries
+r"[a-zA-Z_][a-zA-Z_\d]*)", # Idents
s)
# [('list_c', '[1,2]'), ('a', '3'), ('b', '1.3'), ('c', 'abch'),
# ('list_a', '[1,2]'), ('dict_a', '{a:2,b:3}')]

The answer is like below
import re
from pprint import pprint
s = 'list_c=[1,2],a=3,b=1.3,c=abch,list_a=[1],Save,Record,dict_a={a:2,b:3}'
m1 = re.findall(r"([^=]+)=" # LHS and assignment operator
+r"([+-]?\d+(?:\.\d+)?|" # Numbers
+r"[+-]?\d+\.|" # More numbers
+r"\[[^]]+\]|" # Lists
+r"{[^}]+}|" # Dictionaries
+r"[a-zA-Z_][a-zA-Z_\d]*)", # Idents
s)
temp_d = {}
for i,j in m1:
temp = i.strip(',').split(',')
if len(temp)>1:
for k in temp[:-1]:
temp_d[k]=''
temp_d[temp[-1]] = j
else:
temp_d[temp[0]] = j
pprint(temp_d)
Output is like
{'Record': '',
'Save': '',
'a': '3',
'b': '1.3',
'c': 'abch',
'dict_a': '{a:2,b:3}',
'list_a': '[1]',
'list_c': '[1,2]'}

Instead of picking out the types, you can start by capturing the identifiers. Here's a regex that captures all the identifiers in the string (for lowercase only, but see note):
regex = re.compile(r'([a-z]|_)+=')
#note if you want all valid variable names: r'([a-z]|[A-Z]|[0-9]|_)+'
cases = [x.group() for x in re.finditer(regex, s)]
This gives a list of all the identifiers in the string:
['list_c=', 'a=', 'b=', 'c=', 'list_a=', 'dict_a=']
We can now define a function to sequentially chop up s using the
above list to partition the string sequentially:
def chop(mystr, mylist):
temp = mystr.partition(mylist[0])[2]
cut = temp.find(mylist[1]) #strip leading bits
return mystr.partition(mylist[0])[2][cut:], mylist[1:]
mystr = s[:]
temp = [mystr]
mylist = cases[:]
while len() > 1:
mystr, mylist = chop(mystr, mylist)
temp.append(mystr)
This (convoluted) slicing operation gives this list of strings:
['list_c=[1,2],a=3,b=1.3,c=abch,list_a=[1,2],dict_a={a:2,b:3}',
'a=3,b=1.3,c=abch,list_a=[1,2],dict_a={a:2,b:3}',
'b=1.3,c=abch,list_a=[1,2],dict_a={a:2,b:3}',
'c=abch,list_a=[1,2],dict_a={a:2,b:3}',
'list_a=[1,2],dict_a={a:2,b:3}',
'dict_a={a:2,b:3}']
Now cut off the ends using each successive entry:
result = []
for x in range(len(temp) - 1):
cut = temp[x].find(temp[x+1]) - 1 #-1 to remove commas
result.append(temp[x][:cut])
result.append(temp.pop()) #get the last item
Now we have the full list:
['list_c=[1,2]', 'a=3', 'b=1.3', 'c=abch', 'list_a=[1,2]', 'dict_a={a:2,b:3}']
Each element is easily parsable into key:value pairs (and is also executable via exec).

get an element out of a string python

I am trying to use line.strip() and line.split() to get an element out of a file, but this always gives me a list of string, does line.split() always return a string? how can I just get a list of elements instead of a list of 'elements'?
myfile = open('myfile.txt','r')
for line in myfile:
line_strip = line.strip()
myline = line_strip.split(' ')
print(myline)
So my code gives me ['hello','hi']
I want to get a list out of the file look likes[hello,hi]
[2.856,9.678,6.001] 6 Mary
[8.923,3.125,0.588] 7 Louis
[7.122,9.023,4,421] 16 Ariel
so when I try
list = []
list.append((mylist[0][0],mylist[0][1]))
I actually want a list = [(2.856,9.678),(8.923,3.123),(7.122,9.023)]
but it seems this mylist[0][0] refers to '[' in my file

my_string = 'hello'
my_list = list(my_string) # ['h', 'e', 'l', 'l', 'o']
my_new_string = ''.join(my_list) # 'hello'

I think you are looking for this
>>> print("[{}]".format(", ".join(data)))
[1, 2, 3]
To address your question, though
this always gives me a list of string,
Right. As str.split() should do.
does line.split() always return a string?
Assuming type(line) == str, then no, it returns a list of string elements from the split line.
how can I just get a list of elements instead of a list of 'elements'?
Your "elements" are strings. The ' marks are only Python's repr of a str type.
For example...
print('4') # 4
print(repr('4')) # '4'
line = "1,2,3"
data = line.split(",")
print(data) # ['1', '2', '3']
You can cast to a different data-type as you wish
print([float(x) for x in data]) # [1.0, 2.0, 3.0]

For what you posted, use a regex:
>>> s="[2.856,9.678,6.001] 6 Mary"
>>> import re
>>> [float(e) for e in re.search(r'\[([^\]]+)',s).group(1).split(',')]
[2.856, 9.678, 6.001]
For all the lines you posted (and this would be similar to a file) you might do:
>>> txt="""\
... [2.856,9.678,6.001] 6 Mary
... [8.923,3.125,0.588] 7 Louis
... [7.122,9.023,4,421] 16 Ariel"""
>>> for line in txt.splitlines():
... print [float(e) for e in re.search(r'\[([^\]]+)',line).group(1).split(',')]
...
[2.856, 9.678, 6.001]
[8.923, 3.125, 0.588]
[7.122, 9.023, 4.0, 421.0]
You would need to add error code to that (if the match fails for instance) but this is the core of what you are looking for.
BTW: Don't use list as a variable name. You will overwrite the list function and have confusing errors in the future...

line.split() returns a list of strings.
For example:
my_string = 'hello hi'
my_string.split(' ') is equal to ['hello', 'hi']
To put a list of strings, like ['hello', 'hi] back together, use join.
For example, ' '.join(['hello', 'hi']) is equal to 'hello hi'. The ' ' specifies to put a space between all the elements in the list that you are joining.

how to print all the elements of the split in a generic way in python

I want to print all the elements of the split- whether the result of the split is a single element list or many elements. How many the splits happen is decided on run time.
For example-
x = "abc;bcd;def"
x1 = x.split(";")
print(x1[0], x1[1], x1[2])
However, x could sometimes be - x =abc, in that case x1[1] and x1[2] would return invalid/null and it screws up my code. Is there a generic way to print the results of the split, irrespective of the number of splits that happen?
I also want to print like this-
print(blah1, x1[0], blah2)- when the split results in only one element
If split results in more than one element, it will print each additional line for the additional element-
print(blah1, x1[0], blah2) #for element0 of the split
print(blah21, x1[1], blah21) #for element1 of the split
and so on (for every additional element of the split that is generated)..

You can use str.join() to join back the split strings with whatever delimiter you want for printing. Example -
>>> x = "abc,bcd,def"
>>> x1 = x.split(',')
>>> print(' '.join(x1))
abc bcd def

Pass the list to print() as separate arguments using the *args syntax:
print(*x1)
This expands the elements, however many there are, to pass them as individual arguments to the print() function.

Just print the result of the split() call:
x = "abc;bcd;def"
x1 = x.split(";")
print x1
It will work with any number of parts:
>>> print "a;b;c".split(";")
['a', 'b', 'c']
>>> print "a;b".split(";")
['a', 'b']
>>> print "a".split(";")
['a']
>>> print "".split(";")
['']

Error handling numpy.float?

I am working on a csv file using python.
I wrote the following script to treat the file:
import pickle
import numpy as np
from csv import reader, writer
dic1 = {'a': 2, 'b': 2, 'c': 2}
dic2 = {'a': 2,'b': 2,'c': 0}
number = dict()
for k in dic1:
number[k] = dic1[k] + dic2[k]
ctVar = {'a': [0.093323751331788565, -1.0872670058072453, '', 8.3574590513050264], 'b': [0.053169909627947334, -1.0825742255395172, '', 8.0033788558001984], 'c': [-0.44681777279768059, 2.2380488442495348]}
Var = {}
for k in number:
Var[k] = number[k]
def findIndex(myList, number):
n = str(number)
m = len(n)
for elt in myList:
e = str(elt)
l = len(e)
mi = min(m,l)
if e[:mi-1] == n[:mi-1]:
return myList.index(elt)
def sortContent(myList):
if '' in myList:
result = ['']
myList.remove('')
else:
result = []
myList.sort()
result = myList + result
return result
An extract of the csv file follows: (INFO: The blanks are important. To increase the readability, I noted them BL but they should just be empty cases)
The columns contain few elements (including '') repeated many times.
a
0.0933237513
-1.0872670058
0.0933237513
BL
BL
0.0933237513
0.0933237513
0.0933237513
BL
Second column:
b
0.0531699096
-1.0825742255
0.0531699096
BL
BL
0.0531699096
0.0531699096
0.0531699096
BL
Third column:
c
-0.4468177728
2.2380488443
-0.4468177728
-0.4468177728
-0.4468177728
-0.4468177728
-0.4468177728
2.2380488443
2.2380488443
I just posted an extract of the code (where I am facing a problem) and we can't see its utility. Basically, it is part of a larger code that I use to modify this csv file and encode it differently.
In this extract, I am trying at some point (line 68) to sort elements of a list that contains numbers and ''.
When I remove the line that does this, the elements printed are those of each column (without any repetition).
The problem is that, when I try to sort them, the '' are no longer taken into account. Yet, when I tested my function sortContent with lists that have '', it worked perfectly.
I thought this problem was related to the use of numpy.float64 elements in my list. So I converted all these elements into floats, but the problem remains.
Any help would be greatly appreciated!

I assume you mean to use sortContent on something else (as obviously if you want the values in your predefined lists in ctVar in a certain order, you can just put them in order in your code rather than sorting them at runtime).
Let's go through your sortContent piece by piece.
if '' in myList:
result = ['']
myList.remove('')
If the list object passed in (let's call this List 1) has items '', create a new list object (let's call it List 2) with just '', and remove the first instance of '' from list 1.
mylist.Sort()
Now, sort the contents of list 1.
result = myList + result
Now create a new list object (call it list 3) with the contents of list 1 and list 2.
return result
Keep in mind that list 1 (the list object that was passed in) still has the '' removed.

Use of slice command with split command?

fh=open('asd.txt')
data=fh.read()
fh.close()
name=data.split('\n')[0][1:]
seq=''.join(data.split('\n')[1:])
print name
print seq
In this code, the 3rd line means "take only first line with first character removed" while the 4th line means "leave the first line and join the next remaining lines".
I cannot get the logic of these two lines.
Can anyone explain me how these two slice operators ([0][1:]) are used together?
Thanx
Edited: renamed file variable (which is a keyword, too) to data.

Think of it like this: file.split('\n') gives you a list of strings. So the first indexing operation, [0], gives you the first string in the list. Now, that string itself is a "list" of characters, so you can then do [1:] to get every character after the first. It's just like starting with a two-dimensional list (a list of lists) and indexing it twice.

When confused by a complex expression, do it it steps.
>>> data.split('\n')[0][1:]
>>> data
>>> data.split('\n')
>>> data.split('\n')[0]
>>> data.split('\n')[0][1:]
That should help.

lets do it by steps, (I think I know what name and seq is):
>>> file = ">Protein kinase\nADVTDADTSCVIN\nASHRGDTYERPLK" <- that's what you get reading your (fasta) file
>>> lines = file.split('\n') <- make a list of lines
>>> line_0 = lines[0] <- take first line (line numbers start at 0)
>>> name = line_0[1:] <- give me line items [x:y] (from x to y)
>>> name
'Protein kinase'
>>>
>>> file = ">Protein kinase\nADVTDADTSCVIN\nASHRGDTYERPLK"
>>> lines = file.split('\n')
>>> seqs = lines[1:] <- gime lines [x:y] (from x to y)
>>> seq = ''.join(seqs)
>>> seq
'ADVTDADTSCVINASHRGDTYERPLK'
>>>
in slice [x:y], x is included, y is not included. When you want to arrive to the end of the list just do not indicate y -> [x:] (from item of index x to the end)

Each set of [] just operates on the list that split returns, and the resulting
list or string then used without assigning it to another variable first.
Break down the third line like this:
lines = file.split('\n')
first_line = lines[0]
name = first_line[1:]
Break down the fourth line like this:
lines = file.split('\n')
all_but_first_line = lines[1:]
seq = ''.join(all_but_first_line)

take this as an example
myl = [["hello","world","of","python"],["python","is","good"]]
so here myl is a list of list. So, myl[0] means first element of list which is equal to ['hello', 'world', 'of', 'python'] but when you use myl[0][1:] it means selecting first element from list which is represented by myl[0] and than from the resulting list(myl[0]) select every element except first one(myl[0][1:]). So output = ['world', 'of', 'python']

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Issue with Configobj-python and list items - python

Related

pattern match get list and dict from string

get an element out of a string python

how to print all the elements of the split in a generic way in python

Error handling numpy.float?

Use of slice command with split command?

Categories

Resources