How to parse Python set-inside-list structure into text? - python

I have the following strange format I am trying to parse.
The data structure I am trying to parse is a "set" of key-value pairs in a list:
[{'key1:value1', 'key2:value2', 'key3:value3',...}]
That's the only data I have, and it needs to be processed. I don't think this can be described as a Python data structure, but I need to parse this to become a string like
'key1:value1, key2:value2, key3:value3'.
Is this doable?
EDIT: Yes, it is key:value, not key:value
Also, this is Python3.x

Iterating over .items() and formatting differently then previous answers.
If your data is the following: list of dict objects then
>>> data = [{'key1':'value1', 'key2':'value2', 'key3':'value3'}]
>>> ', '.join('{0}:{1}'.format(*item) for item in my_dict.items() for my_dict in data)
'key2:value2, key3:value3, key1:value1'
If you data is the list of set objects then approach is simpler
>>> from itertools import chain
>>> data = [{'key1:value1', 'key2:value2', 'key3:value3'}]
>>> ', '.join(chain.from_iterable(data))
'key1:value1, key2:value2, key3:value3'
UPD
NOTE: order can be changed, because set and dict objects are not ordered.

', '.join('{0}:{1}'.format(key, value) for key, value in my_dict.iteritems() for my_dict in my_list)
where my_list is name of your list variable

Since your structure (let's call it myStruct) is a set rather than a dict, the following code should do what you want:
result = ", ".join([x for x in myStruct[0]])
Beware, a set is not ordered, so you might end up with something like 'key2:value2, key1:value1, key3:value3'.

I would use itertools
d = {'key1':'value1', 'key2':'value2', 'key3':'value3'}
for k, v in d.iteritems():
print k + v + ',',

So, you have a list whose only element is a dictionary, and you want to get all the keys-value pairs from that dictionary and put them into a string?
If so, try something like this:
d = yourList[0]
s = ""
for key in d.keys():
s += key + ":" + d[key] + ", "
s = s[:-2] #To trim off the last comma that gets added

Try this,
print ', '.join(i for x in list_of_set for i in x)
If it's set there is no issue with the parsing. The code is equals to
output = ''
for x in list of_set:
for i in x:
output += i
print output

Related

put values to dictionary Python

I want to put the data extracted from a dictionary into another dictionary. I want tu put what I got from that loop into list_od_id = []
for item in album['tracks']['data']:
print(item['id'])
list_of_id = []
for item in album['tracks']['data']:
list_of_id.append(item['id'])
So What I think your trying to say is you want the data of the dictionary to be put in a list.
There are 2 ways of doing this:
1:
dict = {"key1":"value1","key2":"value2","key3":"value3"}
values = []
for i in dict:
values.append(dict[i])
print("values: " + values)
2:
dict = {"key1":"value1","key2":"value2","key3":"value3"}
values = []
for key, value in dict.items():
values.append(value)
print(key + ": " + value)
print(values)
To add a certain value to a list use:
yourList.append(yourValue)
For your Code that would be:
list_of_id =[]
for item in album['tracks']['data']:
list_of_id.append(item['id'])
print(list_of_id)
EDIT
As it seems you confused a list with a dictionary.
your list_of_id is has the datatype list.
That means it is a Set of values.
A dictionary on the over hand Looks Like this:
myDictionary= {"Key": "value"}
Here you have one value for one Key.
quick solution for create new list of item['id'].
list_of_id = [item['id'] for item in album['tracks']['data']]

How do I save changes made in a dictionary values to itself?

I have a dictionary where the values are a list of tuples.
dictionary = {1:[('hello, how are you'),('how is the weather'),('okay
then')], 2:[('is this okay'),('maybe It is')]}
I want to make the values a single string for each key. So I made a function which does the job, but I do not know how to get insert it back to the original dictionary.
my function:
def list_of_tuples_to_string(dictionary):
for tup in dictionary.values():
k = [''.join(i) for i in tup] #joining list of tuples to make a list of strings
l = [''.join(k)] #joining list of strings to make a string
for j in l:
ki = j.lower() #converting string to lower case
return ki
output i want:
dictionary = {1:'hello, how are you how is the weather okay then', 2:'is this okay maybe it is'}
You can simply overwrite the values for each key in the dictionary:
for key, value in dictionary.items():
dictionary[key] = ' '.join(value)
Note the space in the join statement, which joins each string in the list with a space.
It can be done even simpler than you think, just using comprehension dicts
>>> dictionary = {1:[('hello, how are you'),('how is the weather'),('okay then')],
2:[('is this okay'),('maybe It is')]}
>>> dictionary = {key:' '.join(val).lower() for key, val in dictionary.items()}
>>> print(dictionary)
{1: 'hello, how are you how is the weather okay then', 2: 'is this okay maybe It is'}
Now, let's go through the method
we loop through the keys and values in the dictionary with dict.items()
assign the key as itself together with the value as a string consisting of each element in the list.
The elemts are joined together with a single space and set to lowercase.
Try:
for i in dictionary.keys():
dictionary[i]=' '.join(updt_key.lower() for updt_key in dictionary[i])

Python Nested List Comprehension with If Else

I was trying to use a list comprehension to replace multiple possible string values in a list of values.
I have a list of column names which are taken from a cursor.description;
['UNIX_Time', 'col1_MCA', 'col2_MCA', 'col3_MCA', 'col1_MCB', 'col2_MCB', 'col3_MCB']
I then have header_replace;
{'MCB': 'SourceA', 'MCA': 'SourceB'}
I would like to replace the string values for header_replace.keys() found within the column names with the values.
I have had to use the following loop;
headers = []
for header in cursor.description:
replaced = False
for key in header_replace.keys():
if key in header[0]:
headers.append(str.replace(header[0], key, header_replace[key]))
replaced = True
break
if not replaced:
headers.append(header[0])
Which gives me the correct output;
['UNIX_Time', 'col1_SourceA', 'col2_SourceA', 'col3_SourceA', 'col1_SourceB', 'col2_SourceB', 'col3_SourceB']
I tried using this list comprehension;
[str.replace(i[0],k,header_replace[k]) if k in i[0] else i[0] for k in header_replace.keys() for i in cursor.description]
But it meant that items were duplicated for the unmatched keys and I would get;
['UNIX_Time', 'col1_MCA', 'col2_MCA', 'col3_MCA', 'col1_SourceA', 'col2_SourceA', 'col3_SourceA',
'UNIX_Time', 'col1_SourceB', 'col2_SourceB', 'col3_SourceB', 'col1_MCB', 'col2_MCB', 'col3_MCB']
But if instead I use;
[str.replace(i[0],k,header_replace[k]) for k in header_replace.keys() for i in cursor.description if k in i[0]]
#Bakuriu fixed syntax
I would get the correct replacement but then loose any items that didn't need to have an string replacement.
['col1_SourceA', 'col2_SourceA', 'col3_SourceA', 'col1_SourceB', 'col2_SourceB', 'col3_SourceB']
Is there a pythonesque way of doing this or am I over stretching list comprehensions? I certainly find them hard to read.
[str.replace(i[0],k,header_replace[k]) if k in i[0] for k in header_replace.keys() for i in cursor.description]
this is a SyntaxError, because if expressions must contain the else part. You probably meant:
[i[0].replace(k, header_replace[k]) for k in header_replace for i in cursor.description if k in i[0]]
With the if at the end. However I must say that list-comprehension with nested loops aren't usually the way to go.
I would use the expanded for loop. In fact I'd improve it removing the replaced flag:
headers = []
for header in cursor.description:
for key, repl in header_replace.items():
if key in header[0]:
headers.append(header[0].replace(key, repl))
break
else:
headers.append(header[0])
The else of the for loop is executed when no break is triggered during the iterations.
I don't understand why in your code you use str.replace(string, substring, replacement) instead of string.replace(substring, replacement). Strings have instance methods, so you them as such and not as if they were static methods of the class.
If your data is exactly as you described it, you don't need nested replacements and can boil it down to this line:
l = ['UNIX_Time', 'col1_MCA', 'col2_MCA', 'col3_MCA', 'col1_MCB', 'col2_MCB', 'col3_MCB']
[i.replace('_MC', '_Source') for i in l]
>>> ['UNIX_Time',
>>> 'col1_SourceA',
>>> 'col2_SourceA',
>>> 'col3_SourceA',
>>> 'col1_SourceB',
>>> 'col2_SourceB',
>>> 'col3_SourceB']
I guess a function will be more readable:
def repl(key):
for k, v in header_replace.items():
if k in key:
return key.replace(k, v)
return key
print map(repl, names)
Another (less readable) option:
import re
rx = '|'.join(header_replace)
print [re.sub(rx, lambda m: header_replace[m.group(0)], name) for name in names]

concatenate the values of dictionary into single string or sequence

i have a dictionary called self.__sequences reads like "ID:DNA sequence", and the following is part of that dictionary
{'1111758': ('TTAGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGCAGGCCTAA\n', ''),
'1111762': ('AGAGTTTGATCCTGGCTCAGATTGA\n', ''),
'1111763': ('AGAGTTTGATCCTGGCCTT\n', '') }
I want to concatenate the values of the dictionary into one single string or sequence (no \n and no ""), that is, I want something like
"TTAGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGCAGGCCTAAAGAGTTTGATCCTGGCTCAGATTGAAGAGTTTGATCCTGGCCTT"
I write the following code, however, it does not give what I want. I guess it is because the value has two elements(DNA sequence and ""). I am struggling improving my code. Can anyone help me to make it work?
def sequence_statistics(self):
total_len=self.__sequences.values()[0]
for i in range(len(self.__sequences)):
total_len += self.__sequences.values()[i]
return total_len
This will iterate over the sorted keys of your sequences, extract the first value of the tuples in the dict and strip whitespaces. Mind that dicts are unordered in Python 2.7:
''.join(d[k][0].strip() for k in sorted(self.__sequences))
>>> d = {'1111758': ('TTAGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGCAGGCCTAA\n', ''),
... '1111762': ('AGAGTTTGATCCTGGCTCAGATTGA\n', ''),
... '1111763': ('AGAGTTTGATCCTGGCCTT\n', '') }
>>>
>>> lis = []
>>> for tup in d.values():
... lis.append(tup[0].rstrip('\n'))
...
>>> ''.join(lis)
'AGAGTTTGATCCTGGCTCAGATTGAAGAGTTTGATCCTGGCCTTTTAGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGCAGGCCTAA'
>>>
This is a generator that yields the first element of each value, with the "\n" stripped off:
(value[0].strip() for value in self.__sequences.values())
Since you probably want them sorted by keys, it becomes slightly more complicated:
(value[0].strip() for key, value in sorted(self.__sequences.items()))
And to turn that into a single string joined by '' (empty strings) in between, do:
''.join(value[0].strip() for key, value in sorted(self.__sequences.items()))
Try this code instead:
return "".join(v[0].strip() for k, v in self.__sequences.items())

python how to build a new list from this one

I have a list in the following format:
['CASE_1:a','CASE_1:b','CASE_1:c','CASE_1:d',
'CASE_2:e','CASE_2:f','CASE_2:g','CASE_2:h']
I want to create a new list which looks like like this:
['CASE_1:a,b,c,d','CASE_2:e,f,g,h']
Any idea how to get this done elegantly??
You can use a defaultdict by treating case as the key, and appending to the list each letter, where case and the letter are obtained by splitting the elements of your list on ':' - such as:
from collections import defaultdict
case_letters = defaultdict(list)
start = ['CASE_1:a','CASE_1:b','CASE_1:c','CASE_1:d', 'CASE_2:e','CASE_2:f','CASE_2:g','CASE_2:h']
for el in start:
case, letter = el.split(':')
case_letters[case].append(letter)
result = sorted('{case}:{letters}'.format(case=key, letters=','.join(values)) for key, values in case_letters.iteritems())
print result
As this is homework (edit: or was!!?) - I recommend looking at collections.defaultdict, str.split (and other builtin string methods), at the builtin type list and it's methods (such as append, extend, sort etc...), str.format, the builtin sorted method and generally a dict in general. Use the working example here along with the final manual for reference - all these things will come in handy later on - so it's in your best interest to understand them as best you can.
One other thing to consider is that having something like:
{1: ['a', 'b', 'c', 'd'], 2: ['e', 'f', 'g', 'h']}
is a lot more of a useful format and could be used to recreate your desired list afterwards anyway...
I've deleted my full solution since I realized this is homework, but here's the basic idea:
A dictionary is a better data structure. I would look at a collections.defaultdict. e.g.
yourdict = defaultdict(list)
You can iterate through your list (splitting each element on ':'). Something like:
#only split string once -- resulting in a list of length 2.
case, value = element.split(':',1)
Then you can add these to the dict using the list .append method:
yourdict[case].append(value)
Now, you'll have a dict which maps keys (Case_1, Case_2) to lists (['a','b','c','d'], [...]).
If you really need a list, you can sort the items of the dictionary and join appropriately.
sigh. It looks like the homework tag has been removed (here's my original solution):
from collections import defaultdict
d = defaultdict(list)
for elem in yourlist:
case, value = elem.split(':', 1)
d[case].append(value)
Now you have a dictionary as I described above. If you really want to get your list back:
new_lst = [ case+':'+','.join(values) for case,values in sorted(d.items()) ]
data = ['CASE_1:a','CASE_1:b','CASE_1:c','CASE_1:d', 'CASE_2:e','CASE_2:f','CASE_2:g','CASE_2:h']
output = {}
for item in data:
key, value = item.split(':')
if key not in output:
output[key] = []
output[key].append(value)
result = []
for key, values in output.items():
result.append('%s:%s' % (key, ",".join(values)))
print result
outputs
['CASE_2:e,f,g,h', 'CASE_1:a,b,c,d']
mydict = {}
for item in list:
key,value = item.split(":")
if key in mydict:
mydict[key].append(value)
else:
mydict[key] = [value]
[key + ":" + ",".join(value) for key, value in mydict.iteritems()]
Not much elegance, to be honest. You know, I'd store your list as a dict, cause it behaves as a dict in fact.
output is ['CASE_2:e,f,g,h', 'CASE_1:a,b,c,d']

Categories

Resources