Trying to split a list of dictionaries

Trying to split a list of dictionaries - python

Hello I am trying to compare this list of dictionaries:
Call this list Animals:
[{'Fishs=16': 'Fishs=16',
'Birds="6"': 'Birds="6"',
'Dogs=5': 'Dogs=5',
'Bats=10': 'Bats=10',
'Tigers=11': 'Tigers=11',
'Cats=4': 'Cats=4'},
{'Cats=40': 'Cats=40',
'Tigers': 'Tigers = 190',
'Birds=4': 'Birds=4',
'Bats': 'Bats = Null',
'Fishs': 'Fishs = 24',
'Dogs': 'Dogs = 10'}]
I want to make the list look like this
[{'Tigers': 'Tigers=11',
'Dogs': 'Dogs=5',
'Cats': 'Cats=4',
'Bats': 'Bats=10',
'Fishs': 'Fishs=16',
'Birds': 'Birds="6"'},
{'Tigers': 'Tigers=190',
'Dogs': 'Dogs=10',
'Cats': 'Cats=40',
'Bats': 'Bats=Null',
'Fishs': 'Fishs=24',
'Birds': 'Birds=4'}]
so that I can compare it to this other list:
{'Tigers': '19',
'Dogs': '10',
'Cats': '40',
'Bats': '10',
'Fishs': '234',
'Birds': '3'}
Heres the code i've tried to use inorder to split the list:
animals = []
for d in setData:
animals.append({k: v.split('=')[1] for k, v in d.items()})
however It will not split the list since my keys in the dictionaries are in this format Dogs=4 rather than Dogs = 4. I need to be able to split this list even if they are in that format.
On a completely different side note, once this part of the code is fixed I need to figure out how to compare the data from these keys against each other.
for example: Lets say I have Dogs="23" and the compared list is Dogs="50" According to my code this should be Incorrect, but due to the quotes ("23") it says it is, it does not compare the value inside. This is the code i have to compare:
correct_parameters = dict(re.match(r'(\w*)="?(\d*)"?', s).group(1, 2) for s in dataDefault[1:])
print correct_parameters
count = 0
while (count < (len(setNames))):
for number, item in enumerate(animals, 1):
print setNames[count]
count = count + 1
for param, correct in correct_parameters.items():
if item[param] == correct:
print('{} = {} which is correct'.format(param, correct))
However for now I am just trying to fix the list split issue i am having.

lst = [{'Fishs=16': 'Fishs=16', 'Birds="6"': 'Birds="6"', 'Dogs=5': 'Dogs=5', 'Bats=10': 'Bats=10', 'Tigers=11': 'Tigers=11', 'Cats=4': 'Cats=4'}, {'Cats=40': 'Cats=40', 'Tigers': 'Tigers = 190', 'Birds=4': 'Birds=4', 'Bats': 'Bats = Null', 'Fishs': 'Fishs = 24', 'Dogs': 'Dogs = 10'}]
# for each element in that list, loop with index
for idx, val in enumerate(lst):
# create temp object
o = {}
# loop the dictionary
for k,v in val.iteritems():
# if = is found in key
if '=' in k:
# change the key
k = k.split('=')[0]
# insert to temp object
o[k] = v
# change the temp object to current element in the list
lst[idx] = o

Just strip the string after you split it:
>>> [v.strip() for v in " a = b ".split('=')]
['a', 'b']

Related

Fetch value from python dictionary and pass one by one

I have dictionary as mentioned below.
a={'name':['test1','test2'],'regno':['123','345'],'subject':
['maths','science'],'standard':['3','4']}
I need verify below things.
Each values count dictionary should be match.
Fetch the values from each keys one by one and pass it to my other function one by one.
name = 'test1' regno = '123' subject='maths' standard='3'
name = 'test2' regno = '345' subject='science' standard='4'
I have tried using below code but i am stuck here to find out exact way.
a={'name':['test1','test2'],'regno':['123','345'],'subject':['maths','science'],'standard':['3','4']}
lengths = [len(v) for v in a.values()]
if (len(set(lengths)) <= 1) == True:
print('All values are same')`
else:
print('All values are not same')
Need your help to fetch values one by one from each keys and pass it to a function.

Try looping over your dictionary items and then over the lists in values:
for key, vals_list in a.items():
if len(set(vals_list)) <= 1:
print(f'{key}: All values are same!')
# Will do nothing if `vals_list` is empty
for value in vals_list:
your_other_func(value)

You can get it done this way:
a={'name':['test1','test2'],'regno':['123','345'],'subject':
['maths','science'],'standard':['3','4']}
w = [{'name':a['name'][i], 'regno':a['regno'][i], 'standard':a['standard'][i]} for i
in range(len(a['name']))]
for x in range(len(w)):
#your_func(w[x]['name'], w[x]['reno'], w[x]['standard'])
print(w[x]['name'], w[x]['regno'], w[x]['standard'])

I would rebuild a into a list of dictionaries, and then use dict-unpacking to dynamically give the dictionary to the function instead:
def func(name, regno, subject, standard):
print("name={}, regno={}, subject={}, standard={}".format(name, regno, subject, standard))
a={'name':['test1','test2'],'regno':['123','345',],'subject':
['maths','science'],'standard':['3','4']}
new_a = [dict(zip(a.keys(), x)) for x in list(zip(*a.values()))]
print(new_a)
for d in new_a:
func(**d)
Output:
[{'name': 'test1', 'regno': '123', 'subject': 'maths', 'standard': '3'}, {'name': 'test2', 'regno': '345', 'subject': 'science', 'standard': '4'}]
name='test1', regno='123', subject='maths', standard='3'
name='test2', regno='345', subject='science', standard='4'

Create a dict from a list of tuples by summing values for the same keys [duplicate]

This question already has answers here:
Map list of tuples into a dictionary
(5 answers)
Closed 2 years ago.
I have this list:
[(2018, '2', '172767270', '202', 'gege', 'French'),
(2012, '212', '56007072', '200', 'cdadcadc', 'Minangkabou'),
(2013, 'J21', '186144990', '200', 'sacacs', 'Latin'),
...
]
I want the output to be a dictionary based on the key in the last column and the sum of values in the 3rd column.
E.g. for (172767270, French) and (1374767888, French) with their sum 172767270 + 1374767888 = 1547535158 the dictionary would have the following key-value pair:
dic = {'French': 1547535158, ...}
and the final result would be something like:
dic = {'French': 324213424, 'Latin': 34234242, ...}

list = [] #define list here
dict_out = {} #output dictionary
def get_sum(name):
summed = 0
for value in list:
if value[-1] == name:
summed += int(value[2])
return summed
for value in list:
if value[-1] not in dict_out:
dict_out[value[-1]] = get_sum(value[-1])[:4]

I'm assuming that you have a list of tuples. As you've mentioned, we don't need to import any modules. Use the dict.get() method to find the value of a key if it is present and 0 as a default if it is absent.
So for example if 'French' is not in the dictionary .get() will return 0 else it will return the value associated with 'French'
Then we can simply add the value of the third column to the value we returned by .get().
dict={}
for tup in lst:
dict[tup[5]]=dict.get(tup[5],0)+ int(tup[2])
#to get top 5 values
dict2={}
for i in sorted(dict, key=dict.get, reverse=True)[:5]:
dict2[i]=dict[i]

First we have to add all the values based on language.
lang = [(2018, '2', '172767270', '202', 'gege', 'French'),(2012, '212', '56007072', '200', 'cdadcadc', 'Minangkabou'),(2013, 'J21', '186144990', '200', 'sacacs', 'Latin')]
dic = {}
for l in lang:
dic[l[5]] = dic.get(l[5], 0) + int(l[2])
Now we have a dictionary with the sum of 3rd columns of all languages. Now lets sort it to get the top 5.
dic2 = dict(sorted(dic.items(),key=dict.get, reverse=True)[:5])
Now the dic2 has only the top 5 languages with highest 3rd column sum.

If you want to have the sum of the third column based on language, then:
d = defaultdict(int)
l = [(2018, '2', '172767270', '202', 'gege', 'French'),(2018, '2', '172763270', '202', 'gege', 'English'),(2018, '2', '17167270', '202', 'gege', 'Spanish'),
(2012, '212', '56007072', '200', 'cdadcadc', 'Minangkabou'),(2018, '2', '1727672', '202', 'gege', 'Arabic'),(2013, 'J21', '186144990', '200', 'sacacs', 'Latin'),(2017, '2', '1374767888', '202', 'gege', 'French')]
for elem in l:
d[elem[5]]+= int(elem[2])
d
Output:
defaultdict(int,
{'Arabic': 1727672,
'English': 172763270,
'French': 1547535158,
'Latin': 186144990,
'Minangkabou': 56007072,
'Spanish': 17167270})
After that, if you just want the top 5, you can do the following:
dict(sorted(list(d.items()),key= lambda x:x[1],reverse=True)[:5])
Output:
{'English': 172763270,
'French': 1547535158,
'Latin': 186144990,
'Minangkabou': 56007072,
'Spanish': 17167270}

If I understand correctly what you want a for loop will do.
mylist = [] #your list as given above
mydict = {} #here we'll save the values
for(item in mylist):
#read out the values needed
value = item[2]
language = item[-1] #item[6] would also work.
#check if language is already in. If not? Than make it.
if(language not in mydict):
mydict[language] = 0
#Add value to correct dictionary item.
mydict[language] += value
Than you have your full dictionary. Then check the dictionary for the top 5 items based on value.
def myfunc(elem): #returns second entry of tuple.
return elem[1]
#get the list of all the entries
allEntries = list(mydict.items()) #list of tuples
sortedList = sorted(allEntries, key=myfunc, reverse=True) #list sorted on values
print(dict(sortedList[:5])) #dictionary of first five items of the sorted list
I hope that is what you want.

Splitting a string into multiple variables that are subject to change

I have a string like this
b'***************** Winner Prediction *****************\nDate: 2019-08-27 07:00:00\nRace Key: 190827082808\nTrack Name: Mornington\nPosition Number: 8\nName: CONSIDERING\nFinal Odds: 17.3\nPool Final: 37824.7\n'
And in Python, I want to split this string into variables such as:
Date =
Race_Key =
Track_Name =
Name =
Final_Odds =
Pool_Final =
However, the string will always be in the same format, but the values will always be different, for example, the names may have two words in them so it needs to work with all cases.
I have tried:
s = re.split(r'[.?!:]+', pred0)
def search(word, sentences):
return [i for i in sentences if re.search(r'\b%s\b' % word, i)]
But no luck there.

you can split the string and parse it into a dict like this:
s = s.decode() #decode the byte string
n = s.split('\n')[1:-1] #split the string, drop the Winner Prediction and resulting last empty list entry
keys = [key.split(': ')[0].replace(': ','') for key in n] #get keys
vals = [val.split(': ')[1] for val in n] #get values for keys
results = dict(zip(keys,vals)) #store in dict
result :
Date 2019-08-27 07:00:00
Race Key 190827082808
Track Name Mornington
Position Number 8
Name CONSIDERING
Final Odds 17.3
Pool Final 37824.7

You can use the following:
return [line.split(":", 1)[-1].strip() for line in s.splitlines()[1:]]
This will return (for your example input):
['2019-08-27 07:00:00', '190827082808', 'Mornington', '8', 'CONSIDERING', '17.3', '37824.7']

Maybe you can try this:
p = b'***************** Winner Prediction *****************\nDate: 2019-08-27 07:00:00\nRace Key: 190827082808\nTrack Name: Mornington\nPosition Number: 8\nName: CONSIDERING\nFinal Odds: 17.3\nPool Final: 37824.7\n'
out = p.split(b"\n")[:-1][1:]
d = {}
for i in out:
temp = i.split(b":")
key = temp[0].decode()
value = temp[1].strip().decode()
d[key] = value
output would be:
{'Date': '2019-08-27 07',
'Race Key': '190827082808',
'Track Name': 'Mornington',
'Position Number': '8',
'Name': 'CONSIDERING',
'Final Odds': '17.3',
'Pool Final': '37824.7'}

How can I find the max number with different category in this tuple?

I get some data like this
A=['A,1','A,2','A,4','A,5','B,2','B,3','B,4','B,5','C,2','C,20','C,200','C,2']
I want to have a result like this,This means the the name,the min number, the max number.I have 1 million data like this.
'A,1,5','B,2,5','C,2,200'
I tried in this way:
A=['A,1','A,2','A,4','A,5','B,2','B,3','B,4','B,5','C,2','C,20','C,200','C,2']
B=[]
C=[]
for r in A:
B.append(r.split(',')[0])
B_set=list(set(B))
catagory_number=range(0,len(B_set),1)
for j in catagory_number:
numbers = []
for r in A:
if B_set[j]==r.split(',')[0]:
numbers.append(r.split(',')[1])
print numbers
As you can see, it do not work, I get problem to get data together.
['1']
['1', '2']
['1', '2', '4']
['1', '2', '4', '5']
['2']
['2', '20']
['2', '20', '200']
['2', '20', '200', '2']
['2']
['2', '3']
['2', '3', '4']
['2', '3', '4', '5']
Any suggestions?

You could iterate over your list and derive the min and max values using an OrderedDict. At the end you can re-create the string as I show, but actually you might be better off keeping the dictionary data structure (depends what you want to do next):
import collections
def sol(lst):
d = collections.OrderedDict()
for item in lst:
key, value = item.split(',')
value = int(value)
if key in d:
if value < d[key][0]:
d[key][0] = value
elif value > d[key][0]:
d[key][1] = value
else:
d[key] = [value, value] # key = letter; value = [min, max]
return ['{},{},{}'.format(key,*values) for key,values in d.items()] # in Python 2 use key,value[0],value[1]
Example:
my_lst = ['A,1','A,2','A,4','A,5','B,2','B,3','B,4','B,5','C,2','C,20','C,200','C,2']
print(sol(my_lst))
# ['A,1,5', 'B,2,5', 'C,2,200']

A defaultdict with a list as default value could help you a lot:
>>> from collections import defaultdict
>>> data = defaultdict(list)
>>> data['A']
[]
>>> data['A'].append(1)
>>> data['A'].append(2)
>>> data['B'].append(3)
>>> data
defaultdict(<type 'list'>, {'A': [1, 2], 'B': [3]})
It's probably what you wanted to write with set and multiple loops. defaultdict is a standard structure and should be fast enough, even with many values.
Here's a beginning of a solution with this data structure:
from collections import defaultdict
data = defaultdict(list)
A = ['A,1','A,2','A,4','A,5','B,2','B,3','B,4','B,5','C,2','C,20','C,200','C,2']
for couple in A:
letter, number = couple.split(',')
data[letter].append(int(number))
print(data)
# defaultdict(<type 'list'>, {'A': [1, 2, 4, 5], 'C': [2, 20, 200, 2], 'B': [2, 3, 4, 5]})
For each letter in A, you now have a list of corresponding values. It shouldn't be too hard to extract min and max and write the desired list.

you can try this:
letter=[]
number=[]
A=['A,1','A,2','A,4','A,5','B,2','B,3','B,4','B,5','C,2','C,20','C,200','C,2']
for couple in A:
a, b = couple.split(',')
if a not in letter:
letter.append(a)
number.append([b])
else:
ind=letter.index(a)
number[ind].append(b)
B=[]
i=0
while i<len(letter):
B.append(letter[i]+","+str(min(number[i]))+","+str(max(number[i])))
i+=1
print (B)
['A,1,5', 'B,2,5', 'C,2,200']

You can achieve what you intented to do using groupby from itertools module and using list comprehension like this example:
from itertools import groupby
A = ['A,1','A,2','A,4','A,5','B,2','B,3','B,4','B,5','C,2','C,20','C,200','C,2']
sub_final = (sorted(list(v), key = lambda x: int(x.split(",")[1])) for _,v in groupby(sorted(A), lambda x: x[0]))
final = ["{0},{1}".format(k[0],k[-1].split(',')[-1]) for k in sub_final]
print(final)
Output:
['A,1,5', 'B,2,5', 'C,2,200']

Might not be the fastest but I think this is easy to read. Can't offer formatting since I'm using Python 3.4.
A=['A,1','A,2','A,4','A,5','B,2','B,3','B,4','B,5','C,2','C,20','C,200','C,2']
summary = {}
for a in A:
k, v = a.split(',')
v = int(v)
if k in summary:
summary[k] = (min(v, summary[k][0]), max(v, summary[k][1]))
else:
summary[k] = (int(v), int(v))
for k in sorted(summary.keys()):
print (k, summary[k])

Basic idea is to split the list on the basis of it's headers i.e. A, B, C...and find the min and max for each of them. Below is one way to do so:
#!/usr/bin/python
A=['A,1','A,2','A,4','A,5','B,2','B,3','B,4','B,5','C,2','C,20','C,200','C,2']
headerList = []
assoNumList = []
finalList = []
# Iterate over the list to obtain the headers i.e. A,A,A,A,A,B,B,B....C,...
for a in range(len(A)):
header = A[a][0]
headerList.append(header)
# Convert the list into a set to get distinct headers i.e. A,B,C..
headerSet = set(headerList)
uniqueHeaderList = list(headerSet)
# Iterate over the unique header list to get all numbers associated
# with each header. Apply min and max functions over the number set
# to get the Header wise Min and Max numbers.
for i in range(len(uniqueHeaderList)):
for a in range(len(A)):
if(A[a][0] == uniqueHeaderList[i]):
assoNum = A[a][2:]
assoNumList.append(assoNum)
header = A[a][0]
result = header+","+min(assoNumList)+","+max(assoNumList)
finalList.append(result)
del assoNumList[:]
print(sorted(finalList))
#Output: ['A,1,5','B,2,5','C,2,200']

JSON formatting by appending dict values to list

I have a JSON object which is like this:
{ "produktNr:"1234",
"artNr_01":"12",
"artNr_02":"23",
"artNr_03":"",
"artNr_04":"14",
"name_01":"abc",
"name_02":"der",
"test":"junk"
}
I would like to convert this into a dictionary like this:
{ "produktNr:"1234", "artNr":["12","23","","14"], "name":["abc","der"], "test":"junk"}
This conversion is based on a sequence given say, seq = ["artNr","name"]. So the contents of the sequence are searched in the dictionary's keys and the values collected into a list.
My attempt so far:
tempDict = {}
for key,value in fmData.iteritems():
for seqval in seq:
if seqval in key:
if seqval in tempDict:
tempDict[seqval].append(value)
else:
x = []
x.append(value)
tempDict[seqval]=x
else:
tempDict[key] = value
faces a few problems.
The list of values are not ordered i.e, "artNr":["","14","12","23"]
instead of values of [_01,_02,_03,_04]
The items cannot be popped from the dictionary since in the loop the dictionary items cannot be deleted resulting in:
{ "produktNr:"1234", "artNr":["12","23","","14"],"artNr_01":"12", "artNr_02":"23", "artNr_03":"","artNr_04":"14","name":["abc","der"],"name_01":"abc", "name_02":"der", "test":"junk"}
Would love to understand how to deal with this, especially if there's a pythonic way to solve this problem.

You may use OrderedDict from the collections package:
from collections import OrderedDict
import re
input_dict = { "produktNr":"1234",
"artNr_01":"12",
"artNr_02":"23",
"artNr_03":"",
"artNr_04":"14",
"name_01":"abc",
"name_02":"der",
"test":"junk" }
# split keys on the first '_'
m = re.compile('^([^_]*)_(.*)')
def _order_by( item ):
# helper function for ordering the dict.
# item is split on first '_' and, if it was successful
# the second part is returned otherwise item is returned
# if key is something like artNr_42, return 42
# if key is something like test, return test
k,s = item
try:
return m.search(k).group(2)
except:
return k
# create ordered dict using helper function
orderedDict = OrderedDict( sorted(input_dict.items(), key=_order_by))
aggregated_dict = {}
for k, v in orderedDict.iteritems():
# split key
match = m.search(k)
if match:
# key is splittable, i.e., key is something like artNr_42
kk = match.group(1)
if kk not in aggregated_dict:
# create list and add value
aggregated_dict[kk] = [v]
else:
# add value
aggregated_dict[kk].append(v)
else:
# key is not splittable, i.e., key is something like produktNr
aggregated_dict[k] = v
print(aggregated_dict)
which gives the desired output
{'produktNr': '1234', 'test': 'junk', 'name': ['abc', 'der'], 'artNr': ['12', '23', '', '14']}

You can recreate a new dictionary that will group values of keys with '_' in the keys in a list while the other keys and values are kept intact. This should do:
d = { "produktNr":"1234", "artNr_01":"12", "artNr_02":"23","artNr_03":"","artNr_04":"14","name_01":"abc","name_02":"der","test":"junk"}
new_d= {}
for k, v in d.items():
k_new = k.split('_')[0]
if '_' in k:
if k_new not in new_d:
new_d[k_new] = [v]
else:
new_d[k_new].append(v)
else:
new_d[k_new] = v
print(new_d)
# {'artNr': ['', '14', '23', '12'], 'test': 'junk', 'produktNr': '1234', 'name': ['der', 'abc']}
Dicts are unordered collections, so the order with which the values are appended to the list will be indeterminate.

A slight modification of your code:
tempDict = {}
for key,value in fmData.iteritems():
seqval_in_key = "no"
for seqval in seq:
if seqval in key:
seqval_in_key = "yes"
for seqval in seq:
if seqval in key:
if seqval in tempDict:
tempDict[seqval].append(value)
else:
x = []
x.append(value)
tempDict[seqval]=x
else:
if (seqval_in_key == "no"):
tempDict[key] = value
print tempDict
Result:
{'produktNr': '1234', 'test': 'junk', 'name': ['abc', 'der'], 'artNr': ['14', '23', '', '12']}

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Trying to split a list of dictionaries - python

Just strip the string after you split it: >>> [v.strip() for v in " a = b ".split('=')] ['a', 'b']

Related

Fetch value from python dictionary and pass one by one

Create a dict from a list of tuples by summing values for the same keys [duplicate]

Splitting a string into multiple variables that are subject to change

How can I find the max number with different category in this tuple?

JSON formatting by appending dict values to list

Categories

Resources