Splitting a string into multiple variables that are subject to change - python

I have a string like this
b'***************** Winner Prediction *****************\nDate: 2019-08-27 07:00:00\nRace Key: 190827082808\nTrack Name: Mornington\nPosition Number: 8\nName: CONSIDERING\nFinal Odds: 17.3\nPool Final: 37824.7\n'
And in Python, I want to split this string into variables such as:
Date =
Race_Key =
Track_Name =
Name =
Final_Odds =
Pool_Final =
However, the string will always be in the same format, but the values will always be different, for example, the names may have two words in them so it needs to work with all cases.
I have tried:
s = re.split(r'[.?!:]+', pred0)
def search(word, sentences):
return [i for i in sentences if re.search(r'\b%s\b' % word, i)]
But no luck there.

you can split the string and parse it into a dict like this:
s = s.decode() #decode the byte string
n = s.split('\n')[1:-1] #split the string, drop the Winner Prediction and resulting last empty list entry
keys = [key.split(': ')[0].replace(': ','') for key in n] #get keys
vals = [val.split(': ')[1] for val in n] #get values for keys
results = dict(zip(keys,vals)) #store in dict
result :
Date 2019-08-27 07:00:00
Race Key 190827082808
Track Name Mornington
Position Number 8
Name CONSIDERING
Final Odds 17.3
Pool Final 37824.7

You can use the following:
return [line.split(":", 1)[-1].strip() for line in s.splitlines()[1:]]
This will return (for your example input):
['2019-08-27 07:00:00', '190827082808', 'Mornington', '8', 'CONSIDERING', '17.3', '37824.7']

Maybe you can try this:
p = b'***************** Winner Prediction *****************\nDate: 2019-08-27 07:00:00\nRace Key: 190827082808\nTrack Name: Mornington\nPosition Number: 8\nName: CONSIDERING\nFinal Odds: 17.3\nPool Final: 37824.7\n'
out = p.split(b"\n")[:-1][1:]
d = {}
for i in out:
temp = i.split(b":")
key = temp[0].decode()
value = temp[1].strip().decode()
d[key] = value
output would be:
{'Date': '2019-08-27 07',
'Race Key': '190827082808',
'Track Name': 'Mornington',
'Position Number': '8',
'Name': 'CONSIDERING',
'Final Odds': '17.3',
'Pool Final': '37824.7'}

Related

Fetch value from python dictionary and pass one by one

I have dictionary as mentioned below.
a={'name':['test1','test2'],'regno':['123','345'],'subject':
['maths','science'],'standard':['3','4']}
I need verify below things.
Each values count dictionary should be match.
Fetch the values from each keys one by one and pass it to my other function one by one.
name = 'test1' regno = '123' subject='maths' standard='3'
name = 'test2' regno = '345' subject='science' standard='4'
I have tried using below code but i am stuck here to find out exact way.
a={'name':['test1','test2'],'regno':['123','345'],'subject':['maths','science'],'standard':['3','4']}
lengths = [len(v) for v in a.values()]
if (len(set(lengths)) <= 1) == True:
print('All values are same')`
else:
print('All values are not same')
Need your help to fetch values one by one from each keys and pass it to a function.
Try looping over your dictionary items and then over the lists in values:
for key, vals_list in a.items():
if len(set(vals_list)) <= 1:
print(f'{key}: All values are same!')
# Will do nothing if `vals_list` is empty
for value in vals_list:
your_other_func(value)
You can get it done this way:
a={'name':['test1','test2'],'regno':['123','345'],'subject':
['maths','science'],'standard':['3','4']}
w = [{'name':a['name'][i], 'regno':a['regno'][i], 'standard':a['standard'][i]} for i
in range(len(a['name']))]
for x in range(len(w)):
#your_func(w[x]['name'], w[x]['reno'], w[x]['standard'])
print(w[x]['name'], w[x]['regno'], w[x]['standard'])
I would rebuild a into a list of dictionaries, and then use dict-unpacking to dynamically give the dictionary to the function instead:
def func(name, regno, subject, standard):
print("name={}, regno={}, subject={}, standard={}".format(name, regno, subject, standard))
a={'name':['test1','test2'],'regno':['123','345',],'subject':
['maths','science'],'standard':['3','4']}
new_a = [dict(zip(a.keys(), x)) for x in list(zip(*a.values()))]
print(new_a)
for d in new_a:
func(**d)
Output:
[{'name': 'test1', 'regno': '123', 'subject': 'maths', 'standard': '3'}, {'name': 'test2', 'regno': '345', 'subject': 'science', 'standard': '4'}]
name='test1', regno='123', subject='maths', standard='3'
name='test2', regno='345', subject='science', standard='4'

JSON formatting by appending dict values to list

I have a JSON object which is like this:
{ "produktNr:"1234",
"artNr_01":"12",
"artNr_02":"23",
"artNr_03":"",
"artNr_04":"14",
"name_01":"abc",
"name_02":"der",
"test":"junk"
}
I would like to convert this into a dictionary like this:
{ "produktNr:"1234", "artNr":["12","23","","14"], "name":["abc","der"], "test":"junk"}
This conversion is based on a sequence given say, seq = ["artNr","name"]. So the contents of the sequence are searched in the dictionary's keys and the values collected into a list.
My attempt so far:
tempDict = {}
for key,value in fmData.iteritems():
for seqval in seq:
if seqval in key:
if seqval in tempDict:
tempDict[seqval].append(value)
else:
x = []
x.append(value)
tempDict[seqval]=x
else:
tempDict[key] = value
faces a few problems.
The list of values are not ordered i.e, "artNr":["","14","12","23"]
instead of values of [_01,_02,_03,_04]
The items cannot be popped from the dictionary since in the loop the dictionary items cannot be deleted resulting in:
{ "produktNr:"1234", "artNr":["12","23","","14"],"artNr_01":"12", "artNr_02":"23", "artNr_03":"","artNr_04":"14","name":["abc","der"],"name_01":"abc", "name_02":"der", "test":"junk"}
Would love to understand how to deal with this, especially if there's a pythonic way to solve this problem.
You may use OrderedDict from the collections package:
from collections import OrderedDict
import re
input_dict = { "produktNr":"1234",
"artNr_01":"12",
"artNr_02":"23",
"artNr_03":"",
"artNr_04":"14",
"name_01":"abc",
"name_02":"der",
"test":"junk" }
# split keys on the first '_'
m = re.compile('^([^_]*)_(.*)')
def _order_by( item ):
# helper function for ordering the dict.
# item is split on first '_' and, if it was successful
# the second part is returned otherwise item is returned
# if key is something like artNr_42, return 42
# if key is something like test, return test
k,s = item
try:
return m.search(k).group(2)
except:
return k
# create ordered dict using helper function
orderedDict = OrderedDict( sorted(input_dict.items(), key=_order_by))
aggregated_dict = {}
for k, v in orderedDict.iteritems():
# split key
match = m.search(k)
if match:
# key is splittable, i.e., key is something like artNr_42
kk = match.group(1)
if kk not in aggregated_dict:
# create list and add value
aggregated_dict[kk] = [v]
else:
# add value
aggregated_dict[kk].append(v)
else:
# key is not splittable, i.e., key is something like produktNr
aggregated_dict[k] = v
print(aggregated_dict)
which gives the desired output
{'produktNr': '1234', 'test': 'junk', 'name': ['abc', 'der'], 'artNr': ['12', '23', '', '14']}
You can recreate a new dictionary that will group values of keys with '_' in the keys in a list while the other keys and values are kept intact. This should do:
d = { "produktNr":"1234", "artNr_01":"12", "artNr_02":"23","artNr_03":"","artNr_04":"14","name_01":"abc","name_02":"der","test":"junk"}
new_d= {}
for k, v in d.items():
k_new = k.split('_')[0]
if '_' in k:
if k_new not in new_d:
new_d[k_new] = [v]
else:
new_d[k_new].append(v)
else:
new_d[k_new] = v
print(new_d)
# {'artNr': ['', '14', '23', '12'], 'test': 'junk', 'produktNr': '1234', 'name': ['der', 'abc']}
Dicts are unordered collections, so the order with which the values are appended to the list will be indeterminate.
A slight modification of your code:
tempDict = {}
for key,value in fmData.iteritems():
seqval_in_key = "no"
for seqval in seq:
if seqval in key:
seqval_in_key = "yes"
for seqval in seq:
if seqval in key:
if seqval in tempDict:
tempDict[seqval].append(value)
else:
x = []
x.append(value)
tempDict[seqval]=x
else:
if (seqval_in_key == "no"):
tempDict[key] = value
print tempDict
Result:
{'produktNr': '1234', 'test': 'junk', 'name': ['abc', 'der'], 'artNr': ['14', '23', '', '12']}

Extracting columns from a string or list in python

I'm trying to extract columns from a string of values in python. The string of values looks like follows -
CN=Unix ADISID,OU=SA,OU=DGO,DC=dom,DC=ab,DC=com,1001
CN=1002--DS,OU=Process,DC=dom,DC=ab,DC=com,1002
CN=1003--Cyb,OU=SA,OU=DGO,DC=dom,DC=ab,DC=com,1003
CN=Doe--Joe,OU=Adm,DC=dom,DC=ab,DC=com,d1004
CN=cruise--bob,OU=SA,OU=DGO,DC=dom,DC=ab,DC=com,d1005
Now I would like to extract columns from this string with column headers like CN, OU1, OU2,DC1, DC2, DC3,ID. The number of OU and DC values are different in every line so if they are not present in a line, I would like to keep that column as blank. Also, I'm using the following piece of code to generate the above string.
result = l.search_s(base, ldap.SCOPE_SUBTREE, criteria, attributes)
results=""
for i in [entry for dn, entry in result if isinstance(entry, dict)]:
results += str(i.get('distinguishedName')[0] +","+ i.get('sAMAccountName')[0] + "\n").replace("\, ","--")
print results
Will it be easier if I create results as a list to begin with?
To get the "fields left blank" behavior, you're going to have to count the max number of each field. I believe that CN is unique, so that should always be 1.
result = l.search_s(base, ldap.SCOPE_SUBTREE, criteria, attributes)
users = []
for i in [entry for dn, entry in result if isinstance(entry, dict)]:
dn = i.get('distinguishedName')[0].replace('\, ', '--').split(',')
info = collections.defaultdict(list)
info['id'] = i.get('sAMAccountName')[0]
for part in dn:
key,value = part.split('=',1)
info[key].append(value)
users.append(info)
max_cn = max(map(lambda u: len(u['CN']), users))
assert max_cn == 1
max_ou = max(map(lambda u: len(u['OU']), users))
max_dn = max(map(lambda u: len(u['DN']), users))
numflds = max_cn + max_ou + max_dn
fields = []
for u in users:
f = [u['CN']]
ou = u['OU'] + [''] * max_ou
f.extend(ou[:max_ou])
dn = u['DN'] + [''] * max_dn
f.extend(dn[:max_dn])
f.append(u['id'])
For each line:
pairs = [kv.split('=') for kv in line.split(',')]
for pair in pairs:
if len(pair) == 1:
pair.insert(0, 'ID')
Now you have something like this:
[['CN', 'Unix ADISID'],
['OU', 'SA'],
['OU', 'DGO'],
['DC', 'dom'],
['DC', 'ab'],
['DC', 'com'],
['ID', '1001']]
Then:
from collections import defaultdict
mapping = defaultdict(list)
for k,v in pairs:
mapping[k].append(v)
Which gives you:
{'CN': ['Unix ADISID'],
'DC': ['dom', 'ab', 'com'],
'ID': ['1001'],
'OU': ['SA', 'DGO']}

Trying to split a list of dictionaries

Hello I am trying to compare this list of dictionaries:
Call this list Animals:
[{'Fishs=16': 'Fishs=16',
'Birds="6"': 'Birds="6"',
'Dogs=5': 'Dogs=5',
'Bats=10': 'Bats=10',
'Tigers=11': 'Tigers=11',
'Cats=4': 'Cats=4'},
{'Cats=40': 'Cats=40',
'Tigers': 'Tigers = 190',
'Birds=4': 'Birds=4',
'Bats': 'Bats = Null',
'Fishs': 'Fishs = 24',
'Dogs': 'Dogs = 10'}]
I want to make the list look like this
[{'Tigers': 'Tigers=11',
'Dogs': 'Dogs=5',
'Cats': 'Cats=4',
'Bats': 'Bats=10',
'Fishs': 'Fishs=16',
'Birds': 'Birds="6"'},
{'Tigers': 'Tigers=190',
'Dogs': 'Dogs=10',
'Cats': 'Cats=40',
'Bats': 'Bats=Null',
'Fishs': 'Fishs=24',
'Birds': 'Birds=4'}]
so that I can compare it to this other list:
{'Tigers': '19',
'Dogs': '10',
'Cats': '40',
'Bats': '10',
'Fishs': '234',
'Birds': '3'}
Heres the code i've tried to use inorder to split the list:
animals = []
for d in setData:
animals.append({k: v.split('=')[1] for k, v in d.items()})
however It will not split the list since my keys in the dictionaries are in this format Dogs=4 rather than Dogs = 4. I need to be able to split this list even if they are in that format.
On a completely different side note, once this part of the code is fixed I need to figure out how to compare the data from these keys against each other.
for example: Lets say I have Dogs="23" and the compared list is Dogs="50" According to my code this should be Incorrect, but due to the quotes ("23") it says it is, it does not compare the value inside. This is the code i have to compare:
correct_parameters = dict(re.match(r'(\w*)="?(\d*)"?', s).group(1, 2) for s in dataDefault[1:])
print correct_parameters
count = 0
while (count < (len(setNames))):
for number, item in enumerate(animals, 1):
print setNames[count]
count = count + 1
for param, correct in correct_parameters.items():
if item[param] == correct:
print('{} = {} which is correct'.format(param, correct))
However for now I am just trying to fix the list split issue i am having.
lst = [{'Fishs=16': 'Fishs=16', 'Birds="6"': 'Birds="6"', 'Dogs=5': 'Dogs=5', 'Bats=10': 'Bats=10', 'Tigers=11': 'Tigers=11', 'Cats=4': 'Cats=4'}, {'Cats=40': 'Cats=40', 'Tigers': 'Tigers = 190', 'Birds=4': 'Birds=4', 'Bats': 'Bats = Null', 'Fishs': 'Fishs = 24', 'Dogs': 'Dogs = 10'}]
# for each element in that list, loop with index
for idx, val in enumerate(lst):
# create temp object
o = {}
# loop the dictionary
for k,v in val.iteritems():
# if = is found in key
if '=' in k:
# change the key
k = k.split('=')[0]
# insert to temp object
o[k] = v
# change the temp object to current element in the list
lst[idx] = o
Just strip the string after you split it:
>>> [v.strip() for v in " a = b ".split('=')]
['a', 'b']

Adding multiple dictionaries to a key in python dictionary

I am trying to add multiple dictionaries to a key.
e.g.
value = { column1 : {entry1 : val1}
{entry2 : val2}
column2 : {entry3 : val3}
{entry4 : val4}
}
What exactly I am trying to do with this code is:
There is a file.txt which has columns and valid entries for that header. I am trying to make a dictionary with columns as key and for each column another dictionary for each valid entry.
So I am parsing the text file line by line to find the pattern for column and entries and storing it in a variable, check if the column(which is a key) already exists in the dictionary, if exists then add another dictionary to the column, if not create a new entry. I Hope this makes sense.
Sample contents of file.txt
blah blah Column1 blah blah
entry1 val1
entry2 val2
blah blah Column2 blah blah
entry3 val3
entry4 val4
My code:
from __future__ import unicode_literals
import os, re, string, gzip, fnmatch, io
from array import *
header = re.compile(...) #some regex
valid_entries = re.compile(---) #some regex
matches=[]
entries=[]
value = {'MONTH OF INTERVIEW' : {'01': 'MIN VALUE'}}
counter = 0
name = ''
f =open(r'C:/file.txt')
def exists(data, name):
for key in data.keys():
if key == name :
print "existing key : " + name
return True
else :
return False
for line in f:
col = ''
ent = ''
line = re.sub(ur'\u2013', '-', line)
line = re.sub(ur'\u2026', '_', line)
m = header.match(line)
v = valid_entries.match(line)
if m:
name= ''
matches.append(m.groups())
_,_, name,_,_= m.groups()
#print "name : " + name
if v:
entries.append(v.groups())
ent,col= v.groups()
#print v.groups()
#print "col :" + col
#print "ent :" + ent
if (name is not None) and (ent is not None) and (col is not None):
print value
if exists(value, name):
print 'inside existing loop'
value[name].update({ent:col})
else:
value.update({name:{ent:col}})
print value
problem with this code is , it is replacing the values of the sub dictionary and also it is not adding all the values to the dictionary.
I am new to python, so this could be a naive approach to handle this kind of situation. If you think there is a better way of getting what I want, I would really appreciate if you tell me.
Dictionaries have only one value per key. The trick is to make that value a container too, like a list:
value = {
'column1': [{entry1 : val1}, {entry2 : val2}]
'column2': [{entry3 : val3}, {entry4 : val4}]
}
Use dict.setdefault() to insert a list value when there is no value yet:
if name is not None and ent is not None and col is not None:
value.setdefault(name, []).append({ent: col})
You could just make the values one dictionary with multiple (ent, col) key-value pairs here:
if name is not None and ent is not None and col is not None:
value.setdefault(name, {})[ent] = col
Your exists() function was overcomplicating a task dictionaries excel at; testing for a key is done using in instead:
if name in value:
would have sufficed.
I would keep the keys as a list of dictionaries, so you can extend or append
>>> d = {}
>>> d[1] = [{'a': 1}]
>>> d[1].append({'b':2})
>>> d
{1: [{'a': 1}, {'b': 2}]}
You can use defaultdict and regex for this (demo here):
with open('/path/to/file.txt', 'rU') as f: # read the contents from the file
lines = f.readlines()
import re
from collections import defaultdict
d = defaultdict(list) # dict with default value: []
lastKey = None
for line in lines:
m = re.search('Column\d',line) # search the current line for a key
if m: lastKey = m.group()
else:
m = re.search('(?<=entry\d ).*',line) # search the current line for a value
if m: d[lastKey].append(m.group()) # append the value
Output:
[('Column1', ['val1', 'val2']), ('Column2', ['val3', 'val4'])]
Note: Of course, the above code assumes your file.txt was formatted as in your example. For your real file.txt data you might have to adjust the regex.

Categories

Resources