I am trying to print the following dictionary in a hierarchy format
fam_dict{'6081740103':['60817401030000','60817401030100','60817401030200',
'60817401030300','60817401030400','60817401030500','60817401030600']
as shown here:
60817401030000
60817401030100
60817401030200
60817401030400
60817401030500
60817401030600
So far I have the following code which works but I'm having to manually input the i'th index in each line. How can I readjust this code in a recursive format instead of having to count how many lines of code and manually put the index value each time
my_p = node(fam_dict['6081740103'][0], None)
my_c = node(fam_dict['6081740103'][1], my_p)
my_d = node(fam_dict['6081740103'][2], my_c)
my_e = node(fam_dict['6081740103'][4], my_d)
my_f = node(fam_dict['6081740103'][5], my_e)
my_g = node(fam_dict['6081740103'][6], my_f)
print (my_p.name)
print_children(my_p)
You can try this:
fam_dict = {'6081740103':['60817401030000','60817401030100','60817401030200',
'60817401030300','60817401030400','60817401030500','60817401030600']}
for i, val in enumerate(fam_dict['6081740103']):
print(' ' * i * 4 + val)
Which outputs your desired hierachy:
60817401030000
60817401030100
60817401030200
60817401030300
60817401030400
60817401030500
60817401030600
You can create a variable that stores the line that you are iterating through, and then increment the variable each time through the loop. You can multiply that variable by \t Which is the tab operator in order to control how many tabs you want. Here is an example:
lines = 0
fam_dict = {'6081740103': ['60817401030000','60817401030100','60817401030200',
'60817401030300','60817401030400','60817401030500','60817401030600']}
for k, val in fam_dict.items():
for v in val:
lines += 1
t = '\t'
t = t * lines
print(t + str(v))
Here is your output:
60817401030000
60817401030100
60817401030200
60817401030300
60817401030400
60817401030500
60817401030600
You can do it this way too.
for key in fam_dict.keys():
for i in range(len(fam_dict[key])):
print(i*"\t"+ fam_dict[key][i])
Here is an example:
fam_dict = {'6081740103':['60817401030000','60817401030100','60817401030200','60817401030300','60817401030400','60817401030500','60817401030600']}
for k, v in fam_dict.items():
for i, s in enumerate(v):
print("%s%s"% ("\t"*i, s))
In case you want to make nodes for it:
fam_dict = {'6081740103':['60817401030000','60817401030100','60817401030200','60817401030300','60817401030400','60817401030500','60817401030600']}
node_list = []
for k, v in fam_dict.items():
last_parent = none
for i, s in enumerate(v):
print("%s%s"% ("\t"*i, s))
node_list.append(node(v, last_parent))
last_parent=node_list[-1]
The parent node will be node_list[0].
Try this:
fam_dict = {'6081740103':['60817401030000','60817401030100','60817401030200',
'60817401030300','60817401030400','60817401030500','60817401030600']}
l = fam_dict['6081740103']
for i in l:
print(' '*l.index(i)*4+i)
Output:
60817401030000
60817401030100
60817401030200
60817401030300
60817401030400
60817401030500
60817401030600
Related
In my LIST(not dictionary) I have these strings:
"K:60",
"M:37",
"M_4:47",
"M_5:89",
"M_6:91",
"N:15",
"O:24",
"P:50",
"Q:50",
"Q_7:89"
in output I need to have
"K:60",
"M_6:91",
"N:15",
"O:24",
"P:50",
"Q_7:89"
What is the possible decision?
Or even maybe, how to take tag with the maximum among strings with the same tag.
Use re.split and list comprehension as shown below. Use the fact that when the dictionary dct is created, only the last value is kept for each repeated key.
import re
lst = [
"K:60",
"M:37",
"M_4:47",
"M_5:89",
"M_6:91",
"N:15",
"O:24",
"P:50",
"Q:50",
"Q_7:89"
]
dct = dict([ (re.split(r'[:_]', s)[0], s) for s in lst])
lst_uniq = list(dct.values())
print(lst_uniq)
# ['K:60', 'M_6:91', 'N:15', 'O:24', 'P:50', 'Q_7:89']
Probably far from the cleanest but here is a method quite easy to understand.
l = ["K:60", "M:37", "M_4:47", "M_5:89", "M_6:91", "N:15", "O:24", "P:50", "Q:50", "Q_7:89"]
reponse = []
val = []
complete_val = []
for x in l:
if x[0] not in reponse:
reponse.append(x[0])
complete_val.append(x.split(':')[0])
val.append(int(x.split(':')[1]))
elif int(x.split(':')[1]) > val[reponse.index(x[0])]:
val[reponse.index(x[0])] = int(x.split(':')[1])
for x in range(len(complete_val)):
print(str(complete_val[x]) + ":" + str(val[x]))
K:60
M:91
N:15
O:24
P:50
Q:89
I do not see any straight-forward technique. Other than iterating on entire thing and computing yourself, I do not see if any built-in can be used. I have written this where you do not require your values to be sorted in your input.
But I like the answer posted by Timur Shtatland, you can make us of that if your values are already sorted in input.
intermediate = {}
for item in a:
key, val = item.split(':')
key = key.split('_')[0]
val = int(val)
if intermediate.get(key, (float('-inf'), None))[0] < val:
intermediate[key] = (val, item)
ans = [x[1] for x in intermediate.values()]
print(ans)
which gives:
['K:60', 'M_6:91', 'N:15', 'O:24', 'P:50', 'Q_7:89']
Let's say I have the following list.
my_list = ['4/10', '8/-', '9/2', '4/11', '-/13', '19/10', '25/-', '26/-', '4/12', '10/16']
I would like to check the occurrence of each item and if it exists more than once I would like to store it in a new list.
For example from the above list, 4 is existed 3 times before / as 4/10, 4/11, 4/12. So I would like to create a new list called new list and store them as new_list = '4/10', '4/11', '4/12, 19/10'.
An additional example I want to consider also /. if 10 exist twice as 4/10 and 10/16 I don want to consider it as a duplicate since the position after and before / is different.
If there any way to count the existence of an item in a list and store them in a new list?
I tried the following but got an error.
new_list = []
d = Counter(my_list)
for v in d.items():
if v > 1:
new_list.append(v)
The error TypeError: '>' not supported between instances of 'tuple' and 'int'
Can anyone help with this?
I think below code is quite self-explanatory. It will work alright. If you have any issues or need clarification, feel free to ask.
NOTE : This code is not very efficient and can be improved a lot. But will work allright if you are not running this on extremely large data.
my_list = ['4/10', '8/-', '9/2', '4/11', '-/13', '19/10', '25/-', '26/-', '4/12', '10/16']
frequency = {}; new_list = [];
for string in my_list:
x = '';
for j in string:
if j == '/':
break;
x += j;
if x.isdigit():
frequency[x] = frequency.get(x, 0) + 1;
for string in my_list:
x = '';
for j in string:
if j == '/':
break;
x += j;
if x.isdigit():
if frequency[x] > 1:
new_list.append(string);
print(new_list);
.items() is not what you think - it returns a list of key-value pairs (tuples), not sole values. You want to:
d = Counter(node)
new_list = [ k for (k,v) in d.items() if v > 1 ]
Besides, I am not sure how node is related to my_list but I think there is some additional processing you didn't show.
Update: after reading your comment clarifying the problem, I think it requires two separate counters:
first_parts = Counter([x.split('/')[0] for x in my_list])
second_parts = Counter([x.split('/')[1] for x in my_list])
first_duplicates = { k for (k,v) in first_parts.items() if v > 1 and k != '-' }
second_duplicates = { k for (k,v) in second_parts.items() if v > 1 and k != '-' }
new_list = [ e for e in my_list if
e.split('/')[0] in first_duplicates or e.split('/')[1] in second_duplicates ]
this might help : create a dict to contain the pairings and then extract the pairings that have a length more than one. defaultdict helps with aggregating data, based on the common keys.
from collections import defaultdict
d = defaultdict(list)
e = defaultdict(list)
m = [ent for ent in my_list if '-' not in ent]
for ent in m:
front, back = ent.split('/')
d[front].append(ent)
e[back].append(ent)
new_list = []
for k,v in d.items():
if len(v) > 1:
new_list.extend(v)
for k,v in e.items():
if len(v) > 1:
new_list.extend(v)
sortr = lambda x: [int(ent) for ent in x.split("/")]
from operator import itemgetter
sorted(set(new_list), key = sortr)
print(new_list)
['4/10', '4/11', '4/12', '19/10']
I have that weird string (single line) where first field is a key, second is a value. It looks like this:
key1\val1\key2\val2\key3\val3\...\keyn\valn
What would be the best way to convert such notation to python dictionary?
Just use a temporary list to split your string to:
s = 'key1\\val1\\key2\\val2\\key3\\val3'
temp = s.split('\\')
d = {k: v for k, v in zip(temp[0::2], temp[1::2])}
Simple answer.
a = "key1\\val1\\key2\\val2\\key3\\val3"
b = a.split('\\')
dc = {}
for i in range(0,len(b), 2):
dc[b[i]]=b[i+1]
Here is what I came up with:
import re
string = 'key1\\val1\\key2\\val2\\key3\\val3'
dictionary = {match.group(1): match.group(2) for match in re.finditer(r'(\w+)\\(\w+)', string)}
print dictionary
However, note that this would work only if the key values are only characters (no space or underscores and stuff like that). In order to accomodate to such different cases, you would have to modify the simple regex I am using in the above code.
This does it without any imports:
s = """key1\\val1\\key2\\val2\\key3\\val3\\...\\keyn\\valn"""
spl = s.split("\\")
m = {}
for i in range(0, len(spl)-1, 2):
m[spl[i]] = spl[i+1]
print(m)
use split and itertools.islice
import itertools
def parse(ss):
inp = ss.split('\\')
keys, vals = itertools.tee(inp)
keys = itertools.islice(keys,0,None,2)
vals = itertools.islice(vals,1,None,2)
nd = {}
for key,val in zip(keys,vals):
nd[key] = val
return nd
In Python 2.7.12:
line = "key1\\val1\\key2\\val2\\key3\\val3"
line_data = line.split("\\")
line_dict = {}
print line_data
for i in range(0, len(line_data), 2):
key = line_data[i]
value = line_data[i+1]
line_dict[key] = value
print line_dict
Below I am trying to read a file and store every other second and third lines.
I have 4000 lines but there is a pattern of 4 lines which repeats 1000 times.
After I have read and split the lines into three variables x,y,z. But these are string variables. Next for-loop I am trying to convert the lists into numpy arrays. I use a dictionary for this. However, at the end of the code when I print the type of y is still a str variable. As I understand from what happens python did not store the numpy array p as y, although I loop over x,y,z
#!/usr/bin/env python
import numpy as np
import matplotlib.pyplot as plt
fl = open('input.sis','r')
lines = []
x = []
y = []
z = []
for i in range(1000):
line = []
for j in range(4):
f= fl.readline()
line.append(f)
lines.append(line)
xyz = lines[i][2].split(' ')
x.append(xyz[0])
y.append(xyz[1])
z.append(xyz[2])
fl.close()
dic = {'x':x,'y':y,'z':z}
for k in dic:
p = dic[k]
p = np.asfarray(p)
print(type(p))
print(type(y[0]))
Any idea how to tell python to recognize that p = np.asfarray(p) is actually y = np.asfarray(y) and when I print the type of y at the end to be float instead of str? Your help will be highly appreciated!
A way to understand what happens is to replace = by is now a name for.
What you do in your loop is:
p is now a name for dic[k]
p is now a name for the output of np.asfarray(p)
and so on, in each loop. When you leave the for loop, p refers to the output of np.asfarray(dic('z')). And that's all that happened here.
If you want to update the values in your dict, you should do:
dic = {'x':x,'y':y,'z':z}
for k in dic:
dic[k] = np.asfarray(dic[k])
or, a bit nicer:
for key, value in dic.items():
dic[key] = np.asfarray(value)
Now, dic['y'] refers to the array returned bynp.asfarray`
But you haven't done anything to y, so it still refers to the same object as before. If you want to change that, you must write something like y = ....
You could for example do:
y = dic['y']
For a more thorough explanation, have a look at Facts and myths about Python names and values
for k in dic:
p = dic[k]
p = np.asfarray(p)
print(type(p))
Should be
for k in dic:
p = dic[k]
dic[k] = np.asfarray(p)
print(type(p))
When you set p = dic[k] you are calling dic[k] and retrieving a value, now, since dic[x] == x now p == x. Now when you are trying to assign that value to its corresponding key you want to call dic[k] = np.asfarry(p) this translates to dic['x'] = np.asfarry(x) and now it is assigned to the value for the corresponding key.
Here's a visual to break it down whats happening
dicta = {'a': 1 }
for k in dicta:
print(dicta[k])
p = dicta[k]
print(p)
dicta[k] = 3*p
print(dicta[a])
1
1
3
for i in range(1000):
line = []
for j in range(4):
f= fl.readline()
line.append(f)
lines.append(line)
xyz = lines[i][2].split(' ')
x.append(xyz[0])
y.append(xyz[1])
z.append(xyz[2])
fl.close()
dic = {'x':x,'y':y,'z':z}
for k in dic:
p = dic[k]
p = np.asfarray(p)
print(type(p))
print(type(y[0]))
The answer to the question I asked is to use globals(). Some people would discourage you using it but is the only solution I could find. Let me explain:
Right in the first for-loop y.append(xyz[1]) will return y as list of <str>, where the same holds for x and z.
Step 2. I create a dictionary of these variables x,y and z and their values.
Step 3. I want to loop over each variable in dic and change the type of x, y,z from str list to numpy arrays.
Therefore when I print the type of y[0] it is still str.
Now if the second loop is replaced by:
dic = {'x':x,'y':y,'z':z}
for k in sorted(dic):
globals()[k] = np.asfarray(dic[k])
print(type(y[0]))
I get for type(y[0]):
<type 'numpy.float64'>
I have a list containing strings as ['Country-Points'].
For example:
lst = ['Albania-10', 'Albania-5', 'Andorra-0', 'Andorra-4', 'Andorra-8', ...other countries...]
I want to calculate the average for each country without creating a new list. So the output would be (in the case above):
lst = ['Albania-7.5', 'Andorra-4.25', ...other countries...]
Would realy appreciate if anyone can help me with this.
EDIT:
this is what I've got so far. So, "data" is actually a dictionary, where the keys are countries and the values are list of other countries points' to this country (the one as Key). Again, I'm new at Python so I don't realy know all the built-in functions.
for key in self.data:
lst = []
index = 0
score = 0
cnt = 0
s = str(self.data[key][0]).split("-")[0]
for i in range(len(self.data[key])):
if s in self.data[key][i]:
a = str(self.data[key][i]).split("-")
score += int(float(a[1]))
cnt+=1
index+=1
if i+1 != len(self.data[key]) and not s in self.data[key][i+1]:
lst.append(s + "-" + str(float(score/cnt)))
s = str(self.data[key][index]).split("-")[0]
score = 0
self.data[key] = lst
itertools.groupby with a suitable key function can help:
import itertools
def get_country_name(item):
return item.split('-', 1)[0]
def get_country_value(item):
return float(item.split('-', 1)[1])
def country_avg_grouper(lst) :
for ctry, group in itertools.groupby(lst, key=get_country_name):
values = list(get_country_value(c) for c in group)
avg = sum(values)/len(values)
yield '{country}-{avg}'.format(country=ctry, avg=avg)
lst[:] = country_avg_grouper(lst)
The key here is that I wrote a function to do the change out of place and then I can easily make the substitution happen in place by using slice assignment.
I would probabkly do this with an intermediate dictionary.
def country(s):
return s.split('-')[0]
def value(s):
return float(s.split('-')[1])
def country_average(lst):
country_map = {}|
for point in lst:
c = country(pair)
v = value(pair)
old = country_map.get(c, (0, 0))
country_map[c] = (old[0]+v, old[1]+1)
return ['%s-%f' % (country, sum/count)
for (country, (sum, count)) in country_map.items()]
It tries hard to only traverse the original list only once, at the expense of quite a few tuple allocations.