Below I am trying to read a file and store every other second and third lines.
I have 4000 lines but there is a pattern of 4 lines which repeats 1000 times.
After I have read and split the lines into three variables x,y,z. But these are string variables. Next for-loop I am trying to convert the lists into numpy arrays. I use a dictionary for this. However, at the end of the code when I print the type of y is still a str variable. As I understand from what happens python did not store the numpy array p as y, although I loop over x,y,z
#!/usr/bin/env python
import numpy as np
import matplotlib.pyplot as plt
fl = open('input.sis','r')
lines = []
x = []
y = []
z = []
for i in range(1000):
line = []
for j in range(4):
f= fl.readline()
line.append(f)
lines.append(line)
xyz = lines[i][2].split(' ')
x.append(xyz[0])
y.append(xyz[1])
z.append(xyz[2])
fl.close()
dic = {'x':x,'y':y,'z':z}
for k in dic:
p = dic[k]
p = np.asfarray(p)
print(type(p))
print(type(y[0]))
Any idea how to tell python to recognize that p = np.asfarray(p) is actually y = np.asfarray(y) and when I print the type of y at the end to be float instead of str? Your help will be highly appreciated!
A way to understand what happens is to replace = by is now a name for.
What you do in your loop is:
p is now a name for dic[k]
p is now a name for the output of np.asfarray(p)
and so on, in each loop. When you leave the for loop, p refers to the output of np.asfarray(dic('z')). And that's all that happened here.
If you want to update the values in your dict, you should do:
dic = {'x':x,'y':y,'z':z}
for k in dic:
dic[k] = np.asfarray(dic[k])
or, a bit nicer:
for key, value in dic.items():
dic[key] = np.asfarray(value)
Now, dic['y'] refers to the array returned bynp.asfarray`
But you haven't done anything to y, so it still refers to the same object as before. If you want to change that, you must write something like y = ....
You could for example do:
y = dic['y']
For a more thorough explanation, have a look at Facts and myths about Python names and values
for k in dic:
p = dic[k]
p = np.asfarray(p)
print(type(p))
Should be
for k in dic:
p = dic[k]
dic[k] = np.asfarray(p)
print(type(p))
When you set p = dic[k] you are calling dic[k] and retrieving a value, now, since dic[x] == x now p == x. Now when you are trying to assign that value to its corresponding key you want to call dic[k] = np.asfarry(p) this translates to dic['x'] = np.asfarry(x) and now it is assigned to the value for the corresponding key.
Here's a visual to break it down whats happening
dicta = {'a': 1 }
for k in dicta:
print(dicta[k])
p = dicta[k]
print(p)
dicta[k] = 3*p
print(dicta[a])
1
1
3
for i in range(1000):
line = []
for j in range(4):
f= fl.readline()
line.append(f)
lines.append(line)
xyz = lines[i][2].split(' ')
x.append(xyz[0])
y.append(xyz[1])
z.append(xyz[2])
fl.close()
dic = {'x':x,'y':y,'z':z}
for k in dic:
p = dic[k]
p = np.asfarray(p)
print(type(p))
print(type(y[0]))
The answer to the question I asked is to use globals(). Some people would discourage you using it but is the only solution I could find. Let me explain:
Right in the first for-loop y.append(xyz[1]) will return y as list of <str>, where the same holds for x and z.
Step 2. I create a dictionary of these variables x,y and z and their values.
Step 3. I want to loop over each variable in dic and change the type of x, y,z from str list to numpy arrays.
Therefore when I print the type of y[0] it is still str.
Now if the second loop is replaced by:
dic = {'x':x,'y':y,'z':z}
for k in sorted(dic):
globals()[k] = np.asfarray(dic[k])
print(type(y[0]))
I get for type(y[0]):
<type 'numpy.float64'>
Related
In my LIST(not dictionary) I have these strings:
"K:60",
"M:37",
"M_4:47",
"M_5:89",
"M_6:91",
"N:15",
"O:24",
"P:50",
"Q:50",
"Q_7:89"
in output I need to have
"K:60",
"M_6:91",
"N:15",
"O:24",
"P:50",
"Q_7:89"
What is the possible decision?
Or even maybe, how to take tag with the maximum among strings with the same tag.
Use re.split and list comprehension as shown below. Use the fact that when the dictionary dct is created, only the last value is kept for each repeated key.
import re
lst = [
"K:60",
"M:37",
"M_4:47",
"M_5:89",
"M_6:91",
"N:15",
"O:24",
"P:50",
"Q:50",
"Q_7:89"
]
dct = dict([ (re.split(r'[:_]', s)[0], s) for s in lst])
lst_uniq = list(dct.values())
print(lst_uniq)
# ['K:60', 'M_6:91', 'N:15', 'O:24', 'P:50', 'Q_7:89']
Probably far from the cleanest but here is a method quite easy to understand.
l = ["K:60", "M:37", "M_4:47", "M_5:89", "M_6:91", "N:15", "O:24", "P:50", "Q:50", "Q_7:89"]
reponse = []
val = []
complete_val = []
for x in l:
if x[0] not in reponse:
reponse.append(x[0])
complete_val.append(x.split(':')[0])
val.append(int(x.split(':')[1]))
elif int(x.split(':')[1]) > val[reponse.index(x[0])]:
val[reponse.index(x[0])] = int(x.split(':')[1])
for x in range(len(complete_val)):
print(str(complete_val[x]) + ":" + str(val[x]))
K:60
M:91
N:15
O:24
P:50
Q:89
I do not see any straight-forward technique. Other than iterating on entire thing and computing yourself, I do not see if any built-in can be used. I have written this where you do not require your values to be sorted in your input.
But I like the answer posted by Timur Shtatland, you can make us of that if your values are already sorted in input.
intermediate = {}
for item in a:
key, val = item.split(':')
key = key.split('_')[0]
val = int(val)
if intermediate.get(key, (float('-inf'), None))[0] < val:
intermediate[key] = (val, item)
ans = [x[1] for x in intermediate.values()]
print(ans)
which gives:
['K:60', 'M_6:91', 'N:15', 'O:24', 'P:50', 'Q_7:89']
Hi im new to Kattis ive done this assignment "oddmanout" and it works when i compile it locally but i get runtime error doing it via Kattis. Im not sure why?
from collections import Counter
cases = int(input())
i = 0
case = 0
while cases > i:
list = []
i = 1 + i
case = case + 1
guests = int(input())
f = 0
while f < guests:
f = f + 1
invitation_number = int(input())
list.append(invitation_number)
d = Counter(list)
res = [k for k, v in d.items() if v == 1]
resnew = str(res)[1:-1]
print(f'Case#{case}: {resnew}')
Looking at the input data on Kattis : invitation_number = int(input()) reads not just the first integer, but the whole line of invitation numbers at once in the third line of the input. A ValueError is the result.
With invitation_numbers = list(map(int, input().split())) or alternatively invitation_numbers = [int(x) for x in input().split()] you will get your desired format directly.
You may have to rework your approach afterwards, since you have to get rid of the 2nd while loop. Additionally you don't have to use a counter, running through a sorted list and pairwise comparing the entries, may give you the solution aswell.
Additionally try to avoid naming your variables like the datatypes (list = list()).
I have that weird string (single line) where first field is a key, second is a value. It looks like this:
key1\val1\key2\val2\key3\val3\...\keyn\valn
What would be the best way to convert such notation to python dictionary?
Just use a temporary list to split your string to:
s = 'key1\\val1\\key2\\val2\\key3\\val3'
temp = s.split('\\')
d = {k: v for k, v in zip(temp[0::2], temp[1::2])}
Simple answer.
a = "key1\\val1\\key2\\val2\\key3\\val3"
b = a.split('\\')
dc = {}
for i in range(0,len(b), 2):
dc[b[i]]=b[i+1]
Here is what I came up with:
import re
string = 'key1\\val1\\key2\\val2\\key3\\val3'
dictionary = {match.group(1): match.group(2) for match in re.finditer(r'(\w+)\\(\w+)', string)}
print dictionary
However, note that this would work only if the key values are only characters (no space or underscores and stuff like that). In order to accomodate to such different cases, you would have to modify the simple regex I am using in the above code.
This does it without any imports:
s = """key1\\val1\\key2\\val2\\key3\\val3\\...\\keyn\\valn"""
spl = s.split("\\")
m = {}
for i in range(0, len(spl)-1, 2):
m[spl[i]] = spl[i+1]
print(m)
use split and itertools.islice
import itertools
def parse(ss):
inp = ss.split('\\')
keys, vals = itertools.tee(inp)
keys = itertools.islice(keys,0,None,2)
vals = itertools.islice(vals,1,None,2)
nd = {}
for key,val in zip(keys,vals):
nd[key] = val
return nd
In Python 2.7.12:
line = "key1\\val1\\key2\\val2\\key3\\val3"
line_data = line.split("\\")
line_dict = {}
print line_data
for i in range(0, len(line_data), 2):
key = line_data[i]
value = line_data[i+1]
line_dict[key] = value
print line_dict
I am trying to print the following dictionary in a hierarchy format
fam_dict{'6081740103':['60817401030000','60817401030100','60817401030200',
'60817401030300','60817401030400','60817401030500','60817401030600']
as shown here:
60817401030000
60817401030100
60817401030200
60817401030400
60817401030500
60817401030600
So far I have the following code which works but I'm having to manually input the i'th index in each line. How can I readjust this code in a recursive format instead of having to count how many lines of code and manually put the index value each time
my_p = node(fam_dict['6081740103'][0], None)
my_c = node(fam_dict['6081740103'][1], my_p)
my_d = node(fam_dict['6081740103'][2], my_c)
my_e = node(fam_dict['6081740103'][4], my_d)
my_f = node(fam_dict['6081740103'][5], my_e)
my_g = node(fam_dict['6081740103'][6], my_f)
print (my_p.name)
print_children(my_p)
You can try this:
fam_dict = {'6081740103':['60817401030000','60817401030100','60817401030200',
'60817401030300','60817401030400','60817401030500','60817401030600']}
for i, val in enumerate(fam_dict['6081740103']):
print(' ' * i * 4 + val)
Which outputs your desired hierachy:
60817401030000
60817401030100
60817401030200
60817401030300
60817401030400
60817401030500
60817401030600
You can create a variable that stores the line that you are iterating through, and then increment the variable each time through the loop. You can multiply that variable by \t Which is the tab operator in order to control how many tabs you want. Here is an example:
lines = 0
fam_dict = {'6081740103': ['60817401030000','60817401030100','60817401030200',
'60817401030300','60817401030400','60817401030500','60817401030600']}
for k, val in fam_dict.items():
for v in val:
lines += 1
t = '\t'
t = t * lines
print(t + str(v))
Here is your output:
60817401030000
60817401030100
60817401030200
60817401030300
60817401030400
60817401030500
60817401030600
You can do it this way too.
for key in fam_dict.keys():
for i in range(len(fam_dict[key])):
print(i*"\t"+ fam_dict[key][i])
Here is an example:
fam_dict = {'6081740103':['60817401030000','60817401030100','60817401030200','60817401030300','60817401030400','60817401030500','60817401030600']}
for k, v in fam_dict.items():
for i, s in enumerate(v):
print("%s%s"% ("\t"*i, s))
In case you want to make nodes for it:
fam_dict = {'6081740103':['60817401030000','60817401030100','60817401030200','60817401030300','60817401030400','60817401030500','60817401030600']}
node_list = []
for k, v in fam_dict.items():
last_parent = none
for i, s in enumerate(v):
print("%s%s"% ("\t"*i, s))
node_list.append(node(v, last_parent))
last_parent=node_list[-1]
The parent node will be node_list[0].
Try this:
fam_dict = {'6081740103':['60817401030000','60817401030100','60817401030200',
'60817401030300','60817401030400','60817401030500','60817401030600']}
l = fam_dict['6081740103']
for i in l:
print(' '*l.index(i)*4+i)
Output:
60817401030000
60817401030100
60817401030200
60817401030300
60817401030400
60817401030500
60817401030600
This is a simple program but I am finding difficulty how it is actually working.
I have database with 3 tuples.
import matplotlib.pyplot as plt
queries = {}
rewrites = {}
urls = {}
for line in open("data.tsv"):
q, r, u = line.strip().split("\t")
queries.setdefault(q,0)
queries[q] += 1
rewrites.setdefault(r,0)
rewrites[r] += 1
urls.setdefault(u,0)
urls[u] += 1
sQueries = []
sQueries = [x for x in rewrites.values()]
sQueries.sort()
x = range(len(sQueries))
line, = plt.plot(x, sQueries, '-' ,linewidth=2)
plt.show()
This is whole program,
Now
queries.setdefault(q,0)
This command will set the values as 0 , if key i,e and q is not found.
queries[q] += 1
This command will increment the value of each key by 1 if key is there.
Same we continue with all tuples.
Then,
sQueries = [x for x in rewrites.values()]
Then we store the values from Dictionary rewrites , to List Squeries
x = range(len(sQueries))
This command I am not getting what is happening. Can anyone please explain.
len(sQueries)
gives number of elements in your list sQueries
x = range(len(sQueries))
will create a list x containing elements from 0,1,... to (but not including) length of your sQueries array
This:
sQueries = []
sQueries = [x for x in rewrites.values()]
sQueries.sort()
is an obtuse way of writing
sQueries = rewrites.values()
sQueries = sorted(sQueries)
in other words, sort the values of the rewrites dictionary. If, for the sake of argument, sQueries == [2, 3, 7, 9], then len(sQueries) == 4 and range(4) == [0, 1, 2, 3].
So, now you're plotting (0,2), (1,3), (2,7), (3,9), which doesn't seem very useful to me. It seems more likely that you would want the keys of rewrites on the x-axis, which would be the distinct values of r that you read from the TSV file.
length = len(sQueries) # this is length of sQueries
r = range(length) # this one means from 0 to length-1
so
x = range(len(sQueries)) # means x is from 0 to sQueries length - 1