I have a dictionary from which I want to filter out and append the results in another dictionary. The condition is if the difference between two first elements (for example 31 - 30 = 1) in the dictionary is smaller than 5 then add the associated second element of the dictionary and append it in a new dictionary else keep the same first element with the associated second element.
a = {"20" : "1.5", "30" : "2.0", "31" : "1.0", "40" : "1", "50" : "1.5"}
listb = []
listc = []
newdict = {}
for key in a:
b = key
c = a[key]
listb.append(b)
listc.append(c)
for i in range(len(listb)):
low = listb[i]
high = listb[i+1]
diff = int(high) - int(low)
# print(low)
if (diff > 5):
num = listc[i]
# print(num)
num_a = listb[i]
# print(num_a)
newdict[[num_a][i]] = num
print((newdict))
else:
num = listc[i] + listc[i+1]
print(num)
num_a = listb[i+1]
print(num_a)
newdict[[num_a][i]] = num
print(newdict)
The output of this should look something like
a = {"20" : "1.5", "31" : "3.0", "40" : "1", "50" : "1.5"}
Since you are comparing each element with the one 'before' or 'after' it, you want to use an ordered data structure. Since dictionaries are only 'insertion ordered', you cannot reliably check the first item with the one right after. So, you might want to use a list of tuples. I'm not quite sure what you are trying to do, but I tried to interpret it with this code. I hope this helps :)
# Creating a as a list of tuples so that they are ordered
a = [(20, 1.5), (30, 2.0), (31, 1.0), (40, 1), (50, 1.5)]
new_list = []
# you looped through len(a), but you should loop through len(a) - 1 so that you don't get an index error
for i in range(len(a) - 1):
# The first element of each tuple
low_key = a[i][0]
high_key = a[i+1][0]
if high_key - low_key < 5:
sum = a[i+1][1] + a[i][1]
new_tuple = (high_key, sum)
new_list.append(new_tuple)
else:
new_list.append((low_key, a[i][1]))
# need to check if last element, bc only looping through len(a) - 1
if i == len(a) - 1:
new_list.append((high_key, a[i+1][1]))
print(new_list)
If the only other answer requires the use of Pandas then I feel compelled to offer an alternative (I hate Pandas).
This should give what you describe. I'm not currently able to test though.
a = {"20" : "1.5", "30" : "2.0", "31" : "1.0", "40" : "1", "50" : "1.5"}
# your listb and listc are just a.keys() and a.values(). So I'm going to delete all of this listb listc setup stuff.
newdict = {}
skip = False # This is a pretty brute force way to just check whether we've already accounted for the "next" value. Otherwise you will double count.
for i in range(len(a.keys())):
if skip:
skip = False
continue
low = a.keys()[i]
high = a.keys()[i+1]
diff = abs(int(high) - int(low)) # If "diff" is actually meant to be a diff, then we need to use abs
if diff > 5:
newdict[a.keys()[i]] = a.values()[i]
else:
newdict[a.keys()[i]] = a.values()[i] + a.values()[i+1]
skip = True
print(newdict)
Please note that if you have several keys in a row that are all < 5 apart, this might not behave as expected. It also isn't clear from the description what you would actually want in a case where the keys were, eg, 40, 44, 48 (group 40 and 44 or group all 3 numbers?). But based on what you describe the above implements it.
One way to do this is to convert first into Pandas dataframe and do your calculations there then convert it back to dictionary?
d = {"20" : "1.5", "30" : "2.0", "31" : "1.0", "40" : "1", "50" : "1.5"}
df = pd.Series(d)
df = df.reset_index().astype(float)
df['id']= df['index'].diff().shift(-1).fillna(10).values
df = df[df['id']>5]
df = df.set_index(['index'])
df = df.drop('id', axis=1)
df.to_dict()
{0: {20.0: 1.5, 31.0: 1.0, 40.0: 1.0, 50.0: 1.5}}
I'm not really clear on what you're trying to do, but I think with a couple of comments, you might be able to fix your code to achieve your objective, even if I don't fully understand what that objective is.
A dict is inherently unsorted, but I believe your algorithm inherently requires that the keys be in increasing order.
I'd change the second and third line to:
listb = sorted(a.keys())
listc = [a[k] for k in listb]
Next you probably want to loop to len(listb) - 1. Otherwise listb[I + 1] will be out of bounds. Maybe you could check out the enumerate function, but then you'd need to check for being on the last iteration, and handle accordingly.
Finally, you could use some better variable names. a, listb, and listc doesn't really convey much meaning. Even a, a_keys, and a_values would have been easier to follow, but a better description of what a represents would be better again.
Related
In my LIST(not dictionary) I have these strings:
"K:60",
"M:37",
"M_4:47",
"M_5:89",
"M_6:91",
"N:15",
"O:24",
"P:50",
"Q:50",
"Q_7:89"
in output I need to have
"K:60",
"M_6:91",
"N:15",
"O:24",
"P:50",
"Q_7:89"
What is the possible decision?
Or even maybe, how to take tag with the maximum among strings with the same tag.
Use re.split and list comprehension as shown below. Use the fact that when the dictionary dct is created, only the last value is kept for each repeated key.
import re
lst = [
"K:60",
"M:37",
"M_4:47",
"M_5:89",
"M_6:91",
"N:15",
"O:24",
"P:50",
"Q:50",
"Q_7:89"
]
dct = dict([ (re.split(r'[:_]', s)[0], s) for s in lst])
lst_uniq = list(dct.values())
print(lst_uniq)
# ['K:60', 'M_6:91', 'N:15', 'O:24', 'P:50', 'Q_7:89']
Probably far from the cleanest but here is a method quite easy to understand.
l = ["K:60", "M:37", "M_4:47", "M_5:89", "M_6:91", "N:15", "O:24", "P:50", "Q:50", "Q_7:89"]
reponse = []
val = []
complete_val = []
for x in l:
if x[0] not in reponse:
reponse.append(x[0])
complete_val.append(x.split(':')[0])
val.append(int(x.split(':')[1]))
elif int(x.split(':')[1]) > val[reponse.index(x[0])]:
val[reponse.index(x[0])] = int(x.split(':')[1])
for x in range(len(complete_val)):
print(str(complete_val[x]) + ":" + str(val[x]))
K:60
M:91
N:15
O:24
P:50
Q:89
I do not see any straight-forward technique. Other than iterating on entire thing and computing yourself, I do not see if any built-in can be used. I have written this where you do not require your values to be sorted in your input.
But I like the answer posted by Timur Shtatland, you can make us of that if your values are already sorted in input.
intermediate = {}
for item in a:
key, val = item.split(':')
key = key.split('_')[0]
val = int(val)
if intermediate.get(key, (float('-inf'), None))[0] < val:
intermediate[key] = (val, item)
ans = [x[1] for x in intermediate.values()]
print(ans)
which gives:
['K:60', 'M_6:91', 'N:15', 'O:24', 'P:50', 'Q_7:89']
My objective is to use an insertion sort to sort the contents of a csv file by the numbers in the first column for example I want this:
[[7831703, Christian, Schmidt]
[2299817, Amber, Cohen]
[1964394, Gregory, Hanson]
[1984288, Aaron, White]
[9713285, Alexander, Kirk]
[7025528, Janice, Lee]
[6441979, Sarah, Browning]
[8815776, Rick, Wallace]
[2395480, Martin, Weinstein]
[1927432, Stephen, Morrison]]
and sort it to:
[[1927432, Stephen, Morrison]
[1964394, Gregory, Hanson]
[1984288, Aaron, White]
[2299817, Amber, Cohen]
[2395480, Martin, Weinstein]
[6441979, Sarah, Browning]
[7025528, Janice, Lee]
[7831703, Christian, Schmidt]
[8815776, Rick, Wallace]
[9713285, Alexander, Kirk]]
based off the numbers in the first column within python my current code looks like:
import csv
with open('EmployeeList.csv', newline='') as File:
reader = csv.reader(File)
readList = list(reader)
for row in reader:
print(row)
def insertionSort(readList):
#Traverse through 1 to the len of the list
for row in range(len(readList)):
# Traverse through 1 to len(arr)
for i in range(1, len(readList[row])):
key = readList[row][i]
# Move elements of arr[0..i-1], that are
# greater than key, to one position ahead
# of their current position
j = i-1
while j >=0 and key < readList[row][j] :
readList[row] = readList[row]
j -= 1
readList[row] = key
insertionSort(readList)
print ("Sorted array is:")
for i in range(len(readList)):
print ( readList[i])
The code can already sort the contents of a 2d array, but as it is it tries to sort everything.
I think if I got rid of the [] it would work but in testing it hasn't given what I needed.
To try to clarify again I want to sort the rows positions based off of the first columns numerical value.
Sorry if I didn't understand your need right. But you have a list and you need to sort it? Why you don't you just use sort method in list object?
>>> data = [[7831703, "Christian", "Schmidt"],
... [2299817, "Amber", "Cohen"],
... [1964394, "Gregory", "Hanson"],
... [1984288, "Aaron", "White"],
... [9713285, "Alexander", "Kirk"],
... [7025528, "Janice", "Lee"],
... [6441979, "Sarah", "Browning"],
... [8815776, "Rick", "Wallace"],
... [2395480, "Martin", "Weinstein"],
... [1927432, "Stephen", "Morrison"]]
>>> data.sort()
>>> from pprint import pprint
>>> pprint(data)
[[1927432, 'Stephen', 'Morrison'],
[1964394, 'Gregory', 'Hanson'],
[1984288, 'Aaron', 'White'],
[2299817, 'Amber', 'Cohen'],
[2395480, 'Martin', 'Weinstein'],
[6441979, 'Sarah', 'Browning'],
[7025528, 'Janice', 'Lee'],
[7831703, 'Christian', 'Schmidt'],
[8815776, 'Rick', 'Wallace'],
[9713285, 'Alexander', 'Kirk']]
>>>
Note that here we have first element parsed as integer. It is important if you want to sort it by numerical value (99 comes before 100).
And don't be confused by importing pprint. You don't need it to sort. I just used is to get nicer output in console.
And also note that List.sort() is in-place method. It doesn't return sorted list but sorts the list itself.
*** EDIT ***
Here is two different apporach to sort function. Both could be heavily optimized but I hope you get some ideas how this can be done. Both should work and you can add some print commands in loops to see what happens there.
First recursive version. It orders the list a little bit on every run until it is ordered.
def recursiveSort(readList):
# You don't want to mess original data, so we handle copy of it
data = readList.copy()
changed = False
res = []
while len(data): #while 1 shoudl work here as well because eventually we break the loop
if len(data) == 1:
# There is only one element left. Let's add it to end of our result.
res.append(data[0])
break;
if data[0][0] > data[1][0]:
# We compare first two elements in list.
# If first one is bigger, we remove second element from original list and add it next to the result set.
# Then we raise changed flag to tell that we changed the order of original list.
res.append(data.pop(1))
changed = True
else:
# otherwise we remove first element from the list and add next to the result list.
res.append(data.pop(0))
if not changed:
#if no changes has been made, the list is in order
return res
else:
#if we made changes, we sort list one more time.
return recursiveSort(res)
And here is a iterative version, closer your original function.
def iterativeSort(readList):
res = []
for i in range(len(readList)):
print (res)
#loop through the original list
if len(res) == 0:
# if we don't have any items in our result list, we add first element here.
res.append(readList[i])
else:
done = False
for j in range(len(res)):
#loop through the result list this far
if res[j][0] > readList[i][0]:
#if our item in list is smaller than element in res list, we insert it here
res.insert(j, readList[i])
done = True
break
if not done:
#if our item in list is bigger than all the items in result list, we put it last.
res.append(readList[i])
print(res)
return res
I would like to use a for loop to get inputs for many questions I have to receive.
I managed to make some code, but it seems there should be
a better way.
Maybe I can reduce the number of variables I'm using?
## <Desired Result>
## onset : 3mo
## location : earth
## character : red
checks = ['onset','location','character']
l1 = ['onset','location','character']
l2 = ['onset','location','character']
for i in range(len(checks)):
l2[i] = input(checks[i])
for i in range(len(checks)):
print(l1[i]+" : "+l2[i])
A few observations on your code:
Note that you never actually change l1 so basically it is unnecessary and wherever you use l1 replace with checks.
It is not necessary to define l2 this way as you are overriding all its values anyway, so you could just define l2 = [] and then use append in your loop:
for i in range(len(checks)):
l2.append(input(checks[i]))
Both your loops have exactly the same range, so you could combine them to 1:
for i in range(len(checks)):
l2[i] = input(checks[i])
print(l1[i]+" : "+l2[i])
Now, using list-comprehension and the join method of strings, you could actually reduce this code to 3 lines (and get rid of l1):
checks = ['onset', 'location', 'character']
l2 = [input(x) for x in checks]
print("\n".join(checks[i]+" : "+l2[i] for i in range(len(checks))))
Or more neatly using zip:
print("\n".join(check + " : " + ans for check, ans in zip(checks, l2)))
Lastly, to reduce even one more line (and get rid of l2):
checks = ['onset', 'location', 'character']
print("\n".join(check + " : " + input(check) for check in checks))
We could also avoid using join and use the chance to further reduce to one line (getting rid of checks) using print's extra arguments and list-unpacking:
print(*(check + " : " + input(check) for check in ['onset', 'location', 'character']), sep='\n')
What you are trying to achieve for is done using List comprehensions.
In your case you can do that in a single line.
l2 = [input(x) for x in checks]
You should not initialize the list of desired length and take input for each element. You can use append method to that.
The following code will help you:
checks = ['onset','location','character']
arr = []
for i in checks:
arr.append(input(i + ' : '))
If you want to reduce the number of lines, you can try the following:
arr = [ input(i + ' : ') for i in ['onset','location','character']]
For a truly 1-line solution to your for-loops, you could do your list comprehension like this:
l2 = [(n, print(l1[i]+" : "+n))[0] for i, n in enumerate([input(x + ": ") for x in checks])]
Ouput:
onseta
locationb
characterc
onset : a
location : b
character : c
EDIT
As others mentioned, this is not best practice, so use something like this:
checks = ['onset','location','character']
l2 = [input(f"Check {n}:\n > ") for n in checks]
print(*(f"{j}: {l2[i]}\n" for i, j in enumerate(checks)), sep="")
Output:
Check onset:
> ok
Check location:
> ok
Check character:
> ok
onset: ok
location: ok
character: ok
I am trying to write a loop in python that reads a text file of 50 numbers and sorts through the numbers and assigns them to different variables based on their values. I would like variable A to contain values less than 2, variable B to to contain values from 2 to 2.1, Variable c to contain values from 2.1 to 2.25, variable d to contain values from 2.25 to 2.5, and variable e to contain values from 2.25 to 2.5.
My code so far looks like this:
import os
import numpy
os.chdir('/Users/DevEnv')
dispFile = open('output1.txt')
displacement = dispFile.readlines()
dispFile.close()
displacement = [float(i.strip()) for i in displacement]
for i in range (0,50):
displacementval = displacement[i]
if i<2 displacementval()=a():
However, when I try to run this I get an invalid syntax error. I am new to python and programming and would appreciate any help!
You need to initialize variables a, b, c, d, and e as list:
import os
import numpy
os.chdir('/Users/DevEnv')
dispFile = open('output1.txt')
displacement = dispFile.readlines()
dispFile.close()
displacement = [float(i.strip()) for i in displacement]
a = []
b = []
c = []
d = []
e = []
for displacementval in displacement:
if displacementval < 2:
a.append(displacementval)
elif displacementval < 2.1:
b.append(displacementval)
# ... the rest is similar, so omitted
print a
print b
I would suggest reading the official Python tutorial, so you get an idea of Python syntax first.
When you get the hang of the above beginner friendly solutions, consider taking a look at bisect.
You can solve your problem by doing a little semantic modification of the linked example.
from bisect import bisect
def grade(score, breakpoints=[2, 2.1, 2.25, 2.5], marks='abcde'):
i = bisect(breakpoints, score)
return marks[i]
a,b,c,d,e = [], [], [] ,[], []
lists = [a,b,c,d,e]
marks='abcde'
r = map(lambda x: x / 10.0, range(0, 501, 1))
for item in [(grade(score),score) for score in r ]:
l = marks.index(item[0])
lists[l].append(item[1])
grade() , for a given number will ask, at what index would it go if it were to be inserted in breakpoints? And it will return marks[index]. For example if you gave it a score of 1, it would calculate that 1 belongs in index 0 of breakpoints before 2, so it will return marks[0]=a.
the print statement will give you a list of tuples containing (letter mark, value that mark is assigned to).
As I said this is more advanced and you should try to get to understand it after you grasped basic concepts, but it's worth taking a look at to bend your mind a little.
import os
import numpy
os.chdir('/Users/DevEnv')
dispFile = open('output1.txt')
displacement = dispFile.readlines()
dispFile.close()
displacement = [float(i.strip()) for i in displacement]
a = []
b = []
c = []
d = []
e = []
for i in displacement:
if i <2:
a.append(i)
elif i < 2.1:
b.append(i)
elif i < 2.25:
c.append(i)
elif i< 2.5:
d.append(i)
else:
e.append(i)
Don't forget to instantiate the lists, and the line if i<2 displacementval()=a(): doesn't mean anything.
There's a couple of issues here, but I'll try my best to walk you through them.
First off, you said you wanted 5 variables: a, b, c, d, e, and that each should hold a list of values. To do this we must first declare each as a list, which, in Python is just a collection of values. This looks like this:
a = []
b = []
c = []
d = []
e = []
Next during our for loop, we need to compare the values and sort them accordingly, like this:
for i in range(0, 50):
displacementval = displacement[i]
if displacementval < 2:
a.append(displacementval)
elif displacementval < 2.1:
b.append(displacementval)
elif displacementval < 2.25:
c.append(displacementval)
elif displacementval < 2.5:
d.append(displacementval)
e.append(displacementval)
A little explanation: when we use a_list.append(a_value) we are adding that value on to the end of the list. Also, our if statement just checks if something is true, and then moves onto the next elif. So if a value is not < 2, it will go to the next and check if it's less than 2.1, if it is it must be greater than or equal to 2 and less than 2.1, and so on.
Hopefully this helps!
Alternatively, you could save all the categories in one dictionary of lists instead of having a separate variable for each category. Here is the code.
from collections import defaultdict
def get_category(i):
if i < 2:
return'a'
elif i < 2.1:
return 'b'
elif i < 2.25:
return 'c'
elif i < 3:
return 'd'
displ_categories = defaultdict(list)
for i in displacement:
category = get_category(i)
displ_categories[category].append(i)
The resulting dictionary is of the form:
{'a': [0.42, 1.01, 0.118, 0.807, 0.225, 1.307, 1.151, 0.824],
'd': [2.716, 2.255]}
Lets say, I have this array:
s = ["data_s01", "data_s99", "data_s133"]
I want to add "0" after "s" if there are only two digits. so the result is:
["data_s001", "data_s099", "data_s133"]
I have this now:
for v in s:
data = v.split('_s')
if "0" in data:
out_s = data[0] + "0" + data[1]
print(out_s)
But nothing is printed?
>>> ["data_s{:0>3}".format(x[6:]) for x in s]
['data_s001', 'data_s099', 'data_s133']
x=["data_s01", "data_s99", "data_s133"]
print ["".join(["data_s",k.split("_s")[1].zfill(3)]) for k in x]
Try this.
The print function should not be inside the if, since only original strings with no 0 would get printed. Then again, I don't know why you care if there is a 0 in there or not.