I am having a problem with my python code, but I am not sure what it is. I am creating a program that creates a table with all possible combinations of four digits provided the digits do not repeat, which I know is successful. Then, I create another table and attempt to add to this secondary table all of the values which use the same numbers in a different order (so I do not have, say, 1234, 4321, 3241, 3214, 1324, 2413, etc. on this table.) However, this does not seem to be working, as the second table has only one value. What have I done wrong? My code is below. Oh, and I know that the one value comes from appending the 1 at the top.
combolisttwo = list()
combolisttwo.append(1)
combolist = {(a, b, c, d) for a in {1, 2, 3, 4, 5, 6, 7, 8, 9, 0} for b in {1, 2, 3, 4, 5, 6, 7, 8, 9, 0} for c in {1, 2, 3, 4, 5, 6, 7, 8, 9, 0} for d in {1, 2, 3, 4, 5, 6, 7, 8, 9, 0} if a != b and a != c and a != d and b != c and b != d and c!=d}
for i in combolist:
x = 0
letternums = str(i)
letters = list(letternums)
for g in letters:
n = 0
hits = 0
nonhits = 0
letterstwo = str(combolisttwo[n])
if g == letterstwo[n]:
hits = hits + 1
if g != letterstwo[n]:
nonhits = nonhits + 1
if hits == 4:
break
if hits + nonhits == 4:
combolisttwo.append(i)
break
x = len(combolisttwo)
print (x)
All possible combinations of four digits provided the digits do not repeat:
import itertools as IT
combolist = list(IT.combinations(range(10), 4))
Then, I create another table and attempt to add to this secondary table all of the values which use the same numbers in a different order (so I do not have, say, 1234, 4321, 3241, 3214, 1324, 2413, etc. on this table.):
combolist2 = [item for combo in combolist
for item in IT.permutations(combo, len(combo))]
Useful references:
combinations -- for enumerating collections of elements without replacement
permutations -- for enumerating collections of elements in all possible orders
This code is pretty confused ;-) For example, you have n = 0 in your inner loop, but never set n to anything else. For another, you have x = 0, but never use x. Etc.
Using itertools is really best, but if you're trying to learn how to do these things yourself, that's fine. For a start, change your:
letters = list(letternums)
to
letters = list(letternums)
print(letters)
break
I bet you'll be surprised at what you see! The elements of your combolist are tuples, so when you do letternums = str(i) you get a string with a mix of digits, spaces, parentheses and commas. I don't think you're expecting anything but digits.
Your letterstwo is the string "1" (always, because you never change n). But it doesn't much matter, because you set hits and nonhits to 0 every time your for g in letters loop iterates. So hits and nonhits can never be bigger than 1.
Which answers your literal question ;-) combolisttwo.append(i) is never executed because
hits + nonhits == 4 is never true. That's why combolisttwo remains at its initial value ([1]).
Put some calls to print() in your code? That will help you see what's going wrong.
Related
I am comparing two lists together and if they match I want to increment a counter.
Right now the counter is saying 0 each time I print it out even though there should be some matches. Both lists have data within them as well because I can print them out. Below is the code that I am using to find a match in the lists and increment if they do match. What could be going wrong?
numCorrect = sum(1 for a, b in zip(trueLabels, predLabels) if a == b)
Any advice helps, Thanks
Your code works well:
trueLabels = [1, 2, 3, 4, 5]
predLabels = [1, 2, 4, 4, 5]
numCorrect = sum(1 for a, b in zip(trueLabels, predLabels) if a == b)
print(numCorrect)
# 4
You may have shifted indices in your list(s).
Let's say that two lists of elements are given, A and B. I'm interested in checking if A contains all the elements of B. Specifically, the elements must appear in the same order and they do not need to be consecutive. If this is the case, we say that B is a subsequence of A.
Here are some examples:
A = [4, 2, 8, 2, 7, 0, 1, 5, 3]
B = [2, 2, 1, 3]
is_subsequence(A, B) # True
A = [4, 2, 8, 2, 7, 0, 1, 5, 3]
B = [2, 8, 2]
is_subsequence(A, B) # True
A = [4, 2, 8, 2, 7, 0, 1, 5, 3]
B = [2, 1, 6]
is_subsequence(A, B) # False
A = [4, 2, 8, 2, 7, 0, 1, 5, 3]
B = [2, 7, 2]
is_subsequence(A, B) # False
I found a very elegant way to solve this problem (see this answer):
def is_subsequence(A, B):
it = iter(A)
return all(x in it for x in B)
I am now wondering how this solution behaves with possibly very very large inputs. Let's say that my lists contain billions of numbers.
What's the complexity of the code above? What's its worst case? I have tried to test it with very large random inputs, but its speed mostly depends on the automatically generated input.
Most importantly, are there more efficient solutions? Why are these solutions more efficient than this one?
The code you found creates an iterator for A; you can see this as a simple pointer to the next position in A to look at, and in moves the pointer forward across A until a match is found. It can be used multiple times, but only ever moves forward; when using in containment tests against a single iterator multiple times, the iterator can't go backwards and so can only test if still to visit values are equal to the left-hand operand.
Given your last example, with B = [2, 7, 2], what happens is this:
it = iter(A) creates an iterator object for the A list, and stores 0 as the next position to look at.
The all() function tests each element in an iterable and returns False early, if such a result was found. Otherwise it keeps testing every element. Here the tests are repeated x in it calls, where x is set to each value in B in turn.
x is first set to 2, and so 2 in it is tested.
it is set to next look at A[0]. That's 4, not equal to 2, so the internal position counter is incremented to 1.
A[1] is 2, and that's equal, so 2 in it returns True at this point, but not before incrementing the 'next position to look at' counter to 2.
2 in it was true, so all() continues on.
The next value in B is 7, so 7 in it is tested.
it is set to next look at A[2]. That's 8, not 7. The position counter is incremented to 3.
it is set to next look at A[3]. That's 2, not 7. The position counter is incremented to 4.
it is set to next look at A[4]. That's 7, equal to 7. The position counter is incremented to 5 and True is returned.
7 in it was true, so all() continues on.
The next value in B is 2, so 2 in it is tested.
it is set to next look at A[5]. That's 0, not 2. The position counter is incremented to 6.
it is set to next look at A[6]. That's 1, not 2. The position counter is incremented to 7.
it is set to next look at A[7]. That's 5, not 2. The position counter is incremented to 8.
it is set to next look at A[8]. That's 3, not 2. The position counter is incremented to 9.
There is no A[9] because there are not that many elements in A, and so False is returned.
2 in it was False, so all() ends by returning False.
You could verify this with an iterator with a side effect you can observe; here I used print() to write out what the next value is for a given input:
>>> A = [4, 2, 8, 2, 7, 0, 1, 5, 3]
>>> B = [2, 7, 2]
>>> with_sideeffect = lambda name, iterable: (
print(f"{name}[{idx}] = {value}") or value
for idx, value in enumerate(iterable)
)
>>> is_sublist(with_sideeffect(" > A", A), with_sideeffect("< B", B))
< B[0] = 2
> A[0] = 4
> A[1] = 2
< B[1] = 7
> A[2] = 8
> A[3] = 2
> A[4] = 7
< B[2] = 2
> A[5] = 0
> A[6] = 1
> A[7] = 5
> A[8] = 3
False
Your problem requires that you test every element of B consecutively, there are no shortcuts here. You also must scan through A to test for the elements of B being present, in the right order. You can only declare victory when all elements of B have been found (partial scan), and defeat when all elements in A have been scanned and the current value in B you are testing for is not found.
So, assuming the size of B is always smaller than A, the best case scenario then is where all K elements in B are equal to the first K elements of A. The worst case, is any case where not all of the elements of B are present in A, and require a full scan through A. It doesn't matter what number of elements are present in B; if you are testing element K out of K you already have been scanning part-way through A and must complete your scan through A to find that the last element is missing.
So the best case with N elements in A and K elements in B, takes O(K) time. The worst case, using the same definitions of N and K, takes O(N) time.
There is no faster algorithm to test for this condition, so all you can hope for is lowering your constant times (the time taken to complete each of the N steps). Here that'd be a faster way to scan through A as you search for the elements of B. I am not aware of a better way to do this than by using the method you already found.
As mentioned, a run is a sequence of consecutive repeated values. Implement a Python function called longest_run that takes a list of numbers and returns the length of the longest run. For example in the sequence:
2, 7, 4, 4, 2, 5, 2, 5, 10, 12, 5, 5, 5, 5, 6, 20, 1 the longest run has length 4. Then, in the main, your program should ask the user to input the list, then it should call longest_run function, and print the result.
This is what I tried but it only returns 1 and I don't understand why. I can't import any modules for this question.
def longest_run(aList):
'''(list)->int
Returns length of the longest run
Precondition: aList is a list of a len of at least 2 and elements of list are ints
'''
count=0
bucket=[]
for i in aList:
if bucket==i:
count=count+1
else:
bucket=i
count=1
return count
The biggest mistake of your code is to set bucket=[] (which is a list) and later to an integer.
Also, you need to store the longest sequence and the current sequence length (initialized to 1) and the last seen value, so more variables than you're storing.
Each time value is the same as before, increase counter. If it's different, reset counter after having checked if it's not greater than the max. In the end perform the max test again just in case the longest sequence is in the end (classic mistake)
like this:
seq = [2, 7, 4, 4, 2, 5, 2, 5, 10, 12, 5, 5, 5, 5, 6, 20, 1]
result=1
max_result=0
last_seen=seq[0]
for v in seq[1:]:
if v==last_seen:
result += 1
else:
if result > max_result:
max_result = result
last_seen = v
result = 1
# just in case the longest sequence would be at the end of your list...
if result > max_result:
max_result = result
print(max_result)
When you're finally allowed to use python batteries, use itertools.groupby and compute the max of the sequence lengths:
max(sum(1 for x in v) for _,v in itertools.groupby(seq))
You get 1 since when your loop is in it's final iteration:
bucket = 20 and i = 1 which means bucket != i so the loop enters the else clause and assigns count = 1 exits and the function returns count which is 1.
Suggestions:
1) When you encounter a bug like this try running through the code/logic manually - it helps.
2) For this question specifically - whenever a run ends you forget the last run length, think about how you can "remember" the longest run.
So you're trying to find the longest run of the same number in your list? Because it's kinda confusing, what you were trying to code.
You should keep two versions of count: maxCount (the one which you're gonna return) and actualCount (the one you're gonna increment), iterate through the list and compare number with the next one. If it's the same actualCount += 1 if not, actualCount = 0 at the end of every iteration, compare maxCount and actualCount and if actualCount is bigger than maxCount = actualCount.
def longest_run(aList):
maxCount = 1
actualCount = 1
for i in range(len(aList)-1):
if aList[i] == aList[i+1]:
actualCount += 1
else:
actualCount = 1
if actualCount > maxCount:
maxCount = actualCount
return(maxCount)
You can use the following method:(Number of repetitions of each number)
mylist = [2, 7, 4, 4, 2, 5, 2, 5, 10, 12, 5, 5, 5, 5, 6, 20, 1]
my_dict = {i:mylist.count(i) for i in mylist}
mylist = list(dict.fromkeys(mylist))
R_list=[]
for i in mylist:
print("%s repeated %s" %(i,my_dict[i]))
R_list = R_list+[my_dict[i]]
print(R_list)
print(max(R_list))
No this isn't just as simple as using count()
I've got a list of 5 random integers from 0-9. I want to check if 3 of these integers are identical, and if the remaining 2 are identical but different from the other 3.
My idea was to use set() to count the occurrences that exist in the list, but this also includes cases where there are 4 identical integers and 1 lone integer, like so:
nums = [4, 4, 4, 2, 2]
if len(set(nums)) == 2:
print(set(nums))
>> {4, 2}
nums = [4, 4, 4, 4, 2]
if len(set(nums)) == 2:
print(set(nums))
>> {4, 2}
I've been trying to find a way to exclude the 4x1 cases, but everything I've come up with seems convoluted and like bad practice, and this needs to be clean. I'm wondering if maybe there is a better way to do this without using set()?
Thanks for any help.
You can use a dictionary having the number as key and value as its frequency
from collections import Counter
nums = [4, 4, 4, 2, 2]
freq_dict = dict(Counter(nums))
print(freq_dict)
would give you this:
{4: 3, 2: 2}
and then we can check for the condition that the length of the dictionary is 2, otherwise it means it has more than 2 identical elements.
if len(freq_dict) == 2:
for key, value in freq_dict.items():
if value == 2 or value == 3:
print("Given that there are 5 items and the length of the list is 2, it has to be the case that \
the other integer appears {} times".format(5 - value))
print(key)
break
else:
print("Nope")
I'm trying to make histogram by python. I am starting with the following snippet:
def histogram(L):
d = {}
for x in L:
if x in d:
d[x] += 1
else:
d[x] = 1
return d
I understand it's using dictionary function to solve the problem.
But I'm just confused about the 4th line: if x in d:
d is to be constructed, there's nothing in d yet, so how come if x in d?
Keep in mind, that if is inside a for loop.
So, when you're looking at the very first item in L there is nothing in d, but when you get to the next item in L, there is something in d, so you need to check whether to make a new bin on the histogram (d[x] = 1), or add the item to an existing bin (d[x] += 1).
In Python, we actually have some shortcuts for this:
from collections import defaultdict
def histogram(L):
d = defaultdict(int)
for x in L:
d[x] += 1
return d
This automatically starts each bin in d at zero (what int() returns) so you don't have to check if the bin exists. On Python 2.7 or higher:
from collections import Counter
d = Counter(L)
Will automatically make a mapping of the frequencies of each item in L. No other code required.
The code inside of the for loop will be executed once for each element in L, with x being the value of the current element.
Lets look at the simple case where L is the list [3, 3]. The first time through the loop d will be empty, x will be 3, and 3 in d will be false, so d[3] will be set to 1. The next time through the loop x will be 3 again, and 3 in d will be true, so d[3] will be incremented by 1.
You can create a histogram with a dict comprehension:
histogram = {key: l.count(key) for key in set(L)}
I think the other guys have explained you why if x in d. But here is a clue, how this code should be written following "don't ask permission, ask forgiveness":
...
try:
d[x] += 1
except KeyError:
d[x] = 1
The reason for this, is that you expect this error to appear only once (at least once per method call). Thus, there is no need to check if x in d.
You can use a Counter, available from Python 2.7 and Python 3.1+.
>>> # init empty counter
>>> from collections import Counter
>>> c = Counter()
>>> # add a single sample to the histogram
>>> c.update([4])
>>> # add several samples at once
>>> c.update([4, 2, 2, 5])
>>> # print content
>>> print c
Counter({2: 2, 4: 2, 5: 1})
The module brings several nice features, like addition, subtraction, intersection and union on counters. The Counter can count anything which can be used as a dictionary key.
if x isn't in d, then it gets put into d with d[x] = 1. Basically, if x shows up in d more than once it increases the number matched with x.
Try using this to step through the code: http://people.csail.mit.edu/pgbovine/python/
You can create your own histogram in Python using for example matplotlib. If you want to see one example about how this could be implemented, you can refer to this answer.
In this specific case, you can use doing:
temperature = [4, 3, 1, 4, 6, 7, 8, 3, 1]
radius = [0, 2, 3, 4, 0, 1, 2, 10, 7]
density = [1, 10, 2, 24, 7, 10, 21, 102, 203]
points, sub = hist3d_bubble(temperature, density, radius, bins=4)
sub.axes.set_xlabel('temperature')
sub.axes.set_ylabel('density')
sub.axes.set_zlabel('radius')