Grouping tuple columns so their sum is less than 1

Grouping tuple columns so their sum is less than 1 - python

I need to create a list of groups of items, grouped so that the sum of the negative logarithms of the probabilities is roughly 1.
So far I've come up with
probs = np.random.dirichlet(np.ones(50)*100.,size=1).tolist()
logs = [-1 * math.log(1-x,2) for x in probs[0]]
zipped = zip(range(0,50), logs)
for key, igroup in iter.groupby(zipped, lambda x: x[1] < 1):
print(list(igroup))
I.e. I create a list of random numbers, take their negative logarithms, then zip these probabilities together with the item number.
I then want to create groups by adding together the numbers in the second column of the tuple until the sum is 1 (or slightly above it).
I've tried:
for key, igroup in iter.groupby(zipped, lambda x: x[1]):
for thing in igroup:
print(list(iter.takewhile(lambda x: x < 1, iter.accumulate(igroup))))
and various other variations on using itertools.accmuluate, but I can't get it to work.
Does anyone have an idea of what could be going wrong (I think I'm doing too much work).
Ideally, the output should be something like
groups = [[1,2,3], [4,5], [6,7,8,9]]
etc i.e these are the groups which satisfy this property.

Using numpy.ufunc.accumulate and simple loop:
import numpy as np
def group(xs, start=1):
last_sum = 0
for stop, acc in enumerate(np.add.accumulate(xs), start):
if acc - last_sum >= 1:
yield list(range(start, stop))
last_sum = acc
start = stop
if start < stop:
yield list(range(start, stop))
probs = np.random.dirichlet(np.ones(50) * 100, size=1)
logs = -np.log2(1 - probs[0])
print(list(group(logs)))
Sample output:
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35],
[36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50]]
ALTERNATIVE
Using numpy.searchsorted:
def group(xs, idx_start=1):
xs = np.add.accumulate(xs)
idxs = np.searchsorted(xs, np.arange(xs[-1]) + 1, side='left').tolist()
return [list(range(i+idx_start, j+idx_start)) for i, j in zip([0] + idxs, idxs)]

Related

How to generate sequential subsets of integers?

I have the following start and end values:
start = 0
end = 54
I need to generate subsets of 4 sequential integers starting from start until end with a space of 20 between each subset. The result should be this one:
0, 1, 2, 3, 24, 25, 26, 27, 48, 49, 50, 51
In this example, we obtained 3 subsets:
0, 1, 2, 3
24, 25, 26, 27
48, 49, 50, 51
How can I do it using numpy or pandas?
If I do r = [i for i in range(0,54,4)], I get [0, 4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52].

This should get you what you want:
j = 20
k = 4
result = [split for i in range(0,55, j+k) for split in range(i, k+i)]
print (result)
Output:
[0, 1, 2, 3, 24, 25, 26, 27, 48, 49, 50, 51]

Maybe something like this:
r = [j for i in range(0, 54, 24) for j in range(i, i + 4)]
print(r)
[0, 1, 2, 3, 24, 25, 26, 27, 48, 49, 50, 51]

you can use numpy.arange which returns an ndarray object containing evenly spaced values within a given range
import numpy as np
r = np.arange(0, 54, 4)
print(r)
Result
[0, 4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52]

Numpy approach
You can use np.arange to generate number with a step value of 20 + 4, where 20 is for space between each interval and 4 for each sequential sub array.
start = 0
end = 54
out = np.arange(0, 54, 24) # array([ 0, 24, 48]) These are the starting points
# for each subarray
step = np.tile(np.arange(4), (len(out), 1))
# [[0 1 2 3]
# [0 1 2 3]
# [0 1 2 3]]
res = out[:, None] + step
# array([[ 0, 1, 2, 3],
# [24, 25, 26, 27],
# [48, 49, 50, 51]])

This can be done with plane python:
rangeStart = 0
rangeStop = 54
setLen = 4
step = 20
stepTot = step + setLen
a = list( list(i+s for s in range(setLen)) for i in range(rangeStart,rangeStop,stepTot))
In this case you will get the subsets as sublists in the array.

I dont think you need to use numpy or pandas to do what you want. I achieved it with a simple while loop
num = 0
end = 54
sequence = []
while num <= end:
sequence.append(num)
num += 1
if num%4 == 0: //If four numbers have been added
num += 20
//output: [0, 1, 2, 3, 24, 25, 26, 27, 48, 49, 50, 51]

How can I find the longest contiguous subsequence in a rising sequence in Python?

I need to find the longest contiguous subsequence in a rising sequence in Python.
For example if I have A = [1, 2, 3, 5, 8, 9, 11, 13, 17, 18, 19, 20, 21, 25, 27, 28, 29, 30]
The answer would be [17, 18, 19, 20, 21] because it's the longest contiguous subsequence with 5 numbers (whereas [1, 2, 3] is 3 numbers long and [27, 28, 29, 30] is 4 numbers long.)
My code is stuck in an endless loop
num_list = [1, 2, 3, 5, 8, 9, 11, 13, 17, 18, 19, 20, 21, 23, 25, 26, 27]
longest_sequence = {}
longest_sequence_length = 1
for num in num_list:
sequence_length = 1
while True:
if (num + sequence_length) in num_list:
sequence_length += 1
else:
if sequence_length > longest_sequence_length:
longest_sequence_length_length = sequence_length
longest_sequence = {"start": num, "end": num + (sequence_length - 1)}
break
print(f"The longest sequence is {longest_sequence_length} numbers long"
f" and it's between {longest_sequence['start']} and {longest_sequence['end']}")

You can use numpy to solve it in one line:
import numpy as np
A = [1, 2, 3, 5, 8, 9, 11, 13, 17, 18, 19, 20, 21, 25, 27, 28, 29, 30]
out = max(np.split(A, np.where(np.diff(A) != 1)[0] + 1), key=len).tolist()
You can also find the same outcome by running 3 iterations.
(i) First you need to find the differences between consecutive elements in A; that's found in diff (with zip(A,A[1:]), you can access consecutive elements).
(ii) Then you split A on indices where the difference is not 1; that's being done in the second iteration. Basically, if a difference is 1, append the value in A to the running sublist, if not, create a new sublist and put the corresponding value to this new sublist.
(iii) Finally, using max() function, you can find the longest sublist using key=len.
This exact same job is done by the numpy code above.
diff = [j-i for i,j in zip(A, A[1:])]
splits = [[A[0]]]
for x,d in zip(A[1:], diff):
if d == 1:
splits[-1].append(x)
else:
splits.append([x])
out = max(splits, key=len)
Output:
[17, 18, 19, 20, 21]

In line 13 you need a break instead of a continue statement.
Also, in line 11 you had a little mistake, added an extra "_length" to you variable name.

How to print the sum of the current and previous element in a list

I am trying to iterate through a list of numbers and print the sum of the current element and the previous element using python. For example,
Given numbers = [5,10,15,20,25,30,30], the output should be 5, 15, 25, 35, 45, 55, 60,. This is the following code that I have tried, it is very close to the answer but the first element is wrong.
numbers = [5, 10, 15, 20, 25, 30, 30]
i = 0
for x in range(1, 8):
print(numbers[i] + numbers[i - 1], end=", ")
i += 1
I am getting the output 35, 15, 25, 35, 45, 55, 60,. What am I doing wrong?

You can pair adjacent items of numbers by zipping it with itself but padding one with a 0, so that you can iterate through the pairs to output the sums in a list comprehension:
[a + b for a, b in zip([0] + numbers, numbers)]
or by mapping the pairs to the sum function:
list(map(sum, zip([0] + numbers, numbers)))
Both would return:
[5, 15, 25, 35, 45, 55, 60]

You are starting at index 0, where it seems your intended output starts at index 1:
Here is a better solution:
numbers = [5, 10, 15, 20, 25, 30, 30]
for i in range(len(numbers)):
if i == 0:
print(numbers[i])
else:
print(numbers[i - 1] + numbers[i])
Outputs:
5
15
25
35
45
55
60

This should work:
numbers = [5, 10, 15, 20, 25, 30, 30]
output = [numbers[i]+numbers[i-1] if i > 0 else numbers[i] for i in range(len(numbers))]
print(output)

You are starting at i = 0, so the first number you are adding is the 0 and the -1 (the last one, in this case). That's why you are getting the 35 (5+30).

This list comprehension works:
numbers = [5, 10, 15, 20, 25, 30, 30]
output = [value + numbers[i-1] if i else value for i, value in enumerate(numbers)]
print(output)
>>> [5, 15, 25, 35, 45, 55, 60]

Cheat, and add a [0] at the start to prevent the first sum to be wrong.
You'll run into problems at the end, though, because then the list in the enumerate is one item longer than the original, so also clip off its last number:
print ([a+numbers[index] for index,a in enumerate([0]+numbers[:-1])])
Result:
[5, 15, 25, 35, 45, 55, 60]
If you want to see how it works, print the original numbers before addition:
>>> print ([(a,numbers[index]) for index,a in enumerate([0]+numbers[:-1])])
[(0, 5), (5, 10), (10, 15), (15, 20), (20, 25), (25, 30), (30, 30)]
The enumerate loops over the changed list [0, 5, 15, .. 55], where everything is shifted up a place, but numbers[index] still returns the correct index from the original list. Adding them up yields the correct result.

Print list in specified range Python

I'm new to Python and I have this problem
I have a list of numbers like this:
n = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]
I want to print from 11 to 37, that means the output = 11, 13,.... 37.
I tried to print(n[11:37]) but of course it will print [37, 41, 43, 47]
because that is range index.
Any ideas or does Python have any built-in method for this ?

This should do the job...
n = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]
n.sort()
mylist = [x for x in n if x in range(11, 38)]
print(mylist)
Want to print that as comma separated string:
print(mylist.strip('[]'))

This will work. (Assuming list is sorted)
print n[n.index(11): n.index(37)+1]
Output:
[11, 13, 17, 19, 23, 29, 31, 37]

Considering your list is ordered and it has no duplicates:
n = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]
print(",".join(map(str,n[n.index(11): n.index(37)+1])))
Here you have a live example

Using numpy:
import numpy as np
narr = np.array(n)
m = (narr >= 11) & (narr <= 37)
for v in narr[m]:
print(v)
# or, to get rid of the loop:
print('\n'.join(map(str, narr[m])))

it pretty simple, since your list is already sorted you can write
my_list = [x for x in n if x in range(11, 38)]
print(*my_list)
what the '*' does is that it unpacks the array into individual elements, a term known as unpacking.This will produce the actual result you wanted and not an array

If your data is sorted, you can use a generator expression with either a range object or chained comparisons:
n = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]
print(*(i for i in n if i in range(11, 38)), sep=', ')
print(*(i for i in n if 11 <= i <= 37), sep=', ')
If your data is unsorted and you can use indices of the first occurrences of each value, you can slice your list:
print(*n[n.index(11): n.index(37)+1], sep=', ')
Result with the data you have provided:
11, 13, 17, 19, 23, 29, 31, 37

Count total number of occurrences of given list of integers in another

How do I count the number of times the same integer occurs?
My code so far:
def searchAlgorithm (target, array):
i = 0 #iterating through elements of target list
q = 0 #iterating through lists sublists via indexes
while q < 4:
x = 0 #counting number of matches
for i in target:
if i in array[q]:
x += 1
else:
x == 0
print(x)
q += 1
a = [8, 12, 14, 26, 27, 28]
b = [[4, 12, 17, 26, 30, 45], [8, 12, 19, 24, 33, 47], [3, 10, 14, 31, 39, 41], [4, 12, 14, 26, 30, 45]]
searchAlgorithm(a, b)
The output of this is:
2
2
1
3
What I want to achieve is counting the number of times '1', '2' '3' matches occurs.
I have tried:
v = 0
if searchAlgorithm(a, b) == 2:
v += 1
print(v)
But that results in 0

You can use intersection of sets to find elements that are common in both lists. Then you can get the length of the sets. Here is how it looks:
num_common_elements = (len(set(a).intersection(i)) for i in b)
You can then iterate over the generator num_common_elements to use the values. Or you can cast it to a list to see the results:
print(list(num_common_elements))
[Out]: [2, 2, 1, 3]
If you want to implement the intersection functionality yourself, you can use the sum method to implement your own version. This is equivalent to doing len(set(x).intersection(set(y))
sum(i in y for i in x)
This works because it generates values such as [True, False, False, True, True] representing where the values in the first list are present in the second list. The sum method then treats the Trues as 1s and Falses as 0s, thus giving you the size of the intersection set

This is based on what I understand from your question. Probably you are looking for this:
from collections import Counter
def searchAlgorithm (target, array):
i = 0 #iterating through elements of target list
q = 0 #iterating through lists sublists via indexes
lst = []
while q < 4:
x = 0 #counting number of matches
for i in target:
if i in array[q]:
x += 1
else:
x == 0
lst.append(x)
q += 1
print(Counter(lst))
a = [8, 12, 14, 26, 27, 28]
b = [[4, 12, 17, 26, 30, 45], [8, 12, 19, 24, 33, 47], [3, 10, 14, 31, 39, 41], [4, 12, 14, 26, 30, 45]]
searchAlgorithm(a, b)
# Counter({2: 2, 1: 1, 3: 1})

Thanks to some for their helpful feedback, I have since come up a more simplified solution that does exactly what I want.
By storing the results of the matches in a list, I can then return the list out of the searchAlgorithm function and simple use .count() to count all the matches of a specific number within the list.
def searchAlgorithm (target, array):
i = 0
q = 0
results = []
while q < 4:
x = 0 #counting number of matches
for i in target:
if i in array[q]:
x += 1
else:
x == 0
results.append(x)
q += 1
return results
a = [8, 12, 14, 26, 27, 28]
b = [[4, 12, 17, 26, 30, 45], [8, 12, 19, 24, 33, 47], [3, 10, 14, 31, 39, 41], [4, 12, 14, 26, 30, 45]]
searchAlgorithm(a, b)
d2 = (searchAlgorithm(winNum, lotto).count(2))

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Grouping tuple columns so their sum is less than 1 - python

Related

How to generate sequential subsets of integers?

How can I find the longest contiguous subsequence in a rising sequence in Python?

How to print the sum of the current and previous element in a list

Print list in specified range Python

Count total number of occurrences of given list of integers in another

Categories

Resources