Python Data structure index Start at 1 instead of 0? - python

I have a weird question: I have this list of 64 numbers that will never change:
(2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128)
I need a data structure in Python that will allow me to accsess these numbers using a 1-64 index as opposed to the standard 0-63. Is this possible? Would the best way to accomplish this be to build a dictionary?

Just insert a 0 at the beginning of the structure:
(0, 2, 4, 6, 8, ...)

You could override the item getter and make a specialized tuple:
class BaseOneTuple(tuple):
__slots__ = () # Space optimization, see: http://stackoverflow.com/questions/472000/python-slots
def __new__(cls, *items):
return tuple.__new__(cls, items) # Creates new instance of tuple
def __getitem__(self, n):
return tuple.__getitem__(self, n - 1)
b = BaseOneTuple(*range(2, 129, 2))
b[2] == 4

You could use a dictionary, or you could simply subtract one from your index before accessing it.
Also, I note that your 64 numbers are in a simple arithmetic progression. Why store them at all? You can use this:
def my_number(i):
return 2*i
If the list you showed was actually an example, and the real numbers are more complicated, then use a list with a dummy first element:
my_nums = [0, 2, 4, 6, 8, ....]
Then you can get 2 as my_nums[1].

You could use range(2, 129, 2) to generate the numbers in the range 1 - 128 in increments of 2 and convert this list into a tuple if it's not going to change.
t = tuple(range(2, 129, 2))
def numbers(n):
return t[n-1]
Given the global tuple t, function numbers could retrieve elements using a 1-based (instead of 0-based) index.

Related

How to check if two lists don't have similarites?

I need to find a method to check if two json files don't have similarities within each other.
Here is an example of an array of two json files:
[32, 19, 1, 2, 71, 171, 95, 92, 38, 3]
[196, 167, 67, 112, 114, 25, 105, 7, 26, 32]
As you can see, both of these arrays contains "32".
How can I check if there is no similarities beetween their values in the array?
Convert your JSON to lists using json.load(file)
Then, add one list to a set and check all elements of the next list against it
>>> l1 = [32, 19, 1, 2, 71, 171, 95, 92, 38, 3]
>>> l2 = [196, 167, 67, 112, 114, 25, 105, 7, 26, 32]
>>> s1 = set(l1)
>>> any(x in s1 for x in l2)
True
You can do the same within a set (change x in s1 to x in l1) but it'll be less optimal
You can use a set's isdisjoint method
if set(list1).isdisjoint(list2):
print("there are no commonalities")
else:
print("there is at least one common element")
I agree with above two solution, they have used set and still iterating over it. If need to iterate set, then whats needs to have set.
set(list) will take linear time to build set and again iterate the set.
Here is simple solution depicting the same.
l1 = [32, 19, 1, 2, 71, 171, 95, 92, 38, 3]
l2 = [196, 167, 67, 112, 114, 25, 105, 7, 26]
print(any(x in l1 for x in l2))
def compare(list1,list2):
for i in list1:
if(i in list2):
return True
return False
list1 = '{"a":[32, 19, 1, 2, 71, 171, 95, 92, 38, 3]}'
list2 = '{"b":[196, 167, 67, 112, 114, 25, 105, 7, 26, 32]}'
list1=list(json.loads(list1).values())
list2=list(json.loads(list2).values())
print(compare(list1[0],list2[0]))

Something wrong with my quick sort Python code?

I meet a issue with my quick sort code.
class Sort:
def quickSort(self, unsortedlist):
if len(unsortedlist) <= 1:
return unsortedlist
pivot = unsortedlist[0]
unsortedlist.remove(unsortedlist[0])
left, right = [], []
for num in unsortedlist:
if num < pivot:
left.append(num)
else:
right.append(num)
return self.quickSort(left) + [pivot] + self.quickSort(right)
if __name__ == "__main__":
a = [76, 76, 65, 72, 58, 64, 82, 3, 22, 31]
print(Sort().quickSort(a))
print(Sort().quickSort(a))
print(Sort().quickSort(a))
The result will be:
[3, 22, 31, 58, 64, 65, 72, 76, 76, 82]
[3, 22, 31, 58, 64, 65, 72, 76, 82]
[3, 22, 31, 58, 64, 65, 72, 82]
Why the sorted list become less and less?
unsortedlist.remove(unsortedlist[0])
So every time you call quickSort on a list you remove the first element from the source list.
It doesn't really matter for the recursive calls because you "control" the list, but for the "top-level" calls after every call to quickSort you've "lost" one of the elements.

Select elements in such a way that most of the elements are in the beginning of a list

I have a list of values and I want to select n values in such a way that most of the elements come from the beginning of the list and diminish as it goes further in the list (as shown in the link below).
np.random.seed(0)
a = pd.Series(range(100))
np.random.shuffle(a)
a.values
array([26, 86, 2, 55, 75, 93, 16, 73, 54, 95, 53, 92, 78, 13, 7, 30, 22,
24, 33, 8, 43, 62, 3, 71, 45, 48, 6, 99, 82, 76, 60, 80, 90, 68,
51, 27, 18, 56, 63, 74, 1, 61, 42, 41, 4, 15, 17, 40, 38, 5, 91,
59, 0, 34, 28, 50, 11, 35, 23, 52, 10, 31, 66, 57, 79, 85, 32, 84,
14, 89, 19, 29, 49, 97, 98, 69, 20, 94, 72, 77, 25, 37, 81, 46, 39,
65, 58, 12, 88, 70, 87, 36, 21, 83, 9, 96, 67, 64, 47, 44])
what is the good way to select those numbers?
http://www.bydatabedriven.com/wp-content/uploads/2012/12/Screen-Shot-2012-12-03-at-8.12.36-PM.png
As an example if n = 10, then the returned values may be (more numbers picked from the begining of the list as compared to those values which are to the end of the list) :
26, 2, 16, 92, 8, 45, 61, 99, 94 39
You can use np.random.choice and pass appropriately-shaped weights, either directly or using pd.Series.sample since you're using pandas. For example:
In [59]: s = pd.Series(range(100))
In [60]: chosen = s.sample(10**6, replace=True, weights=1.0/(1+np.arange(len(s)))) #typecast the weight to float
In [61]: chosen.hist(bins=50).get_figure().savefig("out.png")
gives me
You can tweak the weights function to your heart's content. Here I used basically 1/i, so that the 4th element is 4 times less likely to be selected than the first. You could take that expression to some power, with **2 making the 4th element 16 times less likely to be selected, or **0.5 making the 4th element half as likely to be selected as the first. Entirely up to you to find a behaviour you're happy with.
Also note that here I'm using replace=True, because I wanted to select a large number of values to make the plot look better. If you don't want the same element to be selected twice, use replace=False.
Solving by Re-inventing the Wheel
Here is how you do it from the first principles.
random.random() returns a random number between 0 and 1. This means that the expression random.random() < x becomes true less frequently as x becomes closer to 0.
For each element in the array, say array[i], let us define the odds of the element getting picked as
odds_pick(i) = (1 - i / len(array)) * DAMP.
Here, DAMP is a number between 0 and 1, which is used to diminish the odds of a number being picked. Thus,
when i = 0 (the first element), the odds of the element being picked are just = DAMP.
For the last element, it's DAMP / len(array).
For others, the odds are between these extremes, and they diminish as i gets larger.
Finally, to get this appropriate sample, iterate over the array, and check if random.random() < odds_pick(i). If so, pick the element.
Because of how we have defined odds_pick(i), it will become more and more close to 0 as i increases, and thus random.random() < odds_pick(i) will be less and less true towards the end. Ultimately, this means we end up picking elements more frequently from the front than from the end.
Code:
import random
def sample_with_bias(arr, n, damping_factor=.3):
res = []
indexes_picked = []
n_picked = 0
while n_picked < n:
for i, x in enumerate(arr):
odds_pick = damping_factor * (1 - i * 1. / len(arr))
if i not in indexes_picked and random.random() < odds_pick:
print(odds_pick)
n_picked += 1
indexes_picked.append(i)
res.append(x)
return res
Note that there are multiple pass over the array to cover the corner case where n unique elements could not be sampled in a single pass.
Let's run some experiments:
def run_experiment(arr, damping_factor, num_pick=10, num_runs=100):
all_samples = []
for i in range(num_runs):
all_samples.extend(sample_with_bias(arr, num_pick, damping_factor=damping_factor))
dist = Counter(all_samples)
dist = sorted(list(dist.items()), key=lambda k: k[0])
k, v = zip(*dist)
plt.bar(k, v)
plt.title("Damping Factor = {0}".format(damping_factor))
plt.show()
and
for df in [0.10, 0.50, 0.99]:
np.random.seed(0)
a = pd.Series(range(100))
run_experiment(a, damping_factor=df)
Results
For lots of damping, almost uniform with still a bias for the elements in the beginning:
Let's see what happens when we decrease the damping:
With almost no damping, only the elements in the front are picked:

Python algorithm for prime numbers

I'm trying to filter off the prime numbers from 1 to 100 and here is the codes. However, it turns out that there are many numbers missed in the output.
def isnot_prime(x):
if x == 1:
return True
if x == 2:
return False
for i in range(2, int(x**0.5)+1):
if x % i == 0:
return True
else:
return False
print filter(isnot_prime, range(1,101))
The output is [1, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100].
There must be something wrong with the algorithm. How can I improve it?
Thank you.
Modify your for to this:
for i in range(2, int(round(x**0.5 + 1))+1):
if x % i == 0:
return True
Remove the else: and remember that int(float) just takes the integral part (it does not round).
Also, keep in mind that there are faster algorithms to do this. For example the Sieve of Eratosthenes is a fast and simple algorithm.
I would do it this way :
print filter(lambda x: len(['' for y in range(2,x) if x%y==0])==0,range(1,101) )

Errors when modifying a dictionary in python

Suppose I want to create a dictionary that maps digits to numbers less than 100 ending in those digits as follows:
d = {}
for i in range(100):
r = i % 10
if r in d:
d[r] = d[r].append(i)
else:
d[r] = [i]
print d
First of all, when i is 20, d[r] is apparently a NoneType when I try to append to it, throwing an error. Why would this be? Secondly, I feel like my approach is inefficient, as the work in checking if r in d isn't propagated. Something like this would be better, I feel:
case(d[r]) of
SOME(L) => d[r] = L.append(i)
| NONE => d[r] = [i]
Is there a way to have that logic in python?
First of all, when i is 20, d[r] is apparently a NoneType when I try to append to it, throwing an error. Why would this be?
This is because the following code is wrong:
d[r] = d[r].append(i)
.append modifies the list as a side effect, and returns None. So after the list is appended to, it gets thrown away and replaced with the None value now being re-assigned into d[r].
Is there a way to have that logic in python?
There are a variety of hacks that can be used, but none of them are appropriate here.
Instead, solve the specific problem: "modify a dictionary value if present, or create a new value otherwise". This can be refined into "create an empty default value if absent, and then modify the value now guaranteed to be present".
You can do that using .setdefault, or more elegantly, you can replace the dictionary with a collections.defaultdict:
from collections import defaultdict
d = defaultdict(list)
for i in range(100):
r = i % 10
d[r].append(i)
Or you can solve the even more specific problem: "create a dictionary with a given pattern", i.e. from applying a rule or formula to an input sequence (in this case, the input is range(100):
from itertools import groupby
def last_digit(i): return i % 10
d = {k: list(v) for k, v in groupby(sorted(range(100), key=last_digit), last_digit)}
Or you can solve the even more specific problem, by taking advantage of the fact that range takes another argument to specify a step size:
d = {i: range(i, 100, 10) for i in range(10)}
With Andrew's suggestion to use d[r].append(i), you get the desired answer:
In [3]: d
Out[3]:
{0: [0, 10, 20, 30, 40, 50, 60, 70, 80, 90],
1: [1, 11, 21, 31, 41, 51, 61, 71, 81, 91],
2: [2, 12, 22, 32, 42, 52, 62, 72, 82, 92],
3: [3, 13, 23, 33, 43, 53, 63, 73, 83, 93],
4: [4, 14, 24, 34, 44, 54, 64, 74, 84, 94],
5: [5, 15, 25, 35, 45, 55, 65, 75, 85, 95],
6: [6, 16, 26, 36, 46, 56, 66, 76, 86, 96],
7: [7, 17, 27, 37, 47, 57, 67, 77, 87, 97],
8: [8, 18, 28, 38, 48, 58, 68, 78, 88, 98],
9: [9, 19, 29, 39, 49, 59, 69, 79, 89, 99]}
You could do this:
In [7]: for onesdigit in range(10):
...: d[onesdigit] = range(onesdigit, 100, 10)

Categories

Resources