How to shuffle my array with no repetitions? - python

I'm trying to create an array of 256 stimuli that represents the frequency value to input into my sound stimuli. So far I have created an array of 4 numbers representing the 4 different frequency levels for my audio tones:
#Pitch list - create an array from 1 to 4 repeated for 256 stimuli
pitch_list = [1,2,3,4]
new_pitch_list = np.repeat(pitch_list,64)
random.shuffle(new_pitch_list)
print(new_pitch_list)
#Replace 1-4 integers in new_pitch_list with frequency values
for x in range(0,len(new_pitch_list)):
if new_pitch_list[x] == 1:
new_pitch_list[x] = 500
elif new_pitch_list[x] == 2:
new_pitch_list[x] = 62
elif new_pitch_list[x] == 3:
new_pitch_list[x] = 750
else:
new_pitch_list[x] == 4
new_pitch_list[x] = 875
My code works for randomly producing an array of 256 numbers of which there are 4 possibilities (500, 625, 750, 875). However, my problem is that I need to create the new_pitch_list so there are no repetitions of 2 numbers. I need to do this so the frequency of the audio tones isn't the same for consecutive audio tones.
I understand that I may need to change the way I use the random.shuffle function, however, I'm not sure if I also need to change my for loop as well to make this work.
So far I have tried to replace the random.shuffle function with the random.choice function, but I'm not sure if I'm going in the wrong direction.Because I'm still fairly new to Python coding, I'm not sure if I can solve this problem without having to change my for loop, so any help would be greatly appreciated!

I would make it so that you populate your array with 3 of your 4 values, and then each time you see consecutive duplicate values you replace the second one with the 4th value. Something like this (untested, but you get the gist).
Also - I'd cut out some of the lines you don't need:
new_pitch_list = np.repeat([500, 62, 750],64)
random.shuffle(new_pitch_list)
print(new_pitch_list)
#Replace 1-4 integers in new_pitch_list with frequency values
for x in range(1,len(new_pitch_list)):
if(new_pitch_list[x-1] == new_pitch_list[x]):
new_pitch_list[x] = 875

After you assign each value, remove that value from the list of choices, and use random.choice().
pitches = [600, 62, 750, 875]
last_pitch = random.choice(pitches)
new_pitch_list = [last_pitch]
for _ in range(255):
pitches.remove(last_pitch)
pitch = random.choice(pitches)
new_pitch_list.append(pitch)
pitches.append(last_pitch)

Related

How to compare specific elements in 2D array in Python

I'm trying to compare these specific elements to find the highest number:
Q_2 = [[5,0,41],[6,3,5],[7,4,3],[8,5,40]]
This is my 2d array in Python I want to compare Q_2[i][2] with each other the example is that number 41 gets compared to 5 and 3 and 40 and the result is the highest number.
I came up with 2 ways:
I store the Q_2[i][2] of the every item to a new list (which I don't know why it wont)
Or I do a loop to compare them
from array import *
#These 2 are used to define columns and rows in for other 2d arrays (All arrays have same column and row)
n = int(3)
m = int(input("Enter number of processes \n")) #I always type 4 for this variable
Q_2 = [[5,0,41],[6,3,5],[7,4,3],[8,5,40]]
for i in range(m):
for j in range(1,3,1):
if(Q_2[i][2]>=Q_2[j][2]:
Max_Timers = Q_2[i]
print(Max_Timers) #to check if the true value is returned or not
The result it returns is 40
This worked when the 0 index was lower then others but once I changed the first one to 41 it no longer works
there is no need of two 'for' loops, as your are after just one element from a 2D array rather than complete 1D array.
This is a working code on 2D array, to get the 1D array that got the highest element on index-2:
max_index = 0
Q_2 = [[5,0,41],[6,3,5],[7,4,3],[8,5,40]]
for i in range(len(Q_2)):
if(Q_2[i][2]>=Q_2[max_index][2]):
max_index = i
print(Q_2[max_index])
The reason your code doesn't work can be easily found out by working out a dry run using the values you are using.
According to your logic,
number 41 gets compared to 5 and 3 and 40 and the result is the highest number
and this is achieved by the two for loops with i and j. But what you overlooked was that the max-value that you calculated was just between the current values and so, you are saving the max value for the current iteration only. So, instead of the Global maximum, only the local/current maximum was being stored.
A quick dry-run of your code (I modified a few lines for the dry-run but with no change in logic) will work like this:
41 is compared to 5 (Max_Timers = 41)
41 is compared to 3 (Max_Timers = 41)
5 is compared to 3 (Max_Timers = 5)
40 is compared to 5 (Max_Timers = 40)
40 is compared to 3 (Max_Timers = 40)
>>> print(Max_timers)
40
So, this is the reason you are getting 40 as a result.
The solution to this is perfectly mentioned by #Rajani B's post, which depicts a global comparison by storing the max value and using that itself for comparison.
Note: I didn't mention this above but even when you used a 2nd for loop, which was already not required, there was an even less reason for you to use a range(1,3,1) in the 2nd loop. As you can see in the dry-run, this resulted in skipping a few of the checks that you probably intended.
You can use numpy to significantly simplify this task and for performance as well. Convert your list into an np.array(), then select the column of interest, which is Q_2[:,2] and apply numpy's .max() method.
import numpy as np
Q_2 = [[5,0,41],[6,3,5],[7,4,3],[8,5,40]]
Q_2 = np.array(Q_2)
mx2 = Q_2[:,2].max()
which gives the desired output:
print(mx2)
41

compare values (strings) within one column of a dataframe

I have a dataframe with twelve rows and two columns, one with the length of a sound (400ms and 600ms) and one with the direction of a sound (up and down). Right now, I'm only interested in the second column called 'direction'.
I want to randomize the values in the column and then compare one value with the preceding value (so n and n-1). For every two adjacent identical values I want my counter_c to add 1, for every two adjacent non-identical values I want my counter_f to add 1.
In the end, I want to have a randomized order with counter_c = 5 and counter_f = 6.
I'm fairly new to Python so I tried to keep it really simple. The code itself works but the randomized order it gives me does not meet my conditions. I'm not sure what the problem is, does anyone have an idea?
import pandas as pd
sounds = pd.read_excel('sounds.xlsx')
counter_c = 0
counter_f = 0
while counter_c < 6 and counter_f < 7:
sounds_rand = sounds.sample(frac = 1)
print(sounds_rand)
for x in range(len([sounds_rand[['direction']]])):
if x == x - 1:
counter_c += 1
else:
counter_f += 1
that's what the excel file looks like (two columns 'name' and 'direction' with 12 rows
A few notes on your problem:
Your algorithm is a hard way to find such distribution. Also, notice that it will not work for any distribution of "up"s and "down"s that you will give to it. If you have the distribution of 10 "up"s and 2 "down"s, for example, it will never converge. I would say that will hardly work for any.
Your loop is not reflecting what you have described in terms of comparison. Notice that x is taking the value of the list generate by the range and not by the elements inside of sound_rand. You can fix it using the code below (I have used a loop because you said that you are learning python, but there are many ways to do it more properly using vectorization):
counter_f, counter_c = 0,0
while counter_c < 6 and counter_f < 7:
sounds_rand = df.sample(frac = 1).reset_index(drop = True) # In the coming steps I will operate
# using the index of this data set, hence I reset it to follow a proper sorting.
print(sounds_rand)
counter_f, counter_c = 0,0
for index in sounds_rand.index:
if index == 0: # We cannot act when index = 0 otherwise we may operate on [-1]
continue
else:
if sounds_rand['direction'][index] == sounds_rand['direction'][index - 1]:
counter_c += 1
else:
counter_f += 1
print(counter_c, counter_f)

Conditional list generation from two different lists in Python

Apologies in advance as I am quite new to this all and I'm not certain on the correct terms for everything.
I have a list of differences between two lists, called Change. I would like to generate a list of percentage changes Percentage by dividing each value in Change by the highest of the respective values in the lists it references (Before and After). If the wrong value is used, a percentage change is showing at -660% in some cases.
Before and After are generated from image files through PIL, but a small section of the output is below.
Before = [135,160,199,]
After = [146,174,176,]
Change = list(After-Before for Before,After in zip(Before,After))
PercentageA = list(Change/After*100 for After,Change in zip(After,Change))
PercentageB = list(Change/Before*100 for Before,Change in zip(Before,Change))
for x in Change:
if x <0:
Percentage = PercentageA
else:
Percentage = PercentageB
print(Percentage)
This code generates:
In [1]:
[8.148148148148149, 8.75, -11.557788944723619]
[8.148148148148149, 8.75, -11.557788944723619]
[7.534246575342466, 8.045977011494253, -13.068181818181818]
However, the result should be:
[7.534246575342466, 8.045977011494253, -11.557788944723619]
My main question is; how can I generate a Percentage list from Change divided by the highest of Before and After for each value in the list?
Edit: Reduced and generalised some of the code, and deleted background. Apologies for putting too much background in.
Edit2: Apologies for misunderstanding what was asked of me.
You are calculating something wrong - this works:
# list must be same lenght for zip() to work
before = [135, 199]
after = [146, 176]
change = [ 11, -23]
print(f"{'before':<10} {'after':<10} {'percent':<10} {'change':<10}")
for b,a,c in zip(before, after, change):
print (f'{b:<10} {a:<10} {c/max(b,a)*100:<10.6} {c:<10}')
Output:
before after percent change
135 146 7.53425 11
199 176 -11.5578 -23
Zipping 3 lists creates tuples from each list that this code iterates over. See f.e. The zip() function in Python 3

enumerate in dictionary loop take long time how to improv the speed

I am using python-3.x and I would like to speed my code where in every loop, I am creating new values and I checked if they exist or not in the dictionary by using the (check if) then I will keep the index where it is found if it exists in the dictionary. I am using the enumerate but it takes a long time and it very clear way. is there any way to speed my code by using another way or in my case the enumerate is the only way I need to work with? I am not sure in my case using numpy will be better.
Here is my code:
# import numpy
import numpy as np
# my first array
my_array_1 = np.random.choice ( np.linspace ( -1000 , 1000 , 2 ** 8 ) , size = ( 100 , 3 ) , replace = True )
my_array_1 = np.array(my_array_1)
# here I want to find the unique values from my_array_1
indx = np.unique(my_array_1, return_index=True, return_counts= True,axis=0)
#then saved the result to dictionary
dic_t= {"my_array_uniq":indx[0], # unique values in my_array_1
"counts":indx[2]} # how many times this unique element appear on my_array_1
# here I want to create random array 100 times
for i in range (100):
print (i)
# my 2nd array
my_array_2 = np.random.choice ( np.linspace ( -1000 , 1000 , 2 ** 8 ) , size = ( 100 , 3 ) , replace = True )
my_array_2 = np.array(my_array_2)
# I would like to check if the values in my_array_2 exists or not in the dictionary (my_array_uniq":indx[0])
# if it exists then I want to hold the index number of that value in the dictionary and
# add 1 to the dic_t["counts"], which mean this value appear agin and cunt how many.
# if not exists, then add this value to the dic (my_array_uniq":indx[0])
# also add 1 to the dic_t["counts"]
for i, a in enumerate(my_array_2):
ix = [k for k,j in enumerate(dic_t["my_array_uniq"]) if (a == j).all()]
if ix:
print (50*"*", i, "Yes", "at", ix[0])
dic_t["counts"][ix[0]] +=1
else:
# print (50*"*", i, "No")
dic_t["counts"] = np.hstack((dic_t["counts"],1))
dic_t["my_array_uniq"] = np.vstack((dic_t["my_array_uniq"], my_array_2[i]))
explanation:
1- I will create an initial array.
2- then I want to find the unique values, index and count from an initial array by using (np.unique).
3- saved the result to the dictionary (dic_t)
4- Then I want to start the loop by creating random values 100 times.
5- I would like to check if this random values in my_array_2 exist or not in the dictionary (my_array_uniq":indx[0])
6- if one of them exists then I want to hold the index number of that value in the dictionary.
7 - add 1 to the dic_t["counts"], which mean this value appears again and count how many.
8- if not exists, then add this value to the dic as new unique value (my_array_uniq":indx[0])
9 - also add 1 to the dic_t["counts"]
So from what I can see you are
Creating 256 random numbers from a linear distribution of numbers between -1000 and 1000
Generating 100 triplets from those (it could be fewer than 100 due to unique but with overwhelming probability it will be exactly 100)
Then doing pretty much the same thing 100 times and each time checking for each of the triplets in the new list whether they exist in the old list.
You're then trying to get a count of how often each element occurs.
I'm wondering why you're trying to do this, because it doesn't make much sense to me, but I'll give a few pointers:
There's no reason to make a dictionary dic_t if you're only going to hold to objects in it, just use two variables my_array_uniq and counts
You're dealing with triplets of floating point numbers. In the given range, that should give you about 10^48 different possible triplets (I may be wrong on the exact number but it's an absurdly large number either way). The way you're generating them does reduce the total phase-space a fair bit, but nowhere near enough. The probability of finding identical ones is very very low.
If you have a set of objects (in this case number triplets) and you want to determine whether you have seen a given one before, you want to use sets. Sets can only contain immutable objects, so you want to turn your triplets into tuples. Determining whether a given triplet is already contained in your set is then an O(1) operation.
For counting the number of occurences of sth, collections.Counter is the natural datastructure to use.

Weighted random numbers in Python from a list of values

I am trying to create a list of 10,000 random numbers between 1 and 1000. But I want 80-85% of the numbers to be the same category( I mean some 100 numbers out of these should appear 80% of the times in the list of random numbers) and the rest appear around 15-20% of the times. Any idea if this can be done in Python/NumPy/SciPy. Thanks.
This can be easily done using 1 call to random.randint() to select a list and another call to random.choice() on the correct list. I'll assume list frequent contain 100 elements which are to be chose 80 percent times and rare contains 900 elements to be chose 20 percent times.
import random
a = random.randint(1,5)
if a == 1:
# Case for rare numbers
choice = random.choice(rare)
else:
# case for frequent numbers
choice = random.choice(frequent)
Here's an approach -
a = np.arange(1,1001) # Input array to extract numbers from
# Select 100 random unique numbers from input array and get also store leftovers
p1 = np.random.choice(a,size=100,replace=0)
p2 = np.setdiff1d(a,p1)
# Get random indices for indexing into p1 and p2
p1_idx = np.random.randint(0,p1.size,(8000))
p2_idx = np.random.randint(0,p2.size,(2000))
# Index and concatenate and randomize their positions
out = np.random.permutation(np.hstack((p1[p1_idx], p2[p2_idx])))
Let's verify after run -
In [78]: np.in1d(out, p1).sum()
Out[78]: 8000
In [79]: np.in1d(out, p2).sum()
Out[79]: 2000

Categories

Resources