I have two arrays of temperature with different times!
x = array(['1999-03-06T12:00:00.000000000', '1999-03-07T12:00:00.000000000',
'1999-03-08T12:00:00.000000000', ..., '2021-10-09T12:00:00.000000000',
'2021-10-10T12:00:00.000000000', '2021-10-11T12:00:00.000000000'],
dtype='datetime64[ns]')
This one is daily from 1999 to 2021
y = array(['2002-07-22T09:03:54.000000000', '2004-11-03T12:36:57.000000000',
'2004-11-04T05:19:08.000000000', '2004-12-13T11:50:36.000000000',
'2005-06-07T13:16:41.000000000', '2006-07-12T12:31:53.000000000',
'2006-07-22T11:43:24.000000000', '2006-09-10T14:08:57.000000000',...]
This one is random from 2002 to 2021
I would like to know how can I select in x (daily) just the dates that contain in y
(So x and y will have the same dates)
You can try intersect1d method in numpy.
import numpy as np
array_common_dates = np.intersect1d(x, y)
For larger arrays, sorted lookup will be faster than linear lookup:
# if x is not sorted, fix it
ind = np.searchsorted(x, y)
mask = ind < x.size
mask[mask] &= y[mask] == x[ind[mask]]
The other advantage of this method is that it provides you with a two-way mapping only marked elements match:
y[mask] == x[ind[mask]]
Maybe you can do something like this
x = ['1999-03-06T12:00:00.000000000', '1999-03-07T12:00:00.000000000',
'1999-03-08T12:00:00.000000000', '2021-10-09T12:00:00.000000000',
'2021-10-10T12:00:00.000000000', '2021-10-11T12:00:00.000000000']
y = ['1999-03-06T12:00:00.000000001', '1999-03-07T12:00:00.00000004',
'1999-03-08T12:00:00.000000002', '2021-10-09T12:00:00.000001004',
'2021-10-10T12:00:00.000000003', '2021-10-11T12:00:00.000002004']
z = []
for i in x:
n = i.find('T')
z.append(i[:n])
new_list = []
for i in z:
for j in y:
if i in j:
new_list.append(j)
print(new_list)
Related
I'm trying to make a loop that finds distances of values of one list, to the values of another list.
The data itself is of varying dimensions in a coordinates layout. Here is an example
x = ['1.23568 1.589887', '1.989 1.689']
y = ['2.5689 1.5789', '2.898 2.656']
I would like to be able to make a separate list for each y value and its distance from each x value.
There are always more x values than y values.
This is what I have so far:
def distances_y(x,y):
for i in y:
ix = [i.split(' ',)[0] for i in y]
for z in x:
zx = [z.split('',1)[0] for z in x]
distances_1 = [zx - ix for z in x]
return distances_1
print(i +"_"+"list") = [distance_1]
But I'm stuck on how to create individual lists for each y value.
Each distance also needs to be a list in itself, a list in a list so to speak.
The largest problem is that I am unable to use packages besides tkinter for this.
Try using a dictionary instead:
def distances_y(x,y):
dct = {}
for i in y:
ix = [i.split(' ',)[0] for i in y]
for z in x:
zx = [z.split('',1)[0] for z in x]
distances_1 = [zx - ix for z in x]
return distances_1
dct[i +"_"+"list"] = [distance_1]
And to get the values, do:
print(dct)
And if you want to get a specific key name, try:
print(dct[<key name over here>])
And if you want a 2-d array:
for each x you would add
my2d.append([])
and for each y
my2d[i].append(x[i] - y[j])
Took a large data set, removed any numbers that are not within 2 SD from a specific column and created an array, now I want to remove any numbers not in array from columns without messing up index. Would preferably like to convert any non-present numbers as nan.
Code used to remove values outside of 2 SD:
pupil_area_array = numpy.array(part_data['pupil_area'])
mean = numpy.mean(part_data['pupil_area'], axis=0)
sd = numpy.std(part_data['pupil_area'], axis=0)
final_list = [x for x in part_data['pupil_area'] if (x > mean - 2 * sd)]
final_list = [x for x in final_list if (x < mean + 2 * sd)]
print(final_list)
If you are not restricted to using a generator, you should be able to use map() https://www.geeksforgeeks.org/python-map-function/:
def filter_sd(value):
if x > mean - 2 * sd:
return x
return None #or return 'Nan'
final = map(filter_sd, part_data['pupil_area'])
I apologise for the terrible description and if this is a duplicated, i have no idea how to phrase this question. Let me explain what i am trying to do. I have a list consisting of 0s and 1s that is 3600 elements long (1 hour time series data). i used itertools.groupby() to get a list of consecutive keys. I need (0,1) to be counted as (1,1), and be summed with the flanking tuples.
so
[(1,8),(0,9),(1,5),(0,1),(1,3),(0,3)]
becomes
[(1,8),(0,9),(1,5),(1,1),(1,3),(0,3)]
which should become
[(1,8),(0,9),(1,9),(0,3)]
right now, what i have is
def counter(file):
list1 = list(dict[file]) #make a list of the data currently working on
graph = dict.fromkeys(list(range(0,3601))) #make a graphing dict, x = key, y = value
for i in list(range(0,3601)):
graph[i] = 0 # set all the values/ y from none to 0
for i in list1:
graph[i] +=1 #populate the values in graphing dict
x,y = zip(*graph.items()) # unpack graphing dict into list, x = 0 to 3600 and y = time where it bite
z = [(x[0], len(list(x[1]))) for x in itertools.groupby(y)] #make a new list z where consecutive y is in format (value, count)
z[:] = [list(i) for i in z]
for i in z[:]:
if i == [0,1]:
i[0]=1
return(z)
dict is a dictionary where the keys are filenames and the values are a list of numbers to be used in the function counter(). and this gives me something like this but much longer
[[1,8],[0,9],[1,5], [1,1], [1,3],[0,3]]
edits:
solved it with the help of a friend,
while (0,1) in z:
idx=z.index((0,1))
if idx == len(z)-1:
break
z[idx] = (1,1+z[idx-1][1] + z[idx+1][1])
del z[idx+1]
del z[idx-1]
Not sure what exactly is that you need. But this is my best attempt of understanding it.
def do_stuff(original_input):
new_original = []
new_original.append(original_input[0])
for el in original_input[1:]:
if el == (0, 1):
el = (1, 1)
if el[0] != new_original[-1][0]:
new_original.append(el)
else:
(a, b) = new_original[-1]
new_original[-1] = (a, b + el[1])
return new_original
# check
print (do_stuff([(1,8),(0,9),(1,5),(0,1),(1,3),(0,3)]))
I've managed to create a for-loop, which provides me with the results that I want, but I'm struggling to collate these results into a single array, so that I can plot it as my x value on a graph.
I have considered collating them into a single list first (but am also struggling to do this).
I have also tried to append, extend, and stack the array below, but nothing seems to work.
When trying to append, I got an error message appears to say that there is not 'value' present.
a = 0.1
x = 0.2
for i in range(1,10):
a = a**3
x = x**2
array = np.array ([a, x])
print (array)
The code above provides 9 individual arrays, as opposed to just 1.
i.e. [(a1, x1), (a2, x2), ... (a9, x9)]
Any suggestions to fix this or alternative methods would be greatly appreciated! Thank you!
okk so you want to store both variable values in this pattern (a1,x1),(a2,x2)....
So this can be done in this way
like first suppose two separate list for a and x , and then merge them into the desired format
the whole code is shown here
import numpy as np
a = 0.1
x = 0.2
list1= []
list2=[]
for i in range(1,10):
a = a**3
x = x**2
list1.append(a)
list2.append(x)
merged_list = [(list1[i], list2[i]) for i in range(0, len(list1))]
print(merged_list)
this will give you the desired output . Thanks for asking
Use append to append value in list
a = 0.1
x = 0.2
array = []
for i in range(1,10):
a = a**3
x = x**2
array.append([a, x])
print(array)
If you want numpy.array
a = np.power(np.repeat(0.1, 10), 3)
x = np.power(np.repeat(0.2, 10), 2)
print(np.array(list(zip(a,x))))
Do you want to append multiple items to a list?
First solution:
l = []
for i in range(1,10):
a = a**3
x = x**2
l.extend([a, x])
print(l)
Second solution:
l = []
for i in range(1,10):
a = a**3
x = x**2
l+= [a, x]
print(l)
Do you want to append multiple items to a numpy array?
array = np.array([])
for i in range(1,10):
a = a**3
x = x**2
array = np.append(array, [a,x])
print(array)
I need to get a high correlation group from the correlation coefficient matrix, keep one of them and exclude the other。But I don't know how to do it gracefully and efficiently.
Here's a similar answer, but hopefully it will be done using a vector-like matrix.:
Merge arrays if they contain one or more of the same value
For example:
a = np.array([[1,0,0,0,0,1],
[0,1,0,1,0,0],
[0,0,1,0,1,1],
[0,1,0,1,0,0],
[0,0,1,0,1,0],
[1,0,1,0,0,1]])
Diagonal:
(0,0),(1,1),(2,2)...(5,5)
Other:
(0,5),(1,3),(2,4),(2,5)
These three pairs because each other contains merged into a group of:
(0,2,4,5) = (0,5),(2,4),(2,5)
So ultimately I need the output:
(I will use the results to index other data and therefore decide to keep the largest index in each group)
out = [(0,2,4,5),(1,3)]
I think the simplest approach is to take a nested loop and iterate through all the elements multiple times. I would like to have a more concise and efficient way to achieve, thank you
This is a loop implementation, I'm sorry to write it hard to see:
a = np.array([[1,0,0,0,0,1],
[0,1,0,1,0,0],
[0,0,1,0,1,1],
[0,1,0,1,0,0],
[0,0,1,0,1,0],
[1,0,1,0,0,1]])
a[np.tril_indices(6, -1)]= 0
a[np.diag_indices(6)] = 0
g = list(np.c_[np.where(a)])
p = {}; index = 1
while len(g)>0:
x = g.pop(0)
if not p:
p[index] = list(x)
for i,l in enumerate(g):
if np.in1d(l,x[0]).any()|np.in1d(l,x[1]).any():
n = list(g.pop(i))
p[index].extend(n)
else:
T = False
for key,v in p.items():
if np.in1d(v,x[0]).any()|np.in1d(v,x[1]).any():
v.extend(list(x))
T = True
if T==False:
index += 1; p[index] = list(x)
for i,l in enumerate(g):
if np.in1d(l,x[0]).any()|np.in1d(l,x[1]).any():
n = list(g.pop(i))
p[index].extend(n)
for key,v in p.items():
print key,np.unique(v)
out:
1 [0 2 4 5]
2 [1 3]
The central problem of merging / consolidating the pairs with common extrema can be solved using this answer.
Hence, the above code may be rewritten like:
a = np.array([[1,0,0,0,0,1],
[0,1,0,1,0,0],
[0,0,1,0,1,1],
[0,1,0,1,0,0],
[0,0,1,0,1,0],
[1,0,1,0,0,1]])
a[np.tril_indices(6, -1)]= 0
a[np.diag_indices(6)] = 0
g = np.c_[np.where(a)].tolist()
def consolidate(items):
items = [set(item.copy()) for item in items]
for i, x in enumerate(items):
for j, y in enumerate(items[i + 1:]):
if x & y:
items[i + j + 1] = x | y
items[i] = None
return [sorted(x) for x in items if x]
p = {i + 1: x for i, x in enumerate(sorted(consolidate(g)))}