Assuming that I have a List data strecture: list.
And I see one code: list[:,0]>5
I don't know what it means? But I know what list[:,0] means.
I google it and read many in python.org but I can't acquire appropriate answer.
I realized it's a very simple thing:
list > 5 compares every elements of list with 5, if it is larger than 5 the result is True, else False.
So if
list=[1,2,6]
list > 5
# -> [False, False, True].
Related
When I have a list containing different values, I tried to check the type of each value and got an unexpected output. It was the Booleans True and False which threw me off. Follows the code and the results.
trial_set = {1,2,3,4,None, True, False}
new_list = [type(x) for x in trial_set]
new_set = {type(x) for x in trial_set}
print(new_list)
print(new_set)
Strange Boolean Type Behaviour
Questions:
Why does the datatype appear first in the output though they are the last two elements in the list?
Why is there only one datatype in output when I have two Booleans in the list?
I understand that Bool datatype is a substring of integer datatype and return 0 or 1 and I tried to figure this out from that angle but came up empty. Please help clarify.
Thank you in advance.
The result is because of the way Python sets work as well as how Booleans work under the hood. Since True == 1 in python, only a single occurrence of 1 or True will appear, in this case 1.
Also, sets in python are unordered so when you iterate it with a comprehension to make a list, the order is not guaranteed to be as you typed it.
This question already has answers here:
Does Python's `all` function use short circuit evaluation?
(4 answers)
Closed last year.
I think all will skip the rest of an array of boolean values as soon as it encounters a False. Can anyone please confirm the same?
array = [False, True, True ...(1000s of values of True)]
all(array)
The time complexity to run the above set of statements would/should be constant, i.e. O(1), right?
all will stop. Since it works with any arbitrary iterable, you can see this using a list iterator.
>>> i = iter([False, True, True])
>>> all(i)
False
>>> list(i)
[True, True]
If all hadn't stopped, list(i) would have returned the empty list.
Another way to see this is to see that all will terminate when given an infinite sequence.
>>> from itertools import repeat
>>> all(repeat(False))
False
all(repeat(True)), on the other hand, will never terminate.
Trying to create multiple lists that are dependent on the previous list.
So for example list 1 would read a specific file and return either a number or the boolean false based on a comparison.
The second list would then compare the number that appears in the same position as those in the previous list (if the value from the previous list was not false) and return the value or false based on the same comparison as the first list
I created a function that carries out these comparisons and creates a list
def generic_state_machine(file,obs_nums):
return file.ix[:,0][obs_nums] if file.ix[:,0][obs_nums] > 0.2 else False
Note: obs_nums looks at the position of the item in a list
I then created the lists that look at different files
session_to_leads = []
lead_to_opps = []
for i in range(1,len(a)):
session_to_leads.append(generic_state_machine(file=a,obs_nums=i))
lead_to_opps.append(generic_state_machine(file=b,obs_nums=i)) if session_to_leads != False else lead_to_opps.append(False)
Given
a = pd.DataFrame([0,0.9,0.6,0.7,0.8])
b = pd.DataFrame([0.7,0.51,0.3,0.7,0.2])
I managed to sort out the initial error I encountered, the only problem now is that list lead_to_opps is not dependent on session_to_leads so if there is a False value in position 1, lead_to_opps will not automatically return a False in the same position. So assuming that random.uniform(0,1) generates 0.5 all the time, this is my current outcome:
session_to_leads = [False,0.9,0.6,0.7,0.8]
lead_to_opps = [0.7,0.51,False,0.7,False]
whereas my desired outcome would be
session_to_leads = [False,0.9,0.6,0.7,0.8]
lead_to_opps = [False,0.51,False,0.7,False]
"During handling of the above exception, another exception occurred:"
This is not an error, this is basically "based on the previous error, this new error occurred.
Please post the error before this one, it will help a lot.
Also, I did not got what is [obs_nums]
It looks like
file.ix[:, 1][obs_nums]
Is the problem, assuming .ix behaves like .loc (it seems .ix is deprecated)
>>> help(pd.Dataframe.loc)
Allowed inputs are...
- A slice object with labels, e.g. 'a':'f'
warning:: Note that contrary to usual python slices,
**both** the start and the stop are included
It's a bit difficult to follow the indexing but do you need to slice at all? Would just:
file.loc[obs_nums]
return the number or Boolean you are looking for?
My function takes a number, and a list of numbers.
If 2 numbers in the list add up to the original number, in the form [Num1, Num2].
Now I don't want any "duplicates" i.e. I only want [4, -7] returned and not [4, -7], [-7, 4].
def pairs(n, num_list):
newest_list = []
for j in range(len(num_list)):
for i in range(len(num_list)-1):
if num_list[j] + num_list[i+1] == n:
newest_list.append([num_list[j], num_list[i+1]])
return newest_list
Now I'd like a hint rather than code posted, a simple.
My question is:
Do I have the ability to do that within my code, and if so, a hint would be great, or will I need to define another function to do that for me?
You definitely have the ability to do that within your code.
A hint to complete this would be to think about at what point in your code it makes sense to stop searching for further matches and to return what you've found. Let me know if that's too cryptic!
You can still do that in your current code by simply appending these two numbers into a Set. For more info, this will help you.
if you have 2 lists l1, and l2 where:
l1=[1,2]
l2=[2,1]
If you convert them to sets, you can compare them and they will evaluate to True if they have the same elements, no matter what the order is:
set(l1) == set(l2) # this evaluates to True
In your if condition, before appending the numbers, you can check if the set set([num_list[j], num_list[i+1]]) is already in newest_list.
I am tempted to write some code, but you said not to, so I'll leave it here :p
You can leave your code the way it is, but before you return the list, you can filter the list with a predicate that the pair [a,b] is only accepted if pair [b,a] is not in the list
When adding a pair [a, b] to the result list, sort the pair, then see if it's in the result list. If so, don't add it.
Also, consider using a Python set.
I have a field that contains an array with three values that are either set to null or true:
"evidence": [true, null, true]
I want to a make a query that will match a couple different combinations of these values such as:
"evidence": [true,true,null]
"evidence": [true,true,true]
I am currently doing this by doing two different queries. Can I match an array with a regex like:
"evidence": [true,true,/true|null/]
My attempts at doing this have returned zero results.
In your query object, you can refer to specific elements in your evidence array by their numeric index. If you combine this ability with an $in operator to match one of a set of values, you can do the query like this:
In the shell:
db.test.find({
'evidence.0': true,
'evidence.1': true,
'evidence.2': {$in: [true, null]}
})
In Python:
db.test.find({
'evidence.0': True,
'evidence.1': True,
'evidence.2': {'$in': [True, None]}
})
A simple way to do this is with set operations, as was noted. For a single field in that array, you can do something like:
container['evidence'][index] in set([True, False])
If you are trying to match all of the fields in the array, you could do something like making a list of those sets of acceptable values, then doing something like:
query = (set([True]), set([True, False]), set([None]))
all(x in query[i] for i, x in enumerate(container['evidence'][index]))
This construct will check each one in a row and if one fails, it will bail out with a False. Without more info about what you're trying to do, it's hard to improve on these general comments.