Match all fields of an array using a regex - python

I have a field that contains an array with three values that are either set to null or true:
"evidence": [true, null, true]
I want to a make a query that will match a couple different combinations of these values such as:
"evidence": [true,true,null]
"evidence": [true,true,true]
I am currently doing this by doing two different queries. Can I match an array with a regex like:
"evidence": [true,true,/true|null/]
My attempts at doing this have returned zero results.

In your query object, you can refer to specific elements in your evidence array by their numeric index. If you combine this ability with an $in operator to match one of a set of values, you can do the query like this:
In the shell:
db.test.find({
'evidence.0': true,
'evidence.1': true,
'evidence.2': {$in: [true, null]}
})
In Python:
db.test.find({
'evidence.0': True,
'evidence.1': True,
'evidence.2': {'$in': [True, None]}
})

A simple way to do this is with set operations, as was noted. For a single field in that array, you can do something like:
container['evidence'][index] in set([True, False])
If you are trying to match all of the fields in the array, you could do something like making a list of those sets of acceptable values, then doing something like:
query = (set([True]), set([True, False]), set([None]))
all(x in query[i] for i, x in enumerate(container['evidence'][index]))
This construct will check each one in a row and if one fails, it will bail out with a False. Without more info about what you're trying to do, it's hard to improve on these general comments.

Related

Comparison Operation of List Data Structure on Python

Assuming that I have a List data strecture: list.
And I see one code: list[:,0]>5
I don't know what it means? But I know what list[:,0] means.
I google it and read many in python.org but I can't acquire appropriate answer.
I realized it's a very simple thing:
list > 5 compares every elements of list with 5, if it is larger than 5 the result is True, else False.
So if
list=[1,2,6]
list > 5
# -> [False, False, True].

Deleteing rows in a pandas dataframe if it contains a certain string

I have a list of columns in a dataframe that either contains a hashmark followed by a string or two hashmarks followed by a string. I wanted to eliminate the rows that contain only one hashmark.
df[df["column name"].str.contains("#") == False]
I've tried using the code above but it erased the entire column. I hoped that it would erase only the rows including only one hashmark. I do not know what to do.
can you try this:
df['len']=df['column name'].str.count('#') #how many "#" expressions are in the column.
df=df[df["len"]>1]
#or one line
df=df[df['column name'].str.count('#')>1]
if each of them have at least one '#' , and its either ## or #,
df[df["column name"].str.contains("##") == False]
above code will get you one #  ones.
df[df["column name"].str.contains("##") == True]
above code will eliminate #'s and get you ## ones.

How to show all rows of a filter containing Boolean values

in a given dataframe in pandas, is there a way to see all the Booleans present in filt in the code below:
filt = dataframe['tag1'] =='ABC'
filt
TLDR
It's possible. I think you should use indexing, it's extensively described here. To be more specific you can use boolean indexing.
Code should look like this
filt = df[df.loc[:,"tag1"] == 'ABC]
Now what actually happens here
df.loc[:,"tag1"] returns all rows : character, but limits columns to just "tag1". Next df.loc[:,"tag1"] == 'ABC comperes returned rows with value "ABC", as the result grid of True/False will be created. True row was equal to "ABC" etc. Now the grand final. Whenever you pass grid of logical values to an dataframe they are treated as indicators whether or not to include the result. So let's say value at [0,0] in passed grid is True, therefore it will be included in the result.
I understand it's hard to wrap one's head around this concept but once you get it it's super useful. The best is to just play around with this iloc[] and loc[] functions.

Is there a script for extracting distinct and not null values in python?

I'm doing data profiling. I want to extract only the distinct values and values that are not null in Python. I have tried creating open lists and appending all new values to the list but that was completely unsuccessful.
assuming you have your values in a list.
(b :=set(List)).remove(None)
you can do set(List) to produce a Mathematical set which is basically a list without repeated values. and then Set.remove(None) to get rid of null values.
I would suggest:
filtered_list = set([value for value in your_list if value != '' and value != None])
"set" is a class which is unordered and only accepts unique variables. I do not have enough information on what your null/empty values look like but you can easily set up the conditions as I did in the example above.

Why does Python have an __ne__ operator method instead of just __eq__?

The answer here gives a handwaving reference to cases where you'd want __ne__ to return something other than just the logical inverse of __eq__, but I can't imagine any such case. Any examples?
SQLAlchemy is a great example. For the uninitiated, SQLAlchemy is a ORM and uses Python expression to generate SQL statements. In a expression such as
meta.Session.query(model.Theme).filter(model.Theme.id == model.Vote.post_id)
the model.Theme.id == model.VoteWarn.post_id does not return a boolean, but a object that eventually produces a SQL query like WHERE theme.id = vote.post_id. The inverse would produce something like WHERE theme.id <> vote.post_id so both methods need to be defined.
Some libraries do fancy things and don't return a bool from these operations. For example, with numpy:
>>> import numpy as np
>>> np.array([1,2,5,4,3,4,5,4,4])==4
array([False, False, False, True, False, True, False, True, True], dtype=bool)
>>> np.array([1,2,5,4,3,4,5,4,4])!=4
array([ True, True, True, False, True, False, True, False, False], dtype=bool)
When you compare an array to a single value or another array you get back an array of bools of the results of comparing the corresponding elements. You couldn't do this if x!=y was simply equivalent to not (x==y).
More generally, in many valued logic systems, equals and not equals are not necessarily exact inverses of each other.
The obvious example is SQL where True == True, False == False and Null != Null. Although I don't know if there are any specific Python examples I can imagine it being implemented in places.

Categories

Resources