Assuming that I have a List data strecture: list.
And I see one code: list[:,0]>5
I don't know what it means? But I know what list[:,0] means.
I google it and read many in python.org but I can't acquire appropriate answer.
I realized it's a very simple thing:
list > 5 compares every elements of list with 5, if it is larger than 5 the result is True, else False.
So if
list=[1,2,6]
list > 5
# -> [False, False, True].
I have a list of columns in a dataframe that either contains a hashmark followed by a string or two hashmarks followed by a string. I wanted to eliminate the rows that contain only one hashmark.
df[df["column name"].str.contains("#") == False]
I've tried using the code above but it erased the entire column. I hoped that it would erase only the rows including only one hashmark. I do not know what to do.
can you try this:
df['len']=df['column name'].str.count('#') #how many "#" expressions are in the column.
df=df[df["len"]>1]
#or one line
df=df[df['column name'].str.count('#')>1]
if each of them have at least one '#' , and its either ## or #,
df[df["column name"].str.contains("##") == False]
above code will get you one # ones.
df[df["column name"].str.contains("##") == True]
above code will eliminate #'s and get you ## ones.
in a given dataframe in pandas, is there a way to see all the Booleans present in filt in the code below:
filt = dataframe['tag1'] =='ABC'
filt
TLDR
It's possible. I think you should use indexing, it's extensively described here. To be more specific you can use boolean indexing.
Code should look like this
filt = df[df.loc[:,"tag1"] == 'ABC]
Now what actually happens here
df.loc[:,"tag1"] returns all rows : character, but limits columns to just "tag1". Next df.loc[:,"tag1"] == 'ABC comperes returned rows with value "ABC", as the result grid of True/False will be created. True row was equal to "ABC" etc. Now the grand final. Whenever you pass grid of logical values to an dataframe they are treated as indicators whether or not to include the result. So let's say value at [0,0] in passed grid is True, therefore it will be included in the result.
I understand it's hard to wrap one's head around this concept but once you get it it's super useful. The best is to just play around with this iloc[] and loc[] functions.
I'm doing data profiling. I want to extract only the distinct values and values that are not null in Python. I have tried creating open lists and appending all new values to the list but that was completely unsuccessful.
assuming you have your values in a list.
(b :=set(List)).remove(None)
you can do set(List) to produce a Mathematical set which is basically a list without repeated values. and then Set.remove(None) to get rid of null values.
I would suggest:
filtered_list = set([value for value in your_list if value != '' and value != None])
"set" is a class which is unordered and only accepts unique variables. I do not have enough information on what your null/empty values look like but you can easily set up the conditions as I did in the example above.
The answer here gives a handwaving reference to cases where you'd want __ne__ to return something other than just the logical inverse of __eq__, but I can't imagine any such case. Any examples?
SQLAlchemy is a great example. For the uninitiated, SQLAlchemy is a ORM and uses Python expression to generate SQL statements. In a expression such as
meta.Session.query(model.Theme).filter(model.Theme.id == model.Vote.post_id)
the model.Theme.id == model.VoteWarn.post_id does not return a boolean, but a object that eventually produces a SQL query like WHERE theme.id = vote.post_id. The inverse would produce something like WHERE theme.id <> vote.post_id so both methods need to be defined.
Some libraries do fancy things and don't return a bool from these operations. For example, with numpy:
>>> import numpy as np
>>> np.array([1,2,5,4,3,4,5,4,4])==4
array([False, False, False, True, False, True, False, True, True], dtype=bool)
>>> np.array([1,2,5,4,3,4,5,4,4])!=4
array([ True, True, True, False, True, False, True, False, False], dtype=bool)
When you compare an array to a single value or another array you get back an array of bools of the results of comparing the corresponding elements. You couldn't do this if x!=y was simply equivalent to not (x==y).
More generally, in many valued logic systems, equals and not equals are not necessarily exact inverses of each other.
The obvious example is SQL where True == True, False == False and Null != Null. Although I don't know if there are any specific Python examples I can imagine it being implemented in places.