Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I'd like to drop a '.' from a column name using regex, and want the code to be applied to many column names that end in '.', so that each pair of like-named columns can be merged into one.
For example, the column names 'Fund' and 'Fund.' are different and have different values, but should become just 'Fund'.
What would be the best regex to use for this?
Try this:
df = pd.DataFrame([1], columns=['Fund.'])
df.columns = df.columns.str.replace('.','')
Output:
print(df.columns)
Index(['Fund'], dtype='object')
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed yesterday.
Improve this question
Trying to remove the numbers and erroneous letters using replace in pandas. When I do this replace it pulls only that column from the dataframe I was working on. Is there any way to modify the one column but still have the other columns from the dataframe stay together?
The document I am working on has over 100,000 rows and not all of the values have the numbers I need to remove. It also has about 30 columns. I am sure there is a much better way but I am a new Python user and still learning.
I tried this:
df.UserName.replace(to_replace= "\d+", value = '', regex=True)
And I get:
So I just want to modify the one column to remove the numbers from the "UserName" column and keep the result of the columns/values as they were.
Thank you in advance for any help!
df = df.UserName.replace(to_replace= "\d+", value = '', regex=True)
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
Why is not filter the df like in the screenshot
Show the function of str.contain("|") that's didn't work
You need to set the regex parameter of contains to False. In your code now, the string "|" is parsed as a regular expression, but you want to match the literal pipe character.
movies_with_more_dir = movies_df[movies_df['director'].str.contains("|", regex=False)]
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
Write python code to find the total number of null values on the excel file without using isnull function ( you should use loop statement)
Using for statement in dataframes is not a good practice. If you're doing only for challenge, this could help:
data = pd.read_excel('data.xlsx')
data.fillna(False, inplace=True)
amount_nan = 0
for column in data:
for value in data[column]:
if not value:
amount_nan = amount_nan + 1
print(amount_nan)
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I would like to overwrite the matrix I have of dimensions n to matrix of dimensions m (n>m). Intuitive code like this does not work:
sigmaSmall = sigmaSmall.loc[indices, indices]
How can I do it in 1 line?
The 2nd dimension takes column names, not numbered indices.
So instead do:
sigmaSmall = sigmaSmall.loc[indices, sigmaSmall.columns[indices]]
Not knowing what your indicies are make it hard to tell, but it should look something like this
df = pd.DataFrame([[1,2,3],[1,2,3],[1,2,3]], columns=['a','b','c'])
df.loc[0:1, ['a','b']]
Where the second argument is the column names that you want to select
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
When working with pandas, I often use name based column indexing. E.g:
df = pd.DataFrame({"abc":[1,2], "bde":[3,4], "mde":[3,4]})
df[["mde","bde"]]
As I have longer column names it because easy for me to create a typo in the column names since they are strings and no code completion. It'd be great if I could do something like:
df.SelectColumnsByObjectAttributeNotString([df.mde, df.bde])
IIUC, you can use name attribute.
df = pd.DataFrame({"a":[1,2], "b":[3,4]})
columns = [df.a.name, df.b.name]
columns
['a', 'b']
I think you may be looking for:
df.columns.values.tolist()