Map integers to specific strings in a Pandas column [duplicate] - python

This question already has answers here:
Remap values in pandas column with a dict, preserve NaNs
(11 answers)
Closed 4 years ago.
A data frame in pandas with two columns "text", "Condition". In "Condition" column it contains numerical values for each row of "text". Values are 1,-1,0. I would like to convert these integer values to text labels, for instance, 1 for positive, -1 for negative and 0 for neutral. How can I achieve that?

If I understood you question correctly here are the following ways you can change your values.
Using Series.map:
df['condition'] = df['condition'].map({1:'positive', -1:'negative', 0:'neutral'})
Using Series.replace:
df['condition'] = df['condition'].replace({1:'positive', -1:'negative', 0:'neutral'}}

Related

convert specific columns values to column_names in pandas [duplicate]

This question already has answers here:
How can I pivot a dataframe?
(5 answers)
How to pivot a dataframe in Pandas? [duplicate]
(2 answers)
Closed 1 year ago.
I have a dataframe like this:
index,col1,value
1,A,1
1,B,2
2,A,3
2,D,4
2,C,5
2,B,6
And I would like to convert this dataframe to this:
index,col1_A,col1_B,col1_C,col1_D
1,1,2,np.Nan,np.nan
2,3,4,5,6
The conversion is based on the index column value and for each unique index column, the column values from col1 is converted to column name and its associated value is set to the corresponding value available in value columns.
Currently my solution contains looping by creating subset of df as temporary df based on each index and then starting looping there. I am wondering if there is already builtin solution available for it in pandas. please feel free to suggest.

Pandas: How to replace values in a dataframe based on a conditional [duplicate]

This question already has answers here:
Pandas fill missing values in dataframe from another dataframe
(6 answers)
Closed 1 year ago.
I am trying to replace the values of my second dataframe ('area') with those of my first dataframe ('test').
Image of my inputs:
The catch is that I only want to replace the values that are not NaN, so, for example, area.iloc[0,1] will be "6643.68" rather than "3321.84" but area.iloc[-2,-1] will be "19.66" rather than "NaN". I would have thought I could do something like:
area.loc[test.notnull()] = test
or
area.replace(area.loc[test.notnull()], test.notnull())
But this gives me the error "Cannot index with multidimensional key". Any ideas? This should be simple.
Use fillna like:
area.fillna(test)

How do I convert strings in a column into numbers which I can use later, in a dataframe? [duplicate]

This question already has an answer here:
Factorize a column of strings in pandas
(1 answer)
Closed 2 years ago.
I have a dataset consisting of 382 rows and 4 columns. I need to convert all the names in to numbers. The names do repeat here and there, so I can't just randomly give numbers.
So, I made a dictionary of the names and the corresponding values. But now, I am not able to change the values in the column.
This is how I tried to add the values to the column:
test_df.replace(to_replace = d_loc,value = None, regex = True, inplace = True)
print(test_df)
but test_df just gives me the same dataframe, without any modifications.
What should I use? I have over 100 unique names, so I cannot mannually rename them.
df.applymap() works on each item in a dataframe:
test_df.applymap(lambda x: dict_to_replace[x])

how to view content of str contains bool array? [duplicate]

This question already has answers here:
select columns based on columns names containing a specific string in pandas
(4 answers)
Closed 4 years ago.
Due to large number of columns I was trying to search for column names where the column name has 'el' in it. This is the code I tried
combined.columns.str.contains("el")
But the result I'm getting is a Boolean array how can I view the column names that returned true from it.
Use boolean indexing with loc with : for select all rows:
combined.loc[:, combined.columns.str.contains("el")]
Or use DataFrame.filter:
combined.filter(like='el')

Selecting all numerical values in data-frame and converting it to int in panda [duplicate]

This question already has answers here:
error using astype when NaN exists in a dataframe
(2 answers)
Closed 4 years ago.
I have data like:
I want to make all numerical values to int, no decimal will be here.
like this:
I was using code like list:
df=pd.read_csv("file.csv")
df = df.astype('int64')
But it was not working. Because it was saying:
ValueError: Cannot convert NA to integer
Because, I have some nan value inside. There are string also in the column one row 2&3.
I think the solution could be, selecting all numerical values from the data frame and converting to int. Can you suggest anything?
There are some null values in your data frame. First, you need to covert them to an integer (for e.g. 0 here). use df.fillna for that.
Then select all numerical columns.
cols = ["num_col1", "num_col2"]
df[cols] = df[cols].astype("int64")

Categories

Resources