How to get value of dataframe? - python

I want to get value of dataframe for add to MySQL. This is my dataframe.
l_id = df['ID'].str.replace('PDF-', '').item()
print(type(l_id))
It show error like this.
ValueError: can only convert an array of size 1 to a Python scalar
If I not use .item() It cannot add to MySQL. How to get value of dataframe ?

Try using replace with nan instead of '' and remove nans and get actual item:
l_id = df['ID'].str.replace('PDF-', pd.np.nan).dropna().item()

There is no attribute .item() in dataframe, but you can do:
df = pd.DataFrame(['PDF-0A1','PDF-02B','PDF-03C'],columns=['ID']) #small dataframe to test
for ids in df.ID:
l_id = ids.replace('PDF-','')
print(l_id)
#0A1
#02B
#03C

Related

Python Pandas: How to find element in a column with a matching string object of other column

I have this string object:
code = '1002'
And i also have the following Pandas Data Frame:
pd.DataFrame({'Code':['1001','1002','1003','1004'],
'Place':['Chile','Peru','Colombia','Argentina']})
What i need is to match the code i hace as a stirng with the column 'Code' and get the element in the same row from the column 'Place'.
Try:
df = pd.DataFrame({'Code':['1001','1002','1003','1004'],
'Place':['Chile','Peru','Colombia','Argentina']})
code = '1002'
df.loc[df['Code'] == code, 'Place'].iloc[0]
Peru
Or
df[df['Code'] == code]['Place']
You can use set_index first, if you want to get the actual value:
>>> df.set_index('Code').loc['1002', 'Place']
'Peru'
basically to solve this u have to get the index of the code first, after converting data frame value into list, I think my following code might help u.
import pandas as pd
code = '1001'
df = pd.DataFrame({'Code':['1001','1002','1003','1004'],
'Place':['Chile','Peru','Colombia','Argentina']})
code_index = list(df['Code']).index(code)
print(code_index)
area = df['Place'][code_index]
print(area)

Creating new column in Pandas based on values from another column

The task is the following:
Add a new column to df called income10. It should contain the same
values as income with all 0 values replaced with 1.
I have tried the following code:
df['income10'] = np.where(df['income']==0, df['income10'],1)
but I keep getting an error:
You can apply a function on each value in your column:
df["a"] = df.a.apply(lambda x: 1 if x == 0 else x)
You are trying to reference a column which does not exist yet.
df['income10'] = np.where(df['income']==0, ===>**df['income10']**,1)
In your np.where, you need to reference the column where the values originate. Try this instead
df['income10'] = np.where(df['income']==0, 1, df['income'])
Edit: corrected order of arguments

Store Value From df to Variable

I am trying to extract a value out of a dataframe and put it into a variable. Then later I will record that value into an Excel workbook.
First I run a SQL query and store into a df:
df = pd.read_sql(strSQL, conn)
I am looping through another list of items and looking them up in the df. They are connected by MMString in the df and MMConcat from the list of items I'm looping through.
dftemp = df.loc[df['MMString'] == MMConcat]
Category = dftemp['CategoryName'].item()
I get the following error at the last line of code above. ValueError: can only convert an array of size 1 to a Python scalar
In the debug console, when I run that last line of code but not store it to a variable, I get what looks like a string value. For example, 'Pickup Truck'.
How can I simply store the value that I'm looking up in the df to a variable?
Index by row and column with loc to return a series, then extract the first value via iat:
Category = df.loc[df['MMString'] == MMConcat, 'CategoryName'].iat[0]
Alternatively, get the first value from the NumPy array representation:
Category = df.loc[df['MMString'] == MMConcat, 'CategoryName'].values[0]
The docs aren't helpful, but pd.Series.item just calls np.ndarray.item and only works for a series with one value:
pd.Series([1]).item() # 1
pd.Series([1, 2]).item() # ValueError: can only convert an array of size 1

ValueError: Cannot set a frame with no defined index and a value that cannot be converted to a Series

I'm using Pandas 0.20.3 in my python 3.X. I want to add one column in a pandas data frame from another pandas data frame. Both the data frame contains 51 rows. So I used following code:
class_df['phone']=group['phone'].values
I got following error message:
ValueError: Cannot set a frame with no defined index and a value that cannot be converted to a Series
class_df.dtypes gives me:
Group_ID object
YEAR object
Terget object
phone object
age object
and type(group['phone']) returns pandas.core.series.Series
Can you suggest me what changes I need to do to remove this error?
The first 5 rows of group['phone'] are given below:
0 [735015372, 72151508105, 7217511580, 721150431...
1 []
2 [735152771, 7351515043, 7115380870, 7115427...
3 [7111332015, 73140214, 737443075, 7110815115...
4 [718218718, 718221342, 73551401, 71811507...
Name: phoen, dtype: object
In most cases, this error comes when you return an empty dataframe. The best approach that worked for me was to check if the dataframe is empty first before using apply()
if len(df) != 0:
df['indicator'] = df.apply(assign_indicator, axis=1)
You have a column of ragged lists. Your only option is to assign a list of lists, and not an array of lists (which is what .value gives).
class_df['phone'] = group['phone'].tolist()
The error of the Question-Headline
"ValueError: Cannot set a frame with no defined index and a value that cannot be converted to a Series"
might as well occur if for what ever reason the table does not have any rows.
Instead of using an if-statement, you can use set result_type argument of apply() function to "reduce".
df['new_column'] = df.apply(func, axis=1, result_type='reduce')
The data assigned to a column in the DataFrame must be a single dimension array. For example, consider a num_arr to be added to a DataFrame
num_arr.shape
(1, 126)
For this num_arr to be added to a DataFrame column, It should be reshaped....
num_arr = num_arr.reshape(-1, )
num_arr.shape
(126,)
Now I could set this arr as a DataFrame column
df = pd.DataFrame()
df['numbers'] = num_arr

How to check if Pandas value is null or zero using Python

I have a data frame created with Pandas that contains numbers. I need to check if the values that I extract from this data frame are nulls or zeros. So I am trying the following:
a = df.ix[[0], ['Column Title']].values
if a != 0 or not math.isnan(float(a)):
print "It is neither a zero nor null"
While it does appear to work, sometimes I get the following error:
TypeError: don't know how to convert scalar number to float
What am I doing wrong?
your code to extract a single value from a series will return list of list format with a single value:
For Example: [[1]]
so try changing your code
a = df.ix[[0], ['Column Title']].values
to
a = df.ix[0, 'Column Title']
then try
math.isnan(float(a))
this will work!!

Categories

Resources