Cant figure out a way to extract Invoice Number as it is instead of its exponential value from an excel sheet in python.
Expected Value : 8000030910
Result: 8.00002e+09
Python Result
Excel sheet
d1.append(sheet.cell_value(j,0))
This code shows the value in exponential format.
I have extracted excel into python using xlrd.
Try wrapping the value in the int() constructor, e. g. int(8000030910). In Python 3 integers do not have the length limit, so it will be just an integer, without any exponentials.
Related
I am trying to read a value from a cell which is located in a subcolumn of a big column. I want to read it by using the string labels and not the integers.
How can I access the value '%' of Product1 in 'Step1' using df.loc?
I am trying
df.loc['Step1', 'Product1','%']
but it doesnt function. Thank you for your help
This is the code I'm using and I have also tried converting my datatype of my columns which is object to float but I got this error
df = pd.read_csv('DDOSping.csv')
pearsoncorr = df.corr(method='pearson')
ValueError: could not convert string to float:
'172.27.224.251-172.27.224.250-56003-502-6'
Somewhere in your CSV this string value exists '172.27.224.251-172.27.224.250-56003-502-6'. Do you know why it's there? What does it represent? It looks to me like it shouldn't be in the data you include in your correlation matrix calculation.
The df.corr method is trying to convert the string value to a float, but it's obviously not possible to do because it's a big complicated string with various characters, not a regular number.
You should clean your CSV of unnecessary data (or make a copy and clean that so you don't lose anything important). Remove anything, like metadata, that isn't the exact data that df.corr needs, including the string in the error message.
If it's just a few values you need to clean then just open in excel or a text editor to do the cleaning. If it's a lot and all the irrelevant data to be removed is in specific rows and/or columns, you could just remove them from your DataFrame before calling 'df.corr' instead of cleaning the file itself.
I am using openpyxl and pandas to read values from an excel sheet into my python script. However, since there is a formula in some cells, it reads the 'formula' instead of the value.
How do I get the value of a cell instead of the formula of that cell?
For example, a cell with the formula - =CONCATENATE(A23," ","gold"), and its value is 'Ghana Gold'. I want it to return 'Ghana Gold', however, at the moment I am getting '=CONCATENATE(A23," ","gold").
P.S. for reference - I referred to https://medium.com/analytics-vidhya/how-to-extract-information-from-your-excel-sheet-using-python-5f4f518aec49 for the data extraction part.
I'm using to csv to save a datframe which looks like this:
PredictionIdx CustomerInterest
0 fe789a06f3 0.654059
1 6238f6b829 0.654269
2 b0e1883ce5 0.666289
3 85e07cdd04 0.664172
in which I've a value '0e15826235' in first column.I'm writing this dataframe to csv using pandas to_csv() . But when I open this csv in google excel or libreoffice it shows 0E in excel and 0 in libreoffice. It is giving me problem during submission in kaggle. But one point to note here is that when I'm reading the same csv using pandas read_csv it shows the above value correctly in dataframe.
As noted in the first comment, the error is resulting from your choice of editor. Many editors will use some version of scientific notation that reads an e (in specific places like the second character) as an indicator of an exponent. Excel, for instance, will read it as a "base X raised to the power Y" where X are the numbers before the e and Y are the numbers after the e. This is a brief description of Excel's scientific notation.
This does not happen in the other cell entries because there appear to be other string-like characters. Excel, Libre, and possibly Google attempt to interpret what the entry is, rather than taking it literally.
In your question you write '0e15826235' with single quotes, indicating that it might be a string, but this might be something to make sure of when writing out the values to a file -- Excel and the rest might not know this is meant to be a string literal.
In general, check for the format of the value and consider what your eventual editor might "think" it is when it opens. For Excel specifically, a single quote character at the start of the string will force Excel to read it as a string. See this answer.
For me code below works correctly with google spreadsheets:
import pandas as pd
df = pd.DataFrame({'PredictionIdx': ['fe789a06f3',
'6238f6b829',
'b0e1883ce5',
'85e07cdd04'],
'CustomerInterest': [0.654059,
0.654269,
0.666289,
0.664172]})
df.to_csv('./test.csv', index = None)
Also csv is very simple text format, it doesn't hold any information about data types.
So you could use df.to_excel() as Nihal suggested, or adjust column type settings in your favourite spreadsheets viewer.
I am facing an issue here. I have a Dataframe column whose values I need to put as value+% i.e. say 10%, 15% etc.
However, I am able to put the values as string type in the excel sheet after writing but while I plot the graph, the value is being considered as a string and hence the chart is not getting generated.
I need to paste the value with the % symbol in the concerned column as well as I need to plot the graph while writing to the excel sheet.
Any solution for this??
Thanks in advance.
For writing the value in excel you can use
str(value) + '%'
While plotting graph access the values by slicing the last character(%) and convert it to number by using eval function.
eval(value[:-1])