I exported string of numbers from python into a csv file:
When I open it in notepad, the data looks as such which is the real data:
However if I open it in excel sheet, the data looks as such which is false:
Can somebody please let me know how do I get to see following string of letters in the csv file:
Cell A1: 15
Cell A2: 15.0
Cell A3: 15.00
Cell A4: 15.000
That is not actually done by csv file, but you are opening it in excel. So, Excel is just ignoring .000s and yeah! If you read that file using other program or python also then you will get .0 for sure.
You can look this article for, how to change that feature. If you are having hard time while saving csv file then, you may look here
Since the 1st step is related to exporting the data from python into csv, all leading and trailing zeroes can be truncated at this stage itself.
One such reference that can be used is shown here: Removing Trailing Zeros in Python
Related
From Python i want to export to csv format a dataframe
The dataframe contains two columns like this
So when i write this :
df['NAME'] = df['NAME'].astype(str) # or .astype('string')
df.to_csv('output.csv',index=False,sep=';')
The excel output in csv format returns this :
and reads the value "MAY8218" as a date format "may-18" while i want it to be read as "MAY8218".
I've tried many ways but none of them is working. I don't want an alternative like putting quotation marks to the left and the right of the value.
Thanks.
If you want to export the dataframe to use it in excel just export it as xlsx. It works for me and maintains the value as string in the original format.
df.to_excel('output.xlsx',index=False)
The CSV format is a text format. The file contains no hint for the type of the field. The problem is that Excel has the worst possible support for CSV files: it assumes that CSV files always use its own conventions when you try to read one. In short, one Excel implementation can only read correctly what it has written...
That means that you cannot prevent Excel to interpret the csv data the way it wants, at least when you open a csv file. Fortunately you have other options:
import the csv file instead of opening it. This time you have options to configure the way the file should be processed.
use LibreOffice calc for processing CSV files. LibreOffice is a little behind Microsoft Office on most points except for csv file handling where it has an excellent support.
So I have a csv file with a column called reference_id. The values in reference id are 15 characters long, so something like '162473985649957'. When I open the CSV file, excel has changed the datatype to General and the numbers are something like '1.62474E+14'. To fix this in excel, I change the column type to Number and remove the decimals and it displays the correct value. I should add, it only does this in CSV file, if I output to xlsx, it works fine. PRoblem is, the file has to be csv.
Is there a way to fix this using python? I'm trying to automate a process. I have tried using the following to convert it to a string. It works in the sense that is converts the column to a string, but it still shows up incorrectly in the csv file.
df['reference_id'] = df['reference_id'].astype(str)
df.to_csv(r'Prev Day Branch Transaction Mems.csv')
Thanks
When I open the CSV file, excel has changed the data
This is an Excel problem. You can't fix how Excel decides to interpret your CSV. (You can work around some issues by using the text import format, but that's cumbersome.)
Either use XLS/XLSX files when working with Excel, or use eg. Gnumeric our something other that doesn't wantonly mangle your data.
I have a CSV file that when I open in notepad, displays as follows:
A,B
C,
D,E,F,G,H
I see that it shows up as Unix (LF) and UTF-8 at the bottom right of the status bar. When I open the file in excel, save it (but without making any changes), and close it, it will convert it to Windows (CRLF) as expected and displays as follows in notepad:
A,B,,,
C,,,,
D,E,F,G,H
The header row is the third row (D,E,F,G,H) and my understanding is that prior to saving, Excel reads the entire CSV file, figures out that the longest row has 4 commas and uses that format throughout the entire file. The problem I'm running into is reading the original LF CSV file into Pandas .read_csv. I think I've narrowed down the solution to 2 possible options (but please correct me if i'm wrong):
Option 1: In my main python script, I start with a def function that just iterates through every csv file in a folder where I open, save, and close in order to format it into CRLF prior to working with the csv files in Pandas.
Option 2: Format the csv file upon reading it into Pandas. I feel like this is the better option, especially knowing the number of columns I have and using .read_csv(header = 3) but when I open the output file and run excel formulas, calculation times are insane, even for relatively small files. I have a feeling it's a datatype issue but I'm still new to all of this. Any clarification or resources are greatly appreciated!
I'm trying to get python to import a CSV file and then export it as a Fixed Width Text File. I can't add the csv file as an attachment. But that's pretty much what I need Python to do. I looked at previous questions about importing and exporting and I'm getting the same issues that they did. Python is importing the CSV and then outputting what looks like a tab delimited file.
Load the CSV values, calculate maximum characters needed for each column, open a new file, and for each row of the original CSV, print out formatted CSV data to the new file. You can pad strings with your favourite characters (such as ' ' or '!' or anything else), integers and fixed point decimals should be padded with your preferred amount of leading zeroes.
https://docs.python.org/3/library/csv.html#csv.reader shows you how to use the csv library to load the CSV file. From that point on, everything should be clear.
Lets say my csv file looks something like this:
acc_num,pincode
023213821,23120
002312727,03131
231238782,29389
008712372,00127
023827812,23371
when I open this file in excel , it removes the leading zeros , but here I need to keep them . This is how it looks when i open it in excel
acc_num,pincode
23213821,23120
2312727,3131
231238782,29389
8712372,127
23827812,23371
I tried converting this into a string but it still shows it without 0 (but its a string now)
I tried using the astype() function from pandas but there's no point in using it
Any help would be appreciated
Did you format the cells in the excel document as 'text', so that when you open it in excel it displays the leading zeros, then when you bring it into Python, ensure you're bring it in and storing as a string, as python3 does not allow leading zeros in ints.