how do you keep leading zeros in a csv file using python? - python

Lets say my csv file looks something like this:
acc_num,pincode
023213821,23120
002312727,03131
231238782,29389
008712372,00127
023827812,23371
when I open this file in excel , it removes the leading zeros , but here I need to keep them . This is how it looks when i open it in excel
acc_num,pincode
23213821,23120
2312727,3131
231238782,29389
8712372,127
23827812,23371
I tried converting this into a string but it still shows it without 0 (but its a string now)
I tried using the astype() function from pandas but there's no point in using it
Any help would be appreciated

Did you format the cells in the excel document as 'text', so that when you open it in excel it displays the leading zeros, then when you bring it into Python, ensure you're bring it in and storing as a string, as python3 does not allow leading zeros in ints.

Related

Differentiate numbering format in csv file

I exported string of numbers from python into a csv file:
When I open it in notepad, the data looks as such which is the real data:
However if I open it in excel sheet, the data looks as such which is false:
Can somebody please let me know how do I get to see following string of letters in the csv file:
Cell A1: 15
Cell A2: 15.0
Cell A3: 15.00
Cell A4: 15.000
That is not actually done by csv file, but you are opening it in excel. So, Excel is just ignoring .000s and yeah! If you read that file using other program or python also then you will get .0 for sure.
You can look this article for, how to change that feature. If you are having hard time while saving csv file then, you may look here
Since the 1st step is related to exporting the data from python into csv, all leading and trailing zeroes can be truncated at this stage itself.
One such reference that can be used is shown here: Removing Trailing Zeros in Python

Output to CSV changing datatype

So I have a csv file with a column called reference_id. The values in reference id are 15 characters long, so something like '162473985649957'. When I open the CSV file, excel has changed the datatype to General and the numbers are something like '1.62474E+14'. To fix this in excel, I change the column type to Number and remove the decimals and it displays the correct value. I should add, it only does this in CSV file, if I output to xlsx, it works fine. PRoblem is, the file has to be csv.
Is there a way to fix this using python? I'm trying to automate a process. I have tried using the following to convert it to a string. It works in the sense that is converts the column to a string, but it still shows up incorrectly in the csv file.
df['reference_id'] = df['reference_id'].astype(str)
df.to_csv(r'Prev Day Branch Transaction Mems.csv')
Thanks
When I open the CSV file, excel has changed the data
This is an Excel problem. You can't fix how Excel decides to interpret your CSV. (You can work around some issues by using the text import format, but that's cumbersome.)
Either use XLS/XLSX files when working with Excel, or use eg. Gnumeric our something other that doesn't wantonly mangle your data.

How to convert a CSV into fixed width text file?

I'm trying to get python to import a CSV file and then export it as a Fixed Width Text File. I can't add the csv file as an attachment. But that's pretty much what I need Python to do. I looked at previous questions about importing and exporting and I'm getting the same issues that they did. Python is importing the CSV and then outputting what looks like a tab delimited file.
Load the CSV values, calculate maximum characters needed for each column, open a new file, and for each row of the original CSV, print out formatted CSV data to the new file. You can pad strings with your favourite characters (such as ' ' or '!' or anything else), integers and fixed point decimals should be padded with your preferred amount of leading zeroes.
https://docs.python.org/3/library/csv.html#csv.reader shows you how to use the csv library to load the CSV file. From that point on, everything should be clear.

Can't convert string column into float

I'm trying to train a linear regression model in jupyter notebooks and loaded an csv file created via Google Sheets. All the data was saved as number in the sheet, but when i loaded the CSV into Jupyter it is turned into an string and i can't convert it back, it gives the following error: could not convert string to float: '10.801.68'
Already changed the commas by dots and tried the following code:
df.columns=['Data', 'Price', 'Volume']
df['Price'] = df['Price'].str.replace(',','.')
df['Price'] = df['Price'].astype(float)
you need to remove the . from your '10.801.68' as Yatu said: df['Price'] = df['Price'].str.replace(',',''), or try to change the thousand seperator while reading your csv file, for example:
pd.read_csv(thousands=r'.')
Your number format in your csv files (saved on windows I would assume) is quite different to python logic, I would try to adjust it directly in the reading process.
You can do as PV8 suggested within python or clean the file withing excel by replacing (ctrl+h) '.' with ''

how to write comma separated list items to csv in a single column in python

I have a list(fulllist) of 292 items and converted to data frame. Then tried writing it to csv in python.
import pandas as pd
my_df = pd.DataFrame(fulllist)
my_df.to_csv('Desktop/pgm/111.csv', index=False,sep=',')
But the some comma separated values fills each columns of csv. I am trying to make that values in single column.
Portion of output is shown below.
I have tried with writerows but wont work.
import csv
with open('Desktop/pgm/111.csv', "wb") as f:
writer = csv.writer(fulllist)
writer.writerows(fulllist)
Also tried with "".join at each time, when the length of list is higher than 1. It also not giving the result. How to make the proper csv so that each fields fill each columns?
My expected output csv is
Please keep in mind that .csv files are in fact plain text files and understanding of .csv by given software depends on implementation, for example some might allow newline character as part of field, when it is between " and ", while other treat every newline character as next row.
Do you have to use .csv format? If not consider other possibilities:
DSV https://en.wikipedia.org/wiki/Delimiter-separated_values is similiar to csv, but you can use for example ; instead of ,, which should help if you do not have ; in your data
openpyxl allows writing and reading of .xlsx files.

Categories

Resources