Parse a string containing a large integer in Python - python

I am having trouble parsing a data set from a .txt file into an Excel file (.csv) in Python.
The source code looks like:
fin = open(filename,'r')
reader = csv.reader(fin)
for line in reader:
list = str(line).split()
print list3
print str(list3[1])
My data sample looks like:
10134.5 -123 9.9527
And Python screen output looks like this
["['10134.5", '-123', '9.9527,"']"
-131.7000
So I'm assuming list3[1] is a float or a number at this moment, which cause some overflow because 100,000 is large than it can hold...
Do you know how to let Python treat it as a string not a integer..

You do not need to split, or to cast to string... numbers inside the list are strings.
fin = open(filename,'r')
reader = csv.reader(fin)
for line in reader:
print(line)
output
['10134.5', '-123', '9.9527']

Related

problem in writing number in csv file using python

I have this array of strings:
my_array = ['5.0', '6.066', '7.5', '7.83', '9.75']
and I want to write first 3 items on my csv file.
I am using this code
n=0
with open("file.csv",'w',newline="",)as e:
while n<3:
writer=csv.writer(e)
writer.writerow(my_array[n])
n=n+1
The output is:
5,.,0
6,.,0,6,6
7,.,5
but I don't want to separate numbers with comma
for example the output must be:
5.0
6.066
7.5
What should I do?
writerow() wants an iterable, and each element of that iterable is written as one column of the csv. Because a string is an iterable, each element (character) of that string is written to a separate column. To fix this, pass a list to writerow() that only contains your single string
You can do something like this:
import csv
my_array=['5.0', '6.066', '7.5', '7.83', '9.75']
with open('example.csv', 'w+') as f:
writer = csv.writer(f)
for elm in my_array[:3]:
writer.writerow([elm])
Output of example.csv:
5.0
6.066
7.5
Apparently you don't actually want a CSV (comma separated values) file, so don't use the csv module.
Simply write each string from the list to a separate line in the file.
Some options are:
with open("file.txt", "w") as f:
for value in my_array[:3]:
print(value, file=f)
with open("file.txt", "w") as f:
f.write("\n".join(my_array[:3]))

Read CSV with comma as linebreak

I have a file saved as .csv
"400":0.1,"401":0.2,"402":0.3
Ultimately I want to save the data in a proper format in a csv file for further processing. The problem is that there are no line breaks in the file.
pathname = r"C:\pathtofile\file.csv"
with open(pathname, newline='') as file:
reader = file.read().replace(',', '\n')
print(reader)
with open(r"C:\pathtofile\filenew.csv", 'w') as new_file:
csv_writer = csv.writer(new_file)
csv_writer.writerow(reader)
The print reader output looks exactly how I want (or at least it's a format I can further process).
"400":0.1
"401":0.2
"402":0.3
And now I want to save that to a new csv file. However the output looks like
"""",4,0,0,"""",:,0,.,1,"
","""",4,0,1,"""",:,0,.,2,"
","""",4,0,2,"""",:,0,.,3
I'm sure it would be intelligent to convert the format to
400,0.1
401,0.2
402,0.3
at this stage instead of doing later with another script.
The main problem is that my current code
with open(pathname, newline='') as file:
reader = file.read().replace(',', '\n')
reader = csv.reader(reader,delimiter=':')
x = []
y = []
print(reader)
for row in reader:
x.append( float(row[0]) )
y.append( float(row[1]) )
print(x)
print(y)
works fine for the type of csv files I currently have, but doesn't work for these mentioned above:
y.append( float(row[1]) )
IndexError: list index out of range
So I'm trying to find a way to work with them too. I think I'm missing something obvious as I imagine that it can't be too hard to properly define the linebreak character and delimiter of a file.
with open(pathname, newline=',') as file:
yields
ValueError: illegal newline value: ,
The right way with csv module, without replacing and casting to float:
import csv
with open('file.csv', 'r') as f, open('filenew.csv', 'w', newline='') as out:
reader = csv.reader(f)
writer = csv.writer(out, quotechar=None)
for r in reader:
for i in r:
writer.writerow(i.split(':'))
The resulting filenew.csv contents (according to your "intelligent" condition):
400,0.1
401,0.2
402,0.3
Nuances:
csv.reader and csv.writer objects treat comma , as default delimiter (no need to file.read().replace(',', '\n'))
quotechar=None is specified for csv.writer object to eliminate double quotes around the values being saved
You need to split the values to form a list to represent a row. Presently the code is splitting the string into individual characters to represent the row.
pathname = r"C:\pathtofile\file.csv"
with open(pathname) as old_file:
with open(r"C:\pathtofile\filenew.csv", 'w') as new_file:
csv_writer = csv.writer(new_file, delimiter=',')
text_rows = old_file.read().split(",")
for row in text_rows:
items = row.split(":")
csv_writer.writerow([int(items[0]), items[1])
If you look at the documentation, for write_row, it says:
Write the row parameter to the writer’s file
object, formatted according to the current dialect.
But, you are writing an entire string in your code
csv_writer.writerow(reader)
because reader is a string at this point.
Now, the format you want to use in your CSV file is not clearly mentioned in the question. But as you said, if you can do some preprocessing to create a list of lists and pass each sublist to writerow(), you should be able to produce the required file format.

Python CSV library returning 1 item instead of a list of items

Im trying to use the CSV library to do some excel processing, but when I use the code posted below, row returns the entirety of data as 1 item, so row[0] returns the entire file and row[1] returns index out of range. Is there a way to make each row a list with each cell being an item? Making the final product a list of lists. I was thinking of using split everytime ther was a close bracket ']' . If needed I can post the excel file
Heres a sample of what some of the output looks like. This is all one item in the list:
['3600035671,"$13,668",8/11/2008,8/11/2013,,,2,4A,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,']
['3910435005,"$34,872",4/1/2010,10/8/2016,,,2,4A,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,']
['5720636344,"$1,726",8/30/2010,9/5/2011,,,3,6C,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,']
['15260473510,"-$1,026,580",7/22/2005,3/5/2008,,,6,1C2A,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,']
import csv
csvfile = open('Invictus.csv', 'rU')
data = csv.reader(csvfile, dialect=csv.excel_tab)
for char in data:
char = filter(None, char)
print char
Assuming you are giving examples of your data above the line import csv, it looks like your data is comma delimited but you are setting up your CSV reader to expect tab delimited data (dialect=csv.excel_tab).
What happens if you change that line to:
data = csv.reader(csvfile, dialect=csv.excel)

Python csv reader returns formula instead of value

I have a txt file which has some 'excel formulas', I have converted this to a csv file using Python csv reader/writer. Now I want to read the values of the csv file and do some calculation, but when i try to access the particular column of .csv file, it still returns me in the 'excel formula' instead of the actual value?? although When i open the csv file .. formulas are converted in to value??
Any ideas?
Here is the code
Code to convert txt to csv
def parseFile(filepath):
file = open(filepath,'r')
content = file.read()
file.close()
lines = content.split('\n')
csv_filepath = filepath[:(len(filepath)-4)]+'_Results.csv'
csv_out = csv.writer(open(csv_filepath, 'a'), delimiter=',' , lineterminator='\n')
for line in lines:
data = line.split('\t')
csv_out.writerow(data)
return csv_filepath
Code to do some calculation in csv file
def csv_cal (csv_filepath):
r = csv.reader(open(csv_filepath))
lines = [l for l in r]
counter =[0]*(len(lines[4])+6)
if lines[4][4] == 'Last Test Pass?' :
print ' i am here'
for i in range(0,3):
print lines[6] [4] ### RETURNS FORMULA ??
return 0
I am new to python, any help would be appreciated!
Thanks,
You can paste special in Excel with Values only option selected. You could select all and paste into a another sheet and save. This would save you from having to implement some kind of parser in python. Or, you could evaluate some simple arithmetic with eval.
edit:
I've heard of xlrd which can be downloaded from pypi. It loads .xls files.
It sounded like you just wanted the final data which past special can do.

Python CSV read-> write; remove and replace PLUS: end of line is JSON format

I am having problems getting my Python script to do what I want. It does not appear to be modifying my file.
I want to:
Read in a *.csv file that has the following format
PropertyName::PropertyValue,…,PropertyName::PropertyValue,{ExtPropertyName::ExtPropertyValue},…,{ExtPropertyName:: ExtPropertyValue}
I want to remove PropertyName:: and leave behid just a column of the PropertyValue
I want to add a header line
I was trying to step through replacing the :: values with a comma, but cant seem to get this to work:
fin = csv.reader(open('infile', 'rb'), delimiter=',')
fout = open('outfile', 'w')
for row in fin:
fout.write(','.join(','.join(item.split()) for item in row) + '::')
fout.close()
Any advice, whether on my first step problem, or to a bigger picture resolution is always appreciated. Thanks.
UPDATE/EDIT asked for by a person nice enough to review for me!
Here is the first line of the *.csv file (INPUT)
InnerDiameterOrWidth::0.1,InnerHeight::0.1,Length2dCenterToCenter::44.6743867864386,Length3dCenterToCenter::44.6768028159989,Length2dToInsideEdge::44.2678260053526,Length3dToInsideEdge::44.2717800813466,Length2dToOutsideEdge::44.6743867864386,Length3dToOutsideEdge::44.6768028159989,MinimumCover::0,MaximumCover::0,StartConnection::ImmxGisUtilityNetworkCommon.Connection,
In a perfect world here is what I would like my text file to look like (OUTPUT)
InnerDiameterOrWidth, InnerHeight, Length2dCenterToCenter,,,,,,,,,,,
0.1,0.1,44.6743867864386
so one header line and the values in column
UPDATED JSON Info
The end of each line has JSON formatted text:
{StartPoint::7858.35924983374[%2C]1703.69341358077[%2C]-3.075},{EndPoint::7822.85045874375[%2C]1730.80294308742[%2C]-3.53962362760298}
WHich I need to split into X Y Z and X Y Z with headers
Maybe something like this (assuming that each line has the same keys, and in the same order):
import csv
with open("diam.csv", "rb") as fin, open("diam_out.csv", "wb") as fout:
reader = csv.reader(fin)
writer = csv.writer(fout)
for i, line in enumerate(reader):
split = [item.split("::") for item in line if item.strip()]
if not split: # blank line
continue
keys, vals = zip(*split)
if i == 0:
# first line: write header
writer.writerow(keys)
writer.writerow(vals)
which produces
localhost-2:coding $ cat diam_out.csv
InnerDiameterOrWidth,InnerHeight,Length2dCenterToCenter,Length3dCenterToCenter,Length2dToInsideEdge,Length3dToInsideEdge,Length2dToOutsideEdge,Length3dToOutsideEdge,MinimumCover,MaximumCover,StartConnection
0.1,0.1,44.6743867864386,44.6768028159989,44.2678260053526,44.2717800813466,44.6743867864386,44.6768028159989,0,0,ImmxGisUtilityNetworkCommon.Connection
I think most of that code should make sense, except maybe the zip(*split) trick: that basically transposes a sequence, i.e.
>>> s = [['a','1'],['b','2']]
>>> zip(*s)
[('a', 'b'), ('1', '2')]
so that the elements are now grouped together by their index (the first ones are all together, the second, etc.)

Categories

Resources