Splitting a line into a list (with some difficulty) - python

I've been trying to work on an assignment and after hours of searching asking class mates and looking on here I can't work this out.
I have a text file containing +100000 lines of data in the format:
"Words" \t "float" \t "float"
and I'm trying to write a function which lets me search a line and pull out one piece of information. It works fine when I write it normally, but I cannot seem to put it into a function
FileList = Mammal.readlines()
Name,Latitude,Longitude = FileList[int].split("\t")
print (Name)
def LineToList(int):
FileList = Text.readlines()
A,B,C = FileList[int].split("\t")
LineToList(0)
print (A)
I receive this error.
IndexError: list index out of range
I've tried swapping out int for a letter, a ratio and adding lines to return values for A, B and C then printing them out but each time it fails.

Your code reads the entire contents of an open file with readlines(), then processes only one line. If your function looks exactly as you show, it will succeed the first time you use it (as long as it's used on a freshly opened file); but on the second call there will be nothing more for readlines() to read, and you'll get back an empty list.
Here's a simpler way to convert an entire file:
lines = mammal.readlines()
values = [ row.split("\t") for row in lines ]
You then have all your values in one list of triples.
Also, note the capitalization. Python style uses names that start with a capital for user-defined types, not for ordinary variables.

Related

python: create variable from loop that reads text file and extracts single column

I'm new to python and was hoping for some assistance please.
I have a small script that reads a text file and prints only the fist column using a for loop:
list = open("/etc/jbstorelist")
for column in list:
print(column.split()[0])
But I would like to take all the lines printed in the for loop and create one single variable for it.
In other words, the text file /etc/jbstorelist has 3 columns and basically I want a list with only the first column, in the form of a single variable.
Any guidance would be appreciated. Thank you.
Since you're new to Python you may want to come back and refernce this answer later.
#Don't override python builtins. (i.e. Don't use `list` as a variable name)
list_ = []
#Use the with statement when opening a file, this will automatically close if
#for you when you exit the block
with open("/etc/jbstorelist") as filestream:
#when you loop over a list you're not looping over the columns you're
#looping over the rows or lines
for line in filestream:
#there is a side effect here you may not be aware of. calling `.split()`
#with no arguments will split on any amount of whitespace if you only
#want to split on a single white space character you can pass `.split()`
#a <space> character like so `.split(' ')`
list_.append(line.split()[0])

When I write in csv how do I separate columns in Python

My code is
import pymysql
conn=pymysql.connect(host=.................)
curs=conn.cursor()
import csv
f=open('./kospilist.csv','r')
data=f.readlines()
data_kp=[]
for i in data:
data_kp.append(i[:-1])
c = csv.writer(open("./test_b.csv","wb"))
def exportFunc():
result=[]
for i in range(0,len(data_kp)):
xp="select date from " + data_kp[i] + " where price is null"
curs.execute(xp)
result= curs.fetchall()
for row in result:
c.writerow(data_kp[i])
c.writerow(row)
c.writerow('\n')
exportFunc()
data_kp is reading the tables name
the tables' names are like this (string, ex: a000010)
I collect table names from here.
Then, execute and get the result.
The actual output of my code is ..
My expectation is
(not 3 columns.. there are 2000 tables)
I thought my code is near the answer... but it's not working..
My work is almost done, but I couldn't finish this part.
I had googled for almost 10 hours..
I don't know how.. please help
I think something is wrong with these part
for row in result:
c.writerow(data_kp[i])
c.writerow(row)
The csvwriter.writerow method allows you to write a row in your output csv file. This means that once you have called the writerow method, the line is wrote and you can't come back to it. When you write the code:
for row in result:
c.writerow(data_kp[i])
c.writerow(row)
You are saying:
"For each result, write a line containing data_kp[i] then write a
line containing row."
This way, everything will be wrote verticaly with alternation between data_kp[i] and row.
What is surprising is that it is not what we get in your actual output. I think that you've changed something. Something like that:
c.writerow(data_kp[i])
for row in result:
c.writerow(row)
But this has not entirely solved your issue, obviously: The names of the tables are not correctly displayed (one character on each column) and they are not side-by-side. So you have 2 problems here:
1. Get the table name in one cell and not splitted
First, let's take a look at the documentation about the csvwriter:
A row must be an iterable of strings or numbers for Writer objects
But your data_kp[i] is a String, not an "iterable of String". This can't work! But you don't get any error either, why? This is because a String, in python, may be itself considered as an iterable of String. Try by yourself:
for char in "abcde":
print(char)
And now, you have probably understood what to do in order to make the things work:
# Give an Iterable containing only data_kp[i]
c.writerow([data_kp[i]])
You have now your table name displayed in only 1 cell! But we still have an other problem...
2. Get the table names displayed side by side
Here, it is a problem in the logic of your code. You are browsing your table names, writing lines containing them and expect them to be written side-by-side and get columns of dates!
Your code need a little bit of rethinking because csvwriter is not made for writing columns but lines. We'll then use the zip_longest function of the itertools module. One can ask why don't I use the zip built-in function of Python: this is because the columns are not said to be of equal size and the zip function will stop once it reached the end of the shortest list!
import itertools
c = csv.writer(open("./test_b.csv","wb"))
# each entry of this list will contain a column for your csv file
data_columns = []
def exportFunc():
result=[]
for i in range(0,len(data_kp)):
xp="select date from " + data_kp[i] + " where price is null"
curs.execute(xp)
result= curs.fetchall()
# each column starts with the name of the table
data_columns.append([data_kp[i]] + list(result))
# the * operator explode the list into arguments for the zip function
ziped_columns = itertools.zip_longest(*data_columns, fillvalue=" ")
csvwriter.writerows(ziped_columns)
Note:
The code provided here has not been tested and may contain bugs. Nevertheless, you should be able (by using the documentation I provided) to fix it in order to make it works! Good luck :)

Replacing text in a file with csv file

I want to replace text in a text file with names that are in a csv file. For example, I have a text file that says:
Dear [name here],
Hello . . .. etc etc
And a csv file that has a 2 columns with the first name in the first and the last name in the second:
Joe Smith
Rachel Cool
How would I be able to read in the CSV file, and replace each name with the text [name here] in the file? I have this so far after opening the text and csv file and putting the first and last names in the variable names:
for row in f.readlines():
names = row.split(",")[0], row.split(",")[1]
And after that I tried doing something like this, but it isnt working:
for row in textfile.readlines():
print row.replace("[name here]", names)
And after running it, a TypeError: expected a string or other character buffer error. Assuming because the variable names isnt defined in the second for loop.
But how would I be able to read both files and replace just the [name here] in the text file?
Answering your question
From this line,
names = row.split(",")[0], row.split(",")[1]
names is a tuple of two string. But then, in this line
row.replace("[name here]", names)
you're trying to use it as a single string.
If you want to write both first/family names, you can append them as suggested in another answer, but why split them in the first place ? You may just read them in the first loop without splitting (which also solves cases where the person a three names (like Herbert George Wells)):
names = row
in the first loop, then
row.replace("[name here]", names)
in the second loop.
If you split the names because you want to use them separately, then it is unclear to me what you want to achieve.
 Please include tracebacks in questions
The hint here is in the traceback:
TypeError: expected a string or other character buffer error
replace expects a string or something that acts like a string. You're feeding it a tuple. Generally, when asking a question here, it is good practice to copy the full traceback so that we know where the error occurs exactly.
Next issue
Now, you're going to face another issue: in the loop, you erase names at each iteration.
You should use a list to store all names:
i = 0
for row in f.readlines():
names[i] = row
i += 1
i = 0
for row in textfile.readlines():
print(row.replace("[name here]", names[i]))
i += 1
(Or maybe you did this correctly in your code but you stripped it away when generating a simplified example for your question.)
The replace() method is expecting a string for the second parameter and you've passed it an array, hence the TypeError.
You can take the two names in the names array and append them together with a space in the middle using the join method. Try this:
print row.replace("[name here]", ' '.join(names))

How to 'flatten' lines from text file if they meet certain criteria using Python?

To start I am a complete new comer to Python and programming anything other than web languages.
So, I have developed a script using Python as an interface between a piece of Software called Spendmap and an online app called Freeagent. This script works perfectly. It imports and parses the text file and pushes it through the API to the web app.
What I am struggling with is Spendmap exports multiple lines per order where as Freeagent wants One line per order. So I need to add the cost values from any orders spread across multiple lines and then 'flatten' the lines into One so it can be sent through the API. The 'key' field is the 'PO' field. So if the script sees any matching PO numbers, I want it to flatten them as per above.
This is a 'dummy' example of the text file produced by Spendmap:
5090071648,2013-06-05,2013-09-05,P000001,1133997,223.010,20,2013-09-10,104,xxxxxx,AP
COMMENT,002091
301067,2013-09-06,2013-09-11,P000002,1133919,42.000,20,2013-10-31,103,xxxxxx,AP
COMMENT,002143
301067,2013-09-06,2013-09-11,P000002,1133919,359.400,20,2013-10-31,103,xxxxxx,AP
COMMENT,002143
301067,2013-09-06,2013-09-11,P000003,1133910,23.690,20,2013-10-31,103,xxxxxx,AP
COMMENT,002143
The above has been formatted for easier reading and normally is just one line after the next with no text formatting.
The 'key' or PO field is the first bold item and the second bold/italic item is the cost to be totalled. So if this example was to be passed through the script id expect the first row to be left alone, the Second and Third row costs to be added as they're both from the same PO number and the Fourth line to left alone.
Expected result:
5090071648,2013-06-05,2013-09-05,P000001,1133997,223.010,20,2013-09-10,104,xxxxxx,AP
COMMENT,002091
301067,2013-09-06,2013-09-11,P000002,1133919,401.400,20,2013-10-31,103,xxxxxx,AP
COMMENT,002143
301067,2013-09-06,2013-09-11,P000003,1133910,23.690,20,2013-10-31,103,xxxxxx,AP
COMMENT,002143
Any help with this would be greatly appreciated and if you need any further details just say.
Thanks in advance for looking!
I won't give you the solution. But you should:
Write and test a regular expression that breaks the line down into its parts, or use the CSV library.
Parse the numbers out so they're decimal numbers rather than strings
Collect the lines up by ID. Perhaps you could use a dict that maps IDs to lists of orders?
When all the input is finished, iterate over that dict and add up all orders stored in that list.
Make a string format function that outputs the line in the expected format.
Maybe feed the output back into the input to test that you get the same result. Second time round there should be no changes, if I understood the problem.
Good luck!
I would use a dictionary to compile the lines, using get(key,0.0) to sum values if they exist already, or start with zero if not:
InputData = """5090071648,2013-06-05,2013-09-05,P000001,1133997,223.010,20,2013-09-10,104,xxxxxx,AP COMMENT,002091
301067,2013-09-06,2013-09-11,P000002,1133919,42.000,20,2013-10-31,103,xxxxxx,AP COMMENT,002143
301067,2013-09-06,2013-09-11,P000002,1133919,359.400,20,2013-10-31,103,xxxxxx,AP COMMENT,002143
301067,2013-09-06,2013-09-11,P000003,1133910,23.690,20,2013-10-31,103,xxxxxx,AP COMMENT,002143"""
OutD = {}
ValueD = {}
for Line in InputData.split('\n'):
# commas in comments won't matter because we are joining after anyway
Fields = Line.split(',')
PO = Fields[3]
Value = float(Fields[5])
# set up the output string with a placeholder for .format()
OutD[PO] = ",".join(Fields[:5] + ["{0:.3f}"] + Fields[6:])
# add the value to the old value or to zero if it is not found
ValueD[PO] = ValueD.get(PO,0.0) + Value
# the output is unsorted by default, but you could sort or preserve original order
for POKey in ValueD:
print OutD[POKey].format(ValueD[POKey])
P.S. Yes, I know Capitals are for Classes, but this makes it easier to tell what variables I have defined...

python. write to file, cannot understand behavior

I don't understand why I cannot write to file in my python program. I have list of strings measurements. I want just write them to file. Instead of all strings it writes only 1 string. I cannot understand why.
This is my piece of code:
fmeasur = open(fmeasur_name, 'w')
line1st = 'rev number, alg time\n'
fmeasur.write(line1st)
for i in xrange(len(measurements)):
fmeasur.write(measurements[i])
print measurements[i]
fmeasur.close()
I can see all print of these trings, but in the file there is only one. What could be the problem?
The only plausible explanation that I have is that you execute the above code multiple times, each time with a single entry in measurements (or at least the last time you execute the code, len(measurements) is 1).
Since you're overwriting the file instead of appending to it, only the last set of measurements would be present in the file, but all of them would appear on the screen.
edit Or do you mean that the data is there, but there's no newlines between the measurements? The easiest way to fix that is by using print >>fmeasur, measurements[i] instead of fmeasur.write(...).

Categories

Resources