I have a big parent list containing many lists of tuples like the small example:
[
[('id', 'name', 'Trans'), ('ENS001', 'EGSB', 'TTP')],
[('id', 'name', 'Trans'), ('EN02', 'EHGT', 'GFT')]
]
My goal is to make a text file in which there would some columns. The columns are the
second tuple of each list in the parent list. The first tuple in all lists are similar
in all nested lists and they would be column names.
I used this code(z is above list)
rows= [i[1] for i in z]
to get
[('ENS001', 'EGSB', 'TTP'), ('EN02', 'EHGT', 'GFT')]
And this one (which I call it “A”)
with open('out.txt','w') as f :
f.write (' '.join(z[0][0]))
for i in rows:
f.write (' '.join(i))
to get the file. But in the file the columns are not separated like this.
id name Trans
ENS001 EGSB TTP
EN02 EHGT GFT
You are writing it all on one line, you need to add a newline:
rows = (sub[1] for sub in z)
with open('out.txt','w') as f:
f.write ("{}\n".format(' '.join(z[0][0]))) # add a newline
for i in rows:
f.write ("{}\n".format(' '.join(i))) # newline again
If you always have three elements in your rows and you want them aligned:
rows = [sub[1] for sub in z]
mx_len = 0
for tup in rows:
mx = len(max(tup[:-1],key=len))
if mx > mx_len:
mx_len = mx
with open('out.txt', 'w') as f:
a, b, c = z[0][0]
f.write("{:<{mx_len}} {:<{mx_len}} {}\n".format(a, b, c, mx_len=mx_len))
for a, b, c in rows:
f.write("{:<{mx_len}} {:<{mx_len}} {}\n".format(a, b, c, mx_len=mx_len))
Output:
id name Trans
ENS001 EGSB TTP
EN02 EHGT GFT
If the length varies:
with open('out.txt', 'w') as f:
f.write(("{:<{mx_len}}"*len(z[0][0])).format(*z[0][0], mx_len=mx_len) + "\n")
for row in rows:
f.write(("{:<{mx_len}}"*len(row)).format(*row, mx_len=mx_len) + "\n")
If you want to align column with spaces, first you have to determine what each column's width will be -- presumably the length of the longer header or content of each column, e.g:
wids = [len(h) for h in z[0][0]]
for i in rows:
wids = [max(len(r), w) for w, r in zip(wids, i)]
Then on this basis you can prepare a format string, such as
fmt = ' '.join('%%%ds' % w for w in wids) + '\n'
and finally, you can write things out:
with open('out.txt','w') as f:
f.write(fmt % z[0][0])
for i in rows:
f.write(fmt % i)
If you want the output to be separated by tabs like this you can join on \t, rather than ' ', which you are using. The bottom line of your code would look like f.write('\t'.join(i)).
Related
My code attempts to split several data tables into year long chunks, then correlate them against all other years and return the correlation values to a matrix. I am attempting to write these outputs to a csv file, and while it is working fine for the matrices themselves, when I try to write the name of the column and table, they are split by their individual characters.
def split_into_yearly_chunks(egauge, column, offset):
split_into_chunks_stmnt = " SELECT " + column + \
" FROM " + egauge + \
" OFFSET " + offset + " LIMIT 525600 "
year_long_chunk = pd.read_sql_query(split_into_chunks_stmnt, engine)
return year_long_chunk
for x in prof_list:
for y in col_list:
list_of_year_long_chunks = []
for z in off_list:
year_long_chunk = split_into_yearly_chunks(x,y,z)
if len(year_long_chunk) == 525600:
list_of_year_long_chunks.append(year_long_chunk)
corr_matrix = []
for corr_year in list_of_year_long_chunks:
corr_row = []
for corr_partner in list_of_year_long_chunks:
corr_value, p_coef = spearmanr(corr_year, corr_partner)
corr_row.append(corr_value)
corr_matrix.append(corr_row)
print(x,'; ',y,'; ')
with open('correlation_data_58_profiles.csv', 'a') as f:
thewriter = csv.writer(f)
thewriter.writerow(x)
thewriter.writerow(y)
for row in corr_matrix:
print(row)
with open('correlation_data_58_profiles.csv', 'a', newline = '') as f:
thewriter = csv.writer(f)
thewriter.writerow(row)
(Really only the last 10 or so lines in my code are the problem, but I figured I'd give the whole thing for background). My prof,col,and off_lists are all lists of strings.
The way that this stores in my csv file looks like this:
e,g,a,u,g,e,1,3,8,3,0
g,r,i,d
1.0,0.7811790818745755,0.7678768782119194,0.7217461539833535
0.7811790818745755,0.9999999999999998,0.7614854144434556,0.714875063672517
0.7678768782119194,0.7614854144434556,0.9999999999999998,0.7215907332829061
0.7217461539833535,0.7148750636725169,0.7215907332829061,0.9999999999999998
I'd like egauge13830 and grid not to be separated by commas, and other answers I've seen wouldn't work for the for loop that I have. How can I do this?
csv.writer(...).writerow expects a list of values, representing the values of a single row.
At some places in the code you are giving it single strings:
thewriter.writerow(x); # x is a string out of prof_list
thewriter.writerow(y); # y is a string out of col_list
Since it expects lists of strings, it treats each of these strings as a list of individual characters; that's why you get each character as its own "column", separated by commas.
If you want each of these single strings to appear in its own row as a single column value, then you'll need to make each of them into a one-element-list, indicating to the CSV writer that you want a row consisting of a single value:
thewriter.writerow([x]); # `[x]` means "a list comprised only of x"
thewriter.writerow([y]);
Also bear in mind that a CSV containing two rows of a single value each, followed by N rows of K values each would be kind of hard to further process; so you should consider if that's really what you want.
I have been facing an issue parsing an horrible txt file, I have manage to extract to a list the information I need:
['OS-EXT-SRV-ATTR:host', 'compute-0-4.domain.tld']
['OS-EXT-SRV-ATTR:hostname', 'commvault-vsa-vm']
['OS-EXT-SRV-ATTR:hypervisor_hostname', 'compute-0-4.domain.tld']
['OS-EXT-SRV-ATTR:instance_name', 'instance-00000008']
['OS-EXT-SRV-ATTR:root_device_name', '/dev/vda']
['hostId', '985035a85d3c98137796f5799341fb65df21e8893fd988ac91a03124']
['key_name', '-']
['name', 'Commvault_VSA_VM']
['OS-EXT-SRV-ATTR:host', 'compute-0-28.domain.tld']
['OS-EXT-SRV-ATTR:hostname', 'dummy-vm']
['OS-EXT-SRV-ATTR:hypervisor_hostname', 'compute-0-28.domain.tld']
['OS-EXT-SRV-ATTR:instance_name', 'instance-0000226e']
['OS-EXT-SRV-ATTR:root_device_name', '/dev/hda']
['hostId', '7bd08d963a7c598f274ce8af2fa4f7beb4a66b98689cc7cdc5a6ef22']
['key_name', '-']
['name', 'Dummy_VM']
['OS-EXT-SRV-ATTR:host', 'compute-0-20.domain.tld']
['OS-EXT-SRV-ATTR:hostname', 'mavtel-sif-vsifarvl11']
['OS-EXT-SRV-ATTR:hypervisor_hostname', 'compute-0-20.domain.tld']
['OS-EXT-SRV-ATTR:instance_name', 'instance-00001da6']
['OS-EXT-SRV-ATTR:root_device_name', '/dev/vda']
['hostId', 'dd82c20a014e05fcfb3d4bcf653c30fa539a8fd4e946760ee1cc6f07']
['key_name', 'mav_tel_key']
['name', 'MAVTEL-SIF-vsifarvl11']
I would like to have the element 0 as headers and 1 has rows, for example:
OS-EXT-SRV-ATTR:host, OS-EXT-SRV-ATTR:hostname,...., name
compute-0-4.domain.tld, commvault-vsa-vm,....., Commvault_VSA_VM
compute-0-28.domain.tld, dummy-vm,...., Dummy_VM
Here is my code so far:
import re
with open('metadata.txt', 'r') as infile:
lines = infile.readlines()
for line in lines:
if re.search('hostId|properties|OS-EXT-SRV-ATTR:host|OS-EXT-SRV-ATTR:hypervisor_hostname|name', line):
re.sub("[\t]+", " ", line)
find = line.strip()
format = ''.join(line.split()).replace('|', ',')
list = format.split(',')
new_list = list[1:-1]
I am very new at python, so sometimes I ran out of ideas on how to make things work.
Looking at your input file, I see that it contains what appears to be output from the openstack nova show command, mixed with other stuff. There are basically two types of lines: valid ones, and invalid ones (duh).
The valid ones have this structure:
'| key | value |'
and the invalid ones have anything else.
So we could define that every valid line
can be split at the | into exactly four parts, of which
the first and the last part must be empty, and the other parts must be filled.
Python can do this (it's called unpacking assignment):
a, b, c, d = [1, 2, 3, 4]
a, b, c, d = some_string.split('|')
which will succeed when the right-hand side has exactly four parts, otherwise it will fail with a ValueError. When we now make sure that a and d are empty, and b and c are not empty - we have a valid line.
Furthermore we can say, if b equals 'Property' and c equals 'Value', we have hit a header row and what follows must describe a "new record".
This function does exactly that:
def parse_metadata_file(path):
""" parses a data file generated by `nova show` into records """
with open(path, 'r', encoding='utf8') as file:
record = {}
for line in file:
try:
# unpack line into 4 fields: "| key | val |"
a, key, val, z = map(str.strip, line.split('|'))
if a != '' or z != '' or key == '' or val == '':
continue
except ValueError:
# skip invalid lines
continue
if key == 'Property' and val == 'Value' and record:
# output current record and start a new one
yield record
record = {}
else:
# write property to current record
record[key] = val
# output last record
if record:
yield record
It spits out a new dict for each record it finds and disregards all lines that do not pass the sanity check. Effectively this function generates a stream of dicts.
Now we can use the csv module to write this stream of dicts to a CSV file:
import csv
# list of fields we are interested in
fields = ['hostId', 'properties', 'OS-EXT-SRV-ATTR:host', 'OS-EXT-SRV-ATTR:hypervisor_hostname', 'name']
with open('output.csv', 'w', encoding='utf8', newline='') as outfile:
writer = csv.DictWriter(outfile, fieldnames=fields, extrasaction='ignore')
writer.writeheader()
writer.writerows(parse_metadata_file('metadata.txt'))
The CSV module has a DictWriter which is designed to accept dicts as input and write them—according to the given key names—to a CSV row.
With extrasaction='ignore' it does not matter if the current record has more fields than required
With fields list it becomes extremely easy to extract a different set of fields.
Configure the writer to suit your needs (docs).
This:
writer.writerows(parse_metadata_file('metadata.txt'))
is a convenient shorthand for
for record in parse_metadata_file('metadata.txt'):
writer.writerow(record)
You can take a step by step approach to build a 2D array by keeping track of your headers and each entry in the text file.
headers = list(set([entry[0] for entry in data])) # obtain unique headers
num_rows = 1
for entry in data: # figuring out how many rows we are going to need
if 'name' in entry: # name is unique per row so using that
num_rows += 1
num_cols = len(headers)
mat = [[0 for _ in range(num_cols)] for _ in range(num_rows)]
mat[0] = headers # add headers as first row
header_lookup = {header: i for i, header in enumerate(headers)}
row = 1
for entry in data:
header, val = entry[0], entry[1]
col = header_lookup[header]
mat[row][col] = val # add entries to each subsequent row
if header == 'name':
row += 1
print mat
output:
[['hostId', 'OS-EXT-SRV-ATTR:host', 'name', 'OS-EXT-SRV-ATTR:hostname', 'OS-EXT-SRV-ATTR:instance_name', 'OS-EXT-SRV-ATTR:root_device_name', 'OS-EXT-SRV-ATTR:hypervisor_hostname', 'key_name'], ['985035a85d3c98137796f5799341fb65df21e8893fd988ac91a03124', 'compute-0-4.domain.tld', 'Commvault_VSA_VM', 'commvault-vsa-vm', 'instance-00000008', '/dev/vda', 'compute-0-4.domain.tld', '-'], ['7bd08d963a7c598f274ce8af2fa4f7beb4a66b98689cc7cdc5a6ef22', 'compute-0-28.domain.tld', 'Dummy_VM', 'dummy-vm', 'instance-0000226e', '/dev/hda', 'compute-0-28.domain.tld', '-'], ['dd82c20a014e05fcfb3d4bcf653c30fa539a8fd4e946760ee1cc6f07', 'compute-0-20.domain.tld', 'MAVTEL-SIF-vsifarvl11', 'mavtel-sif-vsifarvl11', 'instance-00001da6', '/dev/vda', 'compute-0-20.domain.tld', 'mav_tel_key']]
if you need to write the new 2D array to a file so its not as "horrible" :)
with open('output.txt', 'w') as f:
for lines in mat:
lines_out = '\t'.join(lines)
f.write(lines_out)
f.write('\n')
Looks like a job for pandas:
import pandas as pd
list_to_export = [['OS-EXT-SRV-ATTR:host', 'compute-0-4.domain.tld'],
['OS-EXT-SRV-ATTR:hostname', 'commvault-vsa-vm'],
['OS-EXT-SRV-ATTR:hypervisor_hostname', 'compute-0-4.domain.tld'],
['OS-EXT-SRV-ATTR:instance_name', 'instance-00000008'],
['OS-EXT-SRV-ATTR:root_device_name', '/dev/vda'],
['hostId', '985035a85d3c98137796f5799341fb65df21e8893fd988ac91a03124'],
['key_name', '-'],
['name', 'Commvault_VSA_VM'],
['OS-EXT-SRV-ATTR:host', 'compute-0-28.domain.tld'],
['OS-EXT-SRV-ATTR:hostname', 'dummy-vm'],
['OS-EXT-SRV-ATTR:hypervisor_hostname', 'compute-0-28.domain.tld'],
['OS-EXT-SRV-ATTR:instance_name', 'instance-0000226e'],
['OS-EXT-SRV-ATTR:root_device_name', '/dev/hda'],
['hostId', '7bd08d963a7c598f274ce8af2fa4f7beb4a66b98689cc7cdc5a6ef22'],
['key_name', '-'],
['name', 'Dummy_VM'],
['OS-EXT-SRV-ATTR:host', 'compute-0-20.domain.tld'],
['OS-EXT-SRV-ATTR:hostname', 'mavtel-sif-vsifarvl11'],
['OS-EXT-SRV-ATTR:hypervisor_hostname', 'compute-0-20.domain.tld'],
['OS-EXT-SRV-ATTR:instance_name', 'instance-00001da6'],
['OS-EXT-SRV-ATTR:root_device_name', '/dev/vda'],
['hostId', 'dd82c20a014e05fcfb3d4bcf653c30fa539a8fd4e946760ee1cc6f07'],
['key_name', 'mav_tel_key'],
['name', 'MAVTEL-SIF-vsifarvl11']]
data_dict = {}
for i in list_to_export:
if i[0] not in data_dict:
data_dict[i[0]] = [i[1]]
else:
data_dict[i[0]].append(i[1])
pd.DataFrame.from_dict(data_dict, orient = 'index').T.to_csv('filename.csv')
I have a record as below:
29 16
A 1.2595034 0.82587254 0.7375044 1.1270138 -0.35065323 0.55985355
0.7200067 -0.889543 0.2300735 0.56767654 0.2789483 0.32296127 -0.6423197 0.26456305 -0.07363393 -1.0788593
B 1.2467299 0.78651106 0.4702038 1.204216 -0.5282698 0.13987103
0.5911153 -0.6729466 0.377103 0.34090135 0.3052503 0.028784657 -0.39129165 0.079238065 -0.29310825 -0.99383247
I want to split the data into key-value pairs neglecting the first top row i.e 29 16. It should be neglected.
The output should be something like this:
x = A , B
y = 1.2595034 0.82587254 0.7375044 1.1270138 -0.35065323 0.55985355 0.7200067 -0.889543 0.2300735 0.56767654 0.2789483 0.32296127 -0.6423197 0.26456305 -0.07363393 -1.0788593
1.2467299 0.78651106 0.4702038 1.204216 -0.5282698 0.13987103 0.5911153 -0.6729466 0.377103 0.34090135 0.3052503 0.028784657 -0.39129165 0.079238065 -0.29310825 -0.99383247
I am able to neglect the first line using the below code:
f = open(fileName, 'r')
lines = f.readlines()[1:]
Now how do I separate rest record in Python?
So here's my take :D I expect you'd want to have the numbers parsed as well?
def generate_kv(fileName):
with open(fileName, 'r') as file:
# ignore first line
file.readline()
for line in file:
if '' == line.strip():
# empty line
continue
values = line.split(' ')
try:
yield values[0], [float(x) for x in values[1:]]
except ValueError:
print(f'one of the elements was not a float: {line}')
if __name__ == '__main__':
x = []
y = []
for key, value in generate_kv('sample.txt'):
x.append(key)
y.append(value)
print(x)
print(y)
assumes that the values in sample.txt look like this:
% cat sample.txt
29 16
A 1.2595034 0.82587254 0.7375044 1.1270138 -0.35065323 0.55985355 0.7200067 -0.889543 0.2300735 0.56767654 0.2789483 0.32296127 -0.6423197 0.26456305 -0.07363393 -1.0788593
B 1.2467299 0.78651106 0.4702038 1.204216 -0.5282698 0.13987103 0.5911153 -0.6729466 0.377103 0.34090135 0.3052503 0.028784657 -0.39129165 0.079238065 -0.29310825 -0.99383247
and the output:
% python sample.py
['A', 'B']
[[1.2595034, 0.82587254, 0.7375044, 1.1270138, -0.35065323, 0.55985355, 0.7200067, -0.889543, 0.2300735, 0.56767654, 0.2789483, 0.32296127, -0.6423197, 0.26456305, -0.07363393, -1.0788593], [1.2467299, 0.78651106, 0.4702038, 1.204216, -0.5282698, 0.13987103, 0.5911153, -0.6729466, 0.377103, 0.34090135, 0.3052503, 0.028784657, -0.39129165, 0.079238065, -0.29310825, -0.99383247]]
Alternatively, if you'd wanted to have a dictionary, do:
if __name__ == '__main__':
print(dict(generate_kv('sample.txt')))
That will convert the list into a dictionary and output:
{'A': [1.2595034, 0.82587254, 0.7375044, 1.1270138, -0.35065323, 0.55985355, 0.7200067, -0.889543, 0.2300735, 0.56767654, 0.2789483, 0.32296127, -0.6423197, 0.26456305, -0.07363393, -1.0788593], 'B': [1.2467299, 0.78651106, 0.4702038, 1.204216, -0.5282698, 0.13987103, 0.5911153, -0.6729466, 0.377103, 0.34090135, 0.3052503, 0.028784657, -0.39129165, 0.079238065, -0.29310825, -0.99383247]}
you can use this script if your file is a text
filename='file.text'
with open(filename) as f:
data = f.readlines()
x=[data[0][0],data[1][0]]
y=[data[0][1:],data[1][1:]]
If you're happy to store the data in a dictionary here is what you can do:
records = dict()
with open(filename, 'r') as f:
f.readline() # skip the first line
for line in file:
key, value = line.split(maxsplit=1)
records[key] = value.split()
The structure of records would be:
{
'A': ['1.2595034', '0.82587254', '0.7375044', ... ]
'B': ['1.2467299', '0.78651106', '0.4702038', ... ]
}
What's happening
with ... as f we're opening the file within a context manager (more info here). This allows us to automatically close the file when the block finishes.
Because the open file keeps track of where it is in the file we can use f.readline() to move the pointer down a line. (docs)
line.split() allows you to turn a string into a list of strings. With the maxsplits=1 arg it means that it will only split on the first space.
e.g. x, y = 'foo bar baz'.split(maxsplit=1), x = 'foo' and y = 'bar baz'
If I understood correctly, you want the numbers to be collected in a list. One way of doing this is:
import string
text = '''
29 16
A 1.2595034 0.82587254 0.7375044 1.1270138 -0.35065323 0.55985355 0.7200067 -0.889543 0.2300735 0.56767654 0.2789483 0.32296127 -0.6423197 0.26456305 -0.07363393 -1.0788593
B 1.2467299 0.78651106 0.4702038 1.204216 -0.5282698 0.13987103 0.5911153 -0.6729466 0.377103 0.34090135 0.3052503 0.028784657 -0.39129165 0.079238065 -0.29310825 -0.99383247
'''
lines = text.split('\n')
x = [
line[1:].strip().split()
for i, line in enumerate(lines)
if line and line[0].lower() in string.ascii_letters]
This will produce a list of lists when the outer list contains A, B, etc. and the inner lists contain the numbers associated to A, B, etc.
This code assumes that you are interested in lines starting with any single letter (case-insensitive).
For more elaborated conditions you may want to look into regular expressions.
Obviously, if your text is in a file, you could substitute lines = ... with:
with open(filepath, 'r') as lines:
x = ...
Also, if the items in x should not be separated, but rather in a string, you may want to change line[1:].strip().split() with line[1:].strip().
Instead, if you want the numbers as float and not string, you should replace line[1:].strip().split() with [float(value) for value in line[1:].strip().split()].
EDIT:
Alternatively to line[1:].strip().split() you may want to do:
line.split(maxsplit=1)[1].split()
as suggested in some other answer. This would generalize better if the first token is not a single character.
I have two CSV files that I'm trying to compare. I've read them using dict reader. So now I have dictionaries (one for each row) from two CSV files. I want to compare them, say when two elements (those with headers h1 and h2) are same, compare those dictionaries and print out the differences with respect to the second dictionary. Here are sample csv files.
csv1:
h1,h2,h3
aaa,g0,74
bjg,73,kg9
CSV_new:
h1,h2,h3,h4
aaa,g0,7,
bjg,73,kg9,ahf
I want the output to be something like this, though not exactly like shown below, I want it to be able to print out the modifications, additions and deletions in each dictionary with respect to CSV_new:
{h1:'aaa', h2:'g0' {h3:'74', h4:''}}
{h1:'bjg', h2:'73' {h4:''}
My code, that's not well-developed enough.
import csv
f1 = "csv1.csv"
reader1 = csv.DictReader(open (f1), delimiter = ",")
for row1 in reader1:
row1['h1']
#['%s:%s' % (f, row[f]) for f in reader.fieldnames]
f2 = "CSV_new.csv"
reader2 = csv.DictReader(open (f2), delimiter = ",")
for row2 in reader2:
row2['h1']
if row1['h1'] == row2['h1']:
print row1, row2
If you just want to find difference you can use difflib
As an example:
import difflib
fo1 = open(csv)
fo2 = open(CSV_new)
diff =difflib.ndiff(fo1.readlines(),fo2.readlines())
Then you can write the difference as you want
This could be what you are looking for, but as mentioned above there is some ambiguity in your description.
with open(A) as fd1, open(B) as fd2:
a, b = csv.reader(fd1), csv.reader(fd2)
ha, hb = next(a), next(b)
if not set(ha).issubset(set(hb)):
sys.exit(1)
lookup = {label : (key, hb.index(label)) for key, label in enumerate(ha)}
for rowa, rowb in zip(a, b):
for key in lookup:
index_a, index_b = lookup[key]
if rowa[index_a] != rowb[index_b]:
print(rowb)
break
I have some values that I want to write in a text file with the constraint that each value has to go to a particular column of each line.
For example, lets say that I have values = [a, b, c, d] and I want to write them in a line so that a is going to be written in the 10th column of the line, b on the 25th, c on the 34th, and d on the 48th column.
How would I do this in python?
Does python have something like column.insert(10, a)? It would make my life way easier.
I appreciate your hep.
In this case, I'd think you'd just use the padding functions with python's string formatting syntax.
Something like "%10d%15d%9d%14d"%values will place the right-most digit of a,b,c,d on the columns you listed.
If you want to have the left-most digits placed there, then you could use: "%<15d%<9d%<14d%d"%values, and prepend 10 spaces.
EDIT: For some reason I'm having trouble with the above syntax... so I used the newstyle formatting syntax like so:
" "*9 + "{:<14}{:<9}{:<14}{}".format(*values)
This should print, for values=[20,30,403,50]:
......... <-- from " "*9
20............ <-- {:<14}
30....... <-- {:<9}
403........... <-- {:<14}
50 <-- {}
----=----1----=----2----=----3----=----4----=----5 <-- guide
20 30 403 50 <-- Actual output, all together
class ColumnWriter(object):
def __init__(self, columns):
columns = (-1, ) + tuple(columns)
widths = (c2 - c1 for c1, c2 in zip(columns, columns[1:]))
format_codes = ("{" + str(i) + ":>" + str(width) +"}"
for i, width in enumerate(widths))
self.format_string = ''.join(format_codes)
def get_row(self, values):
return self.format_string.format(*values)
cw = ColumnWriter((1, 20, 21))
print cw.get_row((1, 2, 3))
print cw.get_row((1, 'a', 'a'))
if you need the columns to vary from row to row, then you can do one liners.
import itertools
for columns in itertools.combinations(range(10), 3):
print ColumnWriter(columns).get_row(('.','.','.'))
It slacks on the error checking. It needs to check that columns is sorted and that len(values) == len(columns).
It has problems with the value being longer than the area being allocated to hold it but I'm not sure what to do about this. Currently if that occurs, it overwrites the previous column. example:
print ColumnWriter((1, 2, 3)).get_row((1, 1, 'aa'))
If you had an iterable of rows that you wanted to write to a file, you could do something like this
rows = [(1, 3, 4), ('a', 'b', 4), ['foo', 'ten', 'mongoose']]
format = ColumnWriter((20, 30, 50)).get_row
with open(filename, 'w') as fout:
fout.write("\n".join(format(row) for row in rows))
You can use the mmap module to memory-map a file.
http://docs.python.org/library/mmap.html
With mmap you can do something like this:
fh = file('your_file', 'wb')
map = mmap.mmap(fh.fileno(), <length of the file you want to create>)
map[10] = a
map[25] = b
Not sure if that is what you're looking for, but it might work :)
It seems I might have misunderstood the question. The old answer is below
Perhaps you're looking for the csv module?
http://docs.python.org/library/csv.html
import csv
fh = open('eggs.csv', 'wb')
spamWriter = csv.writer(fh, delimiter=' ')
spamWriter.writerow(['Spam'] * 5 + ['Baked Beans'])
spamWriter.writerow(['Spam', 'Lovely Spam', 'Wonderful Spam'])