python file I/O with binary data - python

I'm extracting jpeg type bits from mp3 data actually it will be album arts.
I thought about using library called mutagen, but I'd like to try with bits for some practice purpose.
import os
import sys
import re
f = open(sys.argv[1], "rb")
#sys.argv[1] gets mp3 file name ex) test1.mp3
saver = ""
for value in f:
for i in value:
hexval = hex(ord(i))[2:]
if (ord(i) == 0):
saver += "00" #to match with hex form
else:
saver += hexval
header = "ffd8"
tail = "ffd9"
this part of code is to get mp3 as bit form, and then transform it into hex
and find jpeg trailers which starts as "ffd8" and ends with "ffd9"
frontmatch = re.search(header,saver)
endmatch = re.search(tail, saver)
startIndex = frontmatch.start()
endIndex = endmatch.end()
jpgcontents = saver[startIndex:endIndex]
scale = 16 # equals to hexadecimal
numbits = len(jpgcontents) * 4 #log2(scale)
bitcontents = bin(int(jpgcontents, scale))[2:].zfill(numbits)
and here, I get the bits between the header and tail and transform it into
binary form. Which supposed to be the jpg part of the mp3 files.
txtfile = open(sys.argv[1] + "_tr.jpg", "w")
txtfile.write(bitcontents)
and I wrote the bin to the new file with writing type as jpg.
sorry for my wrong naming as txtfile.
But these codes gave the error which is
Error interpreting JPEG image file
(Not a JPEG file: starts with 0x31 0x31)
I'm not sure whether the bits I extracted are wrong or writing to the file
step is wrong. Or there might be other problem in code.
I'm working in linux version with python 2.6. Is there anything wrong with
just writing str type of bin data as JPG?

You are creating a string of ASCII zeroes and ones, i.e. \x30 and \x31, but the JPEG file needs to be proper binary data. So where your file should have a single byte of (for example) \xd8 you instead have these eight bytes: 11011000, or \x31\x31\x30\x31\x31\x30\x30\x30.
You don't need to do all that messy conversion stuff. You can just search directly for the desired byte patterns, writing them using \x hex escape sequences. And you don't even need regex: the simple string .index or .find methods can do this easily and quickly.
with open(fname, 'rb') as f:
data = f.read()
header = "\xff\xd8"
tail = "\xff\xd9"
try:
start = data.index(header)
end = data.index(tail, start) + 2
except ValueError:
print "Can't find JPEG data!"
exit()
print 'Start: %d End: %d Size: %d' % (start, end, end - start)
with open(fname + "_tr.jpg", 'wb') as f:
f.write(data[start:end])
(Tested on Python 2.6.6)
However, extracting embedded JPEG data like this isn't foolproof, since it's possible that those header and tail byte sequences exist in the MP3 sound data.
FWIW, a simpler way to translate binary data to hex strings and back is to use hexlify and unhexlify from the binascii module.
Here are some examples of doing these transformations, both with and without the binascii functions.
from binascii import hexlify, unhexlify
#Create a string of all possible byte values
allbytes = ''.join([chr(i) for i in xrange(256)])
print 'allbytes'
print repr(allbytes)
print '\nhex list'
print [hex(ord(v))[2:].zfill(2) for v in allbytes]
hexstr = hexlify(allbytes)
print '\nhex string'
print hexstr
newbytes = ''.join([chr(int(hexstr[i:i+2], 16)) for i in xrange(0, len(hexstr), 2)])
print '\nNew bytes'
print repr(newbytes)
print '\nUsing unhexlify'
print repr(unhexlify(hexstr))
output
allbytes
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff'
hex list
['00', '01', '02', '03', '04', '05', '06', '07', '08', '09', '0a', '0b', '0c', '0d', '0e', '0f', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '1a', '1b', '1c', '1d', '1e', '1f', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '2a', '2b', '2c', '2d', '2e', '2f', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '3a', '3b', '3c', '3d', '3e', '3f', '40', '41', '42', '43', '44', '45', '46', '47', '48', '49', '4a', '4b', '4c', '4d', '4e', '4f', '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', '5a', '5b', '5c', '5d', '5e', '5f', '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', '6a', '6b', '6c', '6d', '6e', '6f', '70', '71', '72', '73', '74', '75', '76', '77', '78', '79', '7a', '7b', '7c', '7d', '7e', '7f', '80', '81', '82', '83', '84', '85', '86', '87', '88', '89', '8a', '8b', '8c', '8d', '8e', '8f', '90', '91', '92', '93', '94', '95', '96', '97', '98', '99', '9a', '9b', '9c', '9d', '9e', '9f', 'a0', 'a1', 'a2', 'a3', 'a4', 'a5', 'a6', 'a7', 'a8', 'a9', 'aa', 'ab', 'ac', 'ad', 'ae', 'af', 'b0', 'b1', 'b2', 'b3', 'b4', 'b5', 'b6', 'b7', 'b8', 'b9', 'ba', 'bb', 'bc', 'bd', 'be', 'bf', 'c0', 'c1', 'c2', 'c3', 'c4', 'c5', 'c6', 'c7', 'c8', 'c9', 'ca', 'cb', 'cc', 'cd', 'ce', 'cf', 'd0', 'd1', 'd2', 'd3', 'd4', 'd5', 'd6', 'd7', 'd8', 'd9', 'da', 'db', 'dc', 'dd', 'de', 'df', 'e0', 'e1', 'e2', 'e3', 'e4', 'e5', 'e6', 'e7', 'e8', 'e9', 'ea', 'eb', 'ec', 'ed', 'ee', 'ef', 'f0', 'f1', 'f2', 'f3', 'f4', 'f5', 'f6', 'f7', 'f8', 'f9', 'fa', 'fb', 'fc', 'fd', 'fe', 'ff']
hex string
000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f404142434445464748494a4b4c4d4e4f505152535455565758595a5b5c5d5e5f606162636465666768696a6b6c6d6e6f707172737475767778797a7b7c7d7e7f808182838485868788898a8b8c8d8e8f909192939495969798999a9b9c9d9e9fa0a1a2a3a4a5a6a7a8a9aaabacadaeafb0b1b2b3b4b5b6b7b8b9babbbcbdbebfc0c1c2c3c4c5c6c7c8c9cacbcccdcecfd0d1d2d3d4d5d6d7d8d9dadbdcdddedfe0e1e2e3e4e5e6e7e8e9eaebecedeeeff0f1f2f3f4f5f6f7f8f9fafbfcfdfeff
New bytes
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff'
Using unhexlify
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff'
Note that this code needs some modifications to run on Python 3 (apart from converting the print statements to print function calls) because plain Python 3 strings are Unicode strings, not byte strings.

You need to write out as binary
Try:
txtfile = open(sys.argv[1] + "_tr.jpg", "wb")

Oups, you are not doing what you expect. the bin generates a string containing the value in binary form. Let's look at what you have, if the content on the input file was :
saver is a string of hexadecimal characters in textual form something like "313233414243" for an initial string of "132ABC"
jpgcontents has same format and starts with "ffd8" and ends with "ffd9"
you then apply the magic formula bin(int(jpgcontents, scale))[2:].zfill(numbits) that
convert the hexa string to a long integer
convert the long integer to a binary representation string - this part would convert hexa "ff" in integer 255 and end in the string "0b11111111"
remove first characters "0b" and fill the end of buffer if needed
bitcontents is then a string starting with "11111111....". Just rename your file with a .txt extension and open it with a text editor, you will see that it is a large file containing only ASCII characters 0 and 1.
As the header is "ffd8" the file will start with 10 "1". So the error that it starts with 0x31 0x31 because 0x31 is the ascii code of "1".
What you need is convert the hexa string jpgcontents in a binary byte array.
fileimage = ''.join([ jpgcontent[i:i+2] for i in range(0, len(jpgcontent), 2]
You can then safely copy the fileimage buffer to a binary file:
file = open(sys.argv[1] + "_tr.jpg", "wb")
file.write(fileimage)

The easiest method is using the binascii module: https://docs.python.org/2/library/binascii.html.
import binascii
# code in ascii format contained in a list
code = ['00', '01', '02', '03', '04', '05', '06', '07', '08', '09']
bfile = open('bfile.bin', 'w')
for c in code:
# convert the ascii to binary and write it to the file
bfile.write(binascii.unhexlify(c))
bfile.close()

Related

How to automate single line code, with changes in input?

I have created a variable named 'j' which has some values and I want my code to pick one value at a time and execute.
I tried writing code but it does not work. I'm sharing my code please see when it can be improved.
j = ['0', '1', '3', '4', '6', '7', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40', '41', '42', '43', '44', '45', '46', '47', '48', '49', '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', '60', '61', '62', '63', '64', '65', '66', '67']
for i in j:
labels('i') = mne.read_labels_from_annot('sub-CC721377_T1w', parc='aparc', subjects_dir=subjects_dir)['i']
done
According to the MNE documentation, the function read_labels_from_annot returns a list of labels.
Thus, instead of indexing the result with ...)[0] at the end, you should just capture the entire list:
labels = mne.read_labels_from_annot(...)
This would capture a list of labels, rather than a single label, which would have the effect of 'indexing at the end "[0]" from 0 - 67'.
You asked about adding all the results together into a label_all variable. You didn't specify (and I don't know anything about the MNE package), so it's not clear: do the labels ever repeat? Is is possible that "lab123" will occur in every input file? If so, should the label_all store multiple copies of the same value, or just the unique label names?
I think something like this is what you're after:
import mne
def get_labels_for_subject(sub, *, hemi='both', parc='aparc', **kwargs):
"""Get MNE labels for a given subject. **kwargs allows passing named
parameters like subjects_dir, regexp, and others that default to None."""
labels = mne.read_labels_from_annot(sub, hemi=hemi, parc=parc, **kwargs)
return labels
# List of all the subjects
subjects = [
'sub-CC721377_T1w',
'sub-next???',
]
label_all = []
for s in subjects:
label_all.extend(get_labels_for_subject(s, subjects_dir='.'))
print("Got labels:", label_all)

Python set remove Non Numeric values

I have set value like below:
set(['Virtual', '120', 'P', '130', '90', '250', '100', '10', 'Mar', 'indicates', '18', '50', '40', '1', '|'])
How do i remove all Non Numeric value?
Output expected:
set(['120', '130', '90', '250', '100', '10','18', '50', '40', '1'])
You can create a new set:
number_set = set()
for object in old_set:
try:
number_set.add(int(object))
except ValueError:
print("Not a number")
print(number_set)
You can also try removing all non-numeric objects from the set:
for object in old_set:
try:
x = int(object)
execpt ValueError:
old_set.remove(object)
You can use a filter to clean your set:
s = set(['Virtual', '120', 'P', '130', '90', '250', '100', '10', 'Mar', 'indicates', '18', '50', '40', '1', '|'])
def isInt(text):
"""Returns True for a text that is convertable to int() else False."""
try:
_ = int(text)
return True
except ValueError:
return False
# apply filter:
filteredSet = set( filter(lambda x:isInt(x), s))
print(filteredSet)
Output:
{'18', '90', '130', '120', '40', '50', '10', '1', '100', '250'}
This output differs from your want's - but thats how python prints a set with print.

Assembly x86 "PSHUFB 128bit" implementation in another language

I was reversing some application and i faced this opcode:
PSHUFB XMM2, XMMWORD_ADDRESS
and i tried implementing the algorithm of this function in python with no success!
The reference of how this opcode should work is here:
http://www.felixcloutier.com/x86/PSHUFB.html
Here is a code snippet:
PSHUFB (with 128 bit operands)
for i = 0 to 15 {
if (SRC[(i * 8)+7] = 1 ) then
DEST[(i*8)+7..(i*8)+0] ← 0;
else
index[3..0] ← SRC[(i*8)+3 .. (i*8)+0]; DEST[(i*8)+7..(i*8)+0] ← DEST[(index*8+7)..(index*8+0)];
endif
}
DEST[VLMAX-1:128] ← 0
im trying to implement the 128 version of this opcode with no success.
Here are the values before and after the function
Before
WINDBG>r xmm2
xmm2= 0 3.78351e-044 6.09194e+027 6.09194e+027
After
WINDBG>r xmm2
xmm2=9.68577e-042 0 4.92279e-029 4.92279e-029
in python you can use 'struct' to change those from float numbers to Hex:
hex(struct.unpack('<I', struct.pack('<f', f))[0])
So i can sort of say those are the hex values of XMM2 before and after the PSHUFB opcode:
Before
xmm2 = 0 0x0000001b 0x6d9d7914 0x6d9d7914
After
xmm2 = 00001b00 00000000 10799d78 10799d78
And most importantly, i almost forgot.. the value of XMMWORD_ADDRESS is:
03 02 01 00 07 06 05 04 0D 0C 0B 0A 09 08 80 80
xmmword 808008090A0B0C0D0405060700010203h
Implementation in Python could be highly appreciated.
Implementation in C could work as well
or maybe some explanation of how the hell it works!
Because i couldnt understand the intel reference
This is the code algorithm i have so far
x = ['00', '00', '00', '00', '00', '00', '00', '1b', '6d', '9d', '79', '14', '6d', '9d', '79', '14']
s = ['03', '02', '01', '00', '07', '06', '05', '04', '0D', '0C', '0B', '0A', '09', '08', '80', '80']
new = []
for i in range(16):
if 0x80 == int(s[i], 16) & 0x80:
print "MSB", s[i]
new.append(0)
else:
print "NOT MSB", s[i]
new.append( x[int(s[i], 16) & 15] )
print x
print new
Where x is the xmm0, and s is the SRC.
the output i get is:
['00', '00', '00', '00', '00', '00', '00', '1b', '6d', '9d', '79',
'14', '6d', '9d', '79', '14']
['00', '00', '00', '00', '1b', '00', '00', '00', '9d', '6d', '14',
'79', '9d', '6d', '00', '00']
where i should get
['00', '00', '1b', '00', '00', '00', '00', '00', '10', '79', '9d',
'78', '10', '79', '9d', '78']
Something else i have noticed right now, in the 'output' i get the hexadecimal number 0x78
Where could it come from?
It works like 16 parallel table lookups, with special handling for indexes that have their top bit set. So for example, it could look like this: (not tested, not Python)
for (int i = 0; i < 16; i++)
new_dest[i] = (src[i] & 0x80) ? 0 : dest[src[i] & 15];
dest = new_dest;
The new_dest there is significant, because it's really 16 parallel assignments, ie read-before-write, the second lookup is not affected by what happened in to the first byte and so on. Intel's code snippet leaves that implicit (or is wrong, depending on how you look at it).

Random data generator mathing a regex in python

In python, I am looking for python code which I can use to create random data matching any regex. For example, if the regex is
\d{1,100}
I want to have a list of random numbers with a random length between 1 and 100 (equally distributed)
There are some 'regex inverters' available (see here) which compute ALL possible matches, which is not what I want, and which is extremely impracticable. The example above, for example, has more then 10^100 possible matches, which never can be stored in a list. I just need a function to return a match by random.
Maybe there is a package already available which can be used to accomplish this? I need a function that creates a matching string for ANY regex, not just the given one or some other, but maybe 100 different regex. I just cannot code them myself, I want the function extract the pattern to return me a matching string.
If the expressions you match do not have any "advanced" features, like look-ahead or look-behind, then you can parse it yourself and build a proper generator
Treat each part of the regex as a function returning something (e.g., between 1 and 100 digits) and glue them together at the top:
import random
from string import digits, uppercase, letters
def joiner(*items):
# actually should return lambda as the other functions
return ''.join(item() for item in items)
def roll(item, n1, n2=None):
n2 = n2 or n1
return lambda: ''.join(item() for _ in xrange(random.randint(n1, n2)))
def rand(collection):
return lambda: random.choice(collection)
# this is a generator for /\d{1,10}:[A-Z]{5}/
print joiner(roll(rand(digits), 1, 10),
rand(':'),
roll(rand(uppercase), 5))
# [A-C]{2}\d{2,20}#\w{10,1000}
print joiner(roll(rand('ABC'), 2),
roll(rand(digits), 2, 20),
rand('#'),
roll(rand(letters), 10, 1000))
Parsing the regex would be another question. So this solution is not universal, but maybe it's sufficient
Two Python libraries can do this: sre-yield and Hypothesis.
sre-yield
sre-yeld will generate all values matching a given regular expression. It uses SRE, Python's default regular expression engine.
For example,
import sre_yield
list(sre_yield.AllStrings('[a-z]oo$'))
['aoo', 'boo', 'coo', 'doo', 'eoo', 'foo', 'goo', 'hoo', 'ioo', 'joo', 'koo', 'loo', 'moo', 'noo', 'ooo', 'poo', 'qoo', 'roo', 'soo', 'too', 'uoo', 'voo', 'woo', 'xoo', 'yoo', 'zoo']
For decimal numbers,
list(sre_yield.AllStrings('\d{1,2}'))
['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '00', '01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40', '41', '42', '43', '44', '45', '46', '47', '48', '49', '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', '70', '71', '72', '73', '74', '75', '76', '77', '78', '79', '80', '81', '82', '83', '84', '85', '86', '87', '88', '89', '90', '91', '92', '93', '94', '95', '96', '97', '98', '99']
Hypothesis
The unit test library Hypothesis will generate random matching examples. It is also built using SRE.
import hypothesis
g=hypothesis.strategies.from_regex(r'^[A-Z][a-z]$')
g.example()
with output such as:
'Gssov', 'Lmsud', 'Ixnoy'
For decimal numbers
d=hypothesis.strategies.from_regex(r'^[0-9]{1,2}$')
will output one or two digit decimal numbers: 65, 7, 67 although not evenly distributed. Using \d yielded unprintable strings.
Note: use begin and end anchors to prevent extraneous characters.
From this answer
You could try using python to call this perl module:
https://metacpan.org/module/String::Random

How to parse/format a string of tuples?

I jus wrote a grade calculating program and am down to the last few lines ... I have this string of tuples :
"('Jetson Elroy', '45', '88', '88', '70', 0.7253846153846155, 'C', 'Not Passible', 0.9385714285714284, 0.6528571428571427, 0.367142857142857), ('Joe Kunzler', '99', '88', '77', '55', 0.7699999999999999, 'C', 'Not Passible', 0.8557142857142858, 0.5700000000000001, 0.28428571428571436), ('Harris Jones', '77', '99', '47', '77', 0.7115384615384616, 'C', 'Not Passible', 0.9642857142857143, 0.6785714285714286, 0.39285714285714285)"
I'd like to make it so each student and their scores are printed into my text file line by line, like :
'Jetson', '45', '88', '88', '70', 0.7253846153846155, 'C', 'Not Passible', 0.9385714285714284, 0.6528571428571427, 0.367142857142857
'Joe', '99', '88', '77', '55', 0.7699999999999999, 'C', 'Not Passible', 0.8557142857142858, 0.5700000000000001, 0.28428571428571436
'Harris Jones', '77', '99', '47', '77', 0.7115384615384616, 'C', 'Not Passible', 0.9642857142857143, 0.6785714285714286, 0.39285714285714285
I tried to do :
m=open('GradeAnalysis.txt', 'r+')
m.write (print i for i in ourlist)
m.close()
But I realize you have to have a string if you want to use m.write(). Then I transferred my list into the string of tuples like I have above. Any pythonic suggestions are appreciated.
join() can be a simple solution:
m.write('\n'.join(i for i in ourstring))
Edit: Since ourstring is not a list of strings but a string, you can use a regular expression to get the strings that represent the tuples
import re
ourstring = "('Jetson Elroy', '45', '88', '88', '70', 0.7253846153846155, 'C', 'Not Passible', 0.9385714285714284, 0.6528571428571427, 0.367142857142857), ('Joe Kunzler', '99', '88', '77', '55', 0.7699999999999999, 'C', 'Not Passible', 0.8557142857142858, 0.5700000000000001, 0.28428571428571436), ('Harris Jones', '77', '99', '47', '77', 0.7115384615384616, 'C', 'Not Passible', 0.9642857142857143, 0.6785714285714286, 0.39285714285714285)"
tuples = re.findall(r'\((.+?)\)', ourstring) # (.+?) to perform a non-greedy match - important!
m.write ('\n'.join(t for t in tuples))
Use either str.join() or the csv module to write your data.
Using ','.join(str(i) for i in ourlist) creates a comma-separated string to write to the file, but since this is really CSV data, the csv module is perhaps better suited.
import csv
with open('GradeAnalysis.txt', 'rb') as out:
writer = csv.writer(out)
for row in list_of_sequences:
out.writerow(row)
where list_of_sequences is then a list containing other lists or tuples; for example:
list_of_sequences = [(1, 2, 3), ('foo', 'bar', 'baz')]

Categories

Resources