Output python into python-readable format - python

We're using a python based application which reads a configuration file containing a couple of arrays:
Example layout of config file:
array1 = [
'bob',
'sue',
'jayne'
]
Currently changes to the configuration are done by hand, but I've written a little interface to streamline the process (mainly to avoid errors).
It currently reads in the existing configuration, using a simple "import". However what I'm not sure how to do, is get my script to write it's output in valid python, so that the main application can read it again.
How can I can dump the array back into the file, but in valid python?
Cheers!

I'd suggest JSON or YAML (Less verbose than JSON) for configuration files. That way, the configuration file becomes more readable for the less pythonate ;) It's also easier to throw adequate errors, e.g. if the configuration is incomplete.
To save python objects you can always use pickle.

Generally using repr() will create a string that can be re-avaluated. But pprint does a little nicer output.
from pprint import pprint
outf.write("array1 = "); pprint(array1, outf)

repr(array1) (and write that into the file) would be a very simple solution, but it should work here.

Related

How to read and change BAM files from a Python script?

I'm planning on using a Python script to change different BAM (Binary Alignment Map) file headers. Right now I am just testing the output of one bam file but every time I want to check my output, the stdout is not human readable. How can I see the output of my script? Should I use samtools view bam.file on my script? This is my code.
#!/usr/bin/env python
import os
import subprocess
if __name__=='__main__':
for file in os.listdir(os.getcwd()):
if file == "SRR4209928.bam":
with open("SRR4209928.bam", "r") as input:
content = input.readlines()
for line in content:
print(line)
Since BAM is a binary type of SAM, you will need to write something that knows how to deal with the compressed data before you can extract something meaningful from it. Unfortunately, you can't just open() and readlines() from that type of file.
If you are going to write a module by yourself, you will need to read Sequence Alignment/Map Format Specification.
Fortunately someone already did that and created a Python module: You can go ahead and check pysam out. It will surely make your life easier.
I hope it helps.

Can I read and write file in one line with Python?

with ruby I can
File.open('yyy.mp4', 'w') { |f| f.write(File.read('xxx.mp4')}
Can I do this using Python ?
Sure you can:
with open('yyy.mp4', 'wb') as f:
f.write(open('xxx.mp4', 'rb').read())
Note the binary mode flag there (b), since you are copying over mp4 contents, you don't want python to reinterpret newlines for you.
That'll take a lot of memory if xxx.mp4 is large. Take a look at the shutil.copyfile function for a more memory-efficient option:
import shutil
shutil.copyfile('xxx.mp4', 'yyy.mp4')
Python is not about writing ugly one-liner code.
Check the documentation of the shutil module - in particular the copyfile() method.
http://docs.python.org/library/shutil.html
You want to copy a file, do not manually read then write bytes, use file copy functions which are generally much better and efficient for a number of reasons in this simple case.
If you want a true one-liner, you can replace line-breaks by semi-colons :
import shutil; shutil.copyfile("xxx.mp4","yyy.mp4")
Avoid this! I did that once to speed-up an extremely specific case completely unrelated to Python, but by the presence of line-breaks in my python -c "Put 🐍️ code here" command-line and the way Meson handle it.

How to write input files with comments in Python

I need to write some input data files for a python program, and I need the full thing:
comments, spacing, variable = value, etc.
Is there any library (line argparser for command line arguments) for python or should I write my own?
Thanks!
Take a look at the ConfigParser module (renamed to configparser in Python 3).
Alternatively, you could simply write the input data using Python syntax, and import the result into your main program.
Does the syntax have to look like that? Could you just use a character delimited file (like csv or tab-delimited) with each predefined field in a separate column? Python has well defined modules to handle csv data.
If you specifically want input files that present blocks of code, then aix's suggestion of importing would also work.

create a tar file in a string using python

I need to generate a tar file but as a string in memory rather than as an actual file. What I have as input is a single filename and a string containing the assosiated contents. I'm looking for a python lib I can use and avoid having to role my own.
A little more work found these functions but using a memory steam object seems a little... inelegant. And making it accept input from strings looks like even more... inelegant. OTOH it works. I assume, as most of it is new to me. Anyone see any bugs in it?
Use tarfile in conjunction with cStringIO:
c = cStringIO.StringIO()
t = tarfile.open(mode='w', fileobj=c)
# here: do your work on t, then...:
s = c.getvalue() # extract the bytestring you need

Embed pickle (or arbitrary) data in python script

In Perl, the interpreter kind of stops when it encounters a line with
__END__
in it. This is often used to embed arbitrary data at the end of a perl script. In this way the perl script can fetch and store data that it stores 'in itself', which allows for quite nice opportunities.
In my case I have a pickled object that I want to store somewhere. While I can use a file.pickle file just fine, I was looking for a more compact approach (to distribute the script more easily).
Is there a mechanism that allows for embedding arbitrary data inside a python script somehow?
With pickle you can also work directly on strings.
s = pickle.dumps(obj)
pickle.loads(s)
If you combine that with """ (triple-quoted strings) you can easily store any pickled data in your file.
If the data is not particularly large (many K) I would just .encode('base64') it and include that in a triple-quoted string, with .decode('base64') to get back the binary data, and a pickle.loads() call around it.
In Python, you can use """ (triple-quoted strings) to embed long runs of text data in your program.
In your case, however, don't waste time on this.
If you have an object you've pickled, you'd be much, much happier dumping that object as Python source and simply including the source.
The repr function, applied to most objects, will emit a Python source-code version of the object. If you implement __repr__ for all of your custom classes, you can trivially dump your structure as Python source.
If, on the other hand, your pickled structure started out as Python code, just leave it as Python code.
I made this code. You run something like python comp.py foofile.tar.gz, and it creates decomp.py, with foofile.tar.gz's contents embedded in it. I don't think this is really portable with windows because of the Popen though.
import base64
import sys
import subprocess
inf = open(sys.argv[1],"r+b").read()
outs = base64.b64encode(inf)
decomppy = '''#!/usr/bin/python
import base64
def decomp(data):
fname = "%s"
outf = open(fname,"w+b")
outf.write(base64.b64decode(data))
outf.close()
# You can put the rest of your code here.
#Like this, to unzip an archive
#import subprocess
#subprocess.Popen("tar xzf " + fname, shell=True)
#subprocess.Popen("rm " + fname, shell=True)
''' %(sys.argv[1])
taildata = '''uudata = """%s"""
decomp(uudata)
''' %(outs)
outpy = open("decomp.py","w+b")
outpy.write(decomppy)
outpy.write(taildata)
outpy.close()
subprocess.Popen("chmod +x decomp.py",shell=True)

Categories

Resources