I need to write some input data files for a python program, and I need the full thing:
comments, spacing, variable = value, etc.
Is there any library (line argparser for command line arguments) for python or should I write my own?
Thanks!
Take a look at the ConfigParser module (renamed to configparser in Python 3).
Alternatively, you could simply write the input data using Python syntax, and import the result into your main program.
Does the syntax have to look like that? Could you just use a character delimited file (like csv or tab-delimited) with each predefined field in a separate column? Python has well defined modules to handle csv data.
If you specifically want input files that present blocks of code, then aix's suggestion of importing would also work.
Related
Someone created a C module for python with Nuitka. (The original Python code is not available, the module is already compiled - so it is a machine binary file.) I would like to use the code within another tool, which only excepts Python files. So I would like to include the C code into Python.
To get more specific: So far I have the files thatmodule.pyi and a thatmodule.so. I can include them into my current Python code simply by running import thatmodule inside mymodule.py. Now I only want one single Python file mymodule.py.
My current idea is to copy the code from thatmodule.pyi to the beginning of mymodule.py and to convert thatmodule.so to a binary string with
with open('thatmodule.so', mode='rb') as file:
fileContent = file.read()
... missing ... how to convert fileContent to b'string'...
and put this binary string into mymodule.py. And then I have to execute this binary string from within my python module mymodule.py. How can I do this?
You'll have to write it out to a file (and presumably the .pyi too) and then use python's importlib to dynamically import it.
If you have a documentation, which describes functions for thatmodule.so , you can use the following:
import ctypes
mylib = ctypes.CDLL("thatmodule.so")
Documentation here
I'm planning on using a Python script to change different BAM (Binary Alignment Map) file headers. Right now I am just testing the output of one bam file but every time I want to check my output, the stdout is not human readable. How can I see the output of my script? Should I use samtools view bam.file on my script? This is my code.
#!/usr/bin/env python
import os
import subprocess
if __name__=='__main__':
for file in os.listdir(os.getcwd()):
if file == "SRR4209928.bam":
with open("SRR4209928.bam", "r") as input:
content = input.readlines()
for line in content:
print(line)
Since BAM is a binary type of SAM, you will need to write something that knows how to deal with the compressed data before you can extract something meaningful from it. Unfortunately, you can't just open() and readlines() from that type of file.
If you are going to write a module by yourself, you will need to read Sequence Alignment/Map Format Specification.
Fortunately someone already did that and created a Python module: You can go ahead and check pysam out. It will surely make your life easier.
I hope it helps.
I was actually using code from a course at Udacity.com on Data Wrangling. The code file is very short so I was able to copy what they did and I still get an error. They use python 2.7.x. The course is about a year old, so maybe something about the functions or modules in the 2.7 branch has changed. I mean the code used by the instructors works.
I know that using the csv module or function would solve the issue but they want to demonstrate the use of a custom parse function. In addition, they are using the enumerate function. Here is the link to the gist.
This should be very simple and basic and that is why it is frustrating me. I know they are reading the file, which is a csv file, as binary, with the "rb" parameter to the line
with open("file.csv", "rb") as f:
You don't have matching characters in your csv file and the dictionaries in your test function. In particular, in your csv file you are using an em dash (U+2014) and in your firstline and tenthline dictionaries you are using a hyphen-minus (U+002D).
hex(ord(d[0]['US Chart Position'].decode('utf-8')))
'0x2014' # output: code point for the em dash character in csv file
hex(ord(firstline['US Chart Position']))
'0x2d' # output: code point for hyphen-minus
To fix it, just copy and paste the — character from the csv in your gist into the dictionaries in your source code to replace the - characters.
Make sure to include this comment at the top of your file:
# -*- coding: utf-8 -*-
This will ensure that Python knows to expect non-ascii characters in the source code.
Alternatively, you could replace all the — (em dash) characters in the csv file with hyphens:
sed 's/—/-/g' beatles-diskography.csv > beatles-diskography2.csv
Then, remember to use the new file name in your source code.
I'm trying to run an external program from a Python script.
After searching and reading multiple post here I came to what seemed to be the solution.
First, I used subprocess.call function.
If I build the command this way:
hmmer1=subprocess.call("D:\Python_Scripts\HMMer3\hmmsearch.exe --tblout hmmTestTab.out SDHA.hmm Test.fasta")
The external program D:\Python_Scripts\HMMer3\hmmsearch.exe is run taking hmmTestTab.out as file name for the output and SDHA.hmm and Test.fasta as input files.
Nevertheless, if I try to replace the file names with the variables outfile, hmmprofile and fastafile (I intend to receive those variables as arguments for the Python script and use them to build the external program call),
hmmer2=subprocess.call("D:\Python_Scripts\HMMer3\hmmsearch.exe --tblout outfile hmmprofile fastafile")
Python prints an error about being unable to open the input files.
I also used "Popen" function with analogous results:
This call works
hmmer3=Popen(['D:\Python_Scripts\HMMer3\hmmsearch.exe', '--tblout','hmmTestTab.out', 'SDHA.hmm','Test.fasta'])
and this one doesn't
hmmer4=Popen(['D:\Python_Scripts\HMMer3\hmmsearch.exe', '--tblout','outfile', 'hmmprofile','fastafile'])
As result of this, I presume I need to understand which is process to follow to interpolate the variables into the call, because it seems that the problem is there.
Would any of you help me with this issue?
Thanks in advance
You have:
hmmer4=Popen(['D:\Python_Scripts\HMMer3\hmmsearch.exe', '--tblout','outfile', 'hmmprofile','fastafile'])
But that's not passing the variable outfile. It's passing a string, 'outfile'.
You want:
hmmer4=Popen(['D:\Python_Scripts\HMMer3\hmmsearch.exe', '--tblout', outfile, hmmprofile, fastafile])
And the other answer is correct, though it addresses a different problem; you should double the backslashes, or use r'' raw strings.
Try to change this:
hmmer1=subprocess.call("D:\Python_Scripts\HMMer3\hmmsearch.exe"
to
hmmer1=subprocess.call('D:\\Python_Scripts\\HMMer3\\hmmsearch.exe'
Edit
argv = ' --tblout outfile hmmprofile fastafile' # your arguments
program = [r'"D:\\Python_Scripts\\HMMer3\\hmmsearch.exe"', argv]
subprocess.call(program)
I need to generate a tar file but as a string in memory rather than as an actual file. What I have as input is a single filename and a string containing the assosiated contents. I'm looking for a python lib I can use and avoid having to role my own.
A little more work found these functions but using a memory steam object seems a little... inelegant. And making it accept input from strings looks like even more... inelegant. OTOH it works. I assume, as most of it is new to me. Anyone see any bugs in it?
Use tarfile in conjunction with cStringIO:
c = cStringIO.StringIO()
t = tarfile.open(mode='w', fileobj=c)
# here: do your work on t, then...:
s = c.getvalue() # extract the bytestring you need