Perl or Python SVN Crawler

Perl or Python SVN Crawler - python

Is there an SVN crawler, that can walk thru an SVN repo and spitt out all existing branches, or tags?
Preferably in Perl or Python ...

SVN tags and branches are just directories, usually following a particular naming convention. You can easily get them in perl like:
my #branches = `svn ls YourRepoBaseURL/branches`;
chomp #branches; # remove newlines
chop #branches; # remove trailing /
my #tags = `svn ls YourRepoBaseURL/tags`;
chomp #tags;
chop #tags;

Here is a little snippet to print information about files in a SVN repository in python:
# svncrawler.py
import os
import sys
import pysvn
svn_client = pysvn.Client()
for file_status in svn_client.status(sys.argv[1]):
print u'SVN File %s %s' % (file_status, file_status.text_status)
Call it like this:
python svncrawler.py my_repository
It should be easy to modify it to just print the tags and branches.

Thanks for all the help, here is what I came up with in python with your help:
# -*- coding: utf-8 -*-
import os
import sys
import pysvn
svnclient = pysvn.Client()
projects = svnclient.list(sys.argv[1])
for project_path, project_info in projects:
try:
project_branches = svnclient.list(project_path.path + '/branches/')
if ( len(project_branches)>2 ):
for branch, info in project_branches:
print branch.path
except:
pass

Related

import python module when launching from VBA

It is the first time I write as I really didn't find any solution to my issue.
I want to allow my user to launch some Python program from Excel.
So i have this VBA code at some point:
lg_ErrorCode = wsh.Run(str_PythonPath & " " & str_PythonArg, 1, bl_StopVBwhilePython)
If lg_ErrorCode <> 0 Then
MsgBox "Couldn't run python script! " _
+ vbCrLf + CStr(lg_ErrorCode)
Run_Python = False
End If
str_PythonPath = "C:\Python34\python.exe C:\Users\XXXX\Documents\4_Python\Scan_FTP\test.py"
str_PythonArg = "arg1 arg2"
After multiple testing, the row in error in Python is when I try to import another module (I precise that this VBA code is working without the below row in Python):
import fct_Ftp as ftp
The architecture of the module is as follow:
4_Python
-folder: Scan_FTP
- file: test.py (The one launch from VBA)
-file: fct_Ftp.py
(For information, I change the architecture of the file, and try to copy the file at some other position just to test without success)
The import has no problem when I launch Test.py directly with:
import sys, os
sys.path.append('../')
But from VBA, this import is not working.
So I figured out this more generic solution, that dont work as well from Excel/VBA
import sys, os
def m_importingFunction():
str_absPath = os.path.abspath('')
str_absPathDad = os.path.dirname(str_absPath)
l_absPathSons = [os.path.abspath(x[0]) for x in os.walk('.')]
l_Dir = l_absPathSons + [str_absPathDad]
l_DirPy = [Dir for Dir in l_Dir if 'Python' in Dir]
for Dir in l_DirPy:
sys.path.append(Dir)
print(Dir)
m_importingFunction()
try:
import fct_Ftp as ftp
# ftp = __import__ ("fct_Ftp")
write += 'YAAAA' # write a file YAAAA from Python
except:
write += 'NOOOOOO' # write a file NOOOOO from VBA
f= open(write + ".txt","w+")
f.close()
Can you please help me as it is a very tricky questions ?
Many thanks to you guys.

You are able to start your program from the command line?
Why not create a batch file with excel which you then start in a shell?

Can't create user site-packages directory for usercustomize.py file

I need to add the win_unicode_console module to my usercustomize.py file, as described by the documentation.
I've discovered my user site packages directory with:
>>> import site
>>> site.getusersitepackages()
'C:\\Users\\my name\\AppData\\Roaming\\Python\\Python35\\site-packages'
I haven't been able to get to this directory using any method. I've tried using pushd instead of cd to emulate a network drive, and I've also tried getting there using run. No matter what I do in python, or in cmd terminal. I get the response The network path was not found.
Here is an example of one I've tried in cmd:
C:\>pushd \\Users\\my name\\AppData\\Roaming\\Python\\Python35\\site-packages
The network path was not found.
What am I doing wrong, or what could be wrong with the path?

DOS style backslashes don't need to be escaped within the Windows console (else they may have used forward slashes way back when!).
Follow these steps to manually create usercustomize.py:
Start->Run:cmd
Make sure you're on the C: drive
c:
Create the directory. mkdir creates the missing parents. Obviously, change "my name" as appropriate.
mkdir C:\Users\my name\AppData\Roaming\Python\Python35\site-packages
Create usercustomize.py:
notepad C:\Users\my name\AppData\Roaming\Python\Python35\site-packages\usercustomize.py
Click "yes" to create your file.
Edit as appropriate
Or use the following script to have Python do it for you:
import site
import os
import os.path
import io
user_site_dir = site.getusersitepackages()
user_customize_filename = os.path.join(user_site_dir, 'usercustomize.py')
win_unicode_console_text = u"""
# win_unicode_console
import win_unicode_console
win_unicode_console.enable()
"""
if os.path.exists(user_site_dir):
print("User site dir already exists")
else:
print("Creating site dir")
os.makedirs(user_site_dir)
if not os.path.exists(user_customize_filename):
print("Creating {filename}".format(filename=user_customize_filename))
file_mode = 'w+t'
else:
print("{filename} already exists".format(filename=user_customize_filename))
file_mode = 'r+t'
with io.open(user_customize_filename, file_mode) as user_customize_file:
existing_text = user_customize_file.read()
if not win_unicode_console_text in existing_text:
# file pointer should already be at the end of the file after read()
user_customize_file.write(win_unicode_console_text)
print("win_unicode_console added to {filename}".format(filename=user_customize_filename))
else:
print("win_unicode_console already enabled")

Python: How to ignore # so that the line is not a comment?

I'm having trouble using a config file, because the option starts with #, thus python treats it as a comment (like it should).
The part of the config file that is not working:
[channels]
#channel
As you may see, it's an IRC channel, that is why it needs the #. Now I could use some ugly method of adding the # everytime I need it, but I'd prefer to keep it clean.
So is there any way to ignore this? So that when I were to print the option, it would start with

If your setting that in a python file you can escape the # with \
Otherwise I think that should be in a config file with other syntax that doesn't treat # as a commented line

You are probably using ConfigParser - which you should mention btw - then you have to pre-/postprocess the configfile before feeding it to the parser, because ConfigParser ignores the comment-parts.
I can think of two ways, both of them make use of the readfp, instead of the read-method of the ConfigParser-class:
1) subclass StreamWriter and StreamReader from the codecs-module and use them to wrap the opening-process in a transparent recoding.
2) use StringIO from the io module like:
from io import StringIO
...
s = configfile.read()
s.replace("#","_")
f = StringIO(unicode(s))
configparser.readfp(f)
And if you don't have to use an "ini"-file syntax take a look at the json module. I use it more often then the ini-file for configuration, especially if the config-files shouldn't be manually edited by simple users.
my_config={
"channels":["#mychannel", "#yourchannel"],
"user"="bob",
"buddy-list":["alice","eve"],
}
import json
with open(configfile, 'rw') as cfg:
cfg.write(json.dumps(my_config))

ConfigParser has no way to not ignore lines beginning with '#'.
ConfigParser.py, line 476:
# comment or blank line?
if line.strip() == '' or line[0] in '#;':
continue
No way to turn it off.

In your defense ConfigParser is letting you make this mistake:
import sys
import ConfigParser
config = ConfigParser.RawConfigParser()
config.add_section('channels')
config.set('channels', '#channel', 'true')
config.write(sys.stdout)
Produces this output:
[channels]
#channel = true
However you can give section names that start with a # like so:
import sys
import ConfigParser
config = ConfigParser.RawConfigParser()
config.add_section('#channels')
config.set('#channels', 'channel', 'true')
config.write(sys.stdout)
with open('q15123871.cfg', 'wb') as configfile:
config.write(configfile)
config = ConfigParser.RawConfigParser()
config.read('q15123871.cfg')
print config.get('#channels', 'channel')
Which produces the output:
[#channels]
channel = true
true

How to extract chains from a PDB file?

I would like to extract chains from pdb files. I have a file named pdb.txt which contains pdb IDs as shown below. The first four characters represent PDB IDs and last character is the chain IDs.
1B68A
1BZ4B
4FUTA
I would like to 1) read the file line by line
2) download the atomic coordinates of each chain from the corresponding PDB files.
3) save the output to a folder.
I used the following script to extract chains. But this code prints only A chains from pdb files.
for i in 1B68 1BZ4 4FUT
do
wget -c "http://www.pdb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId="$i -O $i.pdb
grep ATOM $i.pdb | grep 'A' > $i\_A.pdb
done

The following BioPython code should suit your needs well.
It uses PDB.Select to only select the desired chains (in your case, one chain) and PDBIO() to create a structure containing just the chain.
import os
from Bio import PDB
class ChainSplitter:
def __init__(self, out_dir=None):
""" Create parsing and writing objects, specify output directory. """
self.parser = PDB.PDBParser()
self.writer = PDB.PDBIO()
if out_dir is None:
out_dir = os.path.join(os.getcwd(), "chain_PDBs")
self.out_dir = out_dir
def make_pdb(self, pdb_path, chain_letters, overwrite=False, struct=None):
""" Create a new PDB file containing only the specified chains.
Returns the path to the created file.
:param pdb_path: full path to the crystal structure
:param chain_letters: iterable of chain characters (case insensitive)
:param overwrite: write over the output file if it exists
"""
chain_letters = [chain.upper() for chain in chain_letters]
# Input/output files
(pdb_dir, pdb_fn) = os.path.split(pdb_path)
pdb_id = pdb_fn[3:7]
out_name = "pdb%s_%s.ent" % (pdb_id, "".join(chain_letters))
out_path = os.path.join(self.out_dir, out_name)
print "OUT PATH:",out_path
plural = "s" if (len(chain_letters) > 1) else "" # for printing
# Skip PDB generation if the file already exists
if (not overwrite) and (os.path.isfile(out_path)):
print("Chain%s %s of '%s' already extracted to '%s'." %
(plural, ", ".join(chain_letters), pdb_id, out_name))
return out_path
print("Extracting chain%s %s from %s..." % (plural,
", ".join(chain_letters), pdb_fn))
# Get structure, write new file with only given chains
if struct is None:
struct = self.parser.get_structure(pdb_id, pdb_path)
self.writer.set_structure(struct)
self.writer.save(out_path, select=SelectChains(chain_letters))
return out_path
class SelectChains(PDB.Select):
""" Only accept the specified chains when saving. """
def __init__(self, chain_letters):
self.chain_letters = chain_letters
def accept_chain(self, chain):
return (chain.get_id() in self.chain_letters)
if __name__ == "__main__":
""" Parses PDB id's desired chains, and creates new PDB structures. """
import sys
if not len(sys.argv) == 2:
print "Usage: $ python %s 'pdb.txt'" % __file__
sys.exit()
pdb_textfn = sys.argv[1]
pdbList = PDB.PDBList()
splitter = ChainSplitter("/home/steve/chain_pdbs") # Change me.
with open(pdb_textfn) as pdb_textfile:
for line in pdb_textfile:
pdb_id = line[:4].lower()
chain = line[4]
pdb_fn = pdbList.retrieve_pdb_file(pdb_id)
splitter.make_pdb(pdb_fn, chain)
One final note: don't write your own parser for PDB files. The format specification is ugly (really ugly), and the amount of faulty PDB files out there is staggering. Use a tool like BioPython that will handle parsing for you!
Furthermore, instead of using wget, you should use tools that interact with the PDB database for you. They take FTP connection limitations into account, the changing nature of the PDB database, and more. I should know - I updated Bio.PDBList to account for changes in the database. =)

It is probably a little late for asnwering this question, but I will give my opinion.
Biopython has some really handy features that would help you achieve such a think easily. You could use something like a custom selection class and then call it for each one of the chains you want to select inside a for loop with the original pdb file.
from Bio.PDB import Select, PDBIO
from Bio.PDB.PDBParser import PDBParser
class ChainSelect(Select):
def __init__(self, chain):
self.chain = chain
def accept_chain(self, chain):
if chain.get_id() == self.chain:
return 1
else:
return 0
chains = ['A','B','C']
p = PDBParser(PERMISSIVE=1)
structure = p.get_structure(pdb_file, pdb_file)
for chain in chains:
pdb_chain_file = 'pdb_file_chain_{}.pdb'.format(chain)
io_w_no_h = PDBIO()
io_w_no_h.set_structure(structure)
io_w_no_h.save('{}'.format(pdb_chain_file), ChainSelect(chain))

Lets say you have the following file pdb_structures
1B68A
1BZ4B
4FUTA
Then have your code in load_pdb.sh
while read name
do
chain=${name:4:1}
name=${name:0:4}
wget -c "http://www.pdb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId="$name -O $name.pdb
awk -v chain=$chain '$0~/^ATOM/ && substr($0,20,1)==chain {print}' $name.pdb > $name\_$chain.pdb
# rm $name.pdb
done
uncomment the last line if you don't need the original pdb's.
execute
cat pdb_structures | ./load_pdb.sh

Python gettext - not translating

Sample python program: [CGI script, so it needs to select its own language rather than using whatever the host OS is set to]
import gettext
gettext.install('test', "./locale")
_ = gettext.gettext
t = gettext.translation('test', "./locale", languages=['fr'])
t.install()
print _("Hello world")
./locale/fr/LC_messages/test.mo contains the translation (as binary file, generated by running msgfmt on a .po file).
Program prints "Hello world" instead of the translated version. What could be the problem?

Maybe this answer is WAY too late, but I just found this and I think it can help you.
import gettext
t = gettext.translation('test', "./locale", languages=['fr'])
_ = t.gettext
print _("Hello world")
In my own programm, I did it this way:
import gettext
DIR = "lang"
APP = "ToolName"
gettext.textdomain(APP)
gettext.bindtextdomain(APP, DIR)
#gettext.bind_textdomain_codeset("default", 'UTF-8') # Not necessary
locale.setlocale(locale.LC_ALL, "")
LANG = "FR_fr"
lang = gettext.translation(APP, DIR, languages=[LANG], fallback = True)
_ = lang.gettext
NOTE:
My program has a lang directory on it.
For every language a directory is made in lang : *XX_xx* (en_US)
Inside the directory en_US there is LC_MESSAGES, and inside there is TOOLNAME.mo
But that's my way for cross-language.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Perl or Python SVN Crawler - python

Is there an SVN crawler, that can walk thru an SVN repo and spitt out all existing branches, or tags? Preferably in Perl or Python ...

Related

import python module when launching from VBA

Can't create user site-packages directory for usercustomize.py file

Python: How to ignore # so that the line is not a comment?

How to extract chains from a PDB file?

Python gettext - not translating

Categories

Resources