Learning Python, scripts for other project - python

Question: I am having some issues with my old scripts that do not work on Python 3.x
Offtopic; how flexible is Python when trying to access binary and text files for mass renaming, renumbering? within Collision and IMG archives?
I do not have the best understanding of this anymore as I have gone the direction of Level design using 3dsmax.
Anyway..
Error:
Traceback (most recent call last):
File "C:\SOL_REM.py", line 26, in <module>
process_ide(sys.argv[1], int(sys.argv[2]),
File "C:\SOL_REM.py", line 18, in process_ide
ide_line = reduce(lambda x,y: str(x)+","+st
NameError: global name 'reduce' is not defined
Code:
import sys
if len(sys.argv) < 4:
sys.exit('Usage: Source ide | ID number | Dest ide filename.' sys.argv[0])
def process_ide(ide_source, num, ide_destination):
src = open(ide_source,'r')
dst = open(ide_destination,'w')
for line in src:
ide_line = line
if not (line == "" or line[0]=="#" or len(line.split(",")) < 2):
ide_line = line.split(",")
ide_line[-1] = ide_line[-1][:-2]
ide_line[0] = num
num+=1
ide_line = reduce(lambda x,y: str(x)+","+str(y), ide_line)+"\n"
dst.write(ide_line)
src.close()
dst.close()
process_ide(sys.argv[1], int(sys.argv[2]), sys.argv[3])
Starting out simple:
What I am trying to do is parse an ide text file by changing numbers in enumerate order.
Syntax would be SOL_rem.py game.ide 1845 game2.ide
Example file:
ID Modelname TexName Rendering flags.
objs
1700, ap_booth2_03, ap_airstuff1, 1, 190, 0
1701, ap_seaplaland1, ap_seasplane, 1, 299, 0
1702, ap_seaplanehanger1, ap_seasplane, 1, 299, 0
1703, ap_termwindows1, ap_termwindows, 1, 299, 4
1704, ap_blastdef_01, ap_newprops1opac, 1, 299, 4
1705, ap_blastdef_03, ap_newprops1opac, 1, 299, 4
1706, air_brway_030, airgrndb, 1, 299, 0
end
The IDs would be re-adjusted from 1845 in ascending order.

reduce is no longer in the builtin namespace in Python 3.
Instead of using reduce, why not just use a join?
ide_line = ','.join(ide_line) + '\n'

In Python3, you can do
from functools import reduce
And even in Python2.6+ the above should be okay, but is not required.
Yes. Python is totally flexible for whatever you want to do. Like overriding builtins too.

Related

how to export WBlock by using pyautocad

hi all i have this problem when i try to export object through WBlock what is wrong there i'm try to do simple python work (note: i'm beginners :D) .. any help
from pyautocad import Autocad, APoint,utils
import win32com.client
AutoCAD = win32com.client.dynamic.Dispatch("AutoCAD.Application")
acad = Autocad(create_if_not_exists = False)
acad.Visible=True
doc = AutoCAD.ActiveDocument
layersList = doc.Layers
for l in layersList:
object = acad.iter_objects()
if l.name == "0":
pass
else:
for o in object:
if o.ObjectName == "AcDbText":
SelectionSet = doc.SelectionSets.Item(o.ObjectName).Name
directoryN = "C:\\Temp\\{}_{}.dwg".format(l.name,o.TextString)
doc.WBlock(directoryN,SelectionSet)
here what i Get
AcDbText
Traceback (most recent call last):
File "C:/Temp/Exporter.py", line 23, in <module>
doc.WBlock(directoryN,SelectionSet)
File "<COMObject <unknown>>", line 2, in WBlock
pywintypes.com_error: (-2147352571, 'Type mismatch.', None, 2)
thanks
i'm try to export every text as WBlock

Multiprocess, various process reading the same file

I am trying to simulate some dna-sequencing reads, and,in order to speed-up the code, I am in need to run it on parallel.
Basically, what I am trying to do is the following:I am sampling reads from the human genome, and I think that one the two process from multiprocessing module try to get data from the same file (the human genome) the processes gets corrupted and it is not able to get the desired DNA sequence. I have tried different things, but I am very new to parallel programming and I cannot solve my problem
When I run the script with one core it works fine.
This is the way I am calling the function
if __name__ == '__main__':
jobs = []
# init the processes
for i in range(number_of_cores):
length= 100
lock = mp.Manager().Lock()
p = mp.Process(target=simulations.sim_reads,args=(lock,FastaFile, "/home/inigo/msc_thesis/genome_data/hg38.fa",length,paired,results_dir,spawn_reads[i],temp_file_names[i]))
jobs.append(p)
p.start()
for p in jobs:
p.join()
And this is the function I am using to get the reads, were each process writes the data to a different file.
def sim_single_end(lc,fastafile,chr,chr_pos_start,chr_pos_end,read_length, unique_id):
lc.acquire()
left_split_read = fastafile.fetch(chr, chr_pos_end - (read_length / 2), chr_pos_end)
right_split_read = fastafile.fetch(chr, chr_pos_start, chr_pos_start + (read_length / 2))
reversed_left_split_read = left_split_read[::-1]
total_read = reversed_left_split_read + right_split_read
seq_id = "id:%s-%s|left_pos:%s-%s|right:%s-%s " % (unique_id,chr, int(chr_pos_end - (read_length / 2)), int(chr_pos_end), int(chr_pos_start),int(chr_pos_start + (read_length / 2)))
quality = "I" * read_length
fastq_string = "#%s\n%s\n+\n%s\n" % (seq_id, total_read, quality)
lc.release()
new_record = SeqIO.read(StringIO(fastq_string), "fastq")
return(new_record)
Here is the traceback:
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/inigo/Dropbox/PycharmProjects/circ_dna/simulations.py", line 107, in sim_ecc_reads
new_read = sim_single_end(lc,fastafile, chr, chr_pos_start, chr_pos_end, read_length, read_id)
File "/home/inigo/Dropbox/PycharmProjects/circ_dna/simulations.py", line 132, in sim_single_end
new_record = SeqIO.read(StringIO(fastq_string), "fastq")
File "/usr/local/lib/python3.5/dist-packages/Bio/SeqIO/__init__.py", line 664, in read
first = next(iterator)
File "/usr/local/lib/python3.5/dist-packages/Bio/SeqIO/__init__.py", line 600, in parse
for r in i:
File "/usr/local/lib/python3.5/dist-packages/Bio/SeqIO/QualityIO.py", line 1031, in FastqPhredIterator
for title_line, seq_string, quality_string in FastqGeneralIterator(handle):
File "/usr/local/lib/python3.5/dist-packages/Bio/SeqIO/QualityIO.py", line 951, in FastqGeneralIterator
% (title_line, seq_len, len(quality_string)))
ValueError: Lengths of sequence and quality values differs for id:6-chr1_KI270707v1_random|left_pos:50511537-50511587|right:50511214-50511264 (0 and 100).
I am the OP of this answer that I did almost a year ago. The problem was that the package that I was using for reading the human genome file (pysam) was failing. The issue was a typo when calling multiprocessing.
From the authors respose, this should work:
p = mp.Process(target=get_fasta, args=(genome_fa,))
note the ',' to ensure you pass a tuple
See https://github.com/pysam-developers/pysam/issues/409 for more details

json.dumps() works on python 2.7 but not on python 3

I have the following code:
import json
src_vol1 = {'provider_id':'src1'}
src_vol2 = {'provider_id':'src2'}
get_snapshot_params = lambda src_volume, trg_volume: {
'volumeId': src_volume['provider_id'],
'snapshotName': trg_volume['id']}
trg_vol1 = {'id':'trg1'}
trg_vol2 = {'id':'trg2'}
src_vols = [src_vol1, src_vol2]
trg_vols = [trg_vol1, trg_vol2]
snapshotDefs = map(get_snapshot_params , src_vols, trg_vols)
params = {'snapshotDefs': snapshotDefs}
json.dumps(params)
I need it work on both Python3 and Python2.7, but on Python3 I get
Traceback (most recent call last):
File "./prog.py", line 16, in <module>
File "/usr/lib/python3.4/json/__init__.py", line 230, in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python3.4/json/encoder.py", line 192, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python3.4/json/encoder.py", line 250, in iterencode
return _iterencode(o, 0)
File "/usr/lib/python3.4/json/encoder.py", line 173, in default
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: <map object at 0xb72a1a0c> is not JSON serializable
I tried to put dict() around the params but it didn't work.
What is the difference? I didn't find anything in the documentation.
map behaves differently between python2 and 3.
To reproduce the python2 behavior, replace map(...) with list(map(...)).
This still works in python2, but in python2 it makes a pointless extra copy of the list returned by map, which can consume more memory and run slower.
To avoid it, you can try something like:
try:
from itertools import imap as map # py2
except ImportError:
pass # py3, map is already defined apropriately
Or you can also check for system version and then re-define map into map_ based on system version:
import sys
ver = sys.version[:3]
if ver < '3': #Python 2
map_ = map #use same map method
elif ver >= '3': #Python 3
map_ = lambda f,x : list(map(f,x))
snapshotDefs = map_(get_snapshot_params , src_vols, trg_vols)

Using Linked lists and patterns in python

Trying to write a function that will iterate over the linked list, sum up all of the odd numbers and then display the sum. Here is what I have so far:
from List import *
def main():
array = eval(input("Give me an array of numbers: "))
ArrayToList(array)
print(array[0])
print(array[1])
print(array[2])
print(array[3])
print(sumOdds(array))
def isOdd(x):
return x % 2 != 0
def sumOdds(array):
if (array == None):
return 0
elif (isOdd(head(array))):
return head(array) + sumOdds(tail(array))
else:
return sumOdds(tail(array))
main()
I can't get it to actually print the sum though. Can anybody help me out with that?
Here is the output of the program when I run it:
$ python3 1.py
Give me an array of numbers: [11, 5, 3, 51]
Traceback (most recent call last):
File "1.py", line 22, in <module>
main()
File "1.py", line 10, in main
print(sumOdds(array))
File "1.py", line 19, in sumOdds
return head(array) + sumOdds(tail(array))
File "1.py", line 18, in sumOdds
elif (isOdd(head(array))):
File "/Users/~/cs150/practice3/friday/List.py", line 34, in head
return NodeValue(items)
File "/Users/~/cs150/practice3/friday/List.py", line 12, in NodeValue
def NodeValue(n): return n[0]
TypeError: 'int' object is not subscriptable
In Python you iterate through a list like this:
list_of_numbers = [1,4,3,7,5,8,3,7,24,23,76]
sum_of_odds = 0
for number in list_of_numbers:
# now you check for odds
if isOdd(number):
sum_of_odds = sum_of_odds + number
print(sum_of_odds)
List is also a module only on your computer. I do not know what is inside. Therefore, I can not help you after ArrayToList(array).

GHMM - Attempted m_free on NULL pointer

I'm trying to use the ghmm python module on mac osx with Python 2.7. I've managed to get everything installed, and I can import ghmm in the python environment, but there are errors when I run this (from the ghmm 'tutorial') (UnfairCasino can be found here http://ghmm.sourceforge.net/UnfairCasino.py):
from ghmm import *
from UnfairCasino import test_seq
sigma = IntegerRange(1,7)
A = [[0.9, 0.1], [0.3, 0.7]]
efair = [1.0 / 6] * 6
eloaded = [3.0 / 13, 3.0 / 13, 2.0 / 13, 2.0 / 13, 2.0 / 13, 1.0 / 13]
B = [efair, eloaded]
pi = [0.5] * 2
m = HMMFromMatrices(sigma, DiscreteDistribution(sigma), A, B, pi)
v = m.viterbi(test_seq)
Specifically I get this error:
GHMM ghmm.py:148 - sequence.c:ghmm_dseq_free(1199): Attempted m_free on NULL pointer. Bad program, BAD! No cookie for you.
python(52313,0x7fff70940cc0) malloc: * error for object 0x74706d6574744120: pointer being freed was not allocated
* set a breakpoint in malloc_error_break to debug
Abort trap
and when I set the ghmm.py logger to "DEBUG", the log prints out the following just before:
GHMM ghmm.py:2333 - HMM.viterbi() -- begin
GHMM ghmm.py:849 - EmissionSequence.asSequenceSet() -- begin >
GHMM ghmm.py:862 - EmissionSequence.asSequenceSet() -- end >
Traceback (most recent call last):
File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/logging/init.py", line 842, in emit
msg = self.format(record)
File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/logging/init.py", line 719, in format
return fmt.format(record)
File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/logging/init.py", line 464, in format
record.message = record.getMessage()
File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/logging/init.py", line 328, in getMessage
msg = msg % self.args
TypeError: not all arguments converted during string formatting
Logged from file ghmm.py, line 1159
Traceback (most recent call last):
File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/logging/init.py", line 842, in emit
msg = self.format(record)
File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/logging/init.py", line 719, in format
return fmt.format(record)
File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/logging/init.py", line 464, in format
record.message = record.getMessage()
File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/logging/init.py", line 328, in getMessage
msg = msg % self.args
TypeError: not all arguments converted during string formatting
Logged from file ghmm.py, line 949
GHMM ghmm.py:2354 - HMM.viterbi() -- end
GHMM ghmm.py:1167 - del SequenceSubSet >
So I suspect it has something to do with the way Sequences are deleted once the Viterbi function is completed, but I'm not sure if this means I need to modify the Python code, the C code, or if I need to compile ghmm and the wrappers differently. Any help/suggestions would be appreciated greatly as I have been trying to get this library to work the last 4 days.
Given the age of this question, you've probably moved onto something else, but this seemed to be the only related result I found. The issue is that a double-free is happening, due to some weirdness in how the python function 'EmissionSequence::asSequenceSet' is executed. If you look at how ghmm.py is implemented (~lines 845 - 863)
def asSequenceSet(self):
"""
#returns this EmissionSequence as a one element SequenceSet
"""
log.debug("EmissionSequence.asSequenceSet() -- begin " + repr(self.cseq))
seq = self.sequenceAllocationFunction(1)
# checking for state labels in the source C sequence struct
if self.emissionDomain.CDataType == "int" and self.cseq.state_labels is not None:
log.debug("EmissionSequence.asSequenceSet() -- found labels !")
seq.calloc_state_labels()
self.cseq.copyStateLabel(0, seq, 0)
seq.setLength(0, self.cseq.getLength(0))
seq.setSequence(0, self.cseq.getSequence(0))
seq.setWeight(0, self.cseq.getWeight(0))
log.debug("EmissionSequence.asSequenceSet() -- end " + repr(seq))
return SequenceSetSubset(self.emissionDomain, seq, self)
This should probably raise some red flags, since it seems to be reaching into the C a bit much (not that I know for sure, I haven't looked to far into it).
Anyways, if you look a little above this function, there is another function called 'sequenceSet':
def sequenceSet(self):
"""
#return a one-element SequenceSet with this sequence.
"""
# in order to copy the sequence in 'self', we first create an empty SequenceSet and then
# add 'self'
seqSet = SequenceSet(self.emissionDomain, [])
seqSet.cseq.add(self.cseq)
return seqSet
It seems that it has the same purpose, but is implemented differently. Anyways, if you replace the body of 'EmissionSequence::asSequenceSet' in ghmm.py, with just:
def asSequenceSet(self):
"""
#returns this EmissionSequence as a one element SequenceSet
"""
return self.sequenceSet();
And then rebuild/reinstall the ghmm module, the code will work without crashing, and you should be able to go on your merry way. I'm not sure if this can be submitted as a fix, since the ghmm project looks a little dead, but hopefully this is simple enough to help anyone in dire straights using this library.

Categories

Resources