I have a script that combines around 1000 files by using an external program. It does so by copying the first file to combined.sym and then combining the this file with all other files, one by one. Code is as follows:
# gets all .sym files
symfiles = []
onlyfiles = [f for f in listdir(myPath) if isfile(join(myPath, f))]
for f in onlyfiles:
if '.sym' == splitext(f)[1]:
symfiles.append(f)
# copies one initial combined file (point to start from)
combinedFileName = "combined.sym"
copyfile(myPath + symfiles[0], myPath + combinedFileName)
for sym in symfiles:
print("processing " + str(sym))
cmdString = combinedFileName + " "
cmdString += sym
cmdString += " -a " + combinedFileName
finalCmdString = r'cmd.exe /C fuseFiles.bat "' + myPath + '" "' + cmdString + '"'
call(finalCmdString)
print("done")
The batchfile I'm calling looks like this, but that is not the issue:
#set PATH=\path\to\executable\;%PATH%
#cd %1
#FusingProgram "%2"
Now, when I run this script, it runs for like 15 Minutes before it just stops and returns to the CMD.
processing onefile.sym
processing anotherfile.sym
processing evenmorefiles.sym
processing yougetit.sym
C:\Users\test\>
As you can see, it misses the "done" print, and it didn't process all files. The combined.sym is created and expanded by the other .sym files
When checking the Windows events, I can see, that it logs python as crashed and lists ntdll.dll as faulty module(sorry for german log):
Name der fehlerhaften Anwendung: python.exe, Version: 0.0.0.0, Zeitstempel: 0x53787196
Name des fehlerhaften Moduls: ntdll.dll, Version: ****, Zeitstempel: 0x5b6db230
Ausnahmecode: 0xc0000005
Fehleroffset: 0x0002e2eb
ID des fehlerhaften Prozesses: 0x6224
Startzeit der fehlerhaften Anwendung: 0x01d454e0ff20377c
Pfad der fehlerhaften Anwendung: C:\pat\to\python\python.exe
Pfad des fehlerhaften Moduls: C:\WINDOWS\SysWOW64\ntdll.dll
Berichtskennung: 8b4254f0-c0d5-11e8-9a6f-901b0e5a6206
I've done some research and it seems like I'd have to resolve this by updating the ntdll.dll file, but my system is all up to date, so there is nothing I can do about that (except for downloading another version from some untrustworthy site, which I'm not gonna do)
All this being said, have you ever encountered a similar issue or is this just a stupid mistake I made?
Related
I am attempting to run PowerShell script from python to convert .xls files to .xlsb. by looping through a list of file names. I am encountering a PowerShell error "You cannot call a method on a null-valued expression" for command 3 (i.e. cmd3), and I am unsure why (this is my first time with python and running PowerShell script in general). The error is encountered when trying to open the workbook, but when the command is run in PowerShell directly, it seems to work fine.
Code:
import logging, os, shutil, itertools, time, pyxlsb, subprocess
# convert .xls to .xlsb and / transfer new terminology files
for i in itertools.islice(FileList, 0, 6, None):
# define extension
ext = '.xls'
# define file path
psPath = f'{downdir}' + f'\{i}'
# define ps scripts
def run(cmd):
completed = subprocess.run(["powershell", "-Command", cmd], capture_output=True)
return completed
# ps script: open workbook
cmd1 = "$xlExcel12 = 50"
cmd2 = "$Excel = New-Object -Com Excel.Application"
cmd3 = f"$WorkBook = $Excel.Workbooks.Open('{psPath}{ext}')"
cmd4 = f"$WorkBook.SaveAs('{psPath}{ext}',$xlExcel12,[Type]::Missing,
[Type]::Missing,$false,$false,2)"
cmd5 = "$Excel.Quit()"
# ps script: delete.xls files
cmd6 = f"Remove-Item '{psPath}{ext}'"
run(cmd1)
run(cmd2)
run(cmd3)
# change extension
ext = '.xlsb'
run(cmd4)
run(cmd5)
run(cmd6)
# copy .xlsb files to terminology folder
shutil.copy(i + ext, termdir)
Error:
Out[79]: CompletedProcess(args=['powershell', '-Command', "$WorkBook = > > $Excel.Workbooks.Open('C:\Users\Username\Downloads\SEND Terminology.xls')"], returncode=1, stdout=b'', stderr=b"You cannot call a method on a null-valued expression.\r\nAt line:1 char:1\r\n+ $WorkBook = $Excel.Workbooks.Open('C:\Username\User\Downloads\SEND Ter ...\r\n+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\r\n + CategoryInfo : InvalidOperation: (:) [], RuntimeException\r\n + FullyQualifiedErrorId : InvokeMethodOnNull\r\n \r\n")
Any input would be helpful.
Thank you!
Problem
As commenter vonPryz correctly stated, the Powershell commands run in separate processes. The memory spaces of processes are isolated from each other, and will be cleared when a process ends.
When you run the commands in separate Powershell processes, the cmd3, cmd4 and cmd5 won't have the variable $Excel available. Powershell defaults to a $null value for non-existing variables, hence the error message "You cannot call a method on a null-valued expression". The same happens for variable $xlExcel12. These variables only exists as long as the processes that created them were running and would only be visible within these processes, even if you managed to create two processes in parallel.
Solution
Commands cmd1..5 need to be run in the same Powershell process, so each command will be able to "see" the variables created by previous commands:
run( cmd1 + ';' + cmd2 + ';' + cmd3 + ';' + cmd4 + ';' + cmd5 )
You will need to change cmd4 to use another variable for the extension that will be used for saving, e. g. extSave
cmd4 = f"$WorkBook.SaveAs('{psPath}{extSave}',$xlExcel12,[Type]::Missing,
[Type]::Missing,$false,$false,2)"
The cmd6 is completely independent, because it does not depend on Powershell variables. It only depends on python variables, which are resolved before the process starts, so it could still be run in a separate process.
I am a little confused about the way Popen works and I am hoping this is something silly. I am never getting a completion, and poll seems to be returning something odd (log attached)
This is backing up a triplet of schemas (Tablespace) using a utility (CSSBACKUP) supplied to do this.
for i in range(len(schematype)):
schema_base = schemaname + '_' + schematype[i] # we need this without the trailing space.
outputstring = base_folder + '\\' + schemaname + '\\' + schema_base + '_' + timestr + '_cssbackup '
rc = os.unlink(outputstring) # wont run if there is a backup already
logstring = base_folder + '\\' + schemaname + '\\' + schema_base + '_' + timestr + '.log'
exString = "cssbackup " + phys_string + '-schema '+ schema_base + ' ' + '-file ' + outputstring + '-log '+ logstring
logging.debug(exString)
processlist.append(subprocess.Popen(exString)) # start a seperate thread for each one, but we don't want to proceed until processlist[].poll == None (thread is complete)
procdone[i] = False
Now that I have all the processes spawn, I need to sync them up
while finishit < len(schematype):
time.sleep(CSTU_CFG.get(hostname).get("logintv")) # need a delay to keep this program from thrashing
for i in range(len(schematype)): # check each of the procs
if procdone[i] is not True: # if it completed, skip it
if processlist[i].poll is not None: # if it returns something other than "none" it's still running
logging.debug(' Running '+ schematype[i] + ' ' + str(processlist[i])+ ' '+ str(time.time() - start_time))
procdone[i] = False
else:
procdone[i] = True # None was returned so it's finished
logging.debug(' Ended '+ schematype[i]) # log it
finishit = finishit + 1 # update the count
processlist[i].kill # kill the process that was running ( Saves memory )
logging.debug('Dump functions complete')
When I run this, I don't get what I am expecting. I was expecting a pid in the return but I dont see it. So what I get back isnt useful for the .poll command.
So the program runs forever even after the shell that it spawned are gone.
I'm missing something basic.
THanks
11:26:26,133 root, DEBUG Running local 30.014784812927246
11:26:26,133 root, DEBUG Running central 30.014784812927246
11:26:26,133 root, DEBUG Running mngt 30.014784812927246
11:26:56,148 root, DEBUG Running local 60.02956962585449
11:26:56,148 root, DEBUG Running central 60.02956962585449
11:26:56,148 root, DEBUG Running mngt 60.02956962585449
11:27:26,162 root, DEBUG Running local 90.04435467720032
11:27:26,162 root, DEBUG Running central 90.04435467720032
11:27:26,162 root, DEBUG Running mngt 90.04435467720032
11:27:56,177 root, DEBUG Running local 120.05913925170898
11:27:56,177 root, DEBUG Running central 120.05913925170898
11:27:56,177 root, DEBUG Running mngt 120.05913925170898
11:28:26,192 root, DEBUG Running local 150.07392406463623
You should call poll. if processlist[i].poll is not None will always evaluate to True, because processlist[i].poll is the function object, not the result of processlist[i].poll().
Edit:
This looks an quite complicated way to do something like
p = multiprocessing.Pool(n)
p.map_async(subprocess.call, commands)
As a suggestion, you may want to check the multiprocessing module.
I have the above code to run velvet. I can run velveth with no problems but it is not recognising the parameter for velvetg. I have checked the documentation, and I cannot see anything different to what I have. when the programme reaches velveteg, I get the following messege: [0.000000] unknown option: -ins_length 500.
import glob, sys, os, subprocess
def velvet_project():
print 'starting_velvet'
#'this is the directory where I copied the two test files. H*, I realised this is a subdirectory *.gastq.gz, to process all the files with that extention'
folders = glob.glob('/home/my_name/fastqs_test/H*')
#print folders
for folder in folders:
print folder
#looking for fastqs in each folder
fastqs=glob.glob(folder + '/*.fastq.gz')
#print fastqs
strain_id = os.path.basename(folder)
output= '/home/my_name/velvet_results/' + strain_id + '_velvet'
if os.path.exists(output):
print 'velevet folder already exist'
else:
os.makedirs(output)
#cmd is a command line within the programme#
cmd=['velveth', output, '59', '-fastq.gz','shortPaired',fastqs[0],fastqs[1]]
#print cmd
my_file=subprocess.Popen(cmd)#I got this from the documentation.
my_file.wait()
print 'velveth has finished'
cmd_2=['velvetg', output, '-ins_length 500', '-exp_cov auto', '-scaffoding no']
print cmd_2
my_file_2=subprocess.Popen(cmd_2)
my_file_2.wait()
print "velvet has finished :)"
print 'start'
velvet_project()
I have an issue putting files to a server that contains hyphens ("-"), and I think that it may be because of how Linux is treating the file, but I am in no way sure. The script is scanning a folder for pictures/items, puts them in a list and then transferring all items to the server.
This is a part of the script:
def _transferContent(locale):
## Transferring images to server
now = datetime.datetime.now()
localImages = '/home/bcns/Pictures/upload/'
localList = os.listdir(localImages)
print("Found local items: ")
print(localList)
fname = "/tmp/backup_images_file_list"
f = open(fname, 'r')
remoteList = f.read()
remoteImageLocation = "/var/www/bcns-site/pics/photos/backup_" + locale + "-" + `now.year` + `now.month` + `now.day` + "/"
print("Server image location: " + remoteImageLocation)
## Checking local list against remote list (from the server)
for localItem in localList:
localItem_fullpath = localImages + localItem
if os.path.exists(localItem_fullpath):
if localItem in remoteList:
print("Already exists: " + localItem)
else:
put(localItem_fullpath, remoteImageLocation)
else:
print("File not found: " + localItem)
And this is the out put:
Directory created successfully
/tmp/bcns_deploy/backup_images_file_list
[<server>] download: /tmp/backup_images_file_list <- /tmp/bcns_deploy/backup_images_file_list
Warning: Local file /tmp/backup_images_file_list already exists and is being overwritten.
Found local items:
['darth-vader-mug.jpg', 'gun-dog-leash.jpg', 'think-safety-first-sign.jpg', 'hzmp.jpg', 'cy-happ-short-arms.gif', 'Hogwarts-Crest-Pumpkin.jpg']
Server image location: /var/www/bcns-site/pics/photos/backup_fujitsu-20131031/
[<server>] put: /home/bcns/Pictures/upload/darth-vader-mug.jpg -> /var/www/bcns-site/pics/photos/backup_fujitsu-20131031/
Fatal error: put() encountered an exception while uploading '/home/bcns/Pictures/upload/darth-vader-mug.jpg'
Underlying exception:
Failure
I have tried to remove the hyphons, and then the transfer works just fine.
Server runs Ubuntu 12.04 and client runs Debian 7.1 on ext3 disks.
Irritating error, but anyone out here that has a clue on what might make this error?
Dashes in command line options in Linux matter, but dashes in the middle of filenames are file.
Check file permissions -- it's possible that in transferring one file manually, the perms are set differently than if Fabric transfers.
I suggest using put() to transfer a directory at a time. This will help to make sure all the files (and permissions) are what they should be.
Example (untested):
def _transferContent(locale):
## Transferring images to server
now = datetime.datetime.now()
localImageDir = '/home/bcns/Pictures/upload/'
remoteImageDir = "/var/www/bcns-site/pics/photos/backup_" + locale + "-" + `now.year` + `now.month` + `now.day` + "/"
print("Server image location: " + remoteImageDir)
put( localImagesDir, remoteImageDir)
I'm trying to append a file to a .jar file using python but i keep hitting an error which states could not find main class: jar.
def add_to_jar():
jarfile = "jarfile.jar"
skin_image= Skin_Name.text() #this stores the full path to to file to be appended
cmd = 'java jar uf ' + jarfile + " " + skin_image
proc = subprocess.Popen(cmd, shell=True)
any help appreciated
That sounds like an issue with the java interpretter's class path, which is typically defined as an environment variable. Do you need to run jar through java? Often jar is installed as a binary command by itself. Does it work if you just change it to
cmd = 'jar uf ' + jarfile + " " + skin_image