Unable to package a jar file using the jar cvf command - python

I'm running a Jenkins job that has to compile and package a bunch of java files into a jar file. The Jenkins job is calling a Python script that has the steps to be automated. I've iteratively verified that all pervious steps have worked successfully. But my python code executes
jar cvf jarFile.jar .
which is the last step. The code brakes with the following error
Error occurred while deleting java files....
Usage: jar {ctxui}[vfmn0PMe] [jar-file] [manifest-file] [entry-point] [-C dir] files ...
Options:
-c create new archive
-t list table of contents for archive
-x extract named (or all) files from archive
-u update existing archive
-v generate verbose output on standard output
-f specify archive file name
-m include manifest information from specified manifest file
-n perform Pack200 normalization after creating a new archive
-e specify application entry point for stand-alone application
bundled into an executable jar file
-0 store only; use no ZIP compression
-P preserve leading '/' (absolute path) and ".." (parent directory) components from file names
-M do not create a manifest file for the entries
-i generate index information for the specified jar files
-C change to the specified directory and include the following file
If any file is a directory then it is processed recursively.
The manifest file name, the archive file name and the entry point name are
specified in the same order as the 'm', 'f' and 'e' flags.
Example 1: to archive two class files into an archive called classes.jar:
jar cvf classes.jar Foo.class Bar.class
Example 2: use an existing manifest file 'mymanifest' and archive all the
files in the foo/ directory into 'classes.jar':
jar cvfm classes.jar mymanifest -C foo/ .
My best guess is because the underlying OS is Linux its facing some sort of permission issue OR that a previous process is still occupied with a file that the java subprocess is trying to delete.

Related

sh Script to find files(only xml) in a directory, execute a python script (.py) on every file found, source the changed files into a target directory

I have a python script (as a .py file) that modifies xml file headers. I want to find all the files with extension .xml in a directory, copy them to a different directory and run the python script on all the *.xml files that are copied. I would like to source all the changes files through the python script into a different directory.
Currently I am at this step:
find . -type f -name ".xml" -exec cp {} tempdir \;
I am new to shell and I do not know how to execute the python program on each file that's sourced to tempdir and output to a new targetdir
my python command to execute looks something like this: python "xmlchange.py" -i "tempdir" -t "targetdir"
I am thinking of a foreach or a for loop whichever is appropriate in this context.
Any inputs appreciated. TIA!

Wikipedia Extractor as a parser for Wikipedia Data Dump File

I've tried to convert bz2 to text with "Wikipedia Extractor(http://medialab.di.unipi.it/wiki/Wikipedia_Extractor). I've downloaded wikipedia dump with bz2 extension then on command line used this line of code:
WikiExtractor.py -cb 250K -o extracted itwiki-latest-pages-articles.xml.bz2
This gave me a result that can be seen in the link:
However, following up it is stated:
In order to combine the whole extracted text into a single file one can issue:
> find extracted -name '*bz2' -exec bzip2 -c {} \; > text.xml
> rm -rf extracted
I get the following error:
File not found - '*bz2'
What can I do?
Please go through this. This would help.
Error using the 'find' command to generate a collection file on opencv
The commands mentioned on the WikiExtractor page are for Unix/Linux system and wont work on Windows.
The find command you ran on windows works in different way than the one in unix/linux.
The extracted part works fine on both windows/linux env as long as you run it with python prefix.
python WikiExtractor.py -cb 250K -o extracted your_bz2_file
You would see a extracted folder created in same directory as your script.
After that find command is supposed to work like this, only on linux.
find extracted -name '*bz2' -exec bzip2 -c {} \; > text.xml
find everything in the extracted folder that matches with bz2 and
then execute bzip2 command on those file and put the result in
text.xml file.
Also, if you run bzip -help command, which is supposed to run with the find command above, you would see that it wont work on Windows and for Linux you get the following output.
gaurishankarbadola#ubuntu:~$ bzip2 -help
bzip2, a block-sorting file compressor. Version 1.0.6, 6-Sept-2010.
usage: bzip2 [flags and input files in any order]
-h --help print this message
-d --decompress force decompression
-z --compress force compression
-k --keep keep (don't delete) input files
-f --force overwrite existing output files
-t --test test compressed file integrity
-c --stdout output to standard out
-q --quiet suppress noncritical error messages
-v --verbose be verbose (a 2nd -v gives more)
-L --license display software version & license
-V --version display software version & license
-s --small use less memory (at most 2500k)
-1 .. -9 set block size to 100k .. 900k
--fast alias for -1
--best alias for -9
If invoked as `bzip2', default action is to compress.
as `bunzip2', default action is to decompress.
as `bzcat', default action is to decompress to stdout.
If no file names are given, bzip2 compresses or decompresses
from standard input to standard output. You can combine
short flags, so `-v -4' means the same as -v4 or -4v, &c.
As mentioned above, bzip2 default action is to compress, so use bzcat for decompression.
The modified command that would work only on linux would look like this.
find extracted -name '*bz2' -exec bzcat -c {} \; > text.xml
It works on my ubuntu system.
EDIT :
For Windows :
BEFORE YOU TRY ANYTHING, PLEASE GO THROUGH THE INSTRUCTIONS FIRST
Create a separate folder and put the files in the folder. Files --> WikiExtractor.py and itwiki-latest-pages-articles1.xml-p1p277091.bz2 (in my case, since it is a small file I could find).
2. Open command prompt in current directory and run the following command to extract all the files.
python WikiExtractor.py -cb 250K -o extracted itwiki-latest-pages-articles1.xml-p1p277091.bz2
It will take time based on the file size but now the directory would look like this.
CAUTION : If you already have the extracted folder, move that to current directory so that it matches with the image above and you don't have to do extraction again.
Copy paste the below code and save it in bz2_Extractor.py file.
import argparse
import bz2
import logging
from datetime import datetime
from os import listdir
from os.path import isfile, join, isdir
FORMAT = '%(levelname)s: %(message)s'
logging.basicConfig(format=FORMAT)
logger = logging.getLogger()
logger.setLevel(logging.INFO)
def get_all_files_recursively(root):
files = [join(root, f) for f in listdir(root) if isfile(join(root, f))]
dirs = [d for d in listdir(root) if isdir(join(root, d))]
for d in dirs:
files_in_d = get_all_files_recursively(join(root, d))
if files_in_d:
for f in files_in_d:
files.append(join(f))
return files
def bzip_decompress(list_of_files, output_file):
start_time = datetime.now()
with open(f'{output_file}', 'w+', encoding="utf8") as output_file:
for file in list_of_files:
with bz2.open(file, 'rt', encoding="utf8") as bz2_file:
logger.info(f"Reading/Writing file ---> {file}")
output_file.writelines(bz2_file.read())
output_file.write('\n')
stop_time = datetime.now()
print(f"Total time taken to write out {len(list_of_files)} files = {(stop_time - start_time).total_seconds()}")
def main():
parser = argparse.ArgumentParser(description="Input fields")
parser.add_argument("-r", required=True)
parser.add_argument("-n", required=False)
parser.add_argument("-o", required=True)
args = parser.parse_args()
all_files = get_all_files_recursively(args.r)
bzip_decompress(all_files[:int(args.n)], args.o)
if __name__ == "__main__":
main()
Now the current directory would look like this.
Now open a cmd in current directory and run the following command.
Please read what each input does in the command.
python bz2_Extractor.py -r extracted -o output.txt -n 10
-r : The root directory you have bz2 files in.
-o : Output file name
-n : Number of files to write out. [If not provided, it writes out all the files inside root directory]
CAUTION : I can see that your file is in Gigabytes and it has more than half millions articles. If you try to put that in a single file using above command, I'm not sure what would happen or if your system can survive that and if it did survive that, the output file would be so large, since it is extracted from 2.8GB file, I don't think Windows OS would be able to open it directly.
So my suggestion would be to process 10000 files at a time.
Let me know if this works for you.
PS : For above command, the output looks like this.

Bash script: No such file or directory

So, i need to run this run.sh file and i could not with windows default CMD.
So i installed Cygwin64 Terminal and it acctualy reads the file, but at the end of the reading, it spams an error:
$ /cygdrive/c/Python27/Scripts/./run.sh
Starting scraper
Scrape complete, checking movies with imdb
C:\python27\python.exe: can't open file 'check_imdb.py': [Errno 2] No such file or directory
Inside run.sh:
#!/bin/bash
echo "Starting scraper"
scrapy runspider cinema_scraper.py -t json --nolog -o - > "movies.json"
echo "Scrape complete, checking movies with imdb"
python check_imdb.py movies.json
check_imdb.py is inside run.sh folder.
The file is referenced inside the script as a relative path.
python check_imdb.py movies.json
Relative means that it does not specify the whole path (starting with /), and is interpreted relative to the current directory, which you can find out with :
pwd
A path starting with / is said to be absolute.
The important thing is to remember a script interprets paths relative to the current directory, not the directory where the script is located.
You could change to the directory of the script before running it, with :
cd /cygdrive/c/Python27/Scripts
But if you do that, then you will need to provide an absolute path on the command line to your movies.json file.
Better yet, modify the script to have an absolute path:
python /cygdrive/c/Python27/Scripts/check_imdb.py movies.json

Jenkins doesn't include refrenced files when building conda package

I am building a small conda package with Jenkins (linux) that should just:
Download a .zip from an external refrence holding font files
Extract the .zip
Copy the font files to a specific folder
Build the package
The build runs successful, but the package does not include the font files, but is basically empty. My build.sh has:
mkdir $PREFIX\root\share\fonts
cp *.* $PREFIX\root\share\fonts
My meta.yaml source has:
source:
url: <ftp server url>/next-fonts.zip
fn: next-fonts.zip
In Jenkins I do:
mkdir build
conda build fonts
The console output is strange though at this part:
+ mkdir /var/lib/jenkins/conda-bld/fonts_1478708638575/_b_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_prootsharefonts
+ cp Lato-Black.ttf Lato-BlackItalic.ttf Lato-Bold.ttf Lato-BoldItalic.ttf Lato-Hairline.ttf Lato-HairlineItalic.ttf Lato-Italic.ttf Lato-Light.ttf Lato-LightItalic.ttf Lato-Regular.ttf MyriadPro-Black.otf MyriadPro-Bold.otf MyriadPro-Light.otf MyriadPro-Regular.otf MyriadPro-Semibold.otf conda_build.sh /var/lib/jenkins/conda-bld/fonts_1478708638575/_b_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_prootsharefonts
BUILD START: fonts-1-1
Source cache directory is: /var/lib/jenkins/conda-bld/src_cache
Found source in cache: next-fonts.zip
Extracting download
Package: fonts-1-1
source tree in: /var/lib/jenkins/conda-bld/fonts_1478708638575/work/Fonts
number of files: 0
To me it seems the cp either doesn't complete or it copies to a wrong directory. Unfortunately, with the placeholder stuff I really can't decypher where exactly the fonts land when they are copied, all I know is that in /work/Fonts, there are no files and thus nothing is included in the package. While typing, I also noted that /work/Fonts actually has the Fonts starting with a capital F, while nowhere in the configuration or the scripts there is any instance of fonts starting with a capital F.
Any insight on what might go wrong?
mkdir $PREFIX\root\share\fonts
cp *.* $PREFIX\root\share\fonts
should be replaced with
mkdir $PREFIX/root/share/fonts
cp * $PREFIX/root/share/fonts
The buildscript was taken from another package that was built in windows and in changing the build script I forgot to change the folder separators.
Additionally creating subfolder structures isn't possible in linux while it is in windows. So this
mkdir $PREFIX/root/
mkdir $PREFIX/root/share/
mkdir $PREFIX/root/share/fonts/
cp * $PREFIX/root/share/fonts/
Was the ultimate solution to the problem.

Re-write write-protected file

Every 4 hours files are updated with new information if needed - i.e. if any new information has been processed for that particular file (files correspond to people).
I'm running this command to convert my .stp files (those being updated every 4 hours) to .xml files.
rule convert_waveform_stp:
input: '/data01/stpfiles/{file}.Stp'
output: '/data01/workspace/bm_data/xmlfiles/{file}.xml'
shell:
'''
mono /data01/workspace/bm_software/convert.exe {input} -o {output}
'''
My script is in Snakemake (python based) but I'm running the convert.exe through a shell command.
I'm getting an error on the ones already processed using convert.exe. They are saved by convert.exe as write-protected and there is no option to bypass this within the executable itself.
Error Message:
ProtectedOutputException in line 14 of /home/Snakefile:
Write-protected output files for rule convert_waveform_stp:
/data01/workspace/bm_data/xmlfiles/PID_1234567.xml
I'd still like them to be write-protected but would also like to be able to update them as needed.
Is there something I can add to my shell command to write over the write protected files?
take a look at the os standard library package:
https://docs.python.org/3.5/library/os.html?highlight=chmod#os.chmod
It allows for chmod with the following caveat:
Although Windows supports chmod(), you can only set the file’s read-only flag with it (via the stat.S_IWRITE and stat.S_IREAD constants or a corresponding integer value). All other bits are ignored.
#VickiT05, I thought you wanted it in python. Try this:
Check the original file permission with
ls -l [your file name]
stat -c %a [your file name]
Change the protection to with
chmod 777 [your file name]
change back to original file mode or whatever mode you want
chmod [original file protection mode] [your file name]

Categories

Resources