Facing issue while listing file contents in cygwin - python

Context: I want to install ".msi" file on remote windows machine via python script.
I have installed cygwin on remote windows machine and ssh service is running. I execute the command via ssh on remote windows machine from Linux host using python script. For installation of msi file i have used below command:
msiexec /package "msi file name" /quiet /norestart /log "log file name (say instlog.log)"
Now, to verify that installation is successful i list the contents of log file (instlog.log) and checks for string "Installation success or error status: 0".
Problem:
"type" command does not work in cygwin. So i tried "cd {0}; cat {1} | tail -5".format(FileLocation, FileName) to list file contents but i am getting output in different format and python script is unable to match above mentioned string in output. This is want i want to display on console:
MSI (s) (64:74) [18:03:51:360]: Windows Installer installed the product. Product Name: pkg-name. Product Version: 0.2.24-10891. Product Language: 1033. Manufacturer: XYZ Company. Installation success or error status: 0.
And what i am actually getting is:
M S I ( s ) ( 6 4 : 7 4 ) [ 1 8 : 0 3 : 5 1 : 3 6 0 ] : W i n d o w s I n s t a l l e r i n s t a l l e d t h e p r o d u c t . P r o d u c t N a m e : p k g - n a m e . P r o d u c t V e r s i o n : 0 . 2 . 2 4 - 1 0 8 9 1 . P r o d u c t L a n g u a g e : 1 0 3 3 . M a n u f a c t u r e r : X Y Z C o m p a n y . I n s t a l l a t i o n s u c c e s s o r e r r o r s t a t u s : 0 .
So somehow an extra space is introduced after each character in output. I want to know how can i get output in a normal way rather than space separated format. Thank you.

The problem is that msiexec saved its log file in Unicode format. In Windows Unicode consists of 2 chars (meaning that each character that you see is stored in memory as 2 bytes or chars): the first is the codepage number and the second is the entry of the character in that codepage (that is the character itself). Because you're running on an English version the codepage number is 0 (or \0 or \x00 or NULL). Some popular editors are smart enough to figure the encoding out and only display the characters (leaving the interleaved NULL chars aside). Now there are some ways to get through this.
Upgrade cygwin. On my computer (I also have Cygwin installed) I don't experience this problem (my Cygwin is using: GNU coreutils 8.15 - this can be seen for example by typing tail --version). Here are some outputs (I included the hexdump at the end to show you that the file is in unicode format):
cat unicode.txt
yields: unicode chars
tail unicode.txt
yields: unicode chars
hexdump unicode.txt
yields:
0000000 0075 006e 0069 0063 006f 0064 0065 0020
0000010 0063 0068 0061 0072 0073 000d 000a
000001e
Convert the msiexec logs to ASCII format. I am not aware of any native tool that does that but you can Google search for unicode to ascii converter and download such a tool; or as I mentioned earlier there are editors that understand unicode, one that i've already tried and is able to convert files from unicode to ascii is Textpad; or you can write the tool yourself.
If you're reading the msi log file from python you could handle the unicode files from the script. I assume that you have some code that reads the file contents like (!!! I didn't include any exception handling !!!):
f = open("some_msi_log_file.log", "rb")
text = f.read()
f.close()
and you're doing the processing on text. If you modify the code above to:
f = open("some_msi_log_file.log", "rb")
unicode_text = f.read()
f.close()
text = "".join([char for char in unicode_text if char != '\x00'])
text won't contain the \x00s anymore (and will also work with regular ASCII files).

The log file should be converted to a 8 bit wide format like UTF8. This could be achieved with iconv command. You should install it with cygwin installer, and after that use the following command:
iconv -f ucs2 -t utf8 instlog.log > instlog2.log

Related

How can I paste contents of 2 files or single file multiple times?

I am using mostly one liners in shell scripting.
If I have a file with contents as below:
1
2
3
and want it to be pasted like:
1 1
2 2
3 3
how can I do it in shell scripting using python one liner?
PS: I tried the following:-
python -c "file = open('array.bin','r' ) ; cont=file.read ( ) ; print cont*3;file.close()"
but it printed contents like:-
1
2
3
1
2
3
file = open('array.bin','r' )
cont = file.readlines()
for line in cont:
print line, line
file.close()
You could replace your print cont*3 with the following:
print '\n'.join(' '.join(ch * n) for ch in cont.strip().split())
Here n is the number of columns.
You need to break up the lines and then reassemble:
One Liner:
python -c "file=open('array.bin','r'); cont=file.readlines(); print '\n'.join([' '.join([c.strip()]*2) for c in cont]); file.close()"
Long form:
file=open('array.bin', 'r')
cont=file.readlines()
print '\n'.join([' '.join([c.strip()]*2) for c in cont])
file.close()
With array.bin having:
1
2
3
Gives:
1 1
2 2
3 3
Unfortunately, you can't use a simple for statement for a one-liner solution (as suggested in a previous answer). As this answer explains, "as soon as you add a construct that introduces an indented block (like if), you need the line break."
Here's one possible solution that avoids this problem:
Open file and read lines into a list
Modify the list (using a list comprehension). For each item:
Remove the trailing new line character
Multiply by the number of columns
Join the modified list using the new line character as separator
Print the joint list and close file
Detailed/long form (n = number of columns):
f = open('array.bin', 'r')
n = 5
original = list(f)
modified = [line.strip() * n for line in original]
print('\n'.join(modified))
f.close()
One-liner:
python -c "f = open('array.bin', 'r'); n = 5; print('\n'.join([line.strip()*n for line in list(f)])); f.close()"
REPEAT_COUNT=3 && cat contents.txt| python -c "print('\n'.join(w.strip() * ${REPEAT_COUNT} for w in open('/dev/stdin').readlines()))"
First test from the command propmt:
paste -d" " array.bin array.bin
EDIT:
OP wants to use a variable n to show how much columns are needed.
There are different ways to repeat a command 10 times, such as
for i in {1..10}; do echo array.bin; done
seq 10 | xargs -I -- echo "array.bin"
source <(yes echo "array.bin" | head -n10)
yes "array.bin" | head -n10
Other ways are given by https://superuser.com/a/86353 and I will use a variation of
printf -v spaces '%*s' 10 ''; printf '%s\n' ${spaces// /ten}
My solution is
paste -d" " $(printf "%*s" $n " " | sed 's/ /array.bin /g')

finding fonts by examining a hex dump of a psd file

I want to be able to find out what fonts a psd file has using Python. I was able to read a psd file as a binary file and convert the contents into hex.
>>> with open(test_file,'rb') as f:
... content = f.read()
... hex_content = binascii.hexlify(content)
Then I decoded the hex contents into a text file.
>>> with open('./decoded1.txt', 'w') as f:
... f.write(hex_content.decode("hex"))
Near the bottom of the decoded file, I found some sort of header named /FontSet, which I think is what I am looking for.
/FontSet [
<<
/Name (þÿ A d o b e I n v i s F o n t)
/Script 0
/FontType 0
/Synthetic 0
>>
<<
/Name (þÿ M y r i a d P r o - R e g u l a r)
/Script 0
/FontType 0
/Synthetic 0
>>
]
Am I on the right track? I recognize MyriadPro-Regular as the font used in my test file. What is AdobeInvisFont? Is this the Adobe Blank font?

How do you simulate awk in python, for multiline output?

I am used to have awk to retrieve a column from a file.
I need to do something similar now in python. At the moment I use a subprocess and save the result in a variable.
Is possible to run something similar to awk in python, without write a lot of code? I was looking at split; but I don't get how do you parse trough multiple lines.
The input that I have is similar to a simple ls -la or netstat -r. I would like to get the 3rd column, so I can do what I would do with
awk '{print $3}'
Example of the source:
a b c d e
1 2 4 5 2
X Y Z S R
The shortest that I can think of, is a loop splitting for each line, then split each line in single string, print the string[2]. But I am not sure how to write this in the simplest and shortest way; as short as write the awk command in a subprocess.
In bash, using pythonpy
rtb#bartek-laptop ~ $ cat tmp
a b c d e
1 2 4 5 2
X Y Z S R
rtb#bartek-laptop ~ $ cat tmp | py -x "x.split()[2]"
c
4
Z
Or in script
with open('tmp') as f:
result = [line.split()[2] for line in f]
# now result contains list ['c', '4', 'Z']

Different output format than expected

I have written code to read following text file
Generated by trjconv : a bunch of waters t= 0.00000
3000
1SOL OW 1 -1.5040 2.7580 0.6820
1SOL HW1 2 1.4788 2.7853 0.7702
1SOL HW2 3 1.4640 2.8230 0.6243
2SOL OW 4 1.5210 0.9510 2.2050
2SOL HW1 5 -1.5960 0.9780 2.1520
2SOL HW2 6 1.4460 0.9940 2.1640
1000SOL OW 2998 1.5310 1.7952 2.1981
1000SOL HW1 2999 1.4560 1.7375 -2.1836
1000SOL HW2 3000 1.6006 1.7369 2.2286
3.12736 3.12736 3.12736
Generated by trjconv : a bunch of waters t= 9000.00000
3000
1SOL OW 1 1.1579 0.4255 2.1329
1SOL HW1 2 1.0743 0.3793 2.1385
Written Code:
F = open('Data.gro', 'r')
A = open('TTT.txt', 'w')
XO = []
I = range(1, 10)
with open('Data.gro') as F:
for line in F:
if line.split()[0] == '3000':
A.write('Frame' + '\n')
for R in I:
line = next(F)
P = line.split()
x = float(P[3])
XO.append(x)
if line.split()[2] == '3000':
print('Oxygen atoms XYZ coordinates:')
A.write('Oxygen atoms XYZ coordinates:' + '\n')
A.write("%s\n" % (XO))
XO
XO[0] - XO[1]
XO = []
else:
pass
else:
pass
A.close()
First problem:
My problem is Out put text file looks like as follows in one line. It printed as a one line in text file.
FrameOxygen atoms XYZ coordinates:[-1.504, 1.4788, 1.464, 1.521, -1.596, 1.446, 1.531, 1.456, 1.6006]FrameOxygen atoms XYZ coordinates:[1.1579, 1.0743, 1.1514, 2.2976, 2.2161, 2.3118, 2.5927, -2.5927, 2.5365]
Output Should be like below.
Frame
Oxygen atoms XYZ coordinates:
[-1.504, 1.4788, 1.464, 1.521, -1.596, 1.446, 1.531, 1.456, 1.6006]
Frame
Oxygen atoms XYZ coordinates:
[1.1579, 1.0743, 1.1514, 2.2976, 2.2161, 2.3118, 2.5927, -2.5927, 2.5365]
But when I am reading it shows the '\n' at the end of each separated point.
Does any one have a idea.
Second Problem
Next problem is this only generate when I copy paste codes into a python shell. If double click my 'code.py' file it is not generating out put file. There is no error when I copy paste codes into python shell.
1) Which platform and editor you are using?
'\n' should work as expected.
I suspect you are running the code in Windows and you used notepad to inspect the file. Try use Wordpad or other more capable editor to open TTT.txt. The result should be as expected.
2) If you're doubling clicking the script in MS Windows, you are very likely to have missed some exceptions printed by python. Try run it in a command prompt:
python code.py
Anthoney is Correct.
Windows has this issue. use WordPad to open the file.
To answer your first question:
'\n', an escaped n, is the newline character.
To answer your second question:
A frequent problem when pasting into a shell is that the pasting occurs faster than the shell processes it, meaning that the lines could be ignored by the shell.
Another issue you might have, particularly if you're pasting the above code into a shell, is the inconsistent indentation.
Your if and else are not lined up, probably because you only indented 3 spaces from the preceding line.
if line.split()[2] == '3000':
print('Oxygen atoms XYZ coordinates:')
A.write('Oxygen atoms XYZ coordinates:' + '\n')
A.write("%s\n" % (XO))
XO
XO[0] - XO[1]
XO = []
else:
pass
Also, you could nest your openings of files. In particular, this line is redundant, and could be removed:
F = open('Data.gro', 'r')
And you can do this:
...
with open('Data.gro') as F:
with open('TTT.txt', 'w') as A:
...
So that if you have an error writing your file, you will still at least close it. (which means you can remove the A.close() at the end.)

Reading path names from a file in Python under Windows

I have a Python script that read a list of path names from a file and open them using the gzip module. It works well under Linux. But when I used it under Windows, I met an error when calling the gzip.open function. The error message is as follows:
File "C:\dev_tools\Python27\lib\gzip.py", line 34, in open
return GzipFile(filename, mode, compresslevel)
File "C:\dev_tools\Python27\lib\gzip.py", line 89, in __init__
fileobj = self.myfileobj = __builtin__.open(filename, mode or 'rb')
TypeError: file() argument 1 must be encoded string without NULL bytes, not str
The filename should be something like
'G:\ext_pt1\cfx33_50instr4_testset\cfx33_50instr4_0-99\cfx33_50instr4_cov\cfx33_50instr4_id0_cov\cfx33_50instr4_id0.detail.rpt.gz'
But when I printed the filename, it printed out something like
' ■G : \ e x t _ p t 1 \ c f x 3 3 _ 5 0 i n s t r 4 _ t e s t s e t \
c f x 3 3 _ 5 0 i n s t r 4 _ 0 - 9 9 \ c f x 3 3 _ 5 0 i n s t r 4 _
c o v \ c f x 3 3 _ 5 0 i n s t r 4 _ i d 0 _ c o v \ c f x 3 3 _ 5 0
i n s t r 4 _ i d 0 . d e t a i l . r p t . g z'
And when I printed repr(filename), it printed out something like
'\xff\xfeG\x00:\x00\\x00e\x00x\x00t\x00_\x00p\x00t\x001\x00\\x00c\x00f\x00x\x003\x003\x00_\x005\x000\x00i\x00n\x00s\x00t\x00r\x004\x00_\x00t\x00e\x00s\x00t\x00s\x00e\x00t\x00\\x00c\x00f\x00x\x003\x003\x00_\x005\x000\x00i\x00n\x00\x00t\x
00r\x004\x00_\x000\x00-\x009\x009\x00\\x00c\x00f\x00x\x003\x003\x00_\x005\x000\x00i\x00n\x00\x00t\x00r\x004\x00_\x00c\x00o\x00v\x00\\x00c\x00f\x00x\x003\x003\x00_\x005\x000\x00i\x00n\x00s\x00t\x00r\x004\x00_\x00i\x00d\x000\x00_\x00c\x00o\x00v\x00\\x00c\x00f\x00x\x003\x003\x00_\x005\x000\x00i\x00n\x00s\x00t\x00r\x004\x00_\x00i\x00d\x000\x00.\x00d\x00e\x00t\x00a\x00i\x00l\x00.\x00r\x00p\x00t\x00.\x00g\x00z\x00'
I don't know why Python added those spaces (possibly the NULL bytes?) when it read the file. Does anyone have any clue?
Python has not added anything; it has merely read what is in the file. You have a little-endian UTF-16 string there, as you can plainly tell by the byte-order mark in the first two bytes. If you are not expecting this, you could convert it to ASCII (assuming it doesn't have any non-ASCII characters).
# convert mystring from little-endian UTF-16 with optional BOM to ASCII
mystring = unicode(mystring, encoding="utf-16le").encode("ascii", "ignore")
Or just convert it to proper Unicode and use it that way, if Windows will tolerate it:
mystring = unicode(mystring, encoding="utf-16le").lstrip(u"\ufeff")
Above, I have manually specified the byte order and then stripped off the BOM, rather than specifying "utf-16" as the encoding and letting Python figure out the byte order. This is because the BOM is going to be found once at the beginning of the file, not at the beginning of each line, so if you are converting the lines to Unicode one at a time, you won't have a BOM most of the time.
However, it might make more sense to go back to the source of that file and figure out why it's being saved in little-endian UTF-16 if you expected ASCII. Is the file generated the same way on Linux and Windows, for instance? Has it been touched by a text editor that defaults to saving as Unicode? Etc.
It seems that the encoding of your file has some problem. The printed file name pasted in your question is not the normal character. Have you saved your path-list file in unicode format?
I had the same problem. I replaced \ with / and it was ok. Just wanted you to remind this possibility before going into more advanced remedies.

Categories

Resources