Invalid characters for python output file

Invalid characters for python output file - python

I have this little script:
from numpy import *
import numpy as np
import scipy.spatial as spt
X= np.loadtxt('edm')
myfile = open('edm.txt','w')
V= spt.distance.pdist(X.T,'sqeuclidean')
P = spt.distance.squareform(V)
print P
myfile.write(P)
And this matrix:
0 199.778354301
201.857546133 0
If I run my program; I get this in the terminal (according to the "print"):
[[ 0. 80657.85977805]
[ 80657.85977805 0. ]]
But in my output file; I get invalid characters as following :
��������z°¶¡±Û#z°¶¡±Û#��������
Do you know why?
Thanks

You can use NumPy savetxt method to save arrays and not worry about codification.
As in the Docs,
>>> np.savetxt('edm.txt', x) # x is an array

The file contains the binary representation of the numbers in the matrix.
$ od -t x1z edm.txt
0000000 00 00 00 00 00 00 00 00 79 a1 a6 c1 1d b1 f3 40 ........y......#
0000020 79 a1 a6 c1 1d b1 f3 40 00 00 00 00 00 00 00 00 y......#........
0000040

Related

How to write n bytes to a binary file in python 2.7

I am trying to use f.write(struct.pack()) to write n bytes to a binary file but not quite sure how to do that? Any example or sample would be helpful.

You don't really explain your exact problem or what you tried and which error messages you encountered:
The solution should look something like:
with open("filename", "wb") as fout:
fout.write(struct.pack(format, data, ...))
If you explain what data exactly you want to dump, then I can elaborate on the solution
If your data is just a hex string, then you do not need struct, you just use decode.
Please refer to SO question hexadecimal string to byte array in python
example for python 2.7:
hex_str = "414243444500ff"
bytestring = hex_str.decode("hex")
with open("filename", "wb") as fout:
fout.write(bytestring)

The below worked for me:
reserved = "Reserved_48_Bytes"
f.write(struct.pack("48s", reserved))
Output:
hexdump -C output.bin
00000030 52 65 73 65 72 76 65 64 5f 34 38 5f 42 79 74 65 |Reserved_48_Byte|
00000040 73 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |s...............|
00000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|

Internet checksum -- Adding hex numbers together for checksum

I came across the following example of creating an Internet Checksum:
Take the example IP header 45 00 00 54 41 e0 40 00 40 01 00 00 0a 00 00 04 0a 00 00 05:
Adding the fields together yields the two’s complement sum 01 1b 3e.
Then, to convert it to one’s complement, the carry-over bits are added to the first 16-bits: 1b 3e + 01 = 1b 3f.
Finally, the one’s complement of the sum is taken, resulting to the checksum value e4c0.
I was wondering how the IP header is added together to get 01 1b 3e?

Split your IP header into 16-bit parts.
45 00
00 54
41 e0
40 00
40 01
00 00
0a 00
00 04
0a 00
00 05
The sum is 01 1b 3e. You might want to look at how packet header checksums are being calculated here https://en.m.wikipedia.org/wiki/IPv4_header_checksum.

The IP header is added together with carry in hexadecimal numbers of 4 digits.
i.e. the first 3 numbers that are added are 0x4500 + 0x0054 + 0x41e0 +...

pyside2-uic null bytes in output

I'm trying to convert Qt .ui files made using Qt Designer with pyside2-uic but the output starts with 2 garbage bytes then every other byte is a null.
Here's the start of the output:
FF FE 23 00 20 00 2D 00 2A 00 2D 00 20 00 63 00 6F 00 64 00 69 00 6E 00 67 00 3A 00 20 00 75 00 74 00 66 00 2D 00 38 00 20 00 2D 00 2A 00 2D 00 0D 00 0A 00 0D 00 0A 00 23 00 20 00 46 00 6F 00
If I remove the first 2 bytes and all the nulls the it works as expected.
I'm using Python 3.7 and the newest version of pyside2, is there any way to get pyside2-uic to output a valid file without having to run it through another script to pull out all the garbage?

FYI, issue seems to be UTF-8 encoding (when using -o), vs. UTF-16 LE (output redirect in PowerShell).
This also matches to above ... every byte has a 00 with it (16 bit vs. 8 bit).

This bug(?) only occurs when pyside2-uic is run in powershell and the output is redirected to a file.
If using powershell use the -o option to specify an output file. Both methods work fine from a normal command prompt.

In pyside2-uic mainwindow.ui -o MainWindow.py
Use -o instead of >

Assigning strings to a variable in python

Consider the below string which will be given as the input to a function.
01 02 01 0D A1 D6 72 02 00 01 00 00 00 00 53 73 F2
The highlighted part is the address I need.
If the preceding byte is 1 then I have to take only 6 octet and assign it to a variable.
If it is more than 1 the I should read 6 * Num(preceding value) and assign 6 octets for each variable.
Currently I am assigning it statically.
def main(line_input):
Device = ' '.join(line_input[9:3:-1])
Length = line_input[2]
var1 = line_input[3]
main("01 02 02 0D A1 D6 72 02 00 01 00 00 00 00 53 73 F2")
Can this be done?

Here I think this does it, let me know if there is anything that needs changing:
import string
def address_extract(line_input):
line_input = string.split(line_input, ' ')
length = 6 * int(line_input[2])
device_list = []
for x in range(3, 3+length, 6):
if x+6 > len(line_input):
print "Length multiplier too long for input string"
else:
device_list.append(' '.join(line_input[x:x+6]))
return device_list
print address_extract("01 02 02 0D A1 D6 72 02 00 01 00 00 00 00 53 73 F2")
#output = ['0D A1 D6 72 02 00', '01 00 00 00 00 53']

Here is some code that I hope will help you. I tried to add many comments to explain what is happening
import binascii
import struct
#note python 3 behaves differently and won't work with this code (personnaly I find it easyer for strings convertion to bytes)
def main(line_input):
formated_line = line_input.split(" ") #I start by cutting the input on each space character
print formated_line #the output is a list. Each element is composed of 2 chars
formated_line = [binascii.unhexlify(xx) for xx in formated_line] #create a list composed of unhelified bytes of each elements of the original list
print formated_line #the output is a list of bytes char
#can be done in one step but I try to be clearer as you are nee to python (moereover this is easyer in python-3.x)
formated_line = map(ord, formated_line) #convert to a list of int (this is not needed in python 3)
print formated_line
Length = formated_line[2] #this is an int
unformated_var1 = formated_line[3:3+(6*length)] #keep only interesting data
#now you can format your address
main("01 02 02 0D A1 D6 72 02 00 01 00 00 00 00 53 73 F2")
#if the input comes from a machine and not a human, they could exchange 17bytes instead of (17x3)characters
#main("\x01\x02\x02\x0D\xA1\xD6\x72\x02\x00\x01\x00\x00\x00\x00\x53\x73\xF2")
#then the parsing could be done with struct.unpack

Writing numpy array to file- byte-order issue?

I'm trying to write a numpy array to file, but the file format is such that every value must contain only the 8 bytes required to represent a 64-bit float.
As best I can tell, ndarray.tofile(array), with array.dtype = 'float64' is not accomplishing this, so how can I do this quickly?

tofile already creates the binary file that you describe. Are you sure you are calling it correctly; if you're opening the file in your code, did you remember to open it in binary mode? Here is an example of tofile working as expected:
>>> import numpy as np
>>> a = np.array([1, 2, 3], dtype='float64')
>>> a
array([ 1., 2., 3.])
>>> a.tofile('foo')
Inspecting the file reveals it to be 24 bytes long, and with the contents corresponding to little-endian 64-bit IEEE 754 floats:
$ hexdump -C foo
00000000 00 00 00 00 00 00 f0 3f 00 00 00 00 00 00 00 40 |.......?.......#|
00000010 00 00 00 00 00 00 08 40 |.......#|
00000018

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Invalid characters for python output file - python

You can use NumPy savetxt method to save arrays and not worry about codification. As in the Docs, >>> np.savetxt('edm.txt', x) # x is an array

The file contains the binary representation of the numbers in the matrix. $ od -t x1z edm.txt 0000000 00 00 00 00 00 00 00 00 79 a1 a6 c1 1d b1 f3 40 ........y......# 0000020 79 a1 a6 c1 1d b1 f3 40 00 00 00 00 00 00 00 00 y......#........ 0000040

Related

How to write n bytes to a binary file in python 2.7

Internet checksum -- Adding hex numbers together for checksum

pyside2-uic null bytes in output

Assigning strings to a variable in python

Writing numpy array to file- byte-order issue?

Categories

Resources