So I am currently working on a program that was handed down to me from a previous coworker and I am working through a strange bug. When reading data output from 2 separate serial sources byte by byte, python will write to the same cell in the .csv file as well as the console.
import serial
from datetime import datetime
import os
pressure_passed = False
arduino_passed = False
file_passed = False
BAUD_RATE = 115200
GARBAGE_CYCLES = 3 # how many cycles to ignore before logging data
garbage_cycle = 0
# Save data to log file
def LogData(startTime, pressureData, arduinoData, file):
global garbage_cycle
if garbage_cycle < GARBAGE_CYCLES:
garbage_cycle += 1
else:
delta = datetime.now() - startTime
ms = delta.total_seconds() * 1000
dataString = "{:0.2f}, {}, {}\n".format(ms, pressureData, arduinoData)
file.write(dataString)
file.flush()
print(dataString, end = "")
# Get the COM port for the Mark-10 Series 5
while not pressure_passed:
try:
pressure_com = input("Enter Mark-10 Series 5 COM Port #: ")
pressure_ser = serial.Serial("COM" + str(pressure_com), BAUD_RATE)
pressure_passed = True
except:
print("Invalid COM Port, please enter a valid port.\n-----")
# Get the COM port for the Arduino
while not arduino_passed:
try:
arduino_com = input("Enter Ardunio COM Port #: ")
arduino_ser = serial.Serial("COM" + str(arduino_com), BAUD_RATE)
arduino_passed = True
except:
print("Invalid COM Port, please enter a valid port.\n-----")
# Get the name for the log file
while not file_passed:
try:
file_name = input("Enter log file name: ")
# Add extension if not already given
if "." not in file_name:
file_name += ".csv"
log_file = open(file_name, "a")
# Add header row to log file
if os.stat(log_file.name).st_size == 0:
log_file.write("time (ms), pressure, rate (deg/ms)")
file_passed = True
except:
print("Invalid file, or could not open the file specified.\n-----")
start = datetime.now()
# Variables to read serial input
pressure_data = ""
last_pressure = ""
arduino_data = ""
last_arduino = ""
# Main program loop
# Serial is read from byte by byte to better sync the two devices
while True:
try:
x_changed = False
y_changed = False
# Read from Mark-10 serial if available
# x is a byte read from the serial line, converted to ascii
if pressure_ser.in_waiting > 0:
x = pressure_ser.read().decode('ascii')
x_changed = True
# Read from Arduino serial if available
# y is a byte read from the serial line, converted to ascii
if arduino_ser.in_waiting > 0:
y = arduino_ser.read().decode('ascii')
y_changed = True
# If new data received, check if we should log it
if x_changed:
if x == '\n': # New line detected, log the accumulated data
if last_pressure != pressure_data:
LogData(start, last_pressure, last_arduino, log_file)
last_pressure = pressure_data
pressure_data = ""
elif x != '\r': # Otherwise, add the read character to the string
pressure_data += x
if y_changed:
if y == '\n': # New line detected, log the accumulated data
if last_arduino != arduino_data:
LogData(start, last_pressure, last_arduino, log_file)
last_arduino = arduino_data
arduino_data = ""
elif y != '\r': # Otherwise, add the read character to the string
arduino_data += y
except Exception as e:
print(e)
if arduino_ser.isOpen():
arduino_ser.close()
if pressure_ser.isOpen():
pressure_ser.close()
log_file.close()
break
Here is what the file is spitting out, IE the double printing to a single cell. Sample of the data
Any advice is much appreciated, thank you all!
It looks like when a new pressure is read in, but the value has not changed from last time, then it's not resetting the string that's accumulating all the characters and it doubles up. Then on the next pass, when the REAL pressure hasn't changed, it compares the doubled to the non-doubled and writes again, and vice-versa.
Try unindenting the line that resets the string to remove it from the if clause:
# If new data received, check if we should log it
if x_changed:
if x == '\n': # New line detected, log the accumulated data
if last_pressure != pressure_data:
LogData(start, last_pressure, last_arduino, log_file)
last_pressure = pressure_data
pressure_data = ""
elif x != '\r': # Otherwise, add the read character to the string
pressure_data += x
Then the same thing for the arduino value block.
Your logs will probably be much shorter now.
I like your username! My guess is that it is reading from the serial too quickly and going through the loop twice before the arduino has time to change the value of in_waiting.
At the top of your code add:
import time
And in the LogData function add:
time.sleep(0.1)
Give that a shot and let me know if it helps. 0.1s may be too long for your application but it is a good test to see if this is the issue. If it is, you can play around with the amount of time it sleeps.
Based on the sample output provided, I think it's not writing twice but rather the following specific condition is met occasionally which calls two identical LogData() lines.
Only when Condition 1 AND Condition 2 are met - the data is written "twice". Note that the LogData() call is same in both conditions.
Condition 1:
# If new data received, check if we should log it
if x_changed:
if x == '\n': # New line detected, log the accumulated data
if last_pressure != pressure_data:
LogData(start, last_pressure, last_arduino, log_file)
Condition 2:
if y_changed:
if y == '\n': # New line detected, log the accumulated data
if last_arduino != arduino_data:
LogData(start, last_pressure, last_arduino, log_file)
I am receiving a packet from client, consisting of many fields. I read all fields successfully, but when it comes to the last field which is tag_end, python gives me an error:
unpack_from requires a buffer of at least 4 bytes not found.
this is the code:
def set_bin(self, buf):
"""Reads a vector of bytes (probably received from network or
read from file) and tries to construct the packet structure
from it, by reading each packet member from the buffer. This
is somehow like deserializing the packet.
"""
assert isinstance(buf, bytearray), 'buffer type is not valid'
offset = 0
print("$$$$$$$$$$$$$$$$ set bin $$$$$$$$$$$$$$$$$")
try:
(self._tag_start, self._version, self._checksum, self._connection_id,
self._packet_seq) = Packet.PACKER_1.unpack_from(str(buf), offset)
except struct.error as e:
print(e)
raise DeserializeError(e)
except ValueError as e:
print(e)
raise DeserializeError(e)
#I=4 H=2 B=1
offset = Packet.OFFSET_GUID #14 correct
self._guid = buf[offset:offset+Packet.UUID_SIZE] #14-16 correct
offset = Packet.OFFSET_GUID + Packet.UUID_SIZE
print("$$$$$$$$$$$$$$$$ GUID read successfully $$$$$$$$$$$$$$$$$")
try:
(self._timestamp_sec, self._timestamp_microsec, self._command,
self._command_seq, self._subcommand, self._data_seq,
self._data_length) = Packet.PACKER_3.unpack_from(str(buf), offset)
except struct.error as e:
print(e)
raise DeserializeError(e)
except ValueError as e:
print(e)
raise DeserializeError(e)
print("$$$$$$$$$$$$$$$$ timestamps read successfully $$$$$$$$$$$$$$$$$")
offset = Packet.OFFSET_AUTHENTICATE
self._username = buf[offset:offset + self.USERNAME_SIZE] #Saman
offset += self.USERNAME_SIZE
print("$$$$$$$$$$$$$$$$ username read successfully $$$$$$$$$$$$$$$$$")
self._password = buf[offset:offset+self.USERNAME_SIZE]
offset += self.PASSWORD_SIZE
print("$$$$$$$$$$$$$$$$ password read successfully $$$$$$$$$$$$$$$$$")
self._data = buf[offset:offset+self._data_length]
offset = offset + self._data_length
print("$$$$$$$$$$$$$$$$ data read successfully $$$$$$$$$$$$$$$$$")
try:
(self._tag_end,) = Packet.PACKER_4.unpack_from(str(buf), offset)
except struct.error as e:
print(e)
raise DeserializeError(e)
except ValueError as e:
print(e)
raise DeserializeError(e)
print("$$$$$$$$$$$$$$$$ tag end read successfully $$$$$$$$$$$$$$$$$")
if len(buf) != Packet.PACKER.size + self._data_length:
print('failed to deserialize binary data correctly and construct the packet due to extra data')
else:
print('############### Deserialized Successfully')
and this is some constants used in the code:
STRUCT_FORMAT_STR = r'=IHIHH 16B IIHHHHH I 6c 9c' #Saman
STRUCT_FORMAT_STR_1 = r'=IHIHH'
STRUCT_FORMAT_STR_2 = r'=16B'
STRUCT_FORMAT_STR_3 = r'=IIHHHHH'
STRUCT_FORMAT_STR_4 = r'=I'
STRUCT_FORMAT_STR_5 = r'=6c'
STRUCT_FORMAT_STR_6 = r'=9c'
UUID_SIZE = 16
OFFSET_GUID = 14
#OFFSET_DATA = 48 #shifting offset data by 15 char
OFFSET_AUTHENTICATE = 48
PACKER = struct.Struct(str(STRUCT_FORMAT_STR)) #Saman
PACKER_1 = struct.Struct(str(STRUCT_FORMAT_STR_1))
PACKER_2 = struct.Struct(str(STRUCT_FORMAT_STR_2))
PACKER_3 = struct.Struct(str(STRUCT_FORMAT_STR_3))
PACKER_4 = struct.Struct(str(STRUCT_FORMAT_STR_4))
PACKER_5 = struct.Struct(str(STRUCT_FORMAT_STR_5))
PACKER_6 = struct.Struct(str(STRUCT_FORMAT_STR_6))
BYTES_TAG_START = PACKER_4.pack(TAG_START)
BYTES_TAG_END = PACKER_4.pack(TAG_END)
and initialization of the packet object, where it initializes the fields:
def init(self, **kwargs):
if 'buf' in kwargs:
self.set_bin(kwargs['buf'])
else:
assert kwargs['command'] in Packet.RTCINET_COMMANDS.values() and kwargs['subcommand'] in Packet.RTCINET_COMMANDS.values(), 'Undefined protocol command'
assert isinstance(kwargs['data'], bytearray), 'invalid type for data field'
for field in ('command', 'subcommand', 'data'):
setattr(self, '_' + field, kwargs[field])
self._tag_start = Packet.TAG_START
self._version = Packet.VERSION_CURRENT % (Packet.USHRT_MAX + 1)
self._checksum = Packet.CRC_INIT
self._connection_id = kwargs.get('connection_id', 0) % (Packet.USHRT_MAX + 1)
self._packet_seq = Packet.PACKET_SEQ
Packet.PACKET_SEQ = (Packet.PACKET_SEQ + 1) % (Packet.USHRT_MAX + 1)
self._guid = uuid.uuid4().bytes
dt = datetime.datetime.now()
self._timestamp_sec = int(time.mktime(dt.timetuple()))
self._timestamp_microsec = dt.microsecond
# self._command = kwargs['command']
self._command_seq = kwargs.get('command_seq', 0)
# self._subcommand = kwargs['subcommand']
self._data_seq = kwargs.get('data_seq', 0)
self._data_length = len(kwargs['data'])
self._username = Packet.USERNAME #Saman
self._password = Packet.PASSWORD
I have made sure that I read all fields in the right order, as it was written in the packet by the client program. but still I couldn't manage to solve this problem.
Do you have any idea how this could be solved?
The problem seems to be that you're converting things to str all over the place for no good reason.
In some places, like PACKER_1 = struct.Struct(str(STRUCT_FORMAT_STR_1)), it makes your code less readable and understandable, but doesn't affect the actual output. For example, STRUCT_FORMAT_STR_1 is already a str, so str(STRUCT_FORMAT_STR_1) is the same str.
But in other places, it's far worse than that. In particular, look at all the lines like Packet.PACKER_1.unpack_from(str(buf), offset). There, buf is a bytearray. (It has to be, because you assert it.) Calling str on a bytearray gives you the string representation of that bytearray. For example:
>>> b = bytearray(b'abc')
>>> len(b)
3
>>> s = str(b)
>>> s
"bytearray(b'abc')"
>>> len(s)
17
That string representation is obviously not generally going to have the same length as the actual buffer you're representing. So it's no wonder that you get errors about the length being wrong. (And if you got really unlucky and didn't have any such errors, you'd be reading garbage values instead.)
So, what should you do to convert the bytearray into something the struct module can handle? Nothing! As the docs say:
Several struct functions (and methods of Struct) take a buffer argument. This refers to objects that implement the Buffer Protocol and provide either a readable or read-writable buffer. The most common types used for that purpose are bytes and bytearray…
I am not sure what I am doing wrong here but I am trying to open a file, trace1.flow, read the header information then throw the source IP and destination IP into dictionaries. This is done in Python running on a Fedora VM. I am getting the following error:
(secs, nsecs, booted, exporter, mySourceIP, myDestinationIP) = struct.unpack('IIIIII',myBuf)
struct.error: unpack requires a string argument of length 24
Here is my code:
import struct
import socket
#Dictionaries
uniqSource = {}
uniqDestination = {}
def int2quad(i):
z = struct.pack('!I', i)
return socket.inet_ntoa(z)
myFile = open('trace1.flow')
myBuf = myFile.read(8)
(magic, endian, version, headerLen) = struct.unpack('HBBI', myBuf)
print "Magic: ", hex(magic), "Endian: ", endian, "Version: ", version, "Header Length: ", headerLen
myFile.read(headerLen - 8)
try:
while(True):
myBuf = myFile.read(24)
(secs, nsecs, booted, exporter, mySourceIP, myDestinationIP) = struct.unpack('IIIIII',myBuf)
mySourceIP = int2quad(mySourceIP)
myDestinationIP = int2quad(myDestinationIP)
if mySourceIP not in uniqSource:
uniqSource[mySourceIP] = 1
else:
uniqSource[mySourceIP] += 1
if myDestinationIP not in uniqDestination:
uniqDestination[myDestinationIP] = 1
else:
uniqDestination[myDestinationIP] += 1
myFile.read(40)
except EOFError:
print "END OF FILE"
You seem to assume that file.read will raise EOFError on end of file, but this error is only raised by input() and raw_input(). file.read will simply return a string that's shorter than requested (possibly empty).
So you need to check the length after reading:
myBuf = myFile.read(24)
if len(myBuf) < 24:
break
Perhaps your have reached end-of-file. Check the length of myBuf:
len(myBuf)
It's probably less than 24 chars long. Also you don't need those extra parenthesis, and try to specify duplicated types using 'nI' like this:
secs, nsecs, booted, exporter, mySourceIP, myDestinationIP = struct.unpack('6I',myBuf)
The code below is a part of a program which is aimed to capture data from Bloomberg terminal and dump it into SQLite database. It worked pretty well on my 32-bit windows XP. But it keeps giving me
"get_history.histfetch error: [Errno 9] Bad file descriptor" on 64-bit windows 7, although there shouldn't be a problem using 32-bit python under 64-bit OS. Sometimes this problem can be solved by simply exit the program and open it again, but sometimes it just won't work. Right now I'm really confused about what leads to this problem. I looked at the source code and found the problem is generated while calling "histfetch" and I have NO idea which part of the code is failing. Can anyone help me out here...? I really really appreciate it. Thanks in advance.
def run(self):
try: pythoncom.CoInitializeEx(pythoncom.COINIT_APARTMENTTHREADED)
except: pass
while 1:
if self.trigger:
try: self.histfetch()
except Exception,e:
logging.error('get_history.histfetch error: %s %s' % (str(type(e)),str(e)))
if self.errornotify != None:
self.errornotify('get_history error','%s %s' % ( str(type(e)), str(e) ) )
self.trigger = 0
if self.telomere: break
time.sleep(0.5)
def histfetch(self):
blpcon = win32com.client.gencache.EnsureDispatch('blpapicom.Session')
blpcon.Start()
dbcon = sqlite3.connect(self.dbfile)
c = dbcon.cursor()
fieldcodes = {}
symcodes = {}
trysleep(c,'select fid,field from fields')
for fid,field in c.fetchall():
# these are different types so this will be ok
fieldcodes[fid] = field
fieldcodes[field] = fid
trysleep(c,'select sid,symbol from symbols')
for sid,symbol in c.fetchall():
symcodes[sid] = symbol
symcodes[symbol] = sid
for instr in self.instructions:
if instr[1] != 'minute': continue
sym,rollspec = instr[0],instr[2]
print 'MINUTE',sym
limits = []
sid = getsid(sym,symcodes,dbcon,c)
trysleep(c,'select min(epoch),max(epoch) from minute where sid=?',(sid,))
try: mine,maxe = c.fetchone()
except: mine,maxe = None,None
print sym,'minute data limits',mine,maxe
rr = getreqrange(mine,maxe)
if rr == None: continue
start,end = rr
dstart = start.strftime('%Y%m%d')
dend = end.strftime('%Y%m%d')
try: # if rollspec is 'noroll', then this will fail and goto except-block
ndaysbefore = int(rollspec)
print 'hist fetch for %s, %i days' % (sym,ndaysbefore)
rolldb.update_roll_db(blpcon,(sym,))
names = rolldb.get_contract_range(sym,ndaysbefore)
except: names = {sym:None}
# sort alphabetically here so oldest always gets done first
# (at least within the decade)
sorted_contracts = names.keys()
sorted_contracts.sort()
for contract in sorted_contracts:
print 'partial fetch',contract,names[contract]
if names[contract] == None:
_start,_end = start,end
else:
da,db = names[contract]
dc,dd = start,end
try: _start,_end = get_overlap(da,db,dc,dd)
except: continue # because get_overlap returning None cannot assign to tuple
# localstart and end are for printing and logging
localstart = _start.strftime('%Y/%m/%d %H:%M')
localend = _end.strftime('%Y/%m/%d %H:%M')
_start = datetime.utcfromtimestamp(time.mktime(_start.timetuple())).strftime(self.blpfmt)
_end = datetime.utcfromtimestamp(time.mktime(_end.timetuple())).strftime(self.blpfmt)
logging.debug('requesting intraday bars for %s (%s): %s to %s' % (sym,contract,localstart,localend))
print 'start,end:',localstart,localend
result = get_minute(blpcon,contract,_start,_end)
if len(result) == 0:
logging.error('warning: 0-length minute data fetch for %s,%s,%s' % (contract,_start,_end))
continue
event_count = len(result.values()[0])
print event_count,'events returned'
lap = time.clock()
# todo: split up writes: no more than 5000 before commit (so other threads get a chance)
# 100,000 rows is 13 seconds on my machine. 5000 should be 0.5 seconds.
try:
for i in range(event_count):
epoch = calendar.timegm(datetime.strptime(str(result['time'][i]),'%m/%d/%y %H:%M:%S').timetuple())
# this uses sid (from sym), NOT contract
row = (sid,epoch,result['open'][i],result['high'][i],result['low'][i],result['close'][i],result['volume'][i],result['numEvents'][i])
trysleep(c,'insert or ignore into minute (sid,epoch,open,high,low,close,volume,nevents) values (?,?,?,?,?,?,?,?)',row)
dbcon.commit()
except Exception,e:
print 'ERROR',e,'iterating result object'
logging.error(datetime.now().strftime() + ' error in get_history.histfetch writing DB')
# todo: tray notify the error and log it
lap = time.clock() - lap
print 'database write of %i rows in %.2f seconds' % (event_count,lap)
logging.debug(' -- minute bars %i rows (%.2f s)' % (event_count,lap))
for instr in self.instructions:
oldestdaily = datetime.now().replace(hour=0,minute=0,second=0,microsecond=0) - timedelta(self.dailyback)
sym = instr[0]
if instr[1] != 'daily': continue
print 'DAILY',sym
fields = instr[2]
rollspec = instr[3]
sid = getsid(sym,symcodes,dbcon,c)
unionrange = None,None
for f in fields:
try: fid = fieldcodes[f]
except:
trysleep(c,'insert into fields (field) values (?)',(f,))
trysleep(c,'select fid from fields where field=?',(f,))
fid, = c.fetchone()
dbcon.commit()
fieldcodes[fid] = f
fieldcodes[f] = fid
trysleep(c,'select min(epoch),max(epoch) from daily where sid=? and fid=?',(sid,fid))
mine,maxe = c.fetchone()
if mine == None or maxe == None:
unionrange = None
break
if unionrange == (None,None):
unionrange = mine,maxe
else:
unionrange = max(mine,unionrange[0]),min(maxe,unionrange[1])
print sym,'daily unionrange',unionrange
yesterday = datetime.now().replace(hour=0,minute=0,second=0,microsecond=0) - timedelta(days=1)
if unionrange == None:
reqrange = oldestdaily,yesterday
else:
mine = datetime.fromordinal(unionrange[0])
maxe = datetime.fromordinal(unionrange[1])
print 'comparing',mine,maxe,oldestdaily,yesterday
if oldestdaily < datetime.fromordinal(unionrange[0]): a = oldestdaily
else: a = maxe
reqrange = a,yesterday
if reqrange[0] >= reqrange[1]:
print 'skipping daily',sym,'because we\'re up to date'
continue
print 'daily request range',sym,reqrange,reqrange[0] > reqrange[1]
try:
ndaysbefore = int(rollspec) # exception if it's 'noroll'
print 'hist fetch for %s, %i days' % (sym,ndaysbefore)
rolldb.update_roll_db(blpcon,(sym,))
names = rolldb.get_contract_range(sym,ndaysbefore,daily=True)
except: names = {sym:None}
# sort alphabetically here so oldest always gets done first
# (at least within the year)
sorted_contracts = names.keys()
sorted_contracts.sort()
start,end = reqrange
for contract in sorted_contracts:
print 'partial fetch',contract,names[contract]
if names[contract] == None:
_start,_end = start,end
else:
da,db = names[contract]
dc,dd = start,end
try: _start,_end = get_overlap(da,db,dc,dd)
except: continue # because get_overlap returning None cannot assign to tuple
_start = _start.strftime('%Y%m%d')
_end = _end.strftime('%Y%m%d')
logging.info('daily bars for %s (%s), %s - %s' % (sym,contract,_start,_end))
result = get_daily(blpcon,(contract,),fields,_start,_end)
try: result = result[contract]
except:
print 'result doesn\'t contain requested symbol'
logging.error("ERROR: symbol '%s' not in daily request result" % contract)
# todo: log and alert error
continue
if not 'date' in result:
print 'result has no date field'
logging.error('ERROR: daily result has no date field')
# todo: log and alert error
continue
keys = result.keys()
keys.remove('date')
logging.info(' -- %i days returned' % len(result['date']))
for i in range(len(result['date'])):
ordinal = datetime.fromtimestamp(int(result['date'][i])).toordinal()
for k in keys:
trysleep(c,'insert or ignore into daily (sid,fid,epoch,value) values (?,?,?,?)',(sid,fieldcodes[k],ordinal,result[k][i]))
dbcon.commit()
Print the full traceback instead of just the exception message. The traceback will show you where the exception was raised and hence what the problem is:
import traceback
...
try: self.histfetch()
except Exception,e:
logging.error('get_history.histfetch error: %s %s' % (str(type(e)),str(e)))
logging.error(traceback.format_exc())
if self.errornotify != None:
self.errornotify('get_history error','%s %s' % ( str(type(e)), str(e) ) )
Update:
With the above (or similar, the idea being to look at the full traceback), you say:
it said it's with the "print" functions. The program works well after I disable all the "print" functions.
The print function calls you have in your post uses syntax valid in python 2.x only. If that is what you are using, perhaps the application that runs your script has undefined print and you're supposed to use a log function, otherwise I can't see anything wrong with the calls (unless you mean only one of the prints was the issue, then I would need to see the exact error to identify -- post this if you want to figure this out). If you are using Python 3.x, then you must use print(a, b, c, ...), see 3.x docs.