Python bidirectional mapping - python

I'm not sure what to call what I'm looking for; so if I failed to find this question else where, I apologize. In short, I am writing python code that will interface directly with the Linux kernel. Its easy to get the required values from include header files and write them in to my source:
IFA_UNSPEC = 0
IFA_ADDRESS = 1
IFA_LOCAL = 2
IFA_LABEL = 3
IFA_BROADCAST = 4
IFA_ANYCAST = 5
IFA_CACHEINFO = 6
IFA_MULTICAST = 7
Its easy to use these values when constructing structs to send to the kernel. However, they are of almost no help to resolve the values in the responses from the kernel.
If I put the values in to dict I would have to scan all the values in the dict to look up keys for each item in each struct from the kernel I presume. There must be a simpler, more efficient way.
How would you do it? (feel free to retitle the question if its way off)

If you want to use two dicts, you can try this to create the inverted dict:
b = {v: k for k, v in a.iteritems()}

Your solution leaves a lot of work do the repeated person creating the file. That is a source for error (you actually have to write each name three times). If you have a file where you need to update those from time to time (like, when new kernel releases come out), you are destined to include an error sooner or later. Actually, that was just a long way of saying, your solution violates DRY.
I would change your solution to something like this:
IFA_UNSPEC = 0
IFA_ADDRESS = 1
IFA_LOCAL = 2
IFA_LABEL = 3
IFA_BROADCAST = 4
IFA_ANYCAST = 5
IFA_CACHEINFO = 6
IFA_MULTICAST = 7
__IFA_MAX = 8
values = {globals()[x]:x for x in dir() if x.startswith('IFA_') or x.startswith('__IFA_')}
This was the values dict is generated automatically. You might want to (or have to) change the condition in the if statement there, according to whatever else is in that file. Maybe something like the following. That version would take away the need to list prefixes in the if statement, but it would fail if you had other stuff in the file.
values = {globals()[x]:x for x in dir() if not x.endswith('__')}
You could of course do something more sophisticated there, e.g. check for accidentally repeated values.

What I ended up doing is leaving the constant values in the module and creating a dict. The module is ip_addr.py (the values are from linux/if_addr.h) so when constructing structs to send to the kernel I can use if_addr.IFA_LABEL and resolves responses with if_addr.values[2]. I'm hoping this is the most straight forward so when I have to look at this again in a year+ its easy to understand :p
IFA_UNSPEC = 0
IFA_ADDRESS = 1
IFA_LOCAL = 2
IFA_LABEL = 3
IFA_BROADCAST = 4
IFA_ANYCAST = 5
IFA_CACHEINFO = 6
IFA_MULTICAST = 7
__IFA_MAX = 8
values = {
IFA_UNSPEC : 'IFA_UNSPEC',
IFA_ADDRESS : 'IFA_ADDRESS',
IFA_LOCAL : 'IFA_LOCAL',
IFA_LABEL : 'IFA_LABEL',
IFA_BROADCAST : 'IFA_BROADCAST',
IFA_ANYCAST : 'IFA_ANYCAST',
IFA_CACHEINFO : 'IFA_CACHEINFO',
IFA_MULTICAST : 'IFA_MULTICAST',
__IFA_MAX : '__IFA_MAX'
}

Related

Pythonic way of checking bit values

I have this set of constants declarations
self.POUTPUT = 1
self.PPWM = 2
self.PINPUT = 4
self.PUNUSED = 8
self.PSONAR = 16
self.PULTRA = 32
self.PSERVOD = 64
self.PSTEPPER = 128
self.PCOUNT = 256
self.PINPUTDOWN = 512
self.PINPUTNONE = 1024
What is the pythonic way of checking whether a value matches any of the input states (4,512 and 1024) please?
Extra info: I'd normally just use bit pattern checking in simpler languages but I wondered if there was a better way in Python :)
Basically it is that each pin can have one of the I/O states above but if a pin is any of of the input values then one action occurs
e.g if pin == 4 or 512 or 1024 -> do something
if SOME_DATA & self.PINPUTNONE: # do somethign
is how people usually do it
to check alot
ANY = self.POUTPUT | self.PPWM | self.PINPUT | self.PUNUSED | self.PSONAR ...
if some_data & ANY: #do something
Testing for set membership (which is what you seem to be doing) is best done using a set.
self.input_states = {self.PINPUT, self.PINPUTDOWN, self.PINPUTNONE}
# later
if value in self.input_states:
do_something()
Of course you could handle this in a variety of essentially-identical ways, but one way or another you have to encode some knowledge as to which of these magic numbers are "input states".
Now if, as has been suggested, what you actually want is to do some bit-masking or tests based on whether a particular bit is set, then you want to take a look at the Python bitwise operators. In particular, to see if a value is equal to one of the ones you've got there, you'd use the bitwise and, which is denoted by & in Python. Joran's answer covers this use well, but the basic idea is:
if value & self.PINPUT: # Value has a 1 in the PINPUT bit (bit 3)
do_something()
Or if all of those input values trigger the same action:
if any(value & p for p in (self.PINPUT, self.PINPUTDOWN, self.PINPUTNONE)):
do_something()

Re-writing a python program into VB, how to sort CSV?

About a year back, I wrote a little program in python that basically automates a part of my job (with quite a bit of assistance from you guys!) However, I ran into a problem. As I kept making the program better and better, I realized that Python did not want to play nice with excel, and (without boring you with the details suffice to say xlutils will not copy formulas) I NEED to have more access to excel for my intentions.
So I am starting back at square one with VB (2010 Express if it helps.) The only programming course I ever took in my life was on it, and it was pretty straight forward so I decided I'd go back to it for this. Unfortunately, I've forgotten much of what I had learned, and we never really got this far down the rabbit hole in the first place. So, long story short I am trying to:
1) Read data from a .csv structured as so:
41,332.568825,22.221759,-0.489714,eow
42,347.142926,-2.488763,-0.19358,eow
46,414.9969,19.932693,1.306851,r
47,450.626074,21.878299,1.841957,r
48,468.909171,21.362568,1.741944,r
49,506.227269,15.441723,1.40972,r
50,566.199838,17.656284,1.719818,r
51,359.069935,-11.773073,2.443772,l
52,396.321911,-8.711589,1.83507,l
53,423.766684,-4.238343,1.85591,l
2) Sort that data alphabetically by column 5
3) Then selecting only the ones with an "l" in column 5, sort THOSE numerically by column 2 (ascending order) AND copy them to a new file called coil.csv
4) Then selecting only the ones that have an "r" in column 5, sort those numerically by column 2 (descending order) and copy them to the SAME file coil.csv (appended after the others obviously)
After all of that hoopla I wish to get out:
51,359.069935,-11.773073,2.443772,l
52,396.321911,-8.711589,1.83507,l
53,423.766684,-4.238343,1.85591,l
50,566.199838,17.656284,1.719818,r
49,506.227269,15.441723,1.40972,r
48,468.909171,21.362568,1.741944,r
47,450.626074,21.878299,1.841957,r
46,414.9969,19.932693,1.306851,r
I realize that this may be a pretty involved question, and I certainly understand if no one wants to deal with all this bs, lol. Anyway, some full on code, snippets, ideas or even relevant links would be GREATLY appreciated. I've been, and still am googling, but it's harder than expected to find good reliable information pertaining to this.
P.S. Here is the piece of python code that did what I am talking about (although it created two seperate files for the lefts and rights which I don't really need) - if it helps you at all.
msgbox(msg="Please locate your survey file in the next window.")
mainfile = fileopenbox(title="Open survey file")
toponame = boolbox(msg="What is the name of the shots I should use for topography? Note: TOPO is used automatically",choices=("Left","Right"))
fieldnames = ["A","B","C","D","E"]
surveyfile = open(mainfile, "r")
left_file = open("left.csv",'wb')
right_file = open("right.csv",'wb')
coil_file = open("coil1.csv","wb")
reader = csv.DictReader(surveyfile, fieldnames=fieldnames, delimiter=",")
left_writer = csv.DictWriter(left_file, fieldnames + ["F"], delimiter=",")
sortedlefts = sorted(reader,key=lambda x:float(x["B"]))
surveyfile.seek(0,0)
right_writer = csv.DictWriter(right_file, fieldnames + ["F"], delimiter=",")
sortedrights = sorted(reader,key=lambda x:float(x["B"]), reverse=True)
coil_writer = csv.DictWriter(coil_file, fieldnames, delimiter=",",extrasaction='ignore')
for row in sortedlefts:
if row["E"] == "l" or row["E"] == "cl+l":
row['F'] = '%s,%s' % (row['B'], row['D'])
left_writer.writerow(row)
coil_writer.writerow(row)
for row in sortedrights:
if row["E"] == "r":
row['F'] = '%s,%s' % (row['B'], row['D'])
right_writer.writerow(row)
coil_writer.writerow(row)
One option you have is to start with a class to hold the fields. This allows you to override the ToString method to facilitate the output. Then, it's a fairly simple matter of reading each line and assigning the values to a list of the class. In your case you'll want the extra step of making 2 lists sorting one descending and combining them:
Class Fields
Property A As Double = 0
Property B As Double = 0
Property C As Double = 0
Property D As Double = 0
Property E As String = ""
Public Overrides Function ToString() As String
Return Join({A.ToString, B.ToString, C.ToString, D.ToString, E}, ",")
End Function
End Class
Function SortedFields(filename As String) As List(Of Fields)
SortedFields = New List(Of Fields)
Dim test As New List(Of Fields)
Dim sr As New IO.StreamReader(filename)
Using sr As New IO.StreamReader(filename)
Do Until sr.EndOfStream
Dim fieldarray() As String = sr.ReadLine.Split(","c)
If fieldarray.Length = 5 AndAlso Not fieldarray(4)(0) = "e"c Then
If fieldarray(4) = "r" Then
test.Add(New Fields With {.A = Double.Parse(fieldarray(0)), .B = Double.Parse(fieldarray(1)), .C = Double.Parse(fieldarray(2)), .D = Double.Parse(fieldarray(3)), .E = fieldarray(4)})
Else
SortedFields.Add(New Fields With {.A = Double.Parse(fieldarray(0)), .B = Double.Parse(fieldarray(1)), .C = Double.Parse(fieldarray(2)), .D = Double.Parse(fieldarray(3)), .E = fieldarray(4)})
End If
End If
Loop
End Using
SortedFields = SortedFields.OrderBy(Function(x) x.B).Concat(test.OrderByDescending(Function(x) x.B)).ToList
End Function
One simple way of writing the data to a csv file is to use the IO.File.WriteAllLines methods and the ConvertAll method of the List:
IO.File.WriteAllLines(" coil.csv", SortedFields("textfile1.txt").ConvertAll(New Converter(Of Fields, String)(Function(x As Fields) x.ToString)))
You'll notice how the ToString method facilitates this quite easily.
If the class will only be used for this you do have the option to make all the fields string.

Python nest list performance choice

I am trying to understand if there is an advantage in space/time/programming to storing data from a signal processing system as nested list in either :
data[channel][sample]
data[sample][channel]
I can code processing for both - thou I personally find 1) easy to write and index to then 2).
However, 2) is the more common was my local group programs in and stores the data (either in excel/csv or from the data gathering systems). While it is easy to transpose
dataA = map(list, zip(*dataB))
I was wondering if there are any storage or performance - or even - module compatibility issues with 1 over 2?
with 1) I can loop like this
for R in dataA :
for C in R :
process_channel(C)
matplotlib.loglog(dataA[0], dataA[i])
where dataA[0] is time or frequency and i is some other channel to plot
with 2)
for R in dataB :
for C in R
process_sample(C)
matplotlib.loglog([j[0] for j in dataB],[k[i] for k in dataB])
This looks worse in programming style. Maybe I am missing a list method of making this easier? I have also developed code to used dicts ... but this really breaks with general use. So I am less inclined to continue to use dicts. Although the dict storage is
dataC = list(['f':0.1,'chnl1':100.0],['f':0.2,'chnl1':110.0])
or some such. It seems that to be better integrated option 2 is better. However, I am trying to understand how better to code when using option 2) when you wish to process over channels then samples? Just transpose the matrix first and then do the work in option 1) space and transpose back the results:
dataA = smoothing(dataA, smooth_factor)
def smoothing(d, s) :
td = numpy.transpose(d)
td = map(list, zip(*d))
nd=[]
for row in td :
col = []
for i in xrange(0,len(row)-step,step) :
col.append(sum(row[i:i+step]/step)
nd.append(col)
nd = numpy.transpose(nd)
return nd
while this construction works - transposing back and forth all the time looks - um - inefficient.

For data error checking: Is there a way to avoid using a dictionary for a list

I have data that looks like this:
Observation 1
Type : 1
Color: 2
Observation 2
Color: 2
Resolution: 3
Originally what I had done was to attempt to create a csv that looked like:
1,2
2,3 # Only problem here is that the data should look like this 1,2,\n ,2,3 #
I performed the following operation:
while linecache.getline(filename, curline):
for i in range(2):
data_manipulated = linecache.getline(filename, curline).rstrip()
datamanipulated2 = data_manipulated.split(":")
datamanipulated2.pop(0)
lines.append(':'.join(datamanipulated2))
This is quite a large dataset and I tried to find ways to verify that the above problem doesn't happen so that I can compile the data appropriately and with checks. I came across dictionaries, however, performance is a big issue for me and I would prefer lists if that's possible (at least, my understanding is that dictionaries can be significantly slower?). I was just wondering if anyone had any suggestions on the quickest and most robust way to do this?
How about something like:
input_file = open('/path/to/input.file')
results = []
for row in file:
m = re.match('Observation (\d+)', row)
if m:
observation = m.group(1)
continue
m = re.match('Color: (\d+)', row)
if m:
results.append((observation, m.group(1),))
print "{0},{1}".format(*results[-1])
You can speedup using precompiled regular expressions.

Translate ruby to python

I'm rewriting some code from Ruby to Python. The code is for a Perceptron, listed in section 8.2.6 of Clever Algorithms: Nature-Inspired Programming Recipes. I've never used Ruby before and I don't understand this part:
def test_weights(weights, domain, num_inputs)
correct = 0
domain.each do |pattern|
input_vector = Array.new(num_inputs) {|k| pattern[k].to_f}
output = get_output(weights, input_vector)
correct += 1 if output.round == pattern.last
end
return correct
end
Some explanation: num_inputs is an integer (2 in my case), and domain is a list of arrays: [[1,0,1], [0,0,0], etc.]
I don't understand this line:
input_vector = Array.new(num_inputs) {|k| pattern[k].to_f}
It creates an array with 2 values, every values |k| stores pattern[k].to_f, but what is pattern[k].to_f?
Try this:
input_vector = [float(pattern[i]) for i in range(num_inputs)]
pattern[k].to_f
converts pattern[k] to a float.
I'm not a Ruby expert, but I think it would be something like this in Python:
def test_weights(weights, domain, num_inputs):
correct = 0
for pattern in domain:
output = get_output(weights, pattern[:num_inputs])
if round(output) == pattern[-1]:
correct += 1
return correct
There is plenty of scope for optimising this: if num_inputs is always one less then the length of the lists in domain then you may not need that parameter at all.
Be careful about doing line by line translations from one language to another: that tends not to give good results no matter what languages are involved.
Edit: since you said you don't think you need to convert to float you can just slice the required number of elements from the domain value. I've updated my code accordingly.

Categories

Resources