I am writing a program in Python 3 to work with several devices. And I have to store constants for each device. Some constants are general for all devices and permanently fixed, but some other ones are different from version to version depending on the firmware version of the devices. I have to store constants for all versions, not only for the last one. Tell me please the Pythonic way to define constants for different devices and multiple versions of them.
My current solution looks like this:
general = {
'GENERAL_CONST_1': 1,
'GENERAL_CONST_2': 2,
...
'GENERAL_CONST_N': N
}
device_1 = dict()
device_1[FIRMWARE_VERSION_1] = {
'DEVICE_1_CONST_1': 1,
'DEVICE_1_CONST_2': 2,
...
'DEVICE_1_CONST_N': N
}
device_1[FIRMWARE_VERSION_1].update(general)
device_1[FIRMWARE_VERSION_2] = {
'DEVICE_1_CONST_1': 1,
'DEVICE_1_CONST_2': 2,
...
'DEVICE_1_CONST_N': N
}
device_1[FIRMWARE_VERSION_2].update(general)
device_2 = dict()
device_2[FIRMWARE_VERSION_1] = {
'DEVICE_2_CONST_1': 1,
'DEVICE_2_CONST_2': 2,
...
'DEVICE_2_CONST_N': N
}
device_2[FIRMWARE_VERSION_1].update(general)
device_2[FIRMWARE_VERSION_2] = {
'DEVICE_2_CONST_1': 1,
'DEVICE_2_CONST_2': 2,
...
'DEVICE_2_CONST_N': N
}
device_2[FIRMWARE_VERSION_2].update(general)
Thank you in advance! Or, if you could point me in the direction where I can read about the above, I will be grateful for this too.
UPD1:
Thanks to #languitar I decided to use one of INI/JSON/YAML/TSON... format. For example, formats supported in library python-anyconfig. Format INI (proposed by #languitar configparser) looks good for my purposes (also TSON seemed interesting), but, unfortunately, both of them don't support hex value. I was very surprised. But all my constants should have hex format. And then I decided try YAML format. Now file with constants look like this:
# General consts for all devices and all versions
general: &general
GENERAL_CONST_1: 1
GENERAL_CONST_2: 2
...
GENERAL_CONST_N: N
# Particular consts for device_1 for different firmware version
device_1: &device_1
<<: *general
# General consts for device_1 and all firmware versions
DEVICE_1_CONST_1: 1
device_1:
FIRMWARE_VERSION_1:
<<: *device_1
DEVICE_1_CONST_2: 2
...
DEVICE_1_CONST_N: N
FIRMWARE_VERSION_2:
<<: *device_1
DEVICE_1_CONST_2: 2
...
DEVICE_1_CONST_N: N
# Particular consts for device_2 for different firmware version
device_2: &device_2
<<: *general
# General consts for device_2 and all firmware versions
DEVICE_1_CONST_1: 1
device_2:
FIRMWARE_VERSION_1:
<<: *device_2
DEVICE_1_CONST_2: 2
...
DEVICE_1_CONST_N: N
FIRMWARE_VERSION_2:
<<: *device_2
DEVICE_1_CONST_2: 2
...
DEVICE_1_CONST_N: N
But I am not sure, whether this is the right way to store constants for devices and all their firmware versions
Just change your names to all capital letters
like GENERAL, DEVICE_1, etc
Related
This question is a bit long; please bear with me.
I have a data structure with elements like this {x1, x2, x3, x4, x5}:
{0 0 0 0 0, 0 0 0 1 0, 1 1 1 1 0,.....}
They represent all the TRUEs in the truth table. Of course, the 5-bit string elements not present in this set correspond to FALSEs in the truth table. But I don't have the boolean function corresponding to the said set data structure.
I see this question but here all the answers assume the boolean function is given, which is not true.
I need to build a ROBDD and then to ZDD from the given set data structure. Preferably with available python packages like these.
Any advice from experts? I am sure a lot has been done along the line of this.
With the Python package dd, which can be installed using the package manager pip with pip install dd, it is possible to convert the set of variable assignments where the Boolean function is TRUE to a binary decision diagram.
The following example in Python assumes that the assignments where the function is TRUE are given as a set of strings.
from dd import autoref as _bdd
# assignments where the Boolean function is TRUE
data = {'0 0 0 0 0', '0 0 0 1 0', '1 1 1 1 0'}
# variable names
vrs = [f'x{i}' for i in range(1, 6)]
# convert the assignments to dictionaries
assignments = list()
for e in data:
tpl = e.split()
assignment = {k: bool(int(v)) for k, v in zip(vrs, tpl)}
assignments.append(assignment)
# initialize a BDD manager
bdd = _bdd.BDD()
# declare variables
bdd.declare(*vrs)
# create binary decision diagram
u = bdd.false
for assignment in assignments:
u |= bdd.cube(assignment)
# to confirm
satisfying_assignments = list(bdd.pick_iter(u))
print(satisfying_assignments)
For a faster implementation of BDDs, and for an implementation of ZDDs using the C library CUDD, the Cython module extensions dd.cudd and dd.cudd_zdd can be installed as following:
pip download dd --no-deps
tar xzf dd-*.tar.gz
cd dd-*
python setup.py install --fetch --cudd --cudd_zdd
For this (small) example there is no practical speed difference between the pure Python module dd.autoref and the Cython module dd.cudd.
The above binary decision diagram (BDD) can be copied to a zero-suppressed binary decision diagram (ZDD) with the following code:
from dd import _copy
from dd import cudd_zdd
# initialize a ZDD manager
zdd = cudd_zdd.ZDD()
# declare variables
zdd.declare(*vrs)
# copy the BDD to a ZDD
u_zdd = _copy.copy_bdd(u, zdd)
# confirm
satisfying_assignments = list(zdd.pick_iter(u_zdd))
print(satisfying_assignments)
The module dd.cudd_zdd was added in dd == 0.5.6, so the above installation requires downloading the distribution of dd >= 0.5.6, either from PyPI, or from the GitHub repository.
I have a very big file with millions of paths to various executables on windows systems. A simple example would be the following:
C:\windows\ccmcache\1d\Deploy-Application.exe
C:\WINDOWS\ccmcache\7\Deploy-Application.exe
C:\windows\ccmcache\2o\Deploy-Application.exe
C:\WINDOWS\ccmcache\6\Deploy-Application.exe
C:\WINDOWS\ccmcache\15\Deploy-Application.exe
C:\WINDOWS\ccmcache\m\Deploy-Application.exe
C:\WINDOWS\ccmcache\1g\Deploy-Application.exe
C:\windows\ccmcache\2r\Deploy-Application.exe
C:\windows\ccmcache\1l\Deploy-Application.exe
C:\windows\ccmcache\2s\Deploy-Application.exe
or
C:\Users\user23452345\temp\test\1\Another1-Application.exe
C:\Users\user1324asdf\temp\Another-Applicatiooon.exe
C:\Users\user23452---5\temp\lili\Another-Application.exe
C:\Users\user23hkjhf_5\temp\An0ther-Application.exe
As a human, I can recognize that these strings are similar and match them fairly easily with some regex in code. My issue however is to find these patterns in the first place as there are far too many of those, completely unknown to me and are changing frequently.
My goal is to write a python script that finds these similar strings with a degree of certainty and groups them for me.
Which methods, libraries, keywords etc. should I look into to solve this problem?
One possible way is to approach this by calculating the distance between strings. For that, you could use the textdistance lib.
Hope this helps!
Edit:
Two starting points to get more familiarized with the subject:
https://en.wikipedia.org/wiki/Edit_distance
https://en.wikipedia.org/wiki/Levenshtein_distance
Try fuzzywuzzy, a soft string matcher. It makes a difference if you keep the strings as they are or lower case them first:
from fuzzywuzzy import fuzz
import itertools
lines = [
'C:\windows\ccmcache\1d\Deploy-Application.exe',
'C:\WINDOWS\ccmcache\m\Deploy-Application.exe',
'user5323\A-different-Application.bat',
]
for line1, line2 in itertools.combinations(lines, r=2):
case_match = fuzz.ratio(line1, line2)
insensitive_case_match = fuzz.ratio(line1.lower(), line2.lower())
print(line1[:10], '...', line1[:-10])
print(line2[:10], '...', line2[:-10])
print(case_match, insensitive_case_match)
print()
C:\windows ... C:\windows\ccmcached\Deploy-Appli
C:\WINDOWS ... C:\WINDOWS\ccmcache\m\Deploy-Appli
80 95
C:\windows ... C:\windows\ccmcached\Deploy-Appli
user5323\A ... user5323\A-different-Appli
42 45
C:\WINDOWS ... C:\WINDOWS\ccmcache\m\Deploy-Appli
user5323\A ... user5323\A-different-Appli
40 45
One fairly straight-forward and simple way would be to simply check for "how much" a pair of strings differ. Like so:
import difflib
from collections import defaultdict
grouping_requirement = 0.75 # (0;1), the closer to 1, the stronger the equality needs to be to be grouped
s = r'''C:\windows\ccmcache\1d\Deploy-Application.exe
C:\WINDOWS\ccmcache\7\Deploy-Application.exe
C:\windows\ccmcache\2o\Deploy-Application.exe
C:\WINDOWS\ccmcache\6\Deploy-Application.exe
C:\WINDOWS\ccmcache\15\Deploy-Application.exe
C:\WINDOWS\ccmcache\m\Deploy-Application.exe
C:\WINDOWS\ccmcache\1g\Deploy-Application.exe
C:\windows\ccmcache\2r\Deploy-Application.exe
C:\windows\ccmcache\1l\Deploy-Application.exe
C:\windows\ccmcache\2s\Deploy-Application.exe
C:\Users\user23452345\temp\test\1\Another1-Application.exe
C:\Users\user1324asdf\temp\Another-Applicatiooon.exe
C:\Users\user23452---5\temp\lili\Another-Application.exe
C:\Users\user23hkjhf_5\temp\An0ther-Application.exe'''
groups = defaultdict(list)
def match_ratio(s1,s2):
return difflib.SequenceMatcher(None,s1,s2).ratio()
for line in set(s.splitlines()):
for group in groups:
if match_ratio(group, line) > grouping_requirement:
groups[group].append(line)
break
else:
groups[line].append(line)
for group in groups.values():
print(', '.join(group))
print()
The output of this little application is:
C:\WINDOWS\ccmcache\1g\Deploy-Application.exe, C:\WINDOWS\ccmcache\m\Deploy-Application.exe, C:\windows\ccmcache\1l\Deploy-Application.exe, C:\WINDOWS\ccmcache\15\Deploy-Application.exe, C:\WINDOWS\ccmcache\7\Deploy-Application.exe, C:\WINDOWS\ccmcache\6\Deploy-Application.exe, C:\windows\ccmcache\2s\Deploy-Application.exe, C:\windows\ccmcache\1d\Deploy-Application.exe, C:\windows\ccmcache\2o\Deploy-Application.exe, C:\windows\ccmcache\2r\Deploy-Application.exe
C:\Users\user23452345\temp\test\1\Another1-Application.exe, C:\Users\user23hkjhf_5\temp\An0ther-Application.exe, C:\Users\user1324asdf\temp\Another-Applicatiooon.exe, C:\Users\user23452---5\temp\lili\Another-Application.exe
As you see on the top of the code snippet, you see that there is a constant, grouping_requirement, which I arbitrarily set to 0.75. If you reduce that value closer to 0.0, more paths will be grouped together, if you raise that value closer to 1.0, fewer paths will be grouped. Good luck!
I'm starting to learn Go after other languages. Go has a very elegant way of creating constants with numeric values like:
const (
_ = iota // 0 and is skipped
Sunday // 1
Monday // 2
...
)
This is very easy to write, but is it really easy to maintain? For example, if you suddenly insert new value to between present, all subsequent will change their values. And it will be hard to find, only scrupulous diff reading can reveal it. Or errors on other parts. How can I extract these values with names and use in other parts of a program, or in database?
For example for PostgreSQL I can define:
CREATE TYPE color AS ENUM ('', 'Sunday', 'Monday');
Just to illustrate an idea. For example, Python has Enum type:
from enum import Enum
class Color(Enum):
RED = 1
GREEN = 2
BLUE = 3
Then you may use it like Color.RED. Next I can take all values:
list(Color)
[<Color.RED: 1>, <Color.BLUE: 2>, <Color.GREEN: 3>]
This allows me to "introspect" to module and create easily-readable enums in databases. For example for PostgreSQL I can define:
CREATE TYPE color AS ENUM ('RED', 'GREEN', 'BLUE');
How can I:
Reflect golang constants names?
Make error-proof constants which cannot drift their values? Only fix them manually?
May be there's an idiomatic way to do it better?
Thanks.
1) You can use stringer to generate the names https://godoc.org/golang.org/x/tools/cmd/stringer
2) Not sure what you mean? Most languages will allow you to drift values, you should always add to the end of the list if you want the number to stay constant, or like in python you could explicitly set each value to a number instead of using iota.
3) Not really, enums just aren't great in golang
Just a suggestion, but something that might help in your case: I find that constants are less likely to be changed/broken later on if it's clear that the values look like bit masks, which you can do in go like so:
const (
Red = 1 << iota
Green
Blue
) // values = 1, 2, 4
And, even though it's not the prettiest of declarations, you can include the mask constants, too
const (
Red, RedMask = 1 << iota, 1<< iota - 1 // Red = 1, RedMask = 0
Green, GreenMask // Green = 2, mask = 1
Blue, BlueMask // 4, 3
RGB, RGBMask // 8, 7
)
This, coupled with a designated type for these constants might be useful:
type ColourConst int
const (
Red, RMask ColourConst = 1 << iota, 1 << iota-1
// ...
_, All
)
// something like this (untested, might not be correct)
func (c ColourConst) validate() error {
mask := int(c) & (-1 * int(c))
if mask != int(c) {
return errors.New("Colour is not a single bit value")
}
if s := c & All; s != c {
return errors.New("Colour is not in range")
}
}
I know that the days of the week are unlikely to be used as bitmasks, but it makes it less likely for people to break the code. At the very least, it communicates that the order of the constants matter, that's what iota does IMO.
Solution.
There're excellent modules Enumer and Enumelinter
I'm not sure what to call what I'm looking for; so if I failed to find this question else where, I apologize. In short, I am writing python code that will interface directly with the Linux kernel. Its easy to get the required values from include header files and write them in to my source:
IFA_UNSPEC = 0
IFA_ADDRESS = 1
IFA_LOCAL = 2
IFA_LABEL = 3
IFA_BROADCAST = 4
IFA_ANYCAST = 5
IFA_CACHEINFO = 6
IFA_MULTICAST = 7
Its easy to use these values when constructing structs to send to the kernel. However, they are of almost no help to resolve the values in the responses from the kernel.
If I put the values in to dict I would have to scan all the values in the dict to look up keys for each item in each struct from the kernel I presume. There must be a simpler, more efficient way.
How would you do it? (feel free to retitle the question if its way off)
If you want to use two dicts, you can try this to create the inverted dict:
b = {v: k for k, v in a.iteritems()}
Your solution leaves a lot of work do the repeated person creating the file. That is a source for error (you actually have to write each name three times). If you have a file where you need to update those from time to time (like, when new kernel releases come out), you are destined to include an error sooner or later. Actually, that was just a long way of saying, your solution violates DRY.
I would change your solution to something like this:
IFA_UNSPEC = 0
IFA_ADDRESS = 1
IFA_LOCAL = 2
IFA_LABEL = 3
IFA_BROADCAST = 4
IFA_ANYCAST = 5
IFA_CACHEINFO = 6
IFA_MULTICAST = 7
__IFA_MAX = 8
values = {globals()[x]:x for x in dir() if x.startswith('IFA_') or x.startswith('__IFA_')}
This was the values dict is generated automatically. You might want to (or have to) change the condition in the if statement there, according to whatever else is in that file. Maybe something like the following. That version would take away the need to list prefixes in the if statement, but it would fail if you had other stuff in the file.
values = {globals()[x]:x for x in dir() if not x.endswith('__')}
You could of course do something more sophisticated there, e.g. check for accidentally repeated values.
What I ended up doing is leaving the constant values in the module and creating a dict. The module is ip_addr.py (the values are from linux/if_addr.h) so when constructing structs to send to the kernel I can use if_addr.IFA_LABEL and resolves responses with if_addr.values[2]. I'm hoping this is the most straight forward so when I have to look at this again in a year+ its easy to understand :p
IFA_UNSPEC = 0
IFA_ADDRESS = 1
IFA_LOCAL = 2
IFA_LABEL = 3
IFA_BROADCAST = 4
IFA_ANYCAST = 5
IFA_CACHEINFO = 6
IFA_MULTICAST = 7
__IFA_MAX = 8
values = {
IFA_UNSPEC : 'IFA_UNSPEC',
IFA_ADDRESS : 'IFA_ADDRESS',
IFA_LOCAL : 'IFA_LOCAL',
IFA_LABEL : 'IFA_LABEL',
IFA_BROADCAST : 'IFA_BROADCAST',
IFA_ANYCAST : 'IFA_ANYCAST',
IFA_CACHEINFO : 'IFA_CACHEINFO',
IFA_MULTICAST : 'IFA_MULTICAST',
__IFA_MAX : '__IFA_MAX'
}
Hi I'm working on converting perl to python for something to do.
I've been looking at some code on hash tables in perl and I've come across a line of code that I really don't know how it does what it does in python. I know that it shifts the bit strings of page by 1
%page_table = (); #page table is a hash of hashes
%page_table_entry = ( #page table entry structure
"dirty", 0, #0/1 boolean
"referenced", 0, #0/1 boolean
"valid", 0, #0/1 boolean
"frame_no", -1, #-1 indicates an "x", i.e. the page isn't in ram
"page", 0 #used for aging algorithm. 8 bit string.);
#ram = ((-1) x $num_frames);
Could someone please give me an idea on how this would be represented in python? I've got the definitions of the hash tables done, they're just there as references as to what I'm doing. Thanks for any help that you can give me.
for($i=0; $i<#ram; $i++){
$page_table{$ram[$i]}->{page} = $page_table{$ram[$i]}->{page} >> 1;}
The only thing confusing is that page table is a hash of hashes. $page_table{$v} contains a hashref to a hash that contains a key 'page' whose value is an integer. The loop bitshifts that integer but is not very clear perl code. Simpler would be:
foreach my $v (#ram) {
$page_table{$v}->{page} >>= 1;
}
Now the translation to python should be obvious:
for v in ram:
page_table[v][page] >>= 1
Here is what my Pythonizer generates for that code:
#!/usr/bin/env python3
# Generated by "pythonizer -mV q8114826.pl" v0.974 run by snoopyjc on Thu Apr 21 23:35:38 2022
import perllib, builtins
_str = lambda s: "" if s is None else str(s)
perllib.init_package("main")
num_frames = 0
builtins.__PACKAGE__ = "main"
page_table = {} # page table is a hash of hashes
page_table_entry = {"dirty": 0, "referenced": 0, "valid": 0, "frame_no": -1, "page": 0}
# page table entry structure
# 0/1 boolean
# 0/1 boolean
# 0/1 boolean
# -1 indicates an "x", i.e. the page isn't in ram
# used for aging algorithm. 8 bit string.
ram = [(-1) for _ in range(num_frames)]
for i in range(0, len(ram)):
page_table[_str(ram[i])]["page"] = perllib.num(page_table.get(_str(ram[i])).get("page")) >> 1
Woof! No wonder you want to try Python!
Yes, Python can do this because Python dictionaries (what you'd call hashes in Perl) can contain other arrays or dictionaries without doing references to them.
However, I highly suggest that you look into moving into object oriented programming. After looking at that assignment statement of yours, I had to lie down for a bit. I can't imagine trying to maintain and write an entire program like that.
Whenever you have to do a hash that contains an array, or an array of arrays, or a hash of hashes, you should be looking into using object oriented code. Object oriented code can prevent you from making all the sorts of errors that happen when you do that type of stuff. And, it can make your code much more readable -- even Perl code.
Take a look at the Python Tutorial and take a look at the Perl Object Oriented Tutorial and learn a bit about object oriented programming.
This is especially true in Python which was written from the ground up to be object oriented.