Python: What is the BEST data structure to use in this scenario? - python

I am trying to make a DNS Server and Client in python. Where Server will have stored data like:
qtsdatacenter.aws.com 128.64.3.2 A
ww.ibm.com 64.42.3.4 A
www.google.com 8.6.4.2 A
localhost - NS
Basically Hostname IPaddress Type.
What would be the best datastructure to implement that will make searching for queries and outputting referenced data easy.
For example: send a String saying www.google.com from client, the server searches in its stored data table for the string match for hostname, returns in format
www.google.com 8.6.4.2 A.

Keep it simple like this. And use dictionary. Looks like your keys are going to be hashable and dictionaries possess O(1) average complexity. See this example:
dct = {"www.google.com" : "www.google.com 8.6.4.2 A",
"www.ibm.com" : " ww.ibm.com 64.42.3.4 A"}

Related

Turning x.x.x.x string into list address (???)

This is a difficult problem to explain. I have a string that looks like "system.cpu.total.pct" that I'm pulling from a json configuration file. This particular format is required elsewhere in my program so I cannot change it.
This "system.cpu.total.pct" specifies what field I'm interested in snagging out of metricbeat (in Elasticsearch).
I need to convert this into a list address (? is that what to call it ?) so that I can snag stuff out of an array of database results I'm calling 'rawData'. Right now I'm doing this:
if sourceSet == "system.cpu.total.pct":
dataArray.append(rawData['hits']['hits'][thisRecord]["_source"]['system']['cpu']['total']['pct'])
But that's no good, obviously, because the result is hard-coded.
How can I instead write something like
dataArray.append(rawData['hits']['hits'][thisRecord]["_source"]["system.cpu.total.pct"])
that will work for any arbitrary string?
Any suggestions? Thank you!
you can use:
if sourceSet == "system.cpu.total.pct":
d = rawData['hits']['hits'][thisRecord]["_source"]
for t in sourceSet.split('.'):
d = d[t]
dataArray.append(d)

Representation of python dictionaries with unicode in database queries

I have a problem that I would like to know how to efficiently tackle.
I have data that is JSON-formatted (used with dumps / loads) and contains unicode.
This is part of a protocol implemented with JSON to send messages. So messages will be sent as strings and then loaded into python dictionaries. This means that the representation, as a python dictionary, afterwards will look something like:
{u"mykey": u"myVal"}
It is no problem in itself for the system to handle such structures, but the thing happens when I'm going to make a database query to store this structure.
I'm using pyOrient towards OrientDB. The command ends up something like:
"CREATE VERTEX TestVertex SET data = {u'mykey': u'myVal'}"
Which will end up in the data field getting the following values in OrientDB:
{'_NOT_PARSED_': '_NOT_PARSED_'}
I'm assuming this problem relates to other cases as well when you wish to make a query or somehow represent a data object containing unicode.
How could I efficiently get a representation of this data, of arbitrary depth, to be able to use it in a query?
To clarify even more, this is the string the db expects:
"CREATE VERTEX TestVertex SET data = {'mykey': 'myVal'}"
If I'm simply stating the wrong problem/question and should handle it some other way, I'm very much open to suggestions. But what I want to achieve is to have an efficient way to use python2.7 to build a db-query towards orientdb (using pyorient) that specifies an arbitrary data structure. The data property being set is of the OrientDB type EMBEDDEDMAP.
Any help greatly appreciated.
EDIT1:
More explicitly stating that the first code block shows the object as a dict AFTER being dumped / loaded with json to avoid confusion.
Dargolith:
ok based on your last response it seems you are simply looking for code that will dump python expression in a way that you can control how unicode and other data types print. Here is a very simply function that provides this control. There are ways to make this function more efficient (for example, by using a string buffer rather than doing all of the recursive string concatenation happening here). Still this is a very simple function, and as it stands its execution is probably still dominated by your DB lookup.
As you can see in each of the 'if' statements, you have full control of how each data type prints.
def expr_to_str(thing):
if hasattr(thing, 'keys'):
pairs = ['%s:%s' % (expr_to_str(k),expr_to_str(v)) for k,v in thing.iteritems()]
return '{%s}' % ', '.join(pairs)
if hasattr(thing, '__setslice__'):
parts = [expr_to_str(ele) for ele in thing]
return '[%s]' % (', '.join(parts),)
if isinstance(thing, basestring):
return "'%s'" % (str(thing),)
return str(thing)
print "dumped: %s" % expr_to_str({'one': 33, 'two': [u'unicode', 'just a str', 44.44, {'hash': 'here'}]})
outputs:
dumped: {'two':['unicode', 'just a str', 44.44, {'hash':'here'}], 'one':33}
I went on to use json.dumps() as sobolevn suggested in the comment. I didn't think of that one at first since I wasn't really using json in the driver. It turned out however that json.dumps() provided exactly the formats I needed on all the data types I use. Some examples:
>>> json.dumps('test')
'"test"'
>>> json.dumps(['test1', 'test2'])
'["test1", "test2"]'
>>> json.dumps([u'test1', u'test2'])
'["test1", "test2"]'
>>> json.dumps({u'key1': u'val1', u'key2': [u'val21', 'val22', 1]})
'{"key2": ["val21", "val22", 1], "key1": "val1"}'
If you need to take more control of the format, quotes or other things regarding this conversion, see the reply by Dan Oblinger.

Store and lookup IP Packet header fields in Python

I want to create a simple table (using python) in which I can store/search IP packet header fields i.e.,
source IP, Destination IP, Source Port, Destination port, count
I want to achieve the following when I get new packet header fields:
Lookup in the table to see if a packet with these fields is already added, if true then update the count.
If the packet is not already present in the table create a new entry and so on.
Through my search so far I have two options:
Create a list of dictionaries, with each dictionary having the five fields mentioned above.
(Python list of dictionaries search)
Use SQLite.
I want to ask what is an optimal approach (or best option) for creating an packet/flow lookup table. The expected size of table is 100-500 entries.
You could use defaultdict(list) from collections to store your data. I assume you would want to search based on the source IP so you would keep the source IP as key.
from collections import defaultdict
testDictionary = defaultdict(list)
testDictionary["192.168.0.1"] = ["10.10.10.1", 22, 8080, 0]
if testDictionary[sourceIP]:
testDictionary[sourceIP][-1] += 1
Since you are saying that you only have a table with 100-500 entries, you could search for destination IPs also using
for sourceIP, otherHeader in testDictionary.items():
if otherHeader[0] == destinationIP:
testDictionary[sourceIP][-1] += 1
I do not know whether both the source IP and the destination IP would be unique in all the cases. For that, you can decided what to choose. The advantage of defaultdict(list) is that you can append things also without overwriting the previous values.
for sourceIP, otherHeader in testDictionary.items():
if otherHeader[0] != destinationIP:
testDictionary[sourceIP].append(["10.10.10.2", 22, 8080, 1])
else:
testDictionary[sourceIP][-1] += 1
I am not sure this is exactly what you are looking for but I have tried to understand your data type according to description.
Hope that helps.

How can I pass a string to pexpect spawn?

I want to ssh to another node on my network as part of a larger python script, I am using pexpect which works when I do something like this:
session=spawn('ssh root#172.16.210.254')
I want to replace the address with a variable so I can cycle through addresses in a list however when I try:
address = "172.16.210.253"
session=spawn('ssh root#'address)
It doesn't work as using address in this way is invalid syntax. What is the correct syntax for this?
session=spawn('ssh root#' + address) to concatenate the strings

Storing Data from both POST variables and GET parameters

I want my python script to simultaneously accept POST variables and query string variables from the web address.
The script has code :
form = cgi.FieldStorage()
print form
However, this only captures the post variables and no query variables from the web address. Is there a way to do this?
Thanks,
Ali
cgi.parse_qsl (in any Python 2.*; urlparse.parse_qsl in 2.6 or better) take a query string and return a list of name, value pairs. Use os.environ['QUERY_STRING'] to get the query string part of the URL your CGI script was reached at (everything after the ? in the URL, if any).

Categories

Resources