Building a set of records incrementally as app progresses

Building a set of records incrementally as app progresses - python

I have an sysadmin type CLI app that reads in info from a config file. I cannot change the format of the config file below.
TYPE_A = "value1,value2,value2"
TYPE_A = "value3,value4,value5"
TYPE_B = "valuex,valuey,valuez"
Based on the TYPE, I'll need to do some initial processing with each one. After I'm done with that step for all, I need to do some additional processing and depending on the options chosen either print the state and intended action(s) or execute those action(s).
I'd like to do the initial parsing of the config into a dict of lists of dicts and update every instance of TYPE_A, TYPE_B, TYPE_C, etc with all the pertinent info about it. Then either print the full state or execute the actions (or fail if the state of something was incorrect)
My thought is it would look something like:
dict
TYPE_A_list
dict_A[0] key:value,key:value,key:value
dict_A[1] key:value,key:value,key:value
TYPE_B_list
dict_A[0] key:value,key:value,key:value
dict_A[1] key:value,key:value,key:value
I think I'd want to read the config into that and then add keys and values or update values as the app progresses and reprocesses each TYPE.
Finally my questions.
I'm not sure how iterate over each list of dicts or to add list elements and add or update key:value pairs.
Is what I describe above the best way to go about this?
I'm fairly new to Python, so I'm open to any advice. FWIW, this will be python 2.6.
A little clarification on the config file lines
CAR_TYPE = "Ford,Mustang,Blue,2005"
CAR_TYPE = "Honda,Accord,Green,2009"
BIKE_TYPE = "Honda,VTX,Black,2006"
BIKE_TYPE = "Harley,Sportster,Red,2010"
TIRE_TYPE = "170R15,whitewall"
Each type will have the same order and number of values.

No need to "remember" there are two different TYPE_A assignments - you can combine them.
TYPE_A = "value1,value2,value2"
TYPE_A = "value3,value4,value5"
would be parsed as only one of them, or both, depends on the implementation of your sysadmin CLI app.
Then the data model should be:
dict
TYPE_A: list(value1, value2, value3)
TYPE_B: list(valuex, valuey, valuez)
That way, you can iterate through dict.items() pretty easily:
for _type, values in dict.items():
for value in values:
print "%s: %s" % (_type, value)
# or whatever you wish to do

Related

Make directory system from user id in python list

Please help!
Well, first of all, I will explain what this should do. I'm trying to store users and servers from discord in a list (users that use the bot and servers in which the bot is in) with the id.
for example:
class User(object):
name = ""
uid = 0
Now, all discord id are very long and I want to store lots of users and servers in my list (one list for each one) but suppose that I get 10.000 users in my list, and I want to get the last one (without knowing it's the last one), this would take a lot of time. Instead, I thought that I could make a directory system for storing users in the list and finding it quickly. This is how it works:
I can get the id easily so imagine my id is 12345.
Now I convert it into a string using python str(id) function and I store it in a variable, strId.
For each digit of the list, I use it as an index for the users list, like this:
The User() is where the user is stored
users_list = [[[], [[], [], [[], [], [], [User()]]]]]
actual_dir = 0
for digit in strId:
actual_dir = digit
user = actual_dir[0]
And that's how I reach the user (or something like that)
Now, here is where my problem is. I know I can get the user easily by getting the user by id, but when I want to save the changes, I should do something like users_list[1][2][3][4][5] = changed_user_variable, but how far I know I cannot do something like list[1] += [2]
Is there any way to reach the user and save the changes?
Thanks in advance

You can use a python dictionary with the user id as the key and the user object as the value. I ran a test on my own computer and found that finding 100 000 random users in a dictionary with 10 million users only took 0.3s. This method is much simpler and I would guess it's just as fast, if not faster.
You can create a dictionary and add users with:
users = {}
users[userID] = some_user
(many other ways of doing this)
by using a dictionary you can easily change a user's field by:
users[userID].some_field = "Some value"
or overwrite the same way you add users in the first place.

How to make N Lists to fill with data?

I have a huge (2gb) mixed Log file which I want to split/group by the CMS, which made the log entry.
Now I run over the whole file and filter for different CMS Tags and export all Logs grouped by CMS Tags.
As I know, how many CMS I got, I can easy make the correct amount of Lists to fill: eg:
all_wordpress_logs = []
all_cms2_logs = []
...
and fill that with all_wordpress_logs.append(x)
So far so good.
Now I want to group/filter by the class, which trown something that get logged.
But as I dont know how many lists I need, I cant prepare them like above.
So my question is, how can I create Lists "on demand" with correct names to fill in with data?
E.g:
wordpress_class1 = []
wordpress_class1.append(x)
wordpress_class2 = []
wordpress_class2.append(x)
...
wordpress_classN = []
wordpress_classN.append(x)
Any help would be appreciated.

You can use a dictionary, where the keys are the classes you want to group by, and when there is a new list, you just create and fill it then add it to the suitable key.

A dictionary is a good storage tool here where you dont know what classes you might have. It can expand easily and you can have lists inside it mapped to keys. Below is just an example of this. You can then access your lists by the class name.
data="""class1 some log info
class1 more log info
class2 other log info
class3 different log into
class1 last log info
"""
dynamic_lists = {}
for line in data.splitlines():
line_data = line.split()
class_name = line_data[0]
if class_name in dynamic_lists:
dynamic_lists[class_name].append(" ".join(line_data[1:]))
else:
dynamic_lists[class_name] = [" ".join(line_data[1:])]
print(f'lines logged for class1 are: {dynamic_lists["class1"]}')
OUTPUT
lines logged for class1 are: ['some log info', 'more log info', 'last log info']

How to select XML child tags and fill a list of dicts with custom keys in a pythonic way

I'm trying to do some parkour here. Got an exported xml file out of an AccessDB table. I'd like to select just specific tags from each child and save them in a dict, create a list of those dicts and then fill a SQLite custom model database with that list. "Extract and convert"
I managed to parse the XML, get the childs and find the tags and its text. The thing is that it's getting ugly since some childs don't have the tag i'm looking for thus it misses the key but I'd like to default that to "na" and my code is a bit muddy with a lot of if statements under the for loop. I managed to save records in the database with peewee too.
Basically I want to extract data from an AccessDB table, get some field data and save them in a sqlite db with a different field name. I'm running on Linux and I can't get to the AccessDB machine to work with, thus the exported file. If this gets too troublesome I'll try to get the script running there and connect both databases and parse data
xml_parsing_code()
for childs in root:
for tags in child:
if tags.tag == 'PM':
d['maker'] = tags.text
if ...
list.append(d)
db.create_code()
I'm not a beginner but still loads to learn and I'm sure I'm missing something, a more "pythonic" simple to write way or there is a simpler easier approach I'm too obtuse to see. I mean, my code works..."sort of", but it's really ugly and patchy and checking for issues in a 6k item list is a bit of a pain.
Thanks a lot!
UPDATE 2: # (made a mistake which I was overwriting missing values and the already filled ones)
import xml.etree.ElementTree as ET
tags_dict = {
"PartNo": "maker_ref",
"PM": "maker",
etc..
}
tree = ET.parse("exported_table.xml")
root = tree.getroot()
dict_list = list()
#d = dict()
for node in root:
d = dict() #instead of d.clear()
for child in node:
for k, v in tags_dict.items():
if k in child.tag:
d[v] = child.text
if v not in d:
d[v] = "na"
dict_list.append(d)
This is the final working code for this specific part that seems to do the trick. Added the "na" for the missing childs for my new database model structure.
For some reason I can't figure out why if I declare a global dict() and .clear() it for each node loop, instead of what I posted, it fills my list with the last node data repeated for the whole node count. Anyone can shed some light?

Declare dict for Tag and desired key in your data
desired_tags = {"PM": "maker", etc...}
for child in root:
for tags in child:
for k,v in desired_tags.items():
if k in child.tag:
d[v] = tags.text
Didnt tested, no posted data estructured.

Ignore non existing attributes in config file parser python

I have a function returning 1 list at a time, like below
['7:49', 'Section1', '181', '1578', '634', '4055']
['7:49', 'Section2', '181', '1578', '634', '4055']
These values are time,section,count,avg,min,max (I know this will always be the sequence)
My aim is to alert if any of the values breaches limits defined in a config file.
So I create a config like below
[Section1]
Count:10
Min:20
Max:100
Avg:50
[Section2]
Count:10
Min:20
Max:100
Avg:50
My function to check max and min limits
def checklimit(line):
print "Inside CheckLimit", line[1],line[4],line[5]
if line[4] < ConfigSectionMap(line[1])['min'] or line[5] > ConfigSectionMap(line[1])['max']:
sendAlert(line)
This works fine but this could be improved and has some corner cases.
Suppose someone leaves config as below
[Section1]
Count:10
Min:
Max:
Avg:50
[Section2]
Count:10
Avg:50
Meaning the user only wants to check for Count and Avg. How should these cases be handled in my code so as to check only required parameters given in config file. I have used Config Parser from here
Suggestions for question title improvement are welcome. It was hard to put one. Thanks

There's a many ways to approach this. With key lookups, in dictionaries you can use the dict.get() method and provide a fallback value.
so instead of
ConfigSectionMap(line[1])['min']
You can use something like this, which will return 0 if the key does not exist.
ConfigSectionMap(line[1]).get('min', 0)

Structuring Firebase Database

I'm following this tutorial to structure Firebase data. Near the end, it says the following:
With this kind of structure, you should keep in mind to update the data at 2 locations under the user and group too. Also, I would like to notify you that everywhere on the Internet, the object keys are written like "user1","group1","group2" etc. where as in practical scenarios it is better to use firebase generated keys which look like '-JglJnGDXcqLq6m844pZ'. We should use these as it will facilitate ordering and sorting.
So based on that, I'm assuming that the final result should be the following:
I'm using this python wrapper to post the data.
How can I achieve this?

When you write data to a Firebase array (for example in Javascript) using a line like this
var newPostKey = firebase.database().ref().child('users').push().key;
var updates = {item1: value1, item2: value2};
return firebase.database().ref().update(updates);
Like is described here, you will get a generated key for data "pushed". In the example above newPostKey will contain this generated key
UPDATE
To answer the updated question with with the Python wrapper:
Look for the section "Saving Data" in the page you linked to.
The code would look something like this;
data = {"Title": "The Animal Book"}
book = db.child("AllBooks").push(data)
data = {"Title": "Animals"}
category = db.child("Categories").push(data)
data = {category['name']: true }
db.child("AllBooks").child(book['name']).child("categories").push(data)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.