Trying to implement a cache on load of this file - python

The cache would have an initial size of 20 elements and upon reaching its limit, to add any new element it would remove the least recently accessed element. On shutdown it should store the cached data back to the file. The data should be stored in the cache according to a caching strategy. Provide options for cache CRUD. Testing Data set : records of student.
import json
from collections import OrderedDict
import time
import os
if os.path.exists("qwerty.json"):
record = json.load(open("qwerty.json", "r"), object_pairs_hook=OrderedDict)
else:
record = OrderedDict({})
fo = open("foo.txt", "wb")
x = list(record.items())[:20]; x2 = sorted(x, key=lambda k: k[1]['time'], reverse=True)
print(x2)
command = ""
while command != 'exit':
command = input('Enter a command(options: create,read,save): ')
if command == "create":
name = input('Enter name of the Student:')
p = input('Student ID: ')
a = input('Class: ')
n = input('Marks: ')
time = time.time()
record[name] = {'Student ID:': p, 'Class:': a, 'Marks': n, 'time': time }
elif command == 'read':
z = json.load(open("qwerty.json", "r"), object_pairs_hook=OrderedDict)
print(z)
elif command == 'save':
json.dump(record, open('qwerty.json', "w"))
fo.close()

You can actually maintain order with a single file, using a combination of json and collections.OrderedDict.
Your initial setup is like so:
from collections import OrderedDict
phone_book = OrderedDict({})
When creating, add elements into an ordered dict and then dump it as JSON. The order of keys is preserved. After you declared phone_book like above, the rest of the code for create remains the same. Note that when you write to the file, you don't close it, so you can't read the contents later. This should be replaced with something like:
import os
if os.path.exists("qwerty.json")
phone_book = json.load(open("qwerty.json", "r"), object_pairs_hook=OrderedDict)
else:
phone_book = OrderedDict({})
command = ""
while command != 'exit':
command = input('Enter a command(options: create,read,save): ')
if command == "create":
...
elif command == 'read':
...
elif command == 'save':
json.dump(phone_book, open('qwerty.json', "w"))
For reading, you'll have to make some changes:
elif command == 'read':
z = json.load(open("C:\\Users\\qwerty.txt", "r"), object_pairs_hook=OrderedDict)
...
This loads the dict in the order the keys were stored. You can now call list(z.items())[-20:] to get only the last 20 items. Also, when reading a particular key, you update its "last-read-time" by deleting and recreating it:
import copy
key = ...
temp = copy.copy(z[key])
del z[key]
z[key] = temp
This will update the position of key in the dict. This should be enough for you to implement the rest yourself.

Related

how to update a JSON file?

I have some code that stores data in a dictionary and than the dictionary is stored in a JSON file:
def store_data(user_inp):
list_of_letters = list(user_inp)
list_of_colons = []
nested_dict = {}
for letter in list_of_letters:
if letter == ':':
list_of_colons.append(letter)
jf = json.dumps(storage)
with open('myStorage.json', 'w') as f:
f.write(jf)
if len(list_of_colons) == 2:
str1 = ''.join(list_of_letters)
list2 = str1.split(':')
main_key = list2[0]
nested_key = list2[1]
value = list2[2]
if main_key not in storage:
storage[main_key] = nested_dict
nested_dict[nested_key] = value
print(storage, '\n', 'successfully saved!')
jf = json.dumps(storage)
with open('myStorage.json', 'w') as f:
f.write(jf)
elif main_key in storage:
if nested_key in storage[main_key]:
print('this item is already saved: \n', storage)
else:
storage[main_key][nested_key] = value
print(storage, '\n', 'successfully saved!')
jf = json.dumps(storage)
with open('myStorage.json', 'w') as f:
f.write(jf)
The problem is that every time I rerun the program and enter new data, the data in the JSON file is replaced by the data entered the last time I ran the program. For example: If I want to store this string: gmail:pass:1234. What my function does is this:
creates a dictionary with the user input and stores it in the JSON file:
{'gmail': {'pass': 1234}}
As long I don't close the program, the data I enter keeps adding to the JSON object. But if I close the program, run it again, and enter new data, the data I stored before is replaced by the data I entered last.
So what I want is that every time I enter a new piece of data to the dictionary, it will add it to the object stored in the JSON file. So if I run the program again and enter this input, gmail:pass2:2343, this is how it should be stored:
{'gmail': {'pass': '1234', 'pass2': '2343'}}
And if I enter this, zoom:id:1234567, I want it to add this to the object inside the JSON file, like so:
{'gmail': {'pass': '1234', 'pass2': '2343'} 'zoom': {'id': '1234567'}}
I really don't know how to fix this, I already researched but I can't find the solution to my specific case.
Hope you understand what I mean. Thank you in advance for your help.
I think this is what you are trying to do:
def update_with_item(old, new_item):
changed = True
top_key, nested_key, value = new_item
if top_key in old:
if nested_key in old[top_key]:
changed = False
print("This item is already saved: \n", storage)
else:
old[top_key][nested_key] = value
else:
old[top_key] = {nested_key: value}
return old, changed
def main():
stored = json.load(open('myStorage.json'))
old, changed = update_with_item(stored, list2)
if changed:
jf = json.dumps(old)
with open('myStorage.json', 'w') as f:
f.write(jf)
print(storage, '\n', 'successfully saved!')
I'm also not sure how you looping over the code in main, or where the list2 variable is coming from. The main function here will need to be updated to how you are looping over creating the new values etc.
The update_with_item function should resolve the issue you are having with updating the dictionary though.

Python coding issue for csv read

Have a simple code where
goal:
open a csv file as list print it --> worked
open a csv file as dictionary
print it --> working
modify it --> is the code correct for it?
print again --> not working
using Pycharm for debug and can't identify the issue. Any help will be highly appreciated.
import sys
import csv
def print_csv_list(list_in):
"""
function takes a list of lists and prints # of lines instructed by counter parameter
:param list_in: list of lists
:return: no return
"""
counter = 0
for line in list_in:
if counter < 2:
for item in line:
sys.stdout.write(item.strip(",") + "\t")
sys.stdout.flush()
print("\n")
counter +=1
def print_csv_file(file_dict):
for dict_item in file_dict:
print dict_item
def modify_dict(file_dict):
print_csv_file(file_dict)
for dict_item in file_dict:
for k, v in dict_item.iteritems():
if k == "ral_file":
dict_item[k] = v.strip("_regs")
print_csv_file(file_dict)
def parse_ral_file(csvfile):
with open(csvfile, 'r')as print_file:
file_read = csv.reader(print_file, delimiter=',')
print_csv_file(file_read)
with open(csvfile, 'r')as dict_file:
file_dict = csv.DictReader(dict_file, delimiter=',')
modify_dict(file_dict)
if __name__ == "__main__":
x = sys.argv[1]
parse_ral_file(x)
When you iterate through a generator (including a file), you leave the pointer at the end. This means any subsequent iteration will result in empty content. You need to use seek and make the pointer go back to the start of the file.
with open(csvfile, 'r')as dict_file:
modify_dict(dict_file)
def modify_dict(dict_file):
file_dict = csv.DictReader(dict_file, delimiter=',')
print_csv_file(file_dict)
dict_file.seek(0) # If you remove this line, the second `print_csv_file`
# won't print anything
print_csv_file(file_dict)

Python - how to optimize iterator in file parsing

I get files that have NTFS audit permissions and I'm using Python to parse them. The raw CSV files list the path and then which groups have which access, such as this type of pattern:
E:\DIR A, CREATOR OWNER FullControl
E:\DIR A, Sales FullControl
E:\DIR A, HR Full Control
E:\DIR A\SUBDIR, Sales FullControl
E:\DIR A\SUBDIR, HR FullControl
My code parses the file to output this:
File Access for: E:\DIR A
CREATOR OWNER,FullControl
Sales,FullControl
HR,FullControl
File Access For: E:\DIR A\SUBDIR
Sales,FullControl
HR,FullControl
I'm new to generators but I'd like to use them to optimize my code. Nothing I've tried seems to work, so here is the original code (I know it's ugly). It works but it's very slow. The only way I can do this is by parsing out the paths first, put them in a list, make a set so that they're unique, then iterate over that list and match them with the path in the second list, and list all of the items it finds. Like I said, it's ugly but works.
import os, codecs, sys
reload(sys)
sys.setdefaultencoding('utf8') // to prevent cp-932 errors on screen
file = "aud.csv"
outfile = "access-2.csv"
filelist = []
accesslist = []
with codecs.open(file,"r",'utf-8-sig') as infile:
for line in infile:
newline = line.split(',')
folder = newline[0].replace("\"","")
user = newline[1].replace("\"","")
filelist.append(folder)
accesslist.append(folder+","+user)
newfl = sorted(set(filelist))
def makeFile():
print "Starting, please wait"
for i in range(1,len(newfl)):
searchItem = str(newfl[i])
with codecs.open(outfile,"a",'utf-8-sig') as output:
outtext = ("\r\nFile access for: "+ searchItem + "\r\n")
output.write(outtext)
for item in accesslist:
searchBreak = item.split(",")
searchTarg = searchBreak[0]
if searchItem == searchTarg:
searchBreaknew = searchBreak[1].replace("FSA-INC01S\\","")
searchBreaknew = str(searchBreaknew)
# print(searchBreaknew)
searchBreaknew = searchBreaknew.replace(" ",",")
searchBreaknew = searchBreaknew.replace("CREATOR,OWNER","CREATOR OWNER")
output.write(searchBreaknew)
How should I optimize this?
EDIT:
Here is an edited version. It works MUCH faster, though I'm sure it can still be fixed:
import os, codecs, sys, csv
reload(sys)
sys.setdefaultencoding('utf8')
file = "aud.csv"
outfile = "access-3.csv"
filelist = []
accesslist = []
with codecs.open(file,"r",'utf-8-sig') as csvinfile:
auditfile = csv.reader(csvinfile, delimiter=",")
for line in auditfile:
folder = line[0]
user = line[1].replace("FSA-INC01S\\","")
filelist.append(folder)
accesslist.append(folder+","+user)
newfl = sorted(set(filelist))
def makeFile():
print "Starting, please wait"
for i in xrange(1,len(newfl)):
searchItem = str(newfl[i])
outtext = ("\r\nFile access for: "+ searchItem + "\r\n")
accessUserlist = ""
for item in accesslist:
searchBreak = item.split(",")
if searchItem == searchBreak[0]:
searchBreaknew = str(searchBreak[1]).replace(" ",",")
searchBreaknew = searchBreaknew.replace("R,O","R O")
accessUserlist += searchBreaknew+"\r\n"
with codecs.open(outfile,"a",'utf-8-sig') as output:
output.write(outtext)
output.write(accessUserlist)
I'm misguided from your used .csv file extension.
Your given expected output isn't compatible with csv, as inside a record no \n possible.
Proposal using a generator returning record by record:
class Audit(object):
def __init__(self, fieldnames):
self.fieldnames = fieldnames
self.__access = {}
def append(self, row):
folder = row[self.fieldnames[0]]
access = row[self.fieldnames[1]].strip(' ')
access = access.replace("FSA-INC01S\\", "")
access = access.split(' ')
if len(access) == 3:
if access[0] == 'CREATOR':
access[0] += ' ' + access[1]
del access[1];
elif access[1] == 'Full':
access[1] += ' ' + access[2]
del access[2];
if folder not in self.__access:
self.__access[folder] = []
self.__access[folder].append(access)
# Generator for class Audit
def __iter__(self):
record = ''
for folder in sorted(self.__access):
record = folder + '\n'
for access in self.__access[folder]:
record += '%s\n' % (','.join(access) )
yield record + '\n'
How to use it:
def main():
import io, csv
audit = Audit(['Folder', 'Accesslist'])
with io.open(file, "r", encoding='utf-8') as csc_in:
for row in csv.DictReader(csc_in, delimiter=","):
audit.append(row)
with io.open(outfile, 'w', newline='', encoding='utf-8') as txt_out:
for record in audit:
txt_out.write(record)
Tested with Python:3.4.2 - csv:1.0

Issue when building a nested dictionary in Python 3

I wanted to build a nested dictionary based on a text file. for example. (text.txt)
...
hostA hostA.testing.com 192.168.1.101
hostB hostB.testing.com 192.168.1.102
...
Ideally, I want to get the following nested dictionary
...
{'hostA': {'FQHN': 'hostA.testing.com', 'IP': '192.168.1.101'}, 'hostB': {'FQHN': 'hostB.testing.com', 'IP': '192.168.1.102'}}
...
So I made the following Python code:
myinnerdict={}
myouterdict={}
def main():
my_fh = open('text.txt', 'r')
for line in my_fh:
newline = line.strip().split() # get ride of the '\n' and make it a inner list .
#print(newline)
myinnerdict['FQHN']=newline[1]
myinnerdict['IP']=newline[2]
#print(myinnerdict)
#print(newline[0])
myouterdict[newline[0]]=myinnerdict
print(myouterdict)
if __name__ == "__main__":
main()
...
however, beyond my understanding , when I ran it I got this result:
...
{'hostA': {'FQHN': 'hostB.testing.com', 'IP': '192.168.1.102'}, 'hostB': {'FQHN': 'hostB.testing.com', 'IP': '192.168.1.102'}}
...
which is not what I wanted , I don't know what I missed, please kindly help.
This is happening because you are reusing the same dict object for the innerdict. You need create a new dict object within your loop:
myouterdict={}
def main():
my_fh = open('text.txt', 'r')
for line in my_fh:
myinnerdict={}
newline = line.strip().split() # get ride of the '\n' and make it a inner list .
#print(newline)
myinnerdict['FQHN']=newline[1]
myinnerdict['IP']=newline[2]
#print(myinnerdict)
#print(newline[0])
myouterdict[newline[0]]=myinnerdict
print(myouterdict)
if __name__ == "__main__":
main()
The problem is that you are reusing the same variable for the dictionary. As myouterdict is storing the reference to the variable myinnerdict instead of the actual data, therefore, both of them are the same. For example, try this:
>>> a = {}
>>> b = {"my a variable": a}
>>> b
{'my a variable': {}}
>>> a["asdf"] = 3
>>> b
{'my a variable': {'asdf': 3}}
As you can see, b is storing the reference of a and not the empty dict data of a. What you need to do is .copy() it over (Note that .copy() does not copy the contents of the dict but makes a new reference read more here):
myinnerdict = {}
myouterdict = {}
def main():
my_fh = open('text.txt', 'r')
for line in my_fh:
newline = line.strip().split()
myinnerdict['FQHN'] = newline[1]
myinnerdict['IP'] = newline[2]
# Note this copy here
myouterdict[newline[0]] = myinnerdict.copy()
print(myouterdict)
# Remember to close the file!
my_fh.close()
if __name__ == "__main__":
main()
Anyways, you could also just immediately assign a newly created dict object instead of using a variable:
mydict = {}
def main():
my_fh = open('test.txt', 'r')
for line in my_fh:
newline = line.strip().split()
mydict[newline[0]] = {"FQHN": newline[1], "IP": newline[2]}
print(mydict)
my_fh.close()
if __name__ == "__main__":
main()

Append Function Nested Inside IF Statement Body Not Working

I am fairly new to Python (just started learning in the last two weeks) and am trying to write a script to parse a csv file to extract some of the fields into a List:
from string import Template
import csv
import string
site1 = 'D1'
site2 = 'D2'
site3 = 'D5'
site4 = 'K0'
site5 = 'K1'
site6 = 'K2'
site7 = '0'
site8 = '0'
site9 = '0'
lbl = 1
portField = 'y'
sw = 5
swpt = 6
cd = 0
pt = 0
natList = []
with open(name=r'C:\Users\dtruman\Documents\PROJECTS\SCRIPTING - NATAERO DEPLOYER\NATAERO DEPLOYER V1\nataero_deploy.csv') as rcvr:
for line in rcvr:
fields = line.split(',')
Site = fields[0]
siteList = [site1,site2,site3,site4,site5,site6,site7,site8,site9]
while Site in siteList == True:
Label = fields[lbl]
Switch = fields[sw]
if portField == 'y':
Switchport = fields[swpt]
natList.append([Switch,Switchport,Label])
else:
Card = fields[cd]
Port = fields[pt]
natList.append([Switch,Card,Port,Label])
print natList
Even if I strip the ELSE statement away and break into my code right after the IF clause-- i can verify that "Switchport" (first statement in IF clause) is successfully being populated with a Str from my csv file, as well as "Switch" and "Label". However, "natList" is not being appended with the fields parsed from each line of my csv for some reason. Python returns no errors-- just does not append "natList" at all.
This is actually going to be a function (once I get the code itself to work), but for now, I am simply setting the function parameters as global variables for the sake of being able to run it in an iPython console without having to call the function.
The "lbl", "sw", "swpt", "cd", and "pt" refer to column#'s in my csv (the finished function will allow user to enter values for these variables).
I assume I am running into some issue with "natList" scope-- but I have tried moving the "natList = []" statement to various places in my code to no avail.
I can run the above in a console, and then run "append.natList([Switch,Switchport,Label])" separately and it works for some reason....?
Thanks for any assistance!
It seems to be that the while condition needs an additional parenthesis. Just add some in this way while (Site in siteList) == True: or a much cleaner way suggested by Padraic while Site in siteList:.
It was comparing boolean object against string object.
Change
while Site in siteList == True:
to
if Site in siteList:
You might want to look into the csv module as this module attempts to make reading and writing csv files simpler, e.g.:
import csv
with open('<file>') as fp:
...
reader = csv.reader(fp)
if portfield == 'y':
natlist = [[row[i] for i in [sw, swpt, lbl]] for row in fp if row[0] in sitelist]
else:
natlist = [[row[i] for i in [sw, cd, pt, lbl]] for row in fp if row[0] in sitelist]
print natlist
Or alternatively using a csv.DictReader which takes the first row as the fieldnames and then returns dictionaries:
import csv
with open('<file>') as fp:
...
reader = csv.DictReader(fp)
if portfield == 'y':
fields = ['Switch', 'card/port', 'Label']
else:
fields = ['Switch', '??', '??', 'Label']
natlist = [[row[f] for f in fields] for row in fp if row['Building/Site'] in sitelist]
print natlist

Categories

Resources