I would like to change values in a Dict in another file. File1.py contains the code to edit the Dict, File2.py contains the Dict itself.
File1.py is generating a code to replace BTOK values only.
File1.py:
with open('file2.py', 'r') as file :
filedata = file.read()
print (filedata.str(BTK['btk1']))
for line in filedata:
line['btk1'] = BTok
with open('file2.py', 'w') as file:
file.write(line)
File2.py:
c = {
'id' : 'C80e3ce43c3ea3e8d1511ec',
'secret' : 'c10c371b4641010a750073925b0857'
}
rk = {
't1' : 'ZTkwMGE1MGEt',
}
BTK = {
'BTok' : '11eyJhbGc'
}
If you want to do this reliably, that is, so it works whether your strings are quoted with ', " or """, for whatever values they have and whatever newlines you want to put around values, then you may want to use ast to parse the source code and modify it. The only inconvenient with this is that module cannot, by itself, generate code, so you would need to install some additional dependency such as astor, for what is essentially a rather menial task. In any case, here is how you could do it that way:
import ast
import astor
# To read from file:
# with open('file2.py', 'r') as f: code = f.read()
code = """
c = {
'id' : 'C80e3ce43c3ea3e8d1511ec',
'secret' : 'c10c371b4641010a750073925b0857'
}
rk = {
't1' : 'ZTkwMGE1MGEt',
}
BTK = {
'BTok' : '11eyJhbGc'
}
"""
# Value to replace
KEY = 'BTok'
NEW_VALUE = 'new_btok'
# Parse code
m = ast.parse(code)
# Go through module statements
for stmt in m.body:
# Only look at assignments
if not isinstance(stmt, ast.Assign): continue
# Take right-hand side of the assignment
value = stmt.value
# Only look at dict values
if not isinstance(value, ast.Dict): continue
# Look for keys that match what we are looking for
replace_idx = [i for i, k in enumerate(value.keys)
if isinstance(k, ast.Str) and k.s == KEY]
# Replace corresponding values
for i in replace_idx:
value.values[i] = ast.Str(NEW_VALUE)
new_code = astor.to_source(m)
# To write to file:
# with open(`file2.py', 'w') as f: f.write(new_code)
print(new_code)
# c = {'id': 'C80e3ce43c3ea3e8d1511ec', 'secret':
# 'c10c371b4641010a750073925b0857'}
# rk = {'t1': 'ZTkwMGE1MGEt'}
# BTK = {'BTok': 'new_btok'}
Related
So I am looking to create a script to make a mod for a game using Python. The script will need to copy all files from a directory to another directory, then alter those files to add a new attribute after a specific line. The issue I am having is that this game uses custom coding based on json formatting in a txt file. I know how to do most of this, however, adding the new data is not something I can get to work.
My end goal will be to be able to do this to any file, so other mod authors can use it to add the data to their mods without needing to do it manually. I also want to try to make this script do more advanced things, but that is another goal that can wait till I get this bit working.
Sample data:
The line I need to add is position_priority = ###. The ### will be different based on what the building does (building categories).
Sample code I need to alter:
building_name_number = {
base_build_time = 60
base_cap_amount = 1
category = pop_assembly
<more code>
}
I need to put the new data just after building_name_number, however this exact name will be unique, the only thing that will always be the same is that it will start with building. So regex is what I have been trying to use, but I have never dealt with regex so I cant get it to work.
My Current code:
if testingenabled:
workingdir = R"E:/Illusives-Mods/Stellaris/Building Sorting"
pattern = "^building_"
Usortingindex = sortingindex["sorting_pop_assembly"]
print(f"Testing Perameters: Index: {Usortingindex}, Version: {__VERSION__}, Working DIR: {workingdir}")
# os.chdir(stellaris_buildings_path)
os.chdir(workingdir)
for file in os.listdir(workingdir):
if fnmatch.fnmatch(file, "*.txt"):
print("File found")
with open(file, "r+", encoding="utf-8") as openfiledata:
alllines = openfiledata.read()
for line in alllines:
if line == re.match(r'(^building_)', line, re.M):
print("found match")
# print(f"{sorting_attrib}{Usortingindex}")
# print("position_priority = 200")
openfiledata.write("\n" + sorting_attrib + Usortingindex + "\n")
break
I am not getting any errors with this code. But it doesnt work
I am using Python 3.9.6.
EDIT:
This code is before the script
allow = {
hidden_trigger = {
OR = {
owner = { is_ai = no }
NAND = {
free_district_slots = 0
free_building_slots <= 1
free_housing <= 0
free_jobs <= 0
}
}
}
}
This is after
allow = {
hidden_trigger = {
OR = {
owner = {
is_ai = false
}
NAND = {
free_district_slots = 0
free_building_slots = {
value = 1
operand = <=
}
free_housing = {
value = 0
operand = <=
}
free_jobs = {
value = 0
operand = <=
}
}
}
}
}
The output must be the same as the input, at least in terms of the operators
If you would keep it as JSON then you could read all to Python (to get ti as dictionary), search and add items in dictionary, and write back to JSON new dictionary.
text = '''{
"building_name_number": {
"base_build_time": 60,
"base_cap_amount": 1,
"category": "pop_assembly"
},
"building_other": {}
}'''
import json
data = json.loads(text)
for key in data.keys():
if key.startswith('building_'):
data[key]["position_priority"] = 'some_value'
print(json.dumps(data, indent=4))
Result:
{
"building_name_number": {
"base_build_time": 60,
"base_cap_amount": 1,
"category": "pop_assembly",
"position_priority": "some_value"
},
"building_other": {
"position_priority": "some_value"
}
}
I found module paradox-reader which can convert this file format to JSON file.
Using code from file paradoxReader.py I created example which can convert string to Python dictionary, add some value and convert to something similar to original file. But this may need to add more code in encode()
import json
import re
def decode(data):#, no_json):
data = re.sub(r'#.*', '', data) # Remove comments
data = re.sub(r'(?<=^[^\"\n])*(?<=[0-9\.\-a-zA-Z])+(\s)(?=[0-9\.\-a-zA-Z])+(?=[^\"\n]*$)', '\n', data, flags=re.MULTILINE) # Seperate one line lists
data = re.sub(r'[\t ]', '', data) # Remove tabs and spaces
definitions = re.findall(r'(#\w+)=(.+)', data) # replace #variables with value
if definitions:
for definition in definitions:
data = re.sub(r'^#.+', '', data, flags=re.MULTILINE)
data = re.sub(definition[0], definition[1], data)
data = re.sub(r'\n{2,}', '\n', data) # Remove excessive new lines
data = re.sub(r'\n', '', data, count=1) # Remove the first new line
data = re.sub(r'{(?=\w)', '{\n', data) # reformat one-liners
data = re.sub(r'(?<=\w)}', '\n}', data) # reformat one-liners
data = re.sub(r'^[\w-]+(?=[\=\n><])', r'"\g<0>"', data, flags=re.MULTILINE) # Add quotes around keys
data = re.sub(r'([^><])=', r'\1:', data) # Replace = with : but not >= or <=
data = re.sub(r'(?<=:)(?!-?(?:0|[1-9]\d*)(?:\.\d+)?(?:[eE][+-]?\d+)?)(?!\".*\")[^{\n]+', r'"\g<0>"', data) # Add quotes around string values
data = re.sub(r':"yes"', ':true', data) # Replace yes with true
data = re.sub(r':"no"', ':false', data) # Replace no with false
data = re.sub(r'([<>]=?)(.+)', r':{"value":\g<2>,"operand":"\g<1>"}', data) # Handle < > >= <=
data = re.sub(r'(?<![:{])\n(?!}|$)', ',', data) # Add commas
data = re.sub(r'\s', '', data) # remove all white space
data = re.sub(r'{(("[a-zA-Z_]+")+)}', r'[\g<1>]', data) # make lists
data = re.sub(r'""', r'","', data) # Add commas to lists
data = re.sub(r'{("\w+"(,"\w+")*)}', r'[\g<1>]', data)
data = re.sub(r'((\"hsv\")({\d\.\d{1,3}(,\d\.\d{1,3}){2}})),', r'{\g<2>:\g<3>},', data) # fix hsv objects
data = re.sub(r':{([^}{:]*)}', r':[\1]', data) # if there's no : between list elements need to replace {} with []
data = re.sub(r'\[(\w+)\]', r'"\g<1>"', data)
data = re.sub(r'\",:{', '":{', data) # Fix user_empire_designs
data = '{' + data + '}'
return json.loads(data)
def encode(data):
text = json.dumps(data, indent=4)
text = text[2:-2]
text = text.replace('"', '').replace(':', ' =').replace(',', '')
return text
# ----------
text = '''building_name_number = {
base_build_time = 60
base_cap_amount = 1
category = pop_assembly
}'''
data = decode(text)
data['building_name_number']['new_item'] = 123
text = encode(data)
print(text)
Result:
building_name_number = {
base_build_time = 60
base_cap_amount = 1
category = pop_assembly
new_item = 123
}
I have a JSON file containing three fields: 2 are strings and third one is field containing a list of values.
{ "STREAM": "stream",
"BASIS_STREAM": "basis",
"PATHS": "[/opt/path1,/opt/path2]"
}
Now I load that JSON
with open('/pathToJsonFile.json', 'r') as f:
data = json.load(f)
Now I want to get those values.
stream=str(data["STREAM"])
basis=str(data["BASIS_STREAM"])
paths=data["BASE_PATHS"]
The issue is that paths is also threated as String, although I have to use it as a list. I am converting with str function other fields because of the Unicode. Code must be in python2.
Thanks a lot!
Say you have a file called data.json with the following contents:
{
"STREAM": "stream",
"BASIS_STREAM": "basis",
"PATHS": "[/opt/path1,/opt/path2]"
}
Maybe you could use str.split after calling json.load:
with open('data.json', 'r') as f:
data = json.load(f)
print 'data = %s' % data
stream = str(data['STREAM'])
basis = str(data['BASIS_STREAM'])
paths = [str(u_s) for u_s in data['PATHS'][1:-1].split(',')]
print 'stream = %s' % stream
print 'basis = %s' % basis
print 'paths = %s' % paths
Output:
data = {u'PATHS': u'[/opt/path1,/opt/path2]', u'BASIS_STREAM': u'basis', u'STREAM': u'stream'}
stream = stream
basis = basis
paths = ['/opt/path1', '/opt/path2']
Your /opt/path1 and /opt/path2 should be in a quotation marks to be converted in a list. If your PATHS always have a similar template such as "[/XXX,/YYY,/ZZZ,/TTT,/KKK]" the following code should also help. I have converted your data as "['/XXX','/YYY','/ZZZ','/TTT','/KKK']" so that it can be easily converted to a list using ast library. Please see the code as following:
import json
import ast
with open("text_text.json") as f:
data = json.load(f)
print(data["PATHS"]) # Your data
for i in data["PATHS"]:
if i == "[":
data["PATHS"] = data["PATHS"].replace("[", "['")
elif i == ",":
data["PATHS"] = data["PATHS"].replace(",/", "','/")
elif i == "]":
data["PATHS"] = data["PATHS"].replace("]", "']")
#print(data["PATHS"])
print(type(data["PATHS"]))
print(data["PATHS"]) #converted to a data which can be converted to a list.
data_paths = ast.literal_eval(data["PATHS"]) # ast is used to convert str to list.
print(data_paths) # 'list' data
print(type(data_paths))
See the output of the code:
It should also work if your PATH has more data as following:
I am iterating over a dict created from a json file which works fine but as soon as I remove some of the entries in the else clause the results change (normally it prints 35 nuts_ids but with the remove in the else only 32 are printed. So it seems that the remove influences the iterating but why? The key should be safe? How can I do this appropriately without loosing data?
import json
with open("test.json") as json_file:
json_data = json.load(json_file)
for g in json_data["features"]:
poly = g["geometry"]
cntr_code = g["properties"]["CNTR_CODE"]
nuts_id = g["properties"]["NUTS_ID"]
name = g["properties"]["NUTS_NAME"]
if cntr_code == "AT":
print(nuts_id)
# do plotting etc
else: # delete it if it is not part a specific country
json_data["features"].remove(g) # line in question
# do something else with the json_data
Not a good practice to delete items while iterating the object. Instead you can try filtering out the elements you do need.
Ex:
import json
with open("test.json") as json_file:
json_data = json.load(json_file)
json_data_features = [g for g in json_data["features"] if g["properties"]["CNTR_CODE"] == "AT"] #Filter other country codes.
json_data["features"] = json_data_features
for g in json_data["features"]:
poly = g["geometry"]
cntr_code = g["properties"]["CNTR_CODE"]
nuts_id = g["properties"]["NUTS_ID"]
name = g["properties"]["NUTS_NAME"]
# do plotting etc
# do something else with the json_data
Always remember the cardinal rule, never modify objects you are iterating on
You can take a copy of your dictionary and then iterate on it using copy.copy
import json
import copy
with open("test.json") as json_file:
json_data = json.load(json_file)
#Take copy of json_data
json_data_copy = json_data['features'].copy()
#Iterate on the copy
for g in json_data_copy:
poly = g["geometry"]
cntr_code = g["properties"]["CNTR_CODE"]
nuts_id = g["properties"]["NUTS_ID"]
name = g["properties"]["NUTS_NAME"]
if cntr_code == "AT":
print(nuts_id)
# do plotting etc
else: # delete it if it is not part a specific country
json_data["features"].remove(g) # line in question
I am very new to Python/JSON so please bear with me on this. I could do this in R but we need to use Python so as to transform this to Python/Spark/MongoDB. Also, I am just posting a minimal subset - I have a couple more file types and so if anyone can help me with this, I can build upon that to integrate more files and file types:
Getting back to my problem:
I have two tsv input files that I need to merge and convert to JSON. Both the files have gene and sample columns plus some additional columns. However, the gene and sample may or may not overlap like I have shown - f2.tsv has all genes in f1.tsv but also has an additional gene g3. Similarly, both files have overlapping as well as non-overlapping values in sample column.
# f1.tsv – has gene, sample and additional column other1
$ cat f1.tsv
gene sample other1
g1 s1 a1
g1 s2 b1
g1 s3a c1
g2 s4 d1
# f2.tsv – has gene, sample and additional columns other21, other22
$ cat f2.tsv
gene sample other21 other22
g1 s1 a21 a22
g1 s2 b21 b22
g1 s3b c21 c22
g2 s4 d21 d22
g3 s5 f21 f22
The gene forms the top level, each gene has multiple samples which form the second level and the additional columns form the extras which is the third level. The extras are divided into two because one file has other1 and the second file has other21 and other22. The other files that I will include later will have other fields like other31 and other32 and so on but they will still have the gene and sample columns.
# expected output – JSON by combining both tsv files.
$ cat output.json
[{
"gene":"g1",
"samples":[
{
"sample":"s2",
"extras":[
{
"other1":"b1"
},
{
"other21":"b21",
"other22":"b22"
}
]
},
{
"sample":"s1",
"extras":[
{
"other1":"a1"
},
{
"other21":"a21",
"other22":"a22"
}
]
},
{
"sample":"s3b",
"extras":[
{
"other21":"c21",
"other22":"c22"
}
]
},
{
"sample":"s3a",
"extras":[
{
"other1":"c1"
}
]
}
]
},{
"gene":"g2",
"samples":[
{
"sample":"s4",
"extras":[
{
"other1":"d1"
},
{
"other21":"d21",
"other22":"d22"
}
]
}
]
},{
"gene":"g3",
"samples":[
{
"sample":"s5",
"extras":[
{
"other21":"f21",
"other22":"f22"
}
]
}
]
}]
How do convert two csv files to a single - multi level JSON based on two common columns?
I would really appreciate any help that I can get on this.
Thanks!
Here's another option. I tried to make it easy to manage when you start adding more files. You can run on the command line and provide arguments, one for each file you want to add in. Gene/sample names are stored in dictionaries to improve efficiency. The formatting of your desired JSON object is done in each class' format() method. Hope this helps.
import csv, json, sys
class Sample(object):
def __init__(self, name, extras):
self.name = name
self.extras = [extras]
def format(self):
map = {}
map['sample'] = self.name
map['extras'] = self.extras
return map
def add_extras(self, extras):
#edit 8/20
#always just add the new extras to the list
for extra in extras:
self.extras.append(extra)
class Gene(object):
def __init__(self, name, samples):
self.name = name
self.samples = samples
def format(self):
map = {}
map ['gene'] = self.name
map['samples'] = sorted([self.samples[sample_key].format() for sample_key in self.samples], key=lambda sample: sample['sample'])
return map
def create_or_add_samples(self, new_samples):
# loop through new samples, seeing if they already exist in the gene object
for sample_name in new_samples:
sample = new_samples[sample_name]
if sample.name in self.samples:
self.samples[sample.name].add_extras(sample.extras)
else:
self.samples[sample.name] = sample
class Genes(object):
def __init__(self):
self.genes = {}
def format(self):
return sorted([self.genes[gene_name].format() for gene_name in self.genes], key=lambda gene: gene['gene'])
def create_or_add_gene(self, gene):
if not gene.name in self.genes:
self.genes[gene.name] = gene
else:
self.genes[gene.name].create_or_add_samples(gene.samples)
def row_to_gene(headers, row):
gene_name = ""
sample_name = ""
extras = {}
for value in enumerate(row):
if headers[value[0]] == "gene":
gene_name = value[1]
elif headers[value[0]] == "sample":
sample_name = value[1]
else:
extras[headers[value[0]]] = value[1]
sample_dict = {}
sample_dict[sample_name] = Sample(sample_name, extras)
return Gene(gene_name, sample_dict)
if __name__ == '__main__':
delim = "\t"
genes = Genes()
files = sys.argv[1:]
for file in files:
print("Reading " + str(file))
with open(file,'r') as f1:
reader = csv.reader(f1, delimiter=delim)
headers = []
for row in reader:
if len(headers) == 0:
headers = row
else:
genes.create_or_add_gene(row_to_gene(headers, row))
result = json.dumps(genes.format(), indent=4)
print(result)
with open('json_output.txt', 'w') as output:
output.write(result)
This looks like a problem for pandas! Unfortunately pandas only takes us so far and we then have to do some manipulation on our own. This is neither fast nor particularly efficient code, but it will get the job done.
import pandas as pd
import json
from collections import defaultdict
# here we import the tsv files as pandas df
f1 = pd.read_table('f1.tsv', delim_whitespace=True)
f2 = pd.read_table('f2.tsv', delim_whitespace=True)
# we then let pandas merge them
newframe = f1.merge(f2, how='outer', on=['gene', 'sample'])
# have pandas write them out to a json, and then read them back in as a
# python object (a list of dicts)
pythonList = json.loads(newframe.to_json(orient='records'))
newDict = {}
for d in pythonList:
gene = d['gene']
sample = d['sample']
sampleDict = {'sample':sample,
'extras':[]}
extrasdict = defaultdict(lambda:dict())
if gene not in newDict:
newDict[gene] = {'gene':gene, 'samples':[]}
for key, value in d.iteritems():
if 'other' not in key or value is None:
continue
else:
id = key.split('other')[-1]
if len(id) == 1:
extrasdict['1'][key] = value
else:
extrasdict['{}'.format(id[0])][key] = value
for value in extrasdict.values():
sampleDict['extras'].append(value)
newDict[gene]['samples'].append(sampleDict)
newList = [v for k, v in newDict.iteritems()]
print json.dumps(newList)
If this looks like a solution that will work for you, I am happy to spend some time cleaning it up to make it bait more readable and efficient.
PS: If you like R, then pandas is the way to go (it was written to give a R-like interface to data in python)
Do it in steps:
Read the incoming tsv files and aggregate the information from different genes into a dictionary.
Process said dictionary to match your desired format.
Write the result to a JSON file.
Here is the code:
import csv
import json
from collections import defaultdict
input_files = ['f1.tsv', 'f2.tsv']
output_file = 'genes.json'
# Step 1
gene_dict = defaultdict(lambda: defaultdict(list))
for file in input_files:
with open(file, 'r') as f:
reader = csv.DictReader(f, delimiter='\t')
for line in reader:
gene = line.pop('gene')
sample = line.pop('sample')
gene_dict[gene][sample].append(line)
# Step 2
out = [{'gene': gene,
'samples': [{'sample': sample, 'extras': extras}
for sample, extras in samples.items()]}
for gene, samples in gene_dict.items()]
# Step 3
with open(output_file, 'w') as f:
json.dump(out, f)
So my whole problem is that I have two files one with following format(for Python 2.6):
#comments
config = {
#comments
'name': 'hello',
'see?': 'world':'ABC',CLASS=3
}
This file has number of sections like this. Second file has format:
[23]
[config]
'name'='abc'
'see?'=
[23]
Now the requirement is that I need to compare both files and generate file as:
#comments
config = {
#comments
'name': 'abc',
'see?': 'world':'ABC',CLASS=3
}
So the result file will contain the values from the first file, unless the value for same attribute is there in second file, which will overwrite the value. Now my problem is how to manipulate these files using Python.
Thanks in advance and for your previous answers in short time ,I need to use python 2.6
Was unable to find a beautiful solution due to the comments. This is tested and works for me, but requires Python 3.1 or higher:
from collections import OrderedDict
indenting = '\t'
def almost_py_read(f):
sections = OrderedDict()
contents = None
active = sections
for line in f:
line = line.strip()
if line.startswith('#'):
active[line] = None
elif line.endswith('{'):
k = line.split('=')[0].strip()
contents = OrderedDict()
active = contents
sections[k] = contents
elif line.endswith('}'):
active = sections
else:
try:
k, v = line.split(':')
k = k.strip()
v = v.strip()
active[k] = v
except:
pass
return sections
def almost_ini_read(f):
sections = OrderedDict()
contents = None
for line in f:
line = line.strip()
try:
k, v = line.split('=')
k = k.strip()
v = v.strip()
if v:
contents[k] = v
except:
if line.startswith('[') and line.endswith(']'):
contents = OrderedDict()
sections[line[1:-1]] = contents
print(sections)
return sections
def compilefiles(pyname, ininame):
sections = almost_py_read(open(pyname, 'rt'))
override_sections = almost_ini_read(open(ininame, "rt"))
for section_key, section_value in override_sections.items():
if not sections.get(section_key):
sections[section_key] = OrderedDict()
for k, v in section_value.items():
sections[section_key][k] = v
return sections
def output(d, indent=''):
for k, v in d.items():
if v == None:
print(indent+k)
elif v:
if type(v) == str:
print(indent+k+': '+v+',')
else:
print(indent+k+' = {')
output(v, indent+indenting)
print(indent+'}')
d = compilefiles('a.txt', 'b.ini')
output(d)
Output:
#comments
config = {
#comments
'name': 'abc',
'see?': 'world',
}
I had a really long and hard time to manage to write the following code.
I had difficulties to manage with commas. I wanted the updated file to have after the updating the same format as the file to update before the updating : lines end with a comma, except for the last one.
This code is crafted for the particular problem as exposed by the questioner and can't be used as-is for another type of problem. I know. It's the problem of using a code based on regex and not on a parser, I'm fully aware of that. But I think that it is a canvas that can be relatively easily adapted to other cases, by changing the regexes, which is a relatively readily process thanks to the malleability of regexes.
def file_updating(updating_filename,updating_data_extractor,filename_to_update):
# function whose name is hold by updating_data_extractor parameter
# is a function that
# extracts data from the file whose name is hold by updating_filename parameter
# and must return a tuple:
# ( updating dictionary , compiled regex )
updating_dico,pat = updating_data_extractor( updating_filename )
with open(filename_to_update,'r+') as f:
lines = f.readlines()
def jiji(line,dico = updating_dico ):
mat = pat.search(line.rstrip())
if mat and mat.group(3) in dico:
return '%s: %s,' % (mat.group(1),dico.pop(mat.group(3)))
else:
return line.rstrip(',') + ','
li = [jiji(line) for line in lines[0:-1] ] # [0:-1] because last line is '}'
front = (mit.group(2) for mit in ( pat.search(line) for line in lines ) if mit).next()
li.extend(front + '%s: %s,' % item for item in updating_dico.iteritems() )
li[-1] = li[-1].rstrip(',')
li.append('}')
f.seek(0,0)
f.writelines( '\n'.join(li) )
f.truncate()
Exemplifying code:
import re
bef1 = '''#comments
config =
{
#comments
'name': 'hello',
'arctic':01011101,
'summu': 456,
'see?': 'world',
'armorique': 'bretagne'
}'''
bef2 = '''#comments
config =
{
#comments
'name': 'abc',
'see?': { 'world':'india':'jagdev'},
}'''
def one_extractor(data_containing_filename):
with open(data_containing_filename) as g:
contg = re.search('\[(\d+)\].+\[config\](.*?)\[(\\1)\]',g.read(),re.DOTALL)
if contg:
updtgen = ( re.match("([^=]+)=[ \f\t\v]*([^ \f\t\v].*|)",line.strip())
for line in contg.group(2).splitlines() )
updating_data = dict( mi.groups() for mi in updtgen if mi and mi.group(2))
else:
from sys import exit
exit(updating_filename + " isn't a valid file for updating")
pat = re.compile("(([ \t]*)([^:]+)):\s*(.+),?")
return (updating_data,pat)
for bef in (bef1,bef2):
# file to update: rudu.txt
with open('rudu.txt','w') as jecr:
jecr.write(bef)
# updating data: renew_rudu.txt
with open('renew_rudu.txt','w') as jecr:
jecr.write('''[23]
[config]
'nuclear'= 'apocalypse'
'name'='abc'
'armorique'= 'BRETAGNE'
'arctic'=
'boloni'=7600
'see?'=
'summu'='tumulus'
[23]''')
print 'BEFORE ---------------------------------'
with open('rudu.txt') as lir:
print lir.read()
print '\nUPDATING DATA --------------------------'
with open('renew_rudu.txt') as lir:
print lir.read()
file_updating('renew_rudu.txt',one_extractor,'rudu.txt')
print '\nAFTER ================================='
with open('rudu.txt','r') as f:
print f.read()
print '\n\nX#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#\n'
Result:
>>>
BEFORE ---------------------------------
#comments
config =
{
#comments
'name': 'hello',
'arctic':01011101,
'summu': 456,
'see?': 'world',
'armorique': 'bretagne'
}
UPDATING DATA --------------------------
[23]
[config]
'nuclear'= 'apocalypse'
'name'='abc'
'armorique'= 'BRETAGNE'
'arctic'=
'boloni'=7600
'see?'=
'summu'='tumulus'
[23]
AFTER =================================
#comments,
config =,
{,
#comments,
'name': 'abc',
'arctic':01011101,
'summu': 'tumulus',
'see?': 'world',
'armorique': 'BRETAGNE',
'boloni': 7600,
'nuclear': 'apocalypse'
}
X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#
BEFORE ---------------------------------
#comments
config =
{
#comments
'name': 'abc',
'see?': { 'world':'india':'jagdev'},
}
UPDATING DATA --------------------------
[23]
[config]
'nuclear'= 'apocalypse'
'name'='abc'
'armorique'= 'BRETAGNE'
'arctic'=
'boloni'=7600
'see?'=
'summu'='tumulus'
[23]
AFTER =================================
#comments,
config =,
{,
#comments,
'name': 'abc',
'see?': { 'world':'india':'jagdev'},
'armorique': 'BRETAGNE',
'boloni': 7600,
'summu': 'tumulus',
'nuclear': 'apocalypse'
}
X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#X#
>>>
.
EDIT:
I have improved the code because I was still insatisfied. Now the "variable" front catches the blank characters ( ' ' or '\t' ) at the beginning of the data-containing lines in the file to be updated.
I had also forgot the instruction f.truncate() which is very important to not keep a tail of undesired characters.
I am satisfied to see that my code works well even with the following file in which a value is a dictionnary, as presented by Jagdev:
#comments
config =
{
#comments
'name': 'abc',
'see?': { 'world':'india':'jagdev'},
}
That confirms me in my choice to process line after line , and not trying to run through the entire file with a regex.
.
EDIT 2:
I again changed the code. The updating is performed by a function that takes as arguments :
the name of the updating file (the file containing the data used to udpdate another file)
and the function that is suited to extract the data from this particular updating file
Hence, it is possible to update a given file with data from various updating files. That makes the code more generic.
Very roughly (i.e. this hasn't been tested at all, and there are numerous imprvements that could be made such as the use of regex and/or pretty-printing):
dicts = []
with open('file1') as file1:
try:
file1content = file1.read()
eval(file1content )
file1content.strip(' ')
file1content.strip('\t')
for line in file1content.splitlines():
if '={' in line:
dicts.append(line.split('={').strip())
except:
print 'file1 not valid'
with open('file2') as file2:
filelines = file2.readlines()
while filelines:
while filelines and '[23]' not in filelines[0]:
filelines.pop(0)
if filelines:
filelines.pop(0)
dictname = filelines.pop(0).split('[')[1].split(']')[0]
if dictname not in dicts:
dicts.append(dictname)
exec(dictname + ' = {}')
while filelines and '[23]' not in filelines[0]:
line = filelines.pop(0)
[k,v] = line.split('=')
k.strip()
v.strip()
if v:
exec(dictname + '[k] = v')
with open('file3', 'w') as file3:
file3content = '\n'.join([`eval(dictname)` for dictname in dicts])
file3.write(file3content)