I am very new to python & I have encountered a problem which is: I have to convert an azure function code to a normal python script. I did not work with azure before so I am kind of clueless. Here is the code below,
this is a process to analyze a document and return key-value pairs, but I am not aware how to convert this code into a regular python script & run it locally.
import logging
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential
from azure.storage.blob import BlockBlobService, PublicAccess
import json
import re
import uuid
logger = logging.getLogger(__name__)
import azure.functions as func
def upload_blob(account_name, container_name, account_key, blob_name):
# Create the BlockBlobService that is used to call the Blob service for the storage account
blob_service_client = BlockBlobService(
account_name=account_name,
account_key=account_key)
# Set the permission so the blobs are public.
blob_service_client.set_container_acl(container_name, public_access=PublicAccess.Container)
#blob_name = doc_path.split('/')[-1][:-4] + str(uuid.uuid4()) + ".pdf"
# Upload the created file, use blob_name for the blob name
#blob_service_client.create_blob_from_path(container_name, blob_name, doc_path)
blob_url = blob_service_client.make_blob_url(container_name, blob_name)
return blob_url
def delete_blob(account_name, container_name, account_key, blob_name):
blob_service_client = BlockBlobService(
account_name=account_name,
account_key=account_key)
# Delete blob from container
blob_service_client.delete_blob(container_name, blob_name)
def search_value(kvs, search_key):
for key, value in kvs.items():
if re.search(search_key, key, re.IGNORECASE):
return value
def analyze_general_documents(endpoint, api_key, doc_url):
document_analysis_client = DocumentAnalysisClient(
endpoint=endpoint, credential=AzureKeyCredential(api_key)
)
poller = document_analysis_client.begin_analyze_document_from_url("prebuilt-document", doc_url)
result = poller.result()
#print("----Key-value pairs found in document----")
kvs = {}
content = result.content.replace("\n", "").replace("\r", "").strip()
for kv_pair in result.key_value_pairs:
if kv_pair.key:
key = kv_pair.key.content
if kv_pair.value:
val = kv_pair.value.content
kvs[key] = val
return content, kvs
def main(req: func.HttpRequest) -> func.HttpResponse:
try:
# Query parameters
endpoint = ""
api_key = ""
account_name = ""
container_name = ""
account_key = ""
if "blob_name" in req.get_json() and "search_keys" in req.get_json():
blob_name = req.get_json()["blob_name"]
search_keys = req.get_json()["search_keys"]
logger.info(" search_keys = "+str(search_keys))
# Upload file to Azure Storage container.
logger.info("Creating blob url")
blob_url = upload_blob(account_name, container_name, account_key, blob_name)
#logger.info("Blob url = "+str(blob_url))
# Analyze the document
content, kvs = analyze_general_documents(endpoint, api_key, blob_url)
#logger.info("content = "+content)
#logger.info("kvs = "+str(kvs))
# Search for specified keys
search_results = {}
for search_key in search_keys:
val = search_value(kvs, search_key)
if val:
search_results[search_key] = val
#logger.info("search_results = "+str(search_results))
# Delete the uploaded file
delete_blob(account_name, container_name, account_key, blob_name)
# Return search results
return func.HttpResponse(json.dumps(search_results))
else:
return func.HttpResponse("Please pass in end_point, api_key, and blob_name", status_code=400)
except Exception as e:
return func.HttpResponse("Error: " + str(e), status_code=500)
First things first - this may not be the full solution to your problem but maybe helps you with deriving next steps. The script should be re-coded imho since it is based on an old library that might not be maintained any more. However, below are some ideas; this is by no means a real solution and must not be used with production data.
Your imported libraries can remain as-is. Just note that when you install the libraries through pip install {library_name}, you would need to use the old azure-storage library and not azure-storage-blob, since this one won't have the BlockBlobService.
Also, if you want to run the script from the command-line, you might want to pass the parameters that the function originally receives through the HTTP request. For this purpose you could use argparse. Furthermore, you might not want to use the credentials within your script file but rather export them as environment variables - then you would need the os library as well.
That said, your imports would look like this:
import logging
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential
from azure.storage.blob import BlockBlobService, PublicAccess
import json
import re
import uuid
# Importing argparse for being able to pass parameters
import argparse
# Importing os to read environment variables
import os
You would not need import azure.functions as func any more.
Because it is running locally, you could pass the parameters blobname and searchkeys when executing the script. This would require something like this:
parser = argparse.ArgumentParser()
parser.add_argument("-n", "--blobname", type=str)
parser.add_argument("-s", "--searchkeys", type=str)
args = parser.parse_args()
blob_name = args.blobname
search_keys = args.searchkeys
This would allow you to keep the variable names as they are right now.
Like initially mentioned, the functions can remain as-is, however, the credentials should not be within the script. With os imported, you could add this...
# Query parameters
endpoint = os.getenv('form_recognizer_endpoint')
api_key = os.getenv('form_recognizer_api_key')
account_name = os.getenv('storage_account_name')
container_name = os.getenv('storage_container_name')
account_key = os.getenv('storage_account_key')
...and then use the export functionality from your shell to add them as environment variables, i.E.:
export form_recognizer_endpoint="your_endpoint"
export form_recognizer_api_key="your_api_key"
export storage_account_name="your_account_name"
export storage_container_name="your_container"
export storage_account_key="your_key"
Finally, you could remove the surrounding def main and the try-except block and also the if-statement so that your main block would be along these lines:
logger.info(" search_keys = "+str(search_keys))
# Upload file to Azure Storage container.
logger.info("Creating blob url")
blob_url = upload_blob(account_name, container_name, account_key, blob_name)
#logger.info("Blob url = "+str(blob_url))
# Analyze the document
content, kvs = analyze_general_documents(endpoint, api_key, blob_url)
#logger.info("content = "+content)
#logger.info("kvs = "+str(kvs))
# Search for specified keys
search_results = {}
for search_key in search_keys:
val = search_value(kvs, search_key)
if val:
search_results[search_key] = val
#logger.info("search_results = "+str(search_results))
# Delete the uploaded file
delete_blob(account_name, container_name, account_key, blob_name)
Finally you could change the return line to print out the result, i.E.:
# Return search results
print(json.dumps(search_results))
It could be executed like this: python script.py --blobname testfile.pdf --searchkeys "text"
Related
I feel like this should be possible, but I looked through the wandb SDK code and I can't find an easy/logical way to do it. It might be possible to hack it by modifying the manifest entries at some point later (but maybe before the artifact is logged to wandb as then the manifest and the entries might be locked)? I saw things like this in the SDK code:
version = manifest_entry.extra.get("versionID")
etag = manifest_entry.extra.get("etag")
So, I figure we can probably edit those?
UPDATE
So, I tried to hack it together with something like this and it works but it feels wrong:
import os
import wandb
import boto3
from wandb.util import md5_file
ENTITY = os.environ.get("WANDB_ENTITY")
PROJECT = os.environ.get("WANDB_PROJECT")
API_KEY = os.environ.get("WANDB_API_KEY")
api = api = wandb.Api(overrides={"entity": ENTITY, "project": ENTITY})
run = wandb.init(entity=ENTITY, project=PROJECT, job_type="test upload")
file = "admin2Codes.txt" # "admin1CodesASCII.txt" # (both already on s3 with a couple versions)
artifact = wandb.Artifact("test_data", type="dataset")
# modify one of the local files so it has a new md5hash etc.
with open(file, "a") as f:
f.write("new_line_1\n")
# upload local file to s3
local_file_path = file
s3_url = f"s3://bucket/prefix/{file}"
s3_url_arr = s3_url.replace("s3://", "").split("/")
s3_bucket = s3_url_arr[0]
key = "/".join(s3_url_arr[1:])
s3_client = boto3.client("s3")
file_digest = md5_file(local_file_path)
s3_client.upload_file(
local_file_path,
s3_bucket,
key,
# save the md5_digest in metadata,
# can be used later to only upload new files to s3,
# as AWS doesn't digest the file consistently in the E-tag
ExtraArgs={"Metadata": {"md5_digest": file_digest}},
)
head_response = s3_client.head_object(Bucket=s3_bucket, Key=key)
version_id: str = head_response["VersionId"]
print(version_id)
# upload a link/ref to this s3 object in wandb:
artifact.add_reference(s3_dir)
# at this point we might be able to modify the artifact._manifest.entries and each entry.extra.get("etag") etc.?
print([(name, entry.extra) for name, entry in artifact._manifest.entries.items()])
# set these to an older version on s3 that we know we want (rather than latest) - do this via wandb public API:
dataset_v2 = api.artifact(f"{ENTITY}/{PROJECT}/test_data:v2", type="dataset")
# artifact._manifest.add_entry(dataset_v2.manifest.entries["admin1CodesASCII.txt"])
artifact._manifest.entries["admin1CodesASCII.txt"] = dataset_v2.manifest.entries[
"admin1CodesASCII.txt"
]
# verify that it did change:
print([(name, entry.extra) for name, entry in artifact._manifest.entries.items()])
run.log_artifact(artifact) # at this point the manifest is locked I believe?
artifact.wait() # wait for upload to finish (blocking - but should be very quick given it is just an s3 link)
print(artifact.name)
run_id = run.id
run.finish()
curr_run = api.run(f"{ENTITY}/{PROJECT}/{run_id}")
used_artifacts = curr_run.used_artifacts()
logged_artifacts = curr_run.logged_artifacts()
Am I on the right track here? I guess the other workaround is to make a copy on s3 (so that older version is the latest again) but I wanted to avoid this as the 1 file that I want to use an old version of is a large NLP model and the only files I want to change are small config.json files etc. (so seems very wasteful to upload all files again).
I was also wondering if when I copy an old version of an object back into the same key in the bucket if that creates a real copy or just like a pointer to the same underlying object. Neither boto3 nor AWS documentation makes that clear - although it seems like it is a proper copy.
I think I found the correct way to do it now:
import os
import wandb
import boto3
from wandb.util import md5_file
ENTITY = os.environ.get("WANDB_ENTITY")
PROJECT = os.environ.get("WANDB_PROJECT")
def wandb_update_only_some_files_in_artifact(
existing_artifact_name: str,
new_s3_file_urls: list[str],
entity: str = ENTITY,
project: str = PROJECT,
) -> Artifact:
"""If you want to just update a config.json file for example,
but the rest of the artifact can remain the same, then you can
use this functions like so:
wandb_update_only_some_files_in_artifact(
"old_artifact:v3",
["s3://bucket/prefix/config.json"],
)
and then all the other files like model.bin will be the same as in v3,
even if there was a v4 or v5 in between (as the v3 VersionIds are used)
Args:
existing_artifact_name (str): name with version like "old_artifact:v3"
new_s3_file_urls (list[str]): files that should be updated
entity (str, optional): wandb entity. Defaults to ENTITY.
project (str, optional): wandb project. Defaults to PROJECT.
Returns:
Artifact: the new artifact object
"""
api = wandb.Api(overrides={"entity": entity, "project": project})
old_artifact = api.artifact(existing_artifact_name)
old_artifact_name = re.sub(r":v\d+$", "", old_artifact.name)
with wandb.init(entity=entity, project=project) as run:
new_artifact = wandb.Artifact(old_artifact_name, type=old_artifact.type)
s3_file_names = [s3_url.split("/")[-1] for s3_url in new_s3_file_urls]
# add the new ones:
for s3_url, filename in zip(new_s3_file_urls, s3_file_names):
new_artifact.add_reference(s3_url, filename)
# add the old ones:
for filename, entry in old_artifact.manifest.entries.items():
if filename in s3_file_names:
continue
new_artifact.add_reference(entry, filename)
# this also works but feels hackier:
# new_artifact._manifest.entries[filename] = entry
run.log_artifact(new_artifact)
new_artifact.wait() # wait for upload to finish (blocking - but should be very quick given it is just an s3 link)
print(new_artifact.name)
print(run.id)
return new_artifact
# usage:
local_file_path = "config.json" # modified file
s3_url = "s3://bucket/prefix/config.json"
s3_url_arr = s3_url.replace("s3://", "").split("/")
s3_bucket = s3_url_arr[0]
key = "/".join(s3_url_arr[1:])
s3_client = boto3.client("s3")
file_digest = md5_file(local_file_path)
s3_client.upload_file(
local_file_path,
s3_bucket,
key,
# save the md5_digest in metadata,
# can be used later to only upload new files to s3,
# as AWS doesn't digest the file consistently in the E-tag
ExtraArgs={"Metadata": {"md5_digest": file_digest}},
)
wandb_update_only_some_files_in_artifact(
"old_artifact:v3",
["s3://bucket/prefix/config.json"],
)
I have a cloud function calling SCC's list_assets and converting the paginated output to a List (to fetch all the results). However, since I have quite a lot of assets in the organization tree, it is taking a lot of time to fetch and cloud function times out (540 seconds max timeout).
asset_iterator = security_client.list_assets(org_name)
asset_fetch_all=list(asset_iterator)
I tried to export via WebUI and it works fine (took about 5 minutes). Is there a way to export the assets from SCC directly to a Cloud Storage bucket using the API?
I develop the same thing, in Python, for exporting to BQ. Searching in BigQuery is easier than in a file. The code is very similar for GCS storage. Here my working code with BQ
import os
from google.cloud import asset_v1
from google.cloud.asset_v1.proto import asset_service_pb2
from google.cloud.asset_v1 import enums
def GCF_ASSET_TO_BQ(request):
client = asset_v1.AssetServiceClient()
parent = 'organizations/{}'.format(os.getenv('ORGANIZATION_ID'))
output_config = asset_service_pb2.OutputConfig()
output_config.bigquery_destination.dataset = 'projects/{}/datasets/{}'.format(os.getenv('PROJECT_ID'),os.getenv('DATASET'))
content_type = enums.ContentType.RESOURCE
output_config.bigquery_destination.table = 'asset_export'
output_config.bigquery_destination.force = True
response = client.export_assets(parent, output_config, content_type=content_type)
# For waiting the finish
# response.result()
# Do stuff after export
return "done", 200
if __name__ == "__main__":
GCF_ASSET_TO_BQ('')
As you can see, there is some values in Env Var (OrganizationID, projectId and Dataset). For exporting to Cloud Storage, you have to change the definition of the output_config like this
output_config = asset_service_pb2.OutputConfig()
output_config.gcs_destination.uri = 'gs://path/to/file'
You have example in other languages here
Try something like this:
We use it to upload finding into a bucket. Make sure to give the SP the function is running the right permissions to the bucket.
def test_list_medium_findings(source_name):
# [START list_findings_at_a_time]
from google.cloud import securitycenter
from google.cloud import storage
# Create a new client.
client = securitycenter.SecurityCenterClient()
#Set query paramaters
organization_id = "11112222333344444"
org_name = "organizations/{org_id}".format(org_id=organization_id)
all_sources = "{org_name}/sources/-".format(org_name=org_name)
#Query Security Command Center
finding_result_iterator = client.list_findings(all_sources,filter_=YourFilter)
#Set output file settings
bucket="YourBucketName"
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket)
output_file_name = "YourFileName"
my_file = bucket.blob(output_file_name)
with open('/tmp/data.txt', 'w') as file:
for i, finding_result in enumerate(finding_result_iterator):
file.write(
"{}: name: {} resource: {}".format(
i, finding_result.finding.name, finding_result.finding.resource_name
)
)
#Upload to bucket
my_file.upload_from_filename("/tmp/data.txt")
I have installed Jenkinsapi python module and created below python script.
from jenkinsapi.jenkins import Jenkins
jenkinsSource = 'http://10.52.123.124:8080/'
server = Jenkins(jenkinsSource, username = 'XXXXX', password = 'YYYYY')
myJob=server.get_job("__test")
myConfig=myJob.get_config()
print myConfig
Now I need to parse the XML from the variable myConfig and fetch the value given for triggers (cron entry) and save it to another variable in python. So that I can replace the cron entry using jenkinsapi module.
You can read myConfig with xpath, see code below which reads numToKeep from hudson.tasks.LogRotator. This prints out jobs whose build retention is set to more than 10 in Discard old builds job configuration.
Calling code
jobs = server.get_jobs()
for job in jobs:
full_name = job['fullname']
config = server.get_job_config(full_name)
num_days = h.get_value_from_string(config, key='.//numToKeep')
if num_days > 10:
print(full_name, num_days)
Helper function which reads property from config
import xml.etree.ElementTree as ET
def get_value_from_string(xml_string, key):
parsed = ET.fromstring(xml_string)
find_result = parsed.findall(key)
value = 0
if len(find_result) > 0:
value = int(find_result[0].text)
return value
To update a jobs configuration
config = server.get_job_config('<the_job>')
old_config_part = '<numToKeep>100</numToKeep>'
new_config_part = '<numToKeep>5</numToKeep>'
new_config = config.replace(old_config_part, new_config_part)
server.reconfig_job(full_name, new_config)
I am using python with python-kubernetes with a minikube running locally, e.g there are no cloud issues.
I am trying to create a job and provide it with data to run on. I would like to provide it with a mount of a directory with my local machine data.
I am using this example and trying to add a mount volume
This is my code after adding the keyword volume_mounts (I tried multiple places, multiple keywords and nothing works)
from os import path
import yaml
from kubernetes import client, config
JOB_NAME = "pi"
def create_job_object():
# Configureate Pod template container
container = client.V1Container(
name="pi",
image="perl",
volume_mounts=["/home/user/data"],
command=["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"])
# Create and configurate a spec section
template = client.V1PodTemplateSpec(
metadata=client.V1ObjectMeta(labels={
"app": "pi"}),
spec=client.V1PodSpec(restart_policy="Never",
containers=[container]))
# Create the specification of deployment
spec = client.V1JobSpec(
template=template,
backoff_limit=0)
# Instantiate the job object
job = client.V1Job(
api_version="batch/v1",
kind="Job",
metadata=client.V1ObjectMeta(name=JOB_NAME),
spec=spec)
return job
def create_job(api_instance, job):
# Create job
api_response = api_instance.create_namespaced_job(
body=job,
namespace="default")
print("Job created. status='%s'" % str(api_response.status))
def update_job(api_instance, job):
# Update container image
job.spec.template.spec.containers[0].image = "perl"
# Update the job
api_response = api_instance.patch_namespaced_job(
name=JOB_NAME,
namespace="default",
body=job)
print("Job updated. status='%s'" % str(api_response.status))
def delete_job(api_instance):
# Delete job
api_response = api_instance.delete_namespaced_job(
name=JOB_NAME,
namespace="default",
body=client.V1DeleteOptions(
propagation_policy='Foreground',
grace_period_seconds=5))
print("Job deleted. status='%s'" % str(api_response.status))
def main():
# Configs can be set in Configuration class directly or using helper
# utility. If no argument provided, the config will be loaded from
# default location.
config.load_kube_config()
batch_v1 = client.BatchV1Api()
# Create a job object with client-python API. The job we
# created is same as the `pi-job.yaml` in the /examples folder.
job = create_job_object()
create_job(batch_v1, job)
update_job(batch_v1, job)
delete_job(batch_v1)
if __name__ == '__main__':
main()
I get this error
HTTP response body:
{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Job
in version \"v1\" cannot be handled as a Job: v1.Job.Spec:
v1.JobSpec.Template: v1.PodTemplateSpec.Spec: v1.PodSpec.Containers:
[]v1.Container: v1.Container.VolumeMounts: []v1.VolumeMount:
readObjectStart: expect { or n, but found \", error found in #10 byte
of ...|ounts\": [\"/home/user|..., bigger context ...| \"image\":
\"perl\", \"name\": \"pi\", \"volumeMounts\": [\"/home/user/data\"]}],
\"restartPolicy\": \"Never\"}}}}|...","reason":"BadRequest","code":400
What am i missing here?
Is there another way to expose data to the job?
edit: trying to use client.V1Volumemount
I am trying to add this code, and add mount object in different init functions eg.
mount = client.V1VolumeMount(mount_path="/data", name="shai")
client.V1Container
client.V1PodTemplateSpec
client.V1JobSpec
client.V1Job
under multiple keywords, it all results in errors, is this the correct object to use? How shell I use it if at all?
edit: trying to pass volume_mounts as a list with the following code suggested in the answers:
def create_job_object():
# Configureate Pod template container
container = client.V1Container(
name="pi",
image="perl",
volume_mounts=["/home/user/data"],
command=["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"])
# Create and configurate a spec section
template = client.V1PodTemplateSpec(
metadata=client.V1ObjectMeta(labels={
"app": "pi"}),
spec=client.V1PodSpec(restart_policy="Never",
containers=[container]))
# Create the specification of deployment
spec = client.V1JobSpec(
template=template,
backoff_limit=0)
# Instantiate the job object
job = client.V1Job(
api_version="batch/v1",
kind="Job",
metadata=client.V1ObjectMeta(name=JOB_NAME),
spec=spec)
return job
And still getting a similar error
kubernetes.client.rest.ApiException: (422) Reason: Unprocessable
Entity HTTP response headers: HTTPHeaderDict({'Content-Type':
'application/json', 'Date': 'Tue, 06 Aug 2019 06:19:13 GMT',
'Content-Length': '401'}) HTTP response body:
{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Job.batch
\"pi\" is invalid:
spec.template.spec.containers[0].volumeMounts[0].name: Not found:
\"d\"","reason":"Invalid","details":{"name":"pi","group":"batch","kind":"Job","causes":[{"reason":"FieldValueNotFound","message":"Not
found:
\"d\"","field":"spec.template.spec.containers[0].volumeMounts[0].name"}]},"code":422}
The V1Container call is expecting a list of V1VolumeMount objects for volume_mounts parameter but you passed in a list of string:
Code:
def create_job_object():
volume_mount = client.V1VolumeMount(
mount_path="/home/user/data"
# other optional arguments, see the volume mount doc link below
)
# Configureate Pod template container
container = client.V1Container(
name="pi",
image="perl",
volume_mounts=[volume_mount],
command=["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"])
# Create and configurate a spec section
template = client.V1PodTemplateSpec(
metadata=client.V1ObjectMeta(labels={
"app": "pi"}),
spec=client.V1PodSpec(restart_policy="Never",
containers=[container]))
....
references:
https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/V1Container.md
https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/V1VolumeMount.md
I have a module that needs to update new variable values from the web, about once a week. I could place those variable values in a file & load those values on startup. Or, a simpler solution would be to simply auto-update the code.
Is this possible in Python?
Something like this...
def self_updating_module_template():
dynamic_var1 = {'dynamic var1'} # some kind of place holder tag
dynamic_var2 = {'dynamic var2'} # some kind of place holder tag
return
def self_updating_module():
dynamic_var1 = 'old data'
dynamic_var2 = 'old data'
return
def updater():
new_data_from_web = ''
new_dynamic_var1 = new_data_from_web # Makes API call. gets values.
new_dynamic_var2 = new_data_from_web
# loads self_updating_module_template
dynamic_var1 = new_dynamic_var1
dynamic_var2 = new_dynamic_var2
# replace module place holders with new values.
# overwrite self_updating_module.py.
return
I would recommend that you use configparser and a set of default values located in an ini-style file.
The ConfigParser class implements a basic configuration file parser
language which provides a structure similar to what you would find on
Microsoft Windows INI files. You can use this to write Python programs
which can be customized by end users easily.
Whenever the configuration values are updated from the web api endpoint, configparser also lets us write those back out to the configuration file. That said, be careful! The reason that most people recommend that configuration files be included at build/deploy time and not at run time is for security/stability. You have to lock down the endpoint that allows updates to your running configuration in production and have some way to verify any configuration value updates before they are retrieved by your application:
import configparser
filename = 'config.ini'
def load_config():
config = configparser.ConfigParser()
config.read(filename)
if 'WEB_DATA' not in config:
config['WEB_DATA'] = {'dynamic_var1': 'dynamic var1', # some kind of place holder tag
'dynamic_var2': 'dynamic var2'} # some kind of place holder tag
return config
def update_config(config):
new_data_from_web = ''
new_dynamic_var1 = new_data_from_web # Makes API call. gets values.
new_dynamic_var2 = new_data_from_web
config['WEB_DATA']['dynamic_var1'] = new_dynamic_var1
config['WEB_DATA']['dynamic_var2'] = new_dynamic_var2
def save_config(config):
with open(filename, 'w') as configfile:
config.write(configfile)
Example usage::
# Load the configuration
config = load_config()
# Get new data from the web
update_config(config)
# Save the newly updated configuration back to the file
save_config(config)