How to combine YAML files in python?

How to combine YAML files in python? - python

I got some Kubernetes YAML files which I need to combine.
For that, I tried using Python.
The second file, sample.yaml, should be merged to the first file, source.yaml.
The source.yaml file has one section sample:, where the complete sample.yaml should be added.
I tried using the below code:
#pip install pyyaml
import yaml
def yaml_loader(filepath):
#Loads a yaml file
with open(filepath,'r')as file_descriptor:
data = yaml.load(file_descriptor)
return data
def yaml_dump(filepath,data):
with open(filepath,"w") as file_descriptor:
yaml.dump(data, file_descriptor)
if __name__ == "__main__":
file_path1 = "source"
data1 = yaml_loader(file_path1)
file_path2 = "sample.yaml"
with open(file_path2, 'r') as file2:
sample_yaml = file2.read()
data1['data']['sample'] = sample_yml
yaml_dump("temp.yml", data1)
This is creating a new file temp.yml but instead of line breaks, it is saving \n as strings:
How to fix this?

Your original YAML may have issues. If you use VS Code, format your YAML file. Click on the bottom of vscode(if using the same) [Spaces]
and select convert indentation to spaces
also, you can check if YAML module has any indentation property to be configured ,when loading the file

Related

creating and using a preferences file in python

Brand new to stack and python; hopefully someone wiser than myself can help. I have searched up and down and can't seem to find an actual answer to this, apologies if there is an exact answer and I've missed it :( (the few that I've found are either old or don't seem to work).
Closest I've found is
Best way to retrieve variable values from a text file?
Alas, imp seems to be depreciated and tried figuring out importlib but little above my current brain to figure out how to adapt it as errors throw up left and right on me.
This is very close to what I want and could potentially work if someone can help update with new methods, alas still doesn't have how to overwrite the old variable.
= - - Scenario - - =
I would like to create a preferences file (let's call it settings.txt or settings.py: doesn't need to be cross-compatible with other languages, but some reason I'd prefer txt - any preference/standards coders can impart would be appreciated?).
\\\ settings.txt\
water_type = "Fresh"\
measurement = "Metric"\
colour = "Blue"\
location = "Bottom"\
...
I am creating a script main_menu.py which will read variables in settings.txt and write to this file if changes are 'saved'
ie.
"select water type:"
Fresh
Salt
if water_type is the same as settings.txt, do nothing,
if water_type different, overwrite the variable in the settings.txt file
Other scripts down the line will also read and write to this settings file.
I've seen:
from settings import *
Which seems to work for reading the file if I go the settings.py path but still leaves me on how do I overwrite this.
also open to any better/standard/ideas you guys can think of.
Appreciate any help on this!

Here are some suggestions that may help you:
Use a json file:
settings.json
{
"water_type": "Fresh",
"measurement": "Metric",
"colour": "Blue",
"location": "Bottom",
...
}
then in python:
import json
# Load data from the json file
with open("settings.json", "r") as f:
x = json.load(f) # x is a python dictionary in this case
# Change water_type in x
x["water_type"] = "Salt"
# Save changes
with open("settings.json", "w") as f:
json.dump(x, f, indent=4)
Use a yaml file: (edit: you will need to install pyyaml)
settings.yaml
water_type: Fresh
measurement: Metric
colour: Blue
location: Bottom
...
then in python:
import yaml
# Load data from the yaml file
with open("settings.yaml", "r") as f:
x = yaml.load(f, Loader=yaml.FullLoader) # x is a python dictionary in this case
# Change water_type in x
x["water_type"] = "Salt"
# Save changes
with open("settings.yaml", "w") as f:
yaml.dump(x, f)
Use a INI file:
settings.ini
[Preferences]
water_type=Fresh
measurement=Metric
colour=Blue
location=Bottom
...
then in python:
import configparser
# Load data from the ini file
config = configparser.ConfigParser()
config.read('settings.ini')
# Change water_type in config
config["Preferences"]["water_type"] = "Salt"
# Save changes
with open("settings.ini", "w") as f:
config.write(f)

For .py config files, it's usually static options or settings.
Ex.
# config.py
STRINGTOWRITE01 = "Hello, "
STRINGTOWRITE02 = "World!"
LINEENDING = "\n"
It would be hard to save changes made to the settings in such a format.
I'd recommend a JSON file.
Ex. settings.json
{
"MainSettings": {
"StringToWrite": "Hello, World!"
}
}
To read the settings from this file into a Python Dictionary, you can use this bit of code.
import json # Import pythons JSON library
JSON_FILE = open('settings.json','r').read() # Open the file with read permissions, then read it.
JSON_DATA = json.loads(JSON_FILE) # load the raw text from the file into a json object or dictionary
print(JSON_DATA["MainSettings"]["StringToWrite"]) # Access the 'StringToWrite' variable, just as you would with a dictionary.
To write to the settings.json file you can use this bit of code
import json # import pythons json lib
JSON_FILE = open('settings.json','r').read() # Open the file with read permissions, then read it.
JSON_DATA = json.loads(JSON_FILE) # load the data into a json object or dictionary
print(JSON_DATA["MainSettings"]["StringToWrite"]) # Print out the StringToWrite "variable"
JSON_DATA["MainSettings"]["StringToWrite"] = "Goodnight!" # Change the StringToWrite
JSON_DUMP = json.dumps(JSON_DATA) # Turn the json object or dictionary back into a regular string
JSON_FILE = open('settings.json','w') # Reopen the file, this time with read and write permissions
JSON_FILE.write(JSON_DUMP) # Update our settings file, by overwriting our previous settings
Now, I've written this so that it is as easy as possible to understand what's going on. There are better ways to do this with Python Functions.

You guys are fast! I'm away from the computer for the weekend but had to log in just to say thanks.
I'll look into these more next week when I'm back at it and have some time to give it the attention needed. A quick glance could be a bit of fun to implement and learn a bit more.
Had to answer as adding comment only is on one of your guys solutions and wanted to give a blanket thanks to all!
Cheers

Here's a python library if you choose to do it this way.
If not this is also a good resource.
Creating a preferences file example
Writing preferences to file from python file
import json
# Data to be written
dictionary ={
"name" : "sathiyajith",
"rollno" : 56,
"cgpa" : 8.6,
"phonenumber" : "9976770500"
}
# Serializing json
json_object = json.dumps(dictionary, indent = 4)
# Writing to sample.json
with open("sample.json", "w") as outfile:
outfile.write(json_object)
Reading preferences from .json file in Python
import json
# open and read file content
with open('sample.json') as json_file:
data = json.load(json_file)
# print json file
print(data)

how to make python write json read and write same file for each cicle

i'm writing a script in Python doing a while true cicle, how can I make my script take the same file abc123.json for each cicle and modify some variables in it?

If I understand your question correctly, you want to read a file named abc123.json somewhere on a local hard drive that is accessible via path and modify a value for a key (or more) for that json file, then re-write it.
I'm pasting an example of some code I used a while ago in hopes it helps
import json
from collections import OrderedDict
from os import path
def change_json(file_location, data):
with open(file_location, 'r+') as json_file:
# I use OrderedDict to keep the same order of key/values in the source file
json_from_file = json.load(json_file, object_pairs_hook=OrderedDict)
for key in json_from_file:
# make modifications here
json_from_file[key] = data[key]
print(json_from_file)
# rewind to top of the file
json_file.seek(0)
# sort_keys keeps the same order of the dict keys to put back to the file
json.dump(json_from_file, json_file, indent=4, sort_keys=False)
# just in case your new data is smaller than the older
json_file.truncate()
# File name
file_to_change = 'abc123.json'
# File Path (if file is not in script's current working directory. Note the windows style of paths
path_to_file = 'C:\\test'
# here we build the full file path
file_full_path = path.join(path_to_file, file_to_change)
#Simple json that matches what I want to change in the file
json_data = {'key1': 'value 1'}
while 1:
change_json(file_full_path, json_data)
# let's check if we changed that value now
with open(file_full_path, 'r') as f:
if 'value 1' in json.load(f)['key1']:
print('yay')
break
else:
print('nay')
# do some other stuff
Observation: the code above assumes that both your file and the json_data share the same keys. If they dont, your function will need to figure out how to match keys between data structures.

Removing personal information from the comments in a word file using python

I want to remove all the personal information from the comments inside a word file.
Removing the Authors name is fine, I did that using the following,
document = Document('sampleFile.docx')
core_properties = document.core_properties
core_properties.author = ""
document.save('new-filename.docx')
But this is not what I need, I want to remove the name of any person who commented inside that word file.
The way we do it manually is by going into Preferences->security->remove personal information from this file on save

If you want to remove personal information from the comments in .docx file, you'll have to dive deep into the file itself.
So, .docx is just a .zip archive with word-specific files. We need to overwrite some internal files of it, and the easiest way to do it that I could find is to copy all the files to memory, change whatever we have to change and put it all to a new file.
import re
import os
from zipfile import ZipFile
docx_file_name = '/path/to/your/document.docx'
files = dict()
# We read all of the files and store them in "files" dictionary.
document_as_zip = ZipFile(docx_file_name, 'r')
for internal_file in document_as_zip.infolist():
file_reader = document_as_zip.open(internal_file.filename, "r")
files[internal_file.filename] = file_reader.readlines()
file_reader.close()
# We don't need to read anything more, so we close the file.
document_as_zip.close()
# If there are any comments.
if "word/comments.xml" in files.keys():
# We will be working on comments file...
comments = files["word/comments.xml"]
comments_new = str()
# Files contents have been read as list of byte strings.
for comment in comments:
if isinstance(comment, bytes):
# Change every author to "Unknown Author".
comments_new += re.sub(r'w:author="[^"]*"', "w:author=\"Unknown Author\"", comment.decode())
files["word/comments.xml"] = comments_new
# Remove the old .docx file.
os.remove(docx_file_name)
# Now we want to save old files to the new archive.
document_as_zip = ZipFile(docx_file_name, 'w')
for internal_file_name in files.keys():
# Those are lists of byte strings, so we merge them...
merged_binary_data = str()
for binary_data in files[internal_file_name]:
# If the file was not edited (therefore is not the comments.xml file).
if not isinstance(binary_data, str):
binary_data = binary_data.decode()
# Merge file contents.
merged_binary_data += binary_data
# We write old file contents to new file in new .docx.
document_as_zip.writestr(internal_file_name, merged_binary_data)
# Close file for writing.
document_as_zip.close()

The core properties recognised by the CoreProperties class are listed in the official documentation: http://python-docx.readthedocs.io/en/latest/api/document.html#coreproperties-objects
To overwrite all of them you can set them to an empty string like the one you used to overwrite the authors metadata:
document = Document('sampleFile.docx')
core_properties = document.core_properties
meta_fields= ["author", "category", "comments", "content_status", "created", "identifier", "keywords", "language", "revision", "subject", "title", "version"]
for meta_field in meta_fields:
setattr(core_properties, meta_field, "")
document.save('new-filename.docx')

Trying to access a list in a file and use it as a string

I would like to be able to use a list in a file to 'upload' a code to the program.
NotePad file:
savelist = ["Example"]
namelist = ["Example2"]
Python Code:
with open("E:/battle_log.txt", 'rb') as f:
gamesave = savelist[(name)](f)
name1 = namelist [(name)](f)
print ("Welcome back "+name1+"! I bet you missed this adventure!")
f.close()
print savelist
print namelist
I would like this to be the output:
Example
Example2

It looks like you're trying to serialize a program state, the re-load it later! You should consider using a database instead, or even simply pickle
import pickle
savelist = ["Example"]
namelist = ["Example2"]
obj_to_pickle = (savelist, namelist)
with open("path/to/savefile.pkl", 'wb') as p:
pickle.dump(obj_to_pickle, p)
# save data
with open('path/to/savefile.pkl', 'rb') as p:
obj_from_pickle = pickle.load(p)
savelist, namelist = obj_from_pickle
# load data

There are several options:
Save your notepad file with the .py extension and import it. As long as it contains valid python code, everything will be accessible
Load the text as a string and execute it (e.g., via eval())
Store the information in an easy to read configuration file (e.g., YAML) and parse it when you need it
Precompute the data and store it in a pickle file
The first two are risky if you don't have control over who will provide the file as someone can insert malicious code into the inputs.

You could simply import it as long the file is in the same folder as the one your program is in. Kinda like this:
import example.txt
or:
from example.txt import*
Then access it through one of two ways. The first one:
print Example.savelist[0]
print Example.namelist[0]
The second way:
print savelist[0]
print namelist[0]

CSV new-line character seen in unquoted field error

the following code worked until today when I imported from a Windows machine and got this error:
new-line character seen in unquoted field - do you need to open the file in universal-newline mode?
import csv
class CSV:
def __init__(self, file=None):
self.file = file
def read_file(self):
data = []
file_read = csv.reader(self.file)
for row in file_read:
data.append(row)
return data
def get_row_count(self):
return len(self.read_file())
def get_column_count(self):
new_data = self.read_file()
return len(new_data[0])
def get_data(self, rows=1):
data = self.read_file()
return data[:rows]
How can I fix this issue?
def upload_configurator(request, id=None):
"""
A view that allows the user to configurator the uploaded CSV.
"""
upload = Upload.objects.get(id=id)
csvobject = CSV(upload.filepath)
upload.num_records = csvobject.get_row_count()
upload.num_columns = csvobject.get_column_count()
upload.save()
form = ConfiguratorForm()
row_count = csvobject.get_row_count()
colum_count = csvobject.get_column_count()
first_row = csvobject.get_data(rows=1)
first_two_rows = csvobject.get_data(rows=5)

It'll be good to see the csv file itself, but this might work for you, give it a try, replace:
file_read = csv.reader(self.file)
with:
file_read = csv.reader(self.file, dialect=csv.excel_tab)
Or, open a file with universal newline mode and pass it to csv.reader, like:
reader = csv.reader(open(self.file, 'rU'), dialect=csv.excel_tab)
Or, use splitlines(), like this:
def read_file(self):
with open(self.file, 'r') as f:
data = [row for row in csv.reader(f.read().splitlines())]
return data

I realize this is an old post, but I ran into the same problem and don't see the correct answer so I will give it a try
Python Error:
_csv.Error: new-line character seen in unquoted field
Caused by trying to read Macintosh (pre OS X formatted) CSV files. These are text files that use CR for end of line. If using MS Office make sure you select either plain CSV format or CSV (MS-DOS). Do not use CSV (Macintosh) as save-as type.
My preferred EOL version would be LF (Unix/Linux/Apple), but I don't think MS Office provides the option to save in this format.

For Mac OS X, save your CSV file in "Windows Comma Separated (.csv)" format.

If this happens to you on mac (as it did to me):
Save the file as CSV (MS-DOS Comma-Separated)
Run the following script
with open(csv_filename, 'rU') as csvfile:
csvreader = csv.reader(csvfile)
for row in csvreader:
print ', '.join(row)

Try to run dos2unix on your windows imported files first

This is an error that I faced. I had saved .csv file in MAC OSX.
While saving, save it as "Windows Comma Separated Values (.csv)" which resolved the issue.

This worked for me on OSX.
# allow variable to opened as files
from io import StringIO
# library to map other strange (accented) characters back into UTF-8
from unidecode import unidecode
# cleanse input file with Windows formating to plain UTF-8 string
with open(filename, 'rb') as fID:
uncleansedBytes = fID.read()
# decode the file using the correct encoding scheme
# (probably this old windows one)
uncleansedText = uncleansedBytes.decode('Windows-1252')
# replace carriage-returns with new-lines
cleansedText = uncleansedText.replace('\r', '\n')
# map any other non UTF-8 characters into UTF-8
asciiText = unidecode(cleansedText)
# read each line of the csv file and store as an array of dicts,
# use first line as field names for each dict.
reader = csv.DictReader(StringIO(cleansedText))
for line_entry in reader:
# do something with your read data

I know this has been answered for quite some time but not solve my problem. I am using DictReader and StringIO for my csv reading due to some other complications. I was able to solve problem more simply by replacing delimiters explicitly:
with urllib.request.urlopen(q) as response:
raw_data = response.read()
encoding = response.info().get_content_charset('utf8')
data = raw_data.decode(encoding)
if '\r\n' not in data:
# proably a windows delimited thing...try to update it
data = data.replace('\r', '\r\n')
Might not be reasonable for enormous CSV files, but worked well for my use case.

Alternative and fast solution : I faced the same error. I reopened the "wierd" csv file in GNUMERIC on my lubuntu machine and exported the file as csv file. This corrected the issue.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.