Parsing a Fortigate config file with Python

Parsing a Fortigate config file with Python - python

I am basically trying to continue this unanswered question about parsing a fortigate config file.
Reading a fortigate configuration file with Python
The root problem is that this config contains a number of records like this,
edit 1
set srcintf "port26"
set dstintf "port25"
set srcaddr "all"
set dstaddr "all"
set action accept
set utm-status enable
set schedule "always"
set service "ANY"
set av-profile "default"
set nat enable
set central-nat enable
next
I would like to get the output for each acl on a single line so I can import them into a CSV. The problem is that each record can have a variable number of lines, and the indentation shows subsections of the preceding line. The other post does get some of it correctly, but it doesn't handle the indentation. I have come up with some workarounds that replace white spaces with arbitrary characters, but I didn't know if there was a method to read the number tabs/whitespaces and use that to indicate positioning.
Thanks

So I have managed to read your text and turn it into a dictionary in python. It is pretty simple. You basically have to do something along the lines of:
conFigFile=open('./config.txt')
data=dict()
record=0
for line in conFigFile:
if line.find('edit')>=0:
record=int(line.replace('edit',''))
data[record]={}
if line.find('set')>=0:
line=line.replace('set','')
line=line.strip()
print line
key,val=line.split(' ')
data[record][key]=val
conFigFile.close()
This will produce a dictionary which will then allow you to make calls such as:
>>> data[1]['nat']
'enable'
>>> data[1].keys()
['nat', 'service', 'schedule', 'central-nat', 'srcaddr', 'av-profile', 'dstintf', 'srcintf', 'action', 'dstaddr', 'utm-status']
So now it is possible to generate a csv file:
csvFile=open('./data.csv','w')
records=data.keys()
for record in records:
values=data[record].keys()
valList=['Record',str(record)]
for val in values:
valList.append(val)
valList.append(data[record][val])
csvFile.write(",".join(valList))
csvFile.close()
Which produces the csv file:
Record,1,nat,enable,service,"ANY",schedule,"always",central-nat,enable,srcaddr,"all",av-profile,"default",dstintf,"port25",srcintf,"port26",action,accept,dstaddr,"all",utm-status,enable
If you really want to count the spaces before the line, you can do something like the following:
>>> a=' test: one '
>>> a.count(' ') #count all spaces
11
>>> (len(a) - len(a.lstrip())) #count leading spaces
5

Related

Extract text from a config file [duplicate]

This question already has answers here:
Parse key value pairs in a text file
(7 answers)
Closed 1 year ago.
I'm using a config file to inform my Python script of a few key-values, for use in authenticating the user against a website.
I have three variables: the URL, the user name, and the API token.
I've created a config file with each key on a different line, so:
url:<url string>
auth_user:<user name>
auth_token:<API token>
I want to be able to extract the text after the key words into variables, also stripping any "\n" that exist at the end of the line. Currently I'm doing this, and it works but seems clumsy:
with open(argv[1], mode='r') as config_file:
lines = config_file.readlines()
for line in lines:
url_match = match('jira_url:', line)
if url_match:
jira_url = line[9:].split("\n")[0]
user_match = match('auth_user:', line)
if user_match:
auth_user = line[10:].split("\n")[0]
token_match = match('auth_token', line)
if token_match:
auth_token = line[11:].split("\n")[0]
Can anybody suggest a more elegant solution? Specifically it's the ... = line[10:].split("\n")[0] lines that seem clunky to me.
I'm also slightly confused why I can't reuse my match object within the for loop, and have to create new match objects for each config item.

you could use a .yml file and read values with yaml.load() function:
import yaml
with open('settings.yml') as file:
settings = yaml.load(file, Loader=yaml.FullLoader)
now you can access elements like settings["url"] and so on

If the format is always <tag>:<value> you can easily parse it by splitting the line at the colon and filling up a custom dictionary:
config_file = open(filename,"r")
lines = config_file.readlines()
config_file.close()
settings = dict()
for l in lines:
elements = l[:-1].split(':')
settings[elements[0]] = ':'.join(elements[1:])
So, you get a dictionary that has the tags as keys and the values as values. You can then just refer to these dictionary entries in your pogram.
(e.g.: if you need the auth_token, just call settings["auth_token"]

if you can add 1 line for config file, configparser is good choice
https://docs.python.org/3/library/configparser.html
[1] config file : 1.cfg
[DEFAULT] # configparser's config file need section name
url:<url string>
auth_user:<user name>
auth_token:<API token>
[2] python scripts
import configparser
config = configparser.ConfigParser()
config.read('1.cfg')
print(config.get('DEFAULT','url'))
print(config.get('DEFAULT','auth_user'))
print(config.get('DEFAULT','auth_token'))
[3] output
<url string>
<user name>
<API token>
also configparser's methods is useful
whey you can't guarantee config file is always complete

You have a couple of great answers already, but I wanted to step back and provide some guidance on how you might approach these problems in the future. Getting quick answers sometimes prevents you from understanding how those people knew about the answers in the first place.
When you zoom out, the first thing that strikes me is that your task is to provide config, using a file, to your program. Software has the remarkable property of solve-once, use-anywhere. Config files have been a problem worth solving for at least 40 years, so you can bet your bottom dollar you don't need to solve this yourself. And already-solved means someone has already figured out all the little off-by-one and edge-case dramas like stripping line endings and dealing with expected input. The challenge of course, is knowing what solution already exists. If you haven't spent 40 years peeling back the covers of computers to see how they tick, it's difficult to "just know". So you might have a poke around on Google for "config file format" or something.
That would lead you to one of the most prevalent config file systems on the planet - the INI file. Just as useful now as it was 30 years ago, and as a bonus, looks not too dissimilar to your example config file. Then you might search for "read INI file in Python" or something, and come across configparser and you're basically done.
Or you might see that sometime in the last 30 years, YAML became the more trendy option, and wouldn't you know it, PyYAML will do most of the work for you.
But none of this gets you any better at using Python to extract from text files in general. So zooming in a bit, you want to know how to extract parts of lines in a text file. Again, this problem is an age-old problem, and if you were to learn about this problem (rather than just be handed the solution), you would learn that this is called parsing and often involves tokenisation. If you do some research on, say "parsing a text file in python" for example, you would learn about the general techniques that work regardless of the language, such as looping over lines and splitting each one in turn.
Zooming in one more step closer, you're looking to strip the new line off the end of the string so it doesn't get included in your value. Once again, this ain't a new problem, and with the right keywords you could dig up the well-trodden solutions. This is often called "chomping" or "stripping", and with some careful search terms, you'd find rstrip() and friends, and not have to do awkward things like splitting on the '\n' character.
Your final question is about re-using the match object. This is much harder to research. But again, the "solution" wont necessarily show you where you went wrong. What you need to keep in mind is that the statements in the for loop are sequential. To think them through you should literally execute them in your mind, one after one, and imagine what's happening. Each time you call match, it either returns None or a Match object. You never use the object, except to check for truthiness in the if statement. And next time you call match, you do so with different arguments so you get a new Match object (or None). Therefore, you don't need to keep the object around at all. You can simply do:
if match('jira_url:', line):
jira_url = line[9:].split("\n")[0]
if match('auth_user:', line):
auth_user = line[10:].split("\n")[0]
and so on. Not only that, if the first if triggered then you don't need to bother calling match again - it will certainly not trigger any of other matches for the same line. So you could do:
if match('jira_url:', line):
jira_url = line[9:].rstrip()
elif match('auth_user:', line):
auth_user = line[10:].rstrip()
and so on.
But then you can start to think - why bother doing all these matches on the colon, only to then manually split the string at the colon afterwards? You could just do:
tokens = line.rstrip().split(':')
if token[0] == 'jira_url':
jira_url = token[1]
elif token[0] == 'auth_user':
auth_user = token[1]
If you keep making these improvements (and there's lots more to make!), eventually you'll end up re-writing configparse, but at least you'll have learned why it's often a good idea to use an existing library where practical!

Finding lines by keywords and getting the value of said line in python

i am quite new to Python and i would like to ask the following:
Let's say for example i have the following .txt file:
USERNAME -- example.name
SERVER -- server01
COMPUTERNAME -- computer01
I would like to search through this document for the 3 keywords: USERNAME, SERVER and COMPUTERNAME and when I find these, I would like to extract their values, a.i "example.name", "server01" and "computer01" respectively for each line.
Is this possible? I have already tried for looking by line numbers, but I would much prefer searching by keywords instead.
Question 2: Somewhere in this .txt file exists a line with the keyword Adresses: which has multiple values but listed in different lines, like such:
Adresses:
6001:8000:1080:142::12
8002:2013:2380:110::53
9007:2013:2380:117::80
.
.
Would there be any way to get all of the listed addresses as a result, not just the first one? The number of said addresses is dynamic, so it may change in the future.
To this i have honestly no idea how to begin. I appreciate any kind of hints or pointing me in the right direction.
Thank you very much for your time and attention!

Like this:
with open("filename.txt") as f:
for x in f:
a = x.split(" -- ")
print(a[1])

If line with given value always starts with keyword you can try something like this
with open('file.txt', 'r') as file:
for line in file:
if line.startswith('keyword'):
keyword, value = line.split(' -- ')
and to gather all the addresses i'd initiate list of addresses beforehand, then add line
addresses.append(value)
inside of if statement

Your best friend for this kind of task will be str.split function. You can put your data in a dict which will map keywords to values :
data = {} # Create a new dict
with open('data.txt') as file: # Open the file containing the data
lines = file.read().split('\n') # Split the file in lines
for line in lines: # For each line
keyword, value = line.split(' -- ') # Extract keyword and value
data[keyword] = value # Put in the dict
Then, you can access your values with data['USERNAME'] for example. This method will work on any document containing a key-value association on each line (even if you have more than 3 keywords). However, it will not work if the same text file contains the addresses in the format you mentionned.
If you want to include the addresses in the same file, you'll need to adapt the code. You can for example check if splitted_line contains two elements (= key-value on the same line, like USERNAME) or only one (= key-value on multiple lines, like Addresses:). Then, you can store in a list all the different addresses, and bound this list to the data dict. It's not a problem to have a dict in the form :
{
'USERNAME': 'example.name',
'Addresses': ['6001:8000:1080:142::12', '8002:2013:2380:110::53']
}

Formating csv file to utilize zabbix sender

I have a csv file that i need to format first before i can send the data to zabbix, but i enconter a problem in 1 of the csv that i have to use.
This is a example of the part of the file that i have problems:
Some_String_That_ineed_01[ifcontainsCell01removehere],xxxxxxxxx02
Some_String_That_ineed_01vtcrmp01[ifcontainsCell01removehere]-[1],aass
so this is 2 lines from the file, the other lines i already treated.
i need to check if Cell01 is in the line
if 'Cell01' in h: do something.
i need to remove all the content beetwen the [ ] included this [] that contains the Cell01 word and leave only this:
Some_String_That_ineed_01,xxxxxxxxx02
Some_String_That_ineed_01vtcrmp01-[1],aass
else my script already treats quite easy. There must be a better way then what i think which is use h.split in the first [ then split again on the , then remove the content that i wanna then add what is left sums strings. since i cant use replace because i need this data([1]).
later on with the desired result i will add this to zabbix sender as the item and item key_. I already have the hosts,timestamps,values.

You should use a regexp (re module):
import re
s = "Some_String_That_ineed_01[ifcontainsCell01removehere],xxxxxxxxx02"
replaced = re.sub('\[.*Cell01.*\]', '', s)
print (replaced)
will return:
[root#somwhere test]# python replace.py
Some_String_That_ineed_01,xxxxxxxxx02
You can also experiment here.

How to merge few lines with filtering some text

I have a text file with the following format.
The first line includes "USERID"=12345678 and the other lines include the user groups for each application:
For example:
User with user T-number T12345 has WRITE access to the APP1 and APP2 and READ-ONLY access to APP1.
T-Number is just some other kind of ID.
00001, 00002 and so on are sequence numbers and can be ignored.
T12345;;USERID;00001;12345678;
T12345;APPLICATION;WRITE;00001;APP1
T12345;APPLICATION;WRITE;00002;APP2
T12345;APPLICATION;READ-ONLY;00001;APP1
I need to do some filtering and merge the line containing USERID with all the lines having user groups, matching t-number with userid (T12345 = 12345678)
So the output should look like this.
12345678;APPLICATION;WRITE;APP1
12345678;APPLICATION;WRITE;APP2
12345678;APPLICATION;READ-ONLY;APP1
Should I use csv python module to accomplish this?

I do not see any advantage in using the csv module for reading and parsing the input text file. The number of fields varies: 6 fields in the USERID line, with 2 of them empty, but 5 non-empty fields in the other lines. The fields look very simple, so there is no need for csv's handling of the separator character hidden away in quotes and the like. There is no header line as in a csv file, but rather many headers sprinkled in among the data lines.
A simple routine that reads each line, splits each on the semicolon character, and parses the line, and combines related lines would suffice.
The output file is another matter. The lines have the same format, with the same number of fields. So creating that output may be a good use for csv. However, the format is so simple that the file could also be created without csv.

I am not so sure if you should use the csv module here - it has mixed data, possibly more than just users and user group rights? In the case of a user declaration, you only need to retrieve its group and id, while for the application rights you need to extract the group, app name and right. The more differing data you have, the more issues you will encounter - with manual parsing of the data you are always able to just continue when you met certain criterias.
So far i must say you are better off with a manual, line-by-line parsing of the lines, structure it into something meaningful, then output the data. For instance
from StringIO import StringIO
from pprint import pprint
feed = """T12345;;USERID;00001;12345678;
T12345;;USERID;00001;2345678;
T12345;;USERID;00002;345678;
T12345;;USERID;00002;45678;
T12345;APPLICATION;WRITE;00001;APP1
T12345;APPLICATION;WRITE;00002;APP2
T12345;APPLICATION;READ-ONLY;00001;APP1
T12345;APPLICATION;WRITE;00002;APP1
T12345;APPLICATION;WRITE;00002;APP2"""
buf = StringIO(feed)
groups = {}
# Read all data into a dict of dicts
for line in buf:
values = line.strip().split(";")
if values[3] not in groups:
groups[values[3]] = {"users": [], "apps": {}}
if values[2] == "USERID":
groups[values[3]]['users'].append(values[4])
continue
if values[1] == "APPLICATION":
if values[4] not in groups[values[3]]["apps"]:
groups[values[3]]["apps"][values[4]] = []
groups[values[3]]["apps"][values[4]].append(values[2])
print("Structured data with group as root")
pprint(groups)
print("Output data")
for group_id, group in groups.iteritems():
# Order by user, app
for user in group["users"]:
for app_name, rights in group["apps"].iteritems():
for right in rights:
print(";".join([user, "APPLICATION", right, app_name]))
Online demo here

How to create an array or list in a default.conf file

I've inherited a python script that is pulling some variables from a default.conf file which I believe is a Machine configuration file.
One of the parts of the script is pulling a configuration key from the .conf file and expecting there to be a list of possible options however right now there is just one option and I'm unsure of how to make it so there are multiple options.
[syndication]
name = Test Name
title = Test Title
categories = Category 1
So in the above example the config key is syndication and the variable I'm trying to add multiple options to is category.
Thanks!

If there are too few values that fit on one line, I would choose to separate them by commas as exemplified by other fellows, otherwise per RFC822 standard you can split values by lines started by tabs:
settings.conf:
[syndication]
name = Test Name
title = Test Title
categories =
Category 1
Category 2
Category 3
settings.py:
#!/usr/bin/python
import ConfigParser
config = ConfigParser.ConfigParser()
# Reading
config.readfp(open('settings.conf'))
categories = config.get('syndication', 'categories').strip().split('\n')
# Appending
categories.append('Category 4')
# Changing
config.set('syndication', 'categories', '\n' + '\n'.join(categories))
# Storing
config.write(open('settings.conf', 'w'))
Your new settings.conf:
[syndication]
name = Test Name
title = Test Title
categories =
Category 1
Category 2
Category 3
Category 4
Note: You can put a value in the first line after the : or =, but being a list of values, I think starting from the second line is more "readable" when you've to manually edit the file.

I don't think you have a list type in configuration files. However you can do something like comma delimited values:
[syndication]
name = Test Name
title = Test Title
categories = Category 1, Category 2
Then in your code split the values by ,
values = [value.strip() for value in cfg.get('syndication', 'categories').split(',')]

You haven't mentioned how you're reading the .conf file, so I'll assume you're using ConfigParser.
I tried setting categories to a tuple and using ConfigParser.write, and ended up with a string representation of the tuple in the resulting file. This implies to me that ConfigParser doesn't support multiple options.
You can always break up an option manually:
categories = [option.strip() for option in config.get('syndication', 'categories').split(',')]

It is perhaps working the same way as python's logging configuration file format? In this case categories would be a comma separated list (In case of logging configuration files, every item is referring to a specific section in the config file)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.