Convert find words in a string

Convert find words in a string - python

I have an output of a command like that as an example :
Time = XX ,info =XX , mesg=XXXX
{
"description":"add by the customer"
"group": " group black "
"id" :1,
"name": "group_1"
"Num": "No-648747464598"
}
{
"description":"add by the customer"
"group": "group black "
"id" :2,
"name": "group_2"
"Num": "No-7464674846"
}
{
"description":"add by the customer"
"group": " group black "
"id" :3,
"name": "group_3"
"Num": "No-9950509505"
}
How can I use the string function and split so if I give a name will get the Num?
Example :
Fonction X (group_3):
return No-9950509505

You could enumerate the lines and then fetch the data you need like following:
def func(group_name):
c = -1
group = 'group_3'
for idx, i in enumerate(x.splitlines()):
if c != -1:
return i.split(':')[1].replace('"', '')
if group_name in i:
c= idx

Related

how to read nested lists information from a Json file using Python

Here is a part of my Jason file, and I want to read "information" under "runs" -> "results" -> "properties"
I am trying the following:
with open(inputFile, "r") as readFile:
data = json.load(readFile)
print(type(data))
print("Run data type is: ",type(data['runs']))
#print("properties data type is: ", type(data['runs']['properties']))
# error: print("results data type is: ", type(data['runs']['properties']))TypeError: list indices must be integers or slices, not str
for info in data['runs']:
res = info.get('results',{})
#res = info.get('results', {}).get('properties', None)
#Error: AttributeError: 'list' object has no attribute 'get'
#inf = info.get('properties')
print(res)
All the parts that I have commented is not working. and I added also the error message
how can i read "information" in a loop?
{
"$schema" : "https://schemastore.azurewebsites.net/schemas/json/sarif-2.1.0-rtm.4.json",
"version" : "2.1.0",
"runs" : [ {
"tool" : { ...},
"artifacts" : [ ...],
"results" : [ {
"ruleId" : "DECL_MISMATCH",
"ruleIndex" : 0,
"message" : {
"text" : "XXXXX"
},
"level" : "error",
"baselineState" : "unchanged",
"rank" : 100,
"kind" : "fail",
"properties" : {
"tags" : [ "databaseId", "metaFamily", "family", "group", "information", "severity", "status", "comment", "justified", "assignedTo", "ticketKey", "color" ],
"databaseId" : 54496,
"metaFamily" : "Defect",
"family" : "Defect",
"group" : "Programming",
"information" : "Impact: High",
"severity" : "Unset",
"status" : "Unreviewed",
"comment" : "",
"justified" : false,
"color" : "RED"
},
"locations" : [ {
"physicalLocation" : {
"artifactLocation" : {
"index" : 0
}
},
"logicalLocations" : [ {
"fullyQualifiedName" : "File Scope",
"kind" : "function"
} ]
} ]
} ]
} ]
}

While you're trying to access the key properties which is inside a list, you have to set the index number. In this json you've posted the index number can be 0. So the code probably should be like this:
with open(inputFile, "r") as readFile:
data = json.load(readFile)
print(type(data))
print("Run data type is: ",type(data['runs']))
#print("properties data type is: ", type(data['runs']['properties']))
# error: print("results data type is: ", type(data['runs']['properties']))TypeError: list indices must be integers or slices, not str
for info in data['runs']:
# res = info.get('results',{})
res = info.get('results', {})[0].get('properties', None)
#inf = info.get('properties')
print(res)

for run in data['runs']:
for result in run['results']:
properties = result['properties']
print("information = {}".format(properties['information']))

Decide JSON data python parsing flat

it's been a while since I've been stuck on a subject to which I can't find the desired solution.
Example: I have a json given like this:
{
"SECTION": {
"ID": 1,
"COMMENT" : "foo bar ",
"STRUCTURE" : {
"LIEN" : [
{
"from": "2020-01-01",
"to": "2020-01-03"
},
{
"from": "2020-01-04",
"to": "2999-01-07"
}
]
},
"CONTEXTE":{
"NATURE": {
"text": "lorem smdlk fjq lsjdf mqjsh dflkq hs dfhkq g"
}
}
}
}
I would like to have output, for example this:
{
"SECTION.ID": 1,
"SECTION.COMMENT": "foo bar ",
"SECTION.STRUCTURE.LIEN.from": "2020-01-01",
"SECTION.STRUCTURE.LIEN.to": "2020-01-03",
"SECTION.CONTEXTE.NATURE.text": "lorem smdlk fjq lsjdf mqjsh dflkq hs dfhkq g"
}
{
"SECTION.ID": 1,
"SECTION.COMMENT": "foo bar ",
"SECTION.STRUCTURE.LIEN.from": "2020-01-04",
"SECTION.STRUCTURE.LIEN.to": "2999-01-07",
"SECTION.CONTEXTE.NATURE.text": "lorem smdlk fjq lsjdf mqjsh dflkq hs dfhkq g"
}
Does anyone have any idea how I can do this in python? Thank you so much

I suggest you use the json Python module to convert the JSON object to a Python object. Then you can use recursion. If you are using Python 3.5 or later, the following code could be a good starting point:
import json
def flatten_helper(prefix, list_of_dict):
res = []
for i in list_of_dict:
res_dict={}
for k, v in i.items():
res_dict['.'.join([prefix,k])]=v
res.append(res_dict)
return res
def flatten(x):
if isinstance(x, list):
res = []
for ele in x:
res = res + flatten(ele)
return res
else:
res = [{}]
for k, v in x.items():
if (isinstance(v, dict) or isinstance(v,list)):
new_res = []
tempo = flatten(v)
for r in res:
for t in tempo:
new_res.append({**r, **t})
res = flatten_helper(k,new_res)
else:
for i, val in enumerate(res):
res[i][k]=v
return res
jsonobj = '{"SECTION": {"ID": 1, "COMMENT" : "foo bar ", "STRUCTURE" : { "LIEN" : [{"from": "2020-01-01", "to": "2020-01-03"}, {"from": "2020-01-04", "to": "2999-01-07" }]}, "CONTEXTE":{"NATURE": {"text": "lorem smdlk fjq lsjdf mqjsh dflkq hs dfhkq g"}}}}'
pyobj = json.loads(jsonobj)
res = flatten(pyobj)

Sounds like a classic case for recursion; aggregate path until you reach a "simple" value then write the pair "aggregated path"."key" : "value"

Convert string into Dict / JSON

I'm trying to convert the following piece of data into a dict so I can have key:value pair and have them nested :
data="Group A\n name : rey\n age : 6\n status : active\n in role : 201\n weight : 25\n interests\n Out Side : 16016\n In Side : 0\n Out : 2804\n\n name : dan\n age : 5\n status : inactive\n in role : 201\n weight : 18\n interests\n Out Side : 16016\n In Side : 0\n Out : 2804\n\n"
Problem is, not all of them have : and some of them are supposed to be grouped together (i.e part of Group A). \n line break definitely helps me in separating them.
Here's the result I'm hoping to get :
[
{
"Group A":
[
{"name": "rey", "age": "6", "status": "active"},
{"name": "dan", "age": "5", "status": "inactive"}
]
}
]
What do I have right now? I've separated some of them into dict with :
result = {}
for row in data.split('\n'):
if ':' in row:
key, value = row.split(':')
result[key] = value.strip()
This outputs me :
{' name ': 'dan', ' weight ': '18', ' In Side ': '0', ' Out ': '2804', ' in role ': '201', ' status ': 'inactive', ' Out Side ': '16016', ' age ': '5'}
But this messes up with the existing order in which the data is shown above - and not all of it came out.
I'm kind of capturing this data from an external program and so limited to only Python version 2.7. Any ideas would be super helpful!

You should find the pattern in your data :
Your data is separated by empty lines
The inner data start with space
Using those observations you can write :
def parse(data):
aa = {}
curRecord = None
curGroup = None
for row in data.split('\n'):
if row.startswith(' '):
# this is a new key in the inner record
if ':' in row :
if curRecord == None:
curRecord = {}
curGroup.append(curRecord)
key, value = row.split(':')
curRecord[key.strip()] = value.strip()
elif row == '':
# this signal new inner record
curRecord = None
else:
# otherwise , it is a new group
curRecord = None
curGroup = []
aa[row.strip()] = curGroup
return aa
>>> import json
>>> print( json.dumps(parse(data)) );
{"Group A": [{"name": "rey", ... }, {"name": "dan", ... }]}

I will use setdefault to create new lists within your Group. This will work for multiple Groups in case you have group B, C, D...
import json
def convert_string(data):
result = {}
current = ""
key_no = -1
for pair in data.split("\n"):
if ':' not in pair:
if "Group" in pair:
result.setdefault(pair,[])
current = pair
key_no = -1
elif "name" in pair:
result[current].append({})
key_no +=1
key, value = pair.split(':')
result[current][key_no][key.strip()] = value.strip()
elif all(s not in pair.lower() for s in ("weight","in role","out","in side")):
key, value = pair.split(':')
result[current][key_no][key.strip()] = value.strip()
return result
print (json.dumps(convert_string(d),indent=4))
{
"Group A": [
{
"name": "rey",
"age": "6",
"status": "active"
},
{
"name": "dan",
"age": "5",
"status": "inactive"
}
]
}

How to search through a nested list with a dictionary in Python?

In Python 3.6, I have a list like the one below and can't figure out how to properly search through the values. So, if I am given the search string below, I need to search through the values for both title and tags and whichever one has most matches, I would return the id for and if there were many different images (ids) with the same amount of matches, then the one whose title comes first alphabetically would be returned. Also, it is supposed to not be casesensitive. So in the code I have search as my term to search and it should return the first id value, but instead is returning different values.
image_info = [
{
"id" : "34694102243_3370955cf9_z",
"title" : "Eastern",
"flickr_user" : "Sean Davis",
"tags" : ["Los Angeles", "California", "building"]
},
{
"id" : "37198655640_b64940bd52_z",
"title" : "Spreetunnel",
"flickr_user" : "Jens-Olaf Walter",
"tags" : ["Berlin", "Germany", "tunnel", "ceiling"]
},
{
"id" : "34944112220_de5c2684e7_z",
"title" : "View from our rental",
"flickr_user" : "Doug Finney",
"tags" : ["Mexico", "ocean", "beach", "palm"]
},
{
"id" : "36140096743_df8ef41874_z",
"title" : "Someday",
"flickr_user" : "Thomas Hawk",
"tags" : ["Los Angeles", "Hollywood", "California", "Volkswagen", "Beatle", "car"]
}
]
my_counter = 0
search = "CAT IN BUILding"
search = search.lower().split()
matches = {}
for image in image_info:
for word in search:
word = word.lower()
if word in image["title"].lower().split(" "):
my_counter += 1
print(my_counter)
if word in image["tags"]:
my_counter +=1
print(my_counter)
if my_counter > 0:
matches[image["id"]] = my_counter
my_counter = 0

This a variation of code where I have attempted to pre-index the data before doing search. This a very rudimentary implementation of how CloudSearch or ElasticSearch would index and search
import itertools
from collections import Counter
image_info = [
{
"id" : "34694102243_3370955cf9_z",
"title" : "Eastern",
"flickr_user" : "Sean Davis",
"tags" : ["Los Angeles", "California", "building"]
},
{
"id" : "37198655640_b64940bd52_z",
"title" : "Spreetunnel",
"flickr_user" : "Jens-Olaf Walter",
"tags" : ["Berlin", "Germany", "tunnel", "ceiling"]
},
{
"id" : "34944112220_de5c2684e7_z",
"title" : "View from our rental",
"flickr_user" : "Doug Finney",
"tags" : ["Mexico", "ocean", "beach", "palm"]
},
{
"id" : "36140096743_df8ef41874_z",
"title" : "Someday",
"flickr_user" : "Thomas Hawk",
"tags" : ["Los Angeles", "Hollywood", "California", "Volkswagen", "Beatle", "car"]
}
]
my_counter = 0
search = "CAT IN BUILding california"
search = set(search.lower().split())
matches = {}
index = {}
# Building a rudimentary search index
for info in image_info:
bag = info["title"].lower().split(" ")
tags = [t.lower().split(" ") for t in info["tags"]] # we want to be able to hit "los angeles" as will as "los" and "angeles"
tags = list(itertools.chain.from_iterable(tags))
for k in (bag + tags):
if k in index:
index[k].append(info["id"])
else:
index[k] = [info["id"]]
#print(index)
hits = []
for s in search:
if s in index:
hits += index[s]
print(Counter(hits).most_common(1)[0][0])

You are creating new entry in dictionary matches[image["id"]] = my_counter.
If you want to keep only 1 entry in dictionary for that search term and you want image_id and count. I have modified your dict and condition. Hope it helps.
my_counter = 0
search_term = "CAT IN BUILding"
search = search_term.lower().split()
matches = {}
matches[search_term] = {}
for image in image_info:
for word in search:
word = word.lower()
if word in image["title"].lower().split(" "):
my_counter += 1
print(my_counter)
if word in image["tags"]:
my_counter +=1
print(my_counter)
if my_counter > 0:
if not matches[search_term].values() or my_counter > matches[search_term].values()[0]:
matches[search_term][image["id"]] = my_counter
my_counter = 0

Stripping auto-generated quotation marks using json.dumps( ) (Python)

I'm using the json.dumps() method via passing in an OrderedDict. (See below for syntax). It's doing it correctly, but there is one specific field "labels": that consistently surrounds the input with " " (quotation marks) and I need it not to.
desiredJson = OrderedDict([('type', ""), ('labels', '' ), ('bgColor', ''), ('borderColor', '')])
for (category_type, updatedLabels, bgColors, borderColors) in zip(type_, labels_, bgColor_, borderColor_):
print category_type+updatedLabels
desiredJson["type"] = category_type
desiredJson["labels"] = '["%s", "%s"]' % (category_type, updatedLabels)
desiredJson["bgColor"] = bgColors
desiredJson["borderColor"] = borderColors
json.dumps(desiredJson, sort_keys = False, indent = 4, separators=(',' , ': '))
Here's what it looks like: (just a sample block, it outputs a lot)
{
"type": "Overall",
"labels": "[\"Overall\", \"Over\"]",
"bgColor": "#ff7f8d",
"borderColor": "darken"
}
I need it to follow this format:
{
"type": "Overall",
"labels": ["Overall", "Over"], // NOTE DIFFERENCE
"bgColor": "#ff7f8d",
"borderColor": "darken"
}
** Inserting List into dic **
{
"type": "Overall",
"labels": [
"Overall",
"Over"
],
"bgColor": "#ff7f8d",
"borderColor": "darken"
}

This is because you created the element as a string:
desiredJson["labels"] = '["%s", "%s"]' % (category_type, updatedLabels)
If you want it to be an array in the JSON, you shoud set it to a Python list:
desiredJson["labels"] = [category_type, updated_labels]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Convert find words in a string - python

You could enumerate the lines and then fetch the data you need like following: def func(group_name): c = -1 group = 'group_3' for idx, i in enumerate(x.splitlines()): if c != -1: return i.split(':')[1].replace('"', '') if group_name in i: c= idx

Related

how to read nested lists information from a Json file using Python

Decide JSON data python parsing flat

Convert string into Dict / JSON

How to search through a nested list with a dictionary in Python?

Stripping auto-generated quotation marks using json.dumps( ) (Python)

Categories

Resources