Python parsing json data - python

I have a json object saved inside test_data and I need to know if the string inside test_data['sign_in_info']['package_type'] contains the string "vacation_package" in it. I assumed that in could help but I'm not sure how to use it properly or if it´s correct to use it. This is an example of the json object:
"checkout_details": {
"file_name" : "pnc04",
"test_directory" : "test_pnc04_package_today3_signedout_noinsurance_cc",
"scope": "wdw",
"number_of_adults": "2",
"number_of_children": "0",
"sign_in_info": {
"should_login": false,
**"package_type": "vacation_package"**
},
package type has "vacation_package" in it, but it's not always this way.
For now I´m only saving the data this way:
package_type = test_data['sign_in_info']['package_type']
Now, is it ok to do something like:
p= "vacation_package"
if(p in package_type):
....
Or do I have to use 're' to cut the string and find it that way?

You answer depends on what exactly you expect to get from test_data['sign_in_info']['package_type']. Will 'vacation_package' always be by itself? Then in is fine. Could it be part of a larger string? Then you need to use re.search. It might be safer just to use re.search (and a good opportunity to practice regular expressions).

No need to use re, assuming you are using the json package. Yes, it's okay to do that, but are you trying to see if there is a "package type" listed, or if the package type contains vacation_package, possibly among other things? If not, this might be closer to what you want, as it checks for exact matches:
import json
data = json.load(open('file.json'))
if data['sign_in_info'].get('package_type') == "vacation_package":
pass # do something

Related

using variable (f)-string stored in json

I have a json config file where I store my path to data there
The data is bucketed in month and days, so without the json I would use an f-string like:
spark.read.parquet(f"home/data/month={MONTH}/day={DAY}")
Now I want to extract that from json. However, I run into problems with the Month and day variable. I do not want to split the path in the json.
But writing it like this:
{
"path":"home/data/month={MONTH}/day={DAY}"
}
and loading with:
DAY="1"
MONTH="12"
conf_path=pandas.read_json("...")
path=conf_path["path"]
data=spark.read_parquet(f"{path}")
does not really work.
Could you hint me a solution to retrieving a path with variable elements and filling them after reading? How would you store the path or retrieve it without splitting the path? Thanks
------- EDIT: SOLUTION --------
Thanks to Deepak Tripathi answer below, the answer is to use string format.
with the code like this:
day="1"
month="12"
conf_path=pandas.read_json("...")
path=conf_path["path"]
data=spark.read_parquet(path.format(MONTH=month, DAY=day))
you should use string.format() instead of f-strings
Still if you want to use f-strings then you should use eval like this, its unsafe
DAY="1"
MONTH="12"
df = pd.DataFrame(
[{
"path":"home/data/month={MONTH}/day={DAY}"
},
{
"path":"home/data/month={MONTH}/day={DAY}"
}
]
)
a = df['path'][0]
print(eval(f"f'{a}'"))
#home/data/month=12/day=1

JSON object parsing and how to escape unicode characters

I'm fairly new to javascript and such so I don't know if this will be worded correctly, but I'm trying to parse a JSON object that I read from a database. I send the html page the variable from a python script using Django where the variable looks like this:
{
"data":{
"nodes":[
{
"id":"n0",
"label":"Redditor(user_name='awesomeasianguy')"
},
...
]
}
}
Currently, the response looks like:
"{u'data': {u'nodes': [{u'id': u'n0', u'label': u"Redditor(user_name='awesomeasianguy')"}, ...
I tried to take out the characters like u&#39 with a replaceAll type statement as seen below. This however is not that easy of a solution and it seems like there has got to be a better way to escape those characters.
var networ_json = JSON.parse("{{ networ_json }}".replace(/u'/g, '"').replace(/'/g, '"').replace(/u"/g, '"').replace(/"/g, '"'));
If there are any suggestions on a method I'm not using or even a tool to use for this, it would be greatly appreciated.
Use the template filter "|safe" to disable escaping, like,
var networ_json = JSON.parse("{{ networ_json|safe }}";
Read up on it here: https://docs.djangoproject.com/en/dev/ref/templates/builtins/#safe

Pythonic way to import multiple dictionaries from text file

So I have a text file,
question_one = {question:"what is 2+2", answer: "4", fake1: "5"}
question_two = {question:"what is the meaning of life?", answer:"pizza", fake:"42"}
How can I then import these dictionaries so that I could use them like this,
print(question_one["question"])
print(question_two["question"])
So the out come would be
what is 2+2
what is the meaning of life?
I would like this so that I can add questions to a text file from within the program and then save them should I add more, If this is possible another way please let me know!
The simplest way would be to store your questions into a JSON file, like #Thom Wiggers is suggesting.
Here's an example:
[
{
"question": "what is 2+2",
"answer": "4",
"fake1": "5"
},
{
"question": "what is the meaning of life?",
"answer": "pizza",
"fake1": "42"
}
]
import json
with open('questions.json') as f:
questions = json.load(f)
for question in questions:
print(question['question'])
You can read more about the JSON module in the official documentation.
If you only want to serialize data, you want to use pickle or json. exec will execute all Python code, and can be a serious security problem.
pickle is faster, and is specificity tailored to Python, while json can be read & written work by just about any programming language, and is still fairly human-readable & human-editable.
Now, to answer the question as you asked it (you probably don't want to do this):
You can use exec()
This function supports dynamic execution of Python code. object must
be either a string or a code object. If it is a string, the string is
parsed as a suite of Python statements which is then executed (unless
a syntax error occurs).
ie.
exec(open('data.txt', 'r').read())
Another way to do is would be to (ab)use import, assuming your file is named data.py:
import data
data.question_one['question']
This is obviously not what import was intended for... I've 'used' import like this in the past, and regretted it (there are a number of caveats, I'll leave it as an exercise to the reader to think about what they might be).
Warning Both are eval-like statements, and should be used with care, any Python code in data.txt will be executed, which may be potentially dangerous. Be very sure you trust the source of whatever you pass to exec(), and don't use if you only want to serialize data (instead of running Python code as such).

XML to store system paths in Python with lxml

I'm using an xml file to store configurations for a software.
One of theese configurations would be a system path like
> set_value = "c:\\test\\3 tests\\test"
i can store it by using:
> setting = etree.SubElement(settings,
> "setting", name=tmp_set_name, type =
> set_type , value= set_value)
If I use
doc.write(output_file, method='xml',encoding = 'utf-8', compression=0)
the file would be:
< setting type="str" name="MyPath" value="c:\test\3 tests\test"/>
Now I read it again with the etree.parse method
I obtain an etree child object with a string value, but the string
contains the
\3
character and if i try to use it to write again to xml it will be interpreted !!!!! So i cannot use it anymore as a path
Maybe i'm only missing a simple string operation, but I cannot see it =)
How would you solve it in a smart way ?
This is an example, but what is the best way, you think to store paths in xml and parse them with lxml ?
Thank you !!
Now I read it again with the
etree.parse method
I obtain an etree child object with a
string value, but the string contains
the
\3
character and if i try to use it to
write again to xml it will be
interpreted !!!!!
I just tried that, and it doesn't get "interpreted". The elements attributes as returned after parsed is:
{'type': 'str', 'name': 'yowza!', 'value': 'c:\\test\\3 tests\\test'}
So as you see this works just as you expected it to work. If you really have this problem, you are doing something else than what you are saying. Show us the real code, or make a small example code where you demonstrate the problem and use that.

Python: Matching & Stripping port number from socket data

I have data coming in to a python server via a socket. Within this data is the string '<port>80</port>' or which ever port is being used.
I wish to extract the port number into a variable. The data coming in is not XML, I just used the tag approach to identifying data for future XML use if needed. I do not wish to use an XML python library, but simply use something like regexp and strings.
What would you recommend is the best way to match and strip this data?
I am currently using this code with no luck:
p = re.compile('<port>\w</port>')
m = p.search(data)
print m
Thank you :)
Regex can't parse XML and shouldn't be used to parse fake XML. You should do one of
Use a serialization method that is nicer to work with to start with, such as JSON or an ini file with the ConfigParser module.
Really use XML and not something that just sort of looks like XML and really parse it with something like lxml.etree.
Just store the number in a file if this is the entirety of your configuration. This solution isn't really easier than just using JSON or something, but it's better than the current one.
Implementing a bad solution now for future needs that you have no way of defining or accurately predicting is always a bad approach. You will be kept busy enough trying to write and maintain software now that there is no good reason to try to satisfy unknown future needs. I have never seen a case where "I'll put this in for later" has led to less headache later on, especially when I put it in by doing something completely wrong. YAGNI!
As to what's wrong with your snippet other than using an entirely wrong approach, angled brackets have a meaning in regex.
Though Mike Graham is correct, using regex for xml is not 'recommended', the following will work:
(I have defined searchType as 'd' for numerals)
searchStr = 'port'
if searchType == 'd':
retPattern = '(<%s>)(\d+)(</%s>)'
else:
retPattern = '(<%s>)(.+?)(</%s>)'
searchPattern = re.compile(retPattern % (searchStr, searchStr))
found = searchPattern.search(searchStr)
retVal = found.group(2)
(note the complete lack of error checking, that is left as an exercise for the user)

Categories

Resources