Getting the xml soap response instead of raw data - python

I am trying to get the soap response and read few tags and then put the key and values inside a dictionary.
Better would be if I could use the response generated directly and preform regd operations to it.
But since I was not able to do that, I tried storing the response in an xml file and then using that for operations.
My problem is that the response generated is in a raw form. How to resolve this.
Example: <medical:totEeCnt val="2" />
<medical:totMbrCnt val="2" />
<medical:totDepCnt val="0" />
def soapTest():
request = """<soapenv:Envelope.......
auth = HTTPBasicAuth('', '')
headers = {'content-type': 'application/soap+xml', 'SOAPAction': "", 'Host': 'bfx-b2b....com'}
url = "https://bfx-b2b....com/B2BWEB/services/IProductPort"
response = requests.post(url, data=request, headers=headers, auth=auth, verify=True)
# Open local file
fd = os.open('planRates.xml', os.O_RDWR|os.O_CREAT)
# Convert response object into string
response_str = str(response.content)
# Write response to the file
os.write(fd, response_str)
# Close the file
os.close(fd)
tree = ET.parse('planRates.xml')
root = tree.getroot()
dict = {}
print root
for plan in root.findall('.//{http://services.b2b.../types/rates/dental}dentPln'): # type: object
plan_id = plan.get('cd')
print plan
print plan_id
for rtGroup in plan.findall('.//{http://services.b2b....com/types/rates/dental}censRtGrp'):
#print rtGroup
for amt in rtGroup.findall('.//{http://services.b2b....com/types/rates/dental}totAnnPrem'):
# print amt
print amt.get('val')
amount = amt.get('val')
dict[plan_id] = amount
print dict
Update-:
I did few things, what I am not able to understand is that ,
using this, the operations further are working,
tree = ET.parse('data/planRates.xml')
root = tree.getroot()
dict = {}
print tree
print root
for plan in root.findall(..
output -
<xml.etree.ElementTree.ElementTree object at 0x100d7b910>
<Element '{http://schemas.xmlsoap.org/soap/envelope/}Envelope' at 0x101500450>
But after using this ,it is not working
tree = ET.fromstring(response.text)
print tree
for plan in tree.findall(..
output-:
<Element '{http://schemas.xmlsoap.org/soap/envelope/}Envelope' at 0x10d624910>
Basically I am using the same object only .

Supposing you get a response that you want as proper xml object:
rt = resp.text.encode('utf-8')
# printed rt is something like:
'<soap:Envelope xmlns:soap="http://...">
<soap:Envelope
<AirShoppingRS xmlns="http://www.iata.org/IATA/EDIST" Version="16.1">
<Document>...</soap:Envelope</soap:Envelope>'
# striping soapEnv
startTag = '<AirShoppingRS '
endTag = '</AirShoppingRS>'
trimmed = rt[rt.find(startTag): rt.find(endTag) + len(endTag)]
# parsing
from lxml import etree as et
root = et.fromstring(trimmed)
With this root element you can use find method, xpath or whatever you prefer.
Obviously you need to change the start and endtags to extract the correct element from the response but you get the idea, right?

Related

Parsing HTML using LXML Python

I'm trying to parse Oxford Dictionary in order to obtain the etymology of a given word.
class SkipException (Exception):
def __init__(self, value):
self.value = value
try:
doc = lxml.html.parse(urlopen('https://en.oxforddictionaries.com/definition/%s' % "good"))
except SkipException:
doc = ''
if doc:
table = []
trs = doc.xpath("//div[1]/div[2]/div/div/div/div[1]/section[5]/div/p")
I cannot seem to work out how to obtain the string of text I need. I know I lack some lines of code in the ones I have copied but I don't know how HTML nor LXML fully works. I would much appreciate if someone could provide me with the correct way to solve this.
You don't want to do web scraping, and especially when probably every dictionary has an API interface. In the case of Oxford create an account at https://developer.oxforddictionaries.com/. Get the API credentials from your account and do something like this:
import requests
import json
api_base = 'https://od-api.oxforddictionaries.com:443/api/v1/entries/{}/{}'
language = 'en'
word = 'parachute'
headers = {
'app_id': '',
'app_key': ''
}
url = api_base.format(language, word)
reply = requests.get(url, headers=headers)
if reply.ok:
reply_dict = json.loads(reply.text)
results = reply_dict.get('results')
if results:
headword = results[0]
entries = headword.get('lexicalEntries')[0].get('entries')
if entries:
entry = entries[0]
senses = entry.get('senses')
if senses:
sense = senses[0]
print(sense.get('short_definitions'))
Here's a sample to get you started scraping Oxford dictionary pages:
import lxml.html as lh
from urllib.request import urlopen
url = 'https://en.oxforddictionaries.com/definition/parachute'
html = urlopen(url)
root = lh.parse(html)
body = root.find("body")
elements = body.xpath("//span[#class='ind']")
for element in elements:
print(element.text)
To find the correct search string you need to format the html so you can see the structure. I used the html formatter at https://www.freeformatter.com/html-formatter.html. Looking at the formatted HTML, I could see the definitions were in the span elements with the 'ind' class attribute.

How to get a specific value in a json string in python 3.6

I am trying to get a certain value in a string of json but I can't figure out how exactly to do it. I don't want to convert it into a string and strip / replace the unwanted pieces because then I won't be able to get the other values. My current code is:
username = "Dextication"
url = f"https://minecraft-statistic.net/api/player/info/{username}/"
response = requests.get(url)
json_data = json.loads(response.text)
print(json_data)
Edit:
when I run this, json.data = "{"status":"ok","data":{"online":0,"total_time_play":46990,"last_play":1513960562,"license":1,"name":"Dextication","uuid":"74d57a754855410c90b3d51bc99b8beb"}}"
I would like to only print the value: 46990
Try below code
import json, requests
username = "Dextication"
url = f"https://minecraft-statistic.net/api/player/info/{username}/"
response = requests.get(url)
json_data = json.loads(response.text)
result = json_data['data']['total_time_play']
print (result)

How to extract one value from an xml url using xml.etree

Im trying to print the value of just one field of a XML tree, here is the XML tree (e.g), the one that i get when i request it
<puco>
<resultado>OK</resultado>
<coberturaSocial>O.S.P. TIERRA DEL FUEGO(IPAUSS)</coberturaSocial>
<denominacion>DAMIAN GUTIERREZ DEL RIO</denominacion>
<nrodoc>32443324</nrodoc>
<rnos>924001</rnos>
<tipodoc>DNI</tipodoc>
</puco>
Now, i just want to print "coberturaSocial" value, here the request that i have in my views.py:
def get(request):
r = requests.get('https://sisa.msal.gov.ar/sisa/services/rest/puco/38785898')
dom = r.content
asd = etree.fromstring(dom)
If i print "asd" i get this error: The view didn't return an HttpResponse object. It returned None instead.
and also in the console i get this
I just want to print coberturaSocial, please help, new in xml parsing!
You need to extract the contents of the tag and then return it wrapped in a response, like so:
return HttpResponse(asd.find('coberturaSocial').text)
I'm guessing etree is import xml.etree.ElementTree as etree
You can use:
text = r.content
dom = etree.fromstring(text)
el = dom.find('coberturaSocial')
el.text # this is where the string is

Trying to Parse SOAP Response in Python

I'm struggling to find a way to parse the data that I'm getting back from a SOAP response. I'm only familiar with Python (v3.4), but relatively new to it. I'm using suds-jurko to pull the data from a 3rd party SOAP server. The response comes back in the form of "ArrayOfXmlNode". I've tried using ElementTree in different ways to parse the data, but I either get no information or I get "TypeError: invalid file: (ArrayOfXmlNode)" errors. Googling how to handle the ArrayOfXMLNode type response has gotten me nowhere.
The first part of the SOAP response is:
(ArrayOfXmlNode){
XmlNode[] =
(XmlNode){
Hl =
(Hl){
ID = "22437790"
Name = "Cameron"
SpeciesID = "1"
Sex = "Male"
PrimaryBreed = "German Shepherd"
SecondaryBreed = "Mix"
SN = ""
Age = "35"
OnHold = "No"
Location = "Foster Home"
BehaviorResult = ""
Photo = "http://sms.petpoint.com/sms/photos/615/123.jpg"
}
},
I've tried iterating through the data with code similar to:
from suds.client import Client
url = 'http://qag.petpoint.com/webservices/AdoptableSearch.asmx?WSDL'
client = Client(url)
result = client.service.adoptableSearchExtended('nunya', 0, 'A', 'All', 'N')
tree = result[0]
for node in tree:
pet_info = []
pet_info.extend(node)
print(pet_info)
The code above gives me the entire response in "result[0]". Below that I try to create a list from the data, but only get very last node (node being 1 set of information from ID to Photo). Attempts to modify this approach gives me either everything, nothing, or only the last node.
So then I tried to make use of ElementTree with simple code to test it out, but only get the "invalid file" errors.
import xml.etree.ElementTree as ET
from suds.client import Client
url = 'http://qag.petpoint.com/webservices/AdoptableSearch.asmx?WSDL'
client = Client(url)
result = client.service.adoptableSearchExtended('nunya', 0, 'A', 'All', 'N')
pet_info = ET.parse(result)
print(pet_info)
The result:
Traceback (most recent call last):
File "D:\Python\Eclipse Workspace\KivyTest\src\root\nested\Parse.py", line 11, in <module>
pet_info = ET.parse(result)
File "D:\Programs\Python34\lib\xml\etree\ElementTree.py", line 1186, in parse
tree.parse(source, parser)
File "D:\Programs\Python34\lib\xml\etree\ElementTree.py", line 587, in parse
source = open(source, "rb")
TypeError: invalid file: (ArrayOfXmlNode){
XmlNode[] =
(XmlNode){
Hl =
(Hl){
ID = "20840097"
Name = "Daisy"
SpeciesID = "1"
Sex = "Female"
PrimaryBreed = "Terrier, Pit Bull"
SecondaryBreed = ""
SN = ""
Age = "42"
OnHold = "No"
Location = "Dog Adoption"
BehaviorResult = ""
Photo = "http://sms.petpoint.com/sms/photos/615/40f428de-c015-4334-9101-89c707383817.jpg"
}
},
Can someone get me pointed in the right direction?
I had a similar problem parsing data from a web service using Python 3.4 and suds-jurko. I was able to solve the issue using the code in this post, https://stackoverflow.com/a/34844428/5874347. I used the fastest_object_to_dict function to convert the web service response into a dictionary. From there you can parse the data ...
Add the fastest_object_to_dict function to the top of your file
Make your web service call
Create a new variable to save the dictionary response to
result = client.service.adoptableSearchExtended('nunya', 0, 'A', 'All', 'N')
ParsedResponse = fastest_object_to_dict(result)
Your data will now be in the form of a dictionary, you can parse the dictionary on the python side as needed or send it back to your ajax call via json, and parse it with javascript.
To send it back as json
import json
import sys
sys.stdout.write("content-type: text/json\r\n\r\n")
sys.stdout.write(json.dumps(ParsedReponse))
Please try this:
result[0][0]
which will give you the first element of the array (ArrayOfXmlNode).
Similarly, try this:
result[0][1][2]
which will give you the third element of element result[0][1].
Hopefully, this offers an alternative solution.
If you are using Python, you can parse this result JSON from a XML result.
But your SOAP result needs to be a XML output, you can use the retxml=True on suds library.
I needed this result as a JSON output as well, and I ended up solving this way:
import xmltodict
# Parse the XML result into dict
data_dict = xmltodict.parse(soap_response)
# Dump the dict result into a JSON result
json_data = json.dumps(data_dict)
# Load the JSON string result
json = json.loads(json_data)

Mozenda Bulk Insertion with Python Script

I'm trying to write a python script to perform a bulk insertion over the Mozenda API. They give an example in C# in the documentation. https://mozenda.com/api#h9
test.xml - example below.
<ItemList>
<Item>
<First>32</First>
<Second>03</Second>
<Third>403</Third>
<Fourth>015</Fourth>
<Fifth>0000</Fifth>
<PIN>32034030150000</PIN>
</Item>
</ItemList>
My Code:
import urllib2
import urllib
url = 'https://api.mozenda.com/rest?WebServiceKey=[CANNOT-PROVIDE]&Service=Mozenda10&Operation=Collection.AddItem&CollectionID=1037'
fileName = '/Users/me/Desktop/test.xml'
req = urllib2.Request(url, fileName)
moz_response = urllib2.urlopen(req)
response_data = moz_response.read()
print response_data
Output:
<?xml version="1.0" encoding="utf-8"?>
<CollectionAddItemResponse xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Result>Success</Result>
<ItemID>1056</ItemID>
</CollectionAddItemResponse>
Output shows Success, but it's not inserting the data as expected. I suspect the xml needs to be encoded in some way, but I'm not sure...
The C# example is below from Mozenda's Site:
string webServiceKey = C70E1F84-E12B-4e73-B199-2EE6D43AF44E; //Your Account WebServiceKey.
string collectionID = 1001; //The ID of the destination Collection.
string url = string.Format(
"https://api.mozenda.com/rest?WebServiceKey={0}&Service=Mozenda10&Operation=Collection.AddItem&CollectionID={1}",
webServiceKey, collectionID);
string fileName = "C:\\Temp\\NewItems.xml"; //Path to the file containing the items to be uploaded.
WebClient client = new WebClient();
byte[] responseBinary = client.UploadFile(url, fileName);
string response = Encoding.UTF8.GetString(responseBinary);
You're not setting up your request correctly. When you do a POST request, you have to set up the body of the request in the way specified by RFC 1341, section 7.2.1
Here is an example in Python
from urllib2 import urlopen, Request
from string import ascii_uppercase, digits
from random import choice
# Open your data file
with open("C:\datafile.xml", "r") as data_file:
data_file_string = data_file.read()
# The URL
url = 'https://api.mozenda.com/rest?WebServiceKey=[CANNOT-PROVIDE]&Service=Mozenda10&Operation=Collection.AddItem&CollectionID=1037'
# The boundary delimits where the file to be uploaded starts and where it ends.
boundary = "".join(choice(ascii_uppercase + digits)
for x in range(20))
body_list = []
body_list.append("--" + boundary)
body_list.append("Content-Disposition: form-data;"
" name='file'; filename=''")
body_list.append("Content-Type: application/octet-stream")
body_list.append("")
body_list.append(data_file_string)
body_list.append("--{0}--".format(boundary))
body_list.append("")
body="\r\n".join(body_list).encode("utf8")
content_type = ("multipart/form-data; boundary="
"{0}").format(boundary)
headers = {"Content-Type": content_type,
"Content-Length": str(len(body))} # Tells how big the content is
request = Request(url, body, headers)
result = urlopen(url=request).read()

Categories

Resources