Accessing Elasticsearch using python - python

I'm currently trying to write a script to enrich some data. I've already coded some things that work fine with a demodata txt file, but now I'd like to try and directly requests the latest data from the server in the script.
The data I'm working with is stored on Elasticsearch. I've received a URL, including the port number. I also have a cluster ID, a username, and a password.
I can access the data directly using Kibana, where I enter the following into the console (under Dev Tools):
GET /*projectname*/appevents/_search?pretty=true&size=10000
I can copy the output into a TXT file (well, it's actually JSON data), which currently gets parsed by my script. I'd prefer to just collect the data directly without this intermediate step. Also, I'm currently limited to 10000 records/events, but I'd like to get all of them.
This works:
res = requests.get('*url*:*port*',
auth=HTTPBasicAuth('*username*','*password*'))
print(res.content)
I'm struggling with the elasticsearch package. How do I mimic the 'get' command listed above in my script, collecting everything in a JSON format?

Fixed, got some help from a programmer. Stored into a list, so I can work with it from there. Code below, identifying info is removed.
es = Elasticsearch(
hosts=[{'host': '***', 'port': ***}],
http_auth=('***', '***'),
use_ssl=True
)
count = es.count(index="***", doc_type="***")
print(count) # {u'count': 244532, u'_shards': {u'successful': 5, u'failed': 0, u'total': 5}}
# Use scroll to ease strain on cluster (don't pull in all results at once)
results = es.search(index="***", doc_type="***", size=1000,
scroll="30s")
scroll_id = results['_scroll_id']
total_size = results['hits']['total']
print(total_size)
# Save all results in list
dump = []
ct = 1
while total_size > 0:
results = es.scroll(scroll_id=scroll_id, scroll='30s')
dump += results['hits']['hits']
scroll_id = results['_scroll_id']
total_size = len(results['hits']['hits']) # As long as there are results, keep going ...
print("Chunk #", ct, ": ", total_size, "\tList size: ", len(dump))
ct += 1
es.clear_scroll(body={'scroll_id': [scroll_id]}) # Cleanup (otherwise Scroll id remains in ES memory)

Related

Downloading Multiple torrent files with Libtorrent in Python

I'm trying to write a torrent application that can take in a list of magnet links and then download them all together. I've been trying to read and understand the documentation at Libtorrent but I haven't been able to tell if what I try works or not. I've managed to be able to apply a SOCKS5 proxy to a Libtorrent session and download a single magnet link using this code:
import libtorrent as lt
import time
import os
ses = lt.session()
r = lt.proxy_settings()
r.hostname = "proxy_info"
r.username = "proxy_info"
r.password = "proxy_info"
r.port = 1080
r.type = lt.proxy_type_t.socks5_pw
ses.set_peer_proxy(r)
ses.set_web_seed_proxy(r)
ses.set_proxy(r)
t = ses.settings()
t.force_proxy = True
t.proxy_peer_connections = True
t.anonymous_mode = True
ses.set_settings(t)
print(ses.get_settings())
ses.peer_proxy()
ses.web_seed_proxy()
ses.set_settings(t)
magnet_link = "magnet"
params = {
"save_path": os.getcwd() + r"\torrents",
"storage_mode": lt.storage_mode_t.storage_mode_sparse,
"url": magnet_link
}
handle = lt.add_magnet_uri(ses, magnet_link, params)
ses.start_dht()
print('downloading metadata...')
while not handle.has_metadata():
time.sleep(1)
print('got metadata, starting torrent download...')
while handle.status().state != lt.torrent_status.seeding:
s = handle.status()
state_str = ['queued', 'checking', 'downloading metadata', 'downloading', 'finished', 'seeding', 'allocating']
print('%.2f%% complete (down: %.1f kb/s up: %.1f kB/s peers: %d) %s' % (s.progress * 100, s.download_rate / 1000, s.upload_rate / 1000, s.num_peers, state_str[s.state]))
time.sleep(5)
This is great and all for runing on its own with a single link. What I want to do is something like this:
def torrent_download(magnetic_link_list):
for mag in range(len(magnetic_link_list)):
handle = lt.add_magnet_uri(ses, magnetic_link_list[mag], params)
#Then download all the files
#Once all files complete, stop the torrents so they dont seed.
return torrent_name_list
I'm not sure if this is even on the right track or not, but some pointers would be helpful.
UPDATE: This is what I now have and it works fine in my case
def magnet2torrent(magnet_link):
global LIBTORRENT_SESSION, TORRENT_HANDLES
if LIBTORRENT_SESSION is None and TORRENT_HANDLES is None:
TORRENT_HANDLES = []
settings = lt.default_settings()
settings['proxy_hostname'] = CONFIG_DATA["PROXY"]["HOST"]
settings['proxy_username'] = CONFIG_DATA["PROXY"]["USERNAME"]
settings['proxy_password'] = CONFIG_DATA["PROXY"]["PASSWORD"]
settings['proxy_port'] = CONFIG_DATA["PROXY"]["PORT"]
settings['proxy_type'] = CONFIG_DATA["PROXY"]["TYPE"]
settings['force_proxy'] = True
settings['anonymous_mode'] = True
LIBTORRENT_SESSION = lt.session(settings)
params = {
"save_path": os.getcwd() + r"/torrents",
"storage_mode": lt.storage_mode_t.storage_mode_sparse,
"url": magnet_link
}
TORRENT_HANDLES.append(LIBTORRENT_SESSION.add_torrent(params))
def check_torrents():
global TORRENT_HANDLES
for torrent in range(len(TORRENT_HANDLES)):
print(TORRENT_HANDLES[torrent].status().is_seeding)
It's called "magnet links" (not magnetic).
In new versions of libtorrent, the way you add a magnet link is:
params = lt.parse_magnet_link(uri)
handle = ses.add_torrent(params)
That also gives you an opportunity to tweak the add_torrent_params object, to set the save directory for instance.
If you're adding a lot of magnet links (or regular torrent files for that matter) and want to do it quickly, a faster way is to use:
ses.add_torrent_async(params)
That function will return immediately and the torrent_handle object can be picked up later in the add_torrent_alert.
As for downloading multiple magnet links in parallel, your pseudo code for adding them is correct. You just want to make sure you either save off all the torrent_handle objects you get back or query all torrent handles once you're done adding them (using ses.get_torrents()). In your pseudo code you seem to overwrite the last torrent handle every time you add a new one.
The condition you expressed for exiting was that all torrents were complete. The simplest way of doing that is simply to poll them all with handle.status().is_seeding. i.e. loop over your list of torrent handles and ask that. Keep in mind that the call to status() requires a round-trip to the libtorrent network thread, which isn't super fast.
The faster way of doing this is to keep track of all torrents that aren't seeding yet, and "strike them off your list" as you get torrent_finished_alerts for torrents. (you get alerts by calling ses.pop_alerts()).
Another suggestion I would make is to set up your settings_pack object first, then create the session. It's more efficient and a bit cleaner. Especially with regards to opening listen sockets and then immediately closing and re-opening them when you change settings.
i.e.
p = lt.settings_pack()
p['proxy_hostname'] = '...'
p['proxy_username'] = '...'
p['proxy_password'] = '...'
p['proxy_port'] = 1080
p['proxy_type'] = lt.proxy_type_t.socks5_pw
p['proxy_peer_connections'] = True
ses = lt.session(p)

JSON generated manually works, but created through json.dumps does not work even if output seems to be exactly same

I am using the Marketo API through the Python Library marketo-rest-python. I can create Leads and also Update them through the following basic code:
leads = [{"email":"joe#example.com","firstName":"Joe"},{"email":"jill#example.com","firstName":"Jill"}]
lead = mc.execute(method='create_update_leads', leads=leads, action='createOnly', lookupField='email',
asyncProcessing='false', partitionName='Default')
When i create programmatically this "leads" JSON object via
leads = []
lead = {}
lead['email'] = "joe#example.com"
lead['firstName'] = "Joe"
leads.append(lead)
lead = {}
lead['email'] = "jill#example.com"
lead['firstName'] = "Jill"
leads.append(lead)
json_leads = json.dumps(leads, separators=(',', ':'))
print(json_leads)
Then the output is exactly the same within Microsoft Azure Databricks, but the Marketo system returns me an 609-> Invalid JSON.
My output looks like
[{"email":"joe#example.com","firstName":"Joe"},{"email":"jill#example.com","firstName":"Jill"}]
Its exactly the same like in the sample. When i use the sample JSON codeline it works, but my self generated JSOn does not work.
Anyone has an idea what this ould be? I am using Python within Microsoft Azure Databricks.
I believe you don't need to call json.dumps, just do it like this
leads = []
lead = {}
lead['email'] = "joe#example.com"
lead['firstName'] = "Joe"
leads.append(lead)
lead = {}
lead['email'] = "jill#example.com"
lead['firstName'] = "Jill"
leads.append(lead)
lead = mc.execute(method='create_update_leads', leads=leads, action='createOnly',
lookupField='email', asyncProcessing='false', partitionName='Default')

parsing .log files then sort in access

I'm writing a parsing program that that searches through 100+ .log files after some keyword, then puts the words in different array´s and separates the words in to columns in excel. Now I want to sort them in Access automatically so that I can process the different .log file combinations. I can "copy paste" from my Excel file to Access, but that so inefficient and gives some errors... I would like it to be "automatic". I'm new to Access and don´t know how to link from python to Access, I have tried doing it as I did to Excel but that didn't work and started looking in to OBDC but had some problems there to...
import glob # includes
import xlwt # includes
from os import listdir # includes
from os.path import isfile, join # includes
def logfile(filename, tester, createdate,completeresponse):
# Arrays for strings
response = []
message = []
test = []
date = []
with open(filename) as filesearch: # open search file
filesearch = filesearch.readlines() # read file
for line in filesearch:
file = filename[39:] # extract filename [file]
for lines in filesearch:
if createdate in lines: # extract "Create Time" {date}
date.append(lines[15:34])
if completeresponse in lines:
response.append(lines[19:])
print('pending...')
i = 1 # set a number on log {i}
d = {}
for name in filename:
if not d.get(name, False):
d[name] = i
i += 1
if tester in line:
start = '-> '
end = ':\ ' # |<----------->|
number = line[line.find(start)+3: line.find(end)] #Tester -> 1631 22 F1 2E :\ BCM_APP_31381140 AJ \ Read Data By Identifier \
test.append(number) # extract tester {test}
# |<--------------------------------------------
text = line[line.find(end)+3:] # Tester -> 1631 22 F1 2E :\ BCM_APP_31381140 AJ \ Read Data By Identifier \
message.append(text)
with open('Excel.txt', 'a') as handler: # create .txt file
for i in range(len(message)):
# A B C D E
handler.write(f"{file}|{date[i]}|{i}|{test[i]}|{response[i]}")
# A = filename B = create time C = number in file D = tester E = Complete response
# open with 'w' to "reset" the file.
with open('Excel.txt', 'w') as file_handler:
pass
# ---------------------------------------------------------------------------------
for filename in glob.glob(r'C:\Users\Desktop\Access\*.log'):
logfile(filename, 'Sending Request: Tester ->', 'Create Time:','Complete Response:','Channel')
def if_number(s): # look if number or float
try:
float(s)
return True
except ValueError:
return False
# ----------------------------------------------
my_path = r"C:\Users\Desktop\Access" # directory
# search directory for .txt files
text_files = [join(my_path, f) for f in listdir(my_path) if isfile(join(my_path, f)) and '.txt' in f]
for text_file in text_files: # loop and open .txt document
with open(text_file, 'r+') as wordlist:
string = [] # array ot the saved string
for word in wordlist:
string.append(word.split('|')) # put word to string array
column_list = zip(*string) # make list of all string
workbook = xlwt.Workbook()
worksheet = workbook.add_sheet('Tab')
worksheet.col(0) # construct cell
first_col = worksheet.col(0)
first_col.width = 256*50
second_col = worksheet.col(1)
second_col.width = 256*25
third_col = worksheet.col(2)
third_col.width = 256*10
fourth_col = worksheet.col(3)
fourth_col.width = 256*50
fifth_col = worksheet.col(4)
fifth_col.width = 256*100
i = 0 # choose column 0 = A, 3 = C etc
for column in column_list:
for item in range(len(column)):
value = column[item].strip()
if if_number(value):
worksheet.write(item, i, float(value)) # text / float
else:
worksheet.write(item, i, value) # number / int
i += 1
print('File:', text_files, 'Done')
workbook.save(text_file.replace('.txt', '.xls'))
Is there a way to automate the "copy paste"-command, if so how would that look like and work? and if that's something that can´t be done, some advice would help a lot!
EDIT
Thanks i have done som googling and thanks for your help! but now i get a a error... i still can´t send the information to the Access file, i get a syntax error. and i know it exist because i would want to uppdate the existing file... is there a command to "uppdate an exising Acces file"?
error
pyodbc.ProgrammingError: ('42S01', "[42S01] [Microsoft][ODBC Microsoft Access Driver] Table 'tblLogfile' already exists. (-1303) (SQLExecDirectW)")
code
import pyodbc
UDC = r'C:\Users\Documents\Access\UDC.accdb'
# DSN Connection
constr = " DSN=MS Access Database; DBQ={0};".format(UDC)
# DRIVER connection
constr = "DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};UID=admin;UserCommitSync=Yes;Threads=3;SafeTransactions=0;PageTimeout=5;MaxScanRows=8;MaxBufferSize=2048;FIL={MS Access};DriverId=25;DefaultDir=C:/USERS/DOCUMENTS/ACCESS;DBQ=C:/USERS/DOCUMENTS/ACCESS/UDC.accdb"
# Connect to database UDC and open cursor
db = pyodbc.connect(constr)
cursor = db.cursor()
sql = "SELECT * INTO [tblLogfile]" +\
"FROM [Excel 8.0;HDR=YES;Database=C:/Users/Documents/Access/Excel.xls.[Tab];"
cursor.execute(sql)
db.commit()
cursor.close()
db.close()
First, please note, MS Access, a database management system, is not MS Excel, a spreadsheet application. Access sits on top of a relational engine and maintains strict rules in data and relational integrity whereas in Excel anything can be written across cells or ranges of cells with no rules. Additionally, the Access object library (tabledefs, querydefs, forms, reports, macros, modules) is much different than the Excel object library (workbooks, worksheets, range, cell, etc.), so there is no one-to-one translation in code.
Specifically, for your Python project, consider pyodbc using a make-table query that runs a direct connection to the Excel workbook. Since MS Access' database is the ACE/JET engine (Windows .dll files, available on Windows machines regardless of Access install). One feature of this data store is the ability to connect to workbooks even text files! So really, MSAccess.exe is just a GUI console to view .mdb/.accdb files.
Below creates a new database table that replicates the specific workbook sheet data, assuming the sheet maintains:
tabular format beginning in A1 cell (no merged cells/repeating labels)
headers in top row (no spaces before or after or special characters !#$%^~<>)))
columns of consistent data type format (i.e., data integrity rules)
Python code
import pyodbc
databasename = 'C:\\Path\\To\\Database\\File.accdb'
# DSN Connection
constr = "DSN=MS Access Database;DBQ={0};".format(databasename)
# DRIVER CONNECTION
constr = "DRIVER={{Microsoft Access Driver (*.mdb, *.accdb)}};DBQ={0};".format(databasename)
# CONNECT TO DATABASE AND OPEN CURSOR
db = pyodbc.connect(constr)
cur = db.cursor()
# RUN MAKE-TABLE QUERY FROM EXCEL WORKBOOK SOURCE
# OLDER EXCEL FORMAT
sql = "SELECT * INTO [myNewTable]" + \
" FROM [Excel 8.0;HDR=Yes;Database=C:\Path\To\Workbook.xls].[SheetName$];"
# CURRENT EXCEL FORMAT
sql = "SELECT * INTO [myNewTable]" + \
" FROM [Excel 12.0 Xml;HDR=Yes;Database=C:\Path\To\Workbook.xlsx].[SheetName$];"
cursor.execute(sql)
db.commit()
cur.close()
db.close()
Almost certainly the answer from Parfait above is a better way to go, but for fun I'll leave my answer below
If you are willing to put in the time I think you need 3 things to complete the automation of what you want to do:
1) Send a string representation of your data to the windows Clipboard There is windows specific code for this, or you can just save yourself some time and use pyperclip
2) Learn VBA and use VBA to grab the string from the clipboard and process it. Here is some example VBA code that I used in excel the past to grab text from the Clipboard
Function GetTextFromClipBoard() As String
Dim MSForms_DataObject As New MSForms.DataObject
MSForms_DataObject.GetFromClipboard
GetTextFromClipBoard = MSForms_DataObject.GetText()
End Function
3) use pywin32 (I believe available easily with Anaconda) to automate the vba access calls from Python. This is probably going to be the hardest part as the specific call trees are (in my opinion) not well documented and takes a lot of poking and digging to figure out what exactly you need to do. Its painful to say the least, but use IPython to help you with visual cues of what methods your pywin32 objects have available.
As I look at the instructions above, I realize it may also be possible to skip the clipboard and just send the information directly from python to access via pywin32. If you do the clipboard route however, you can break the steps up.
send one dataset to the clipboard
grab and process the data using the VBA editor in Access
after you figure out 1 and 2, use pywin32 to bridge the gap
Good luck, and maybe write a blog post about it if you figure it out to share the details.

How to save photos using instagram API and python

I'm using the Instagram API to obtain photos taken at a particular location using the python 3 code below:
import urllib.request
wp = urllib.request.urlopen("https://api.instagram.com/v1/media/search?lat=48.858844&lng=2.294351&access_token="ACCESS TOKEN")
pw = wp.read()
print(pw)
This allows me to retrieve all the photos. I wanted to know how I can save these on my computer.
An additional question I have is, is there any limit to the number of images returned by running the above? Thanks!
Eventually came up with this. In case anybody needs it, here you go:
#This Python Script will download 10,000 images from a specified location.
# 10k images takes approx 15-20 minutes, approx 700 MB.
import urllib, json, requests
import time, csv
print "time.time(): %f " % time.time() #Current epoch time (Unix Timestamp)
print time.asctime( time.localtime(time.time()) ) #Current time in human readable format
#lat='48.858844' #Latitude of the center search coordinate. If used, lng is required.
#lng='2.294351' #Longitude of the center search coordinate. If used, lat is required.
#Brooklyn Brewery
lat='40.721645'
lng='-73.957258'
distance='5000' #Default is 1km (distance=1000), max distance is 5km.
access_token='<YOUR TOKEN HERE>' #Access token to use API
#The default time span is set to 5 days. The time span must not exceed 7 days.
#min_timestamp # A unix timestamp. All media returned will be taken later than this timestamp.
#max_timestamp # A unix timestamp. All media returned will be taken earlier than this timestamp.
#Settings for Verification Dataset of images
#lat, long =40.721645, -73.957258, dist = 5000, default timestamp (5 days)
images={}
#to keep track of duplicates
total_count=0
count=0
#count for each loop
timestamp_last_image=0
flag=0
#images are returned in reverse order, i.e. most recent to least recent
#A max of 100 images are returned in during each request, to get the next set, we use last image (least recent) timestamp as max timestamp and continue
#to avoid duplicates we check if image ID has already been recorded (instagram tends to return images based on a %60 timestamp)
#flag checks for first run of loop
#use JSON viewer http://www.jsoneditoronline.org/ and use commented API response links below to comprehend JSON response
while total_count<10000:
if flag==0:
response = urllib.urlopen('https://api.instagram.com/v1/media/search?lat='+lat+'&lng='+lng+'&distance='+distance+'&access_token='+access_token+'&count=100')
#https://api.instagram.com/v1/media/search?lat=48.858844&lng=2.294351&distance=5000&access_token=2017228644.ab103e5.f6083159690e476b94dff6cbe8b53759
else:
response = urllib.urlopen('https://api.instagram.com/v1/media/search?lat='+lat+'&lng='+lng+'&distance='+distance+'&max_timestamp='+timestamp_last_image+'&access_token='+access_token+'&count=100')
data = json.load(response)
for img in data["data"]:
#print img["images"]["standard_resolution"]["url"]
if img['id'] in images:
continue
images[img['id']] = 1
total_count = total_count + 1
count=count+1
urllib.urlretrieve(img["images"]["standard_resolution"]["url"],"C://Instagram/"+str(total_count)+".jpg")
#above line downloads image by retrieving it from the url
instaUrlFile.write(img["images"]["standard_resolution"]["url"]+"\n")
#above line captures image url so it can be passed directly to Face++ API from the text file instaUrlFile.txt
print "IMAGE WITH name "+str(total_count)+".jpg was just saved with created time "+data["data"][count-1]["created_time"]
#This for loop will download all the images from instagram and save them in the above path
timestamp_last_image=data["data"][count-1]["created_time"]
flag=1
count=0
Here the code which save all images.
I can't test it, coz i have not instagramm token.
import urllib, json
access_token = "ACCESS TOKEN" # Put here your ACCESS TOKEN
search_results = urllib.urlopen("https://api.instagram.com/v1/media/search?lat=48.858844&lng=2.294351&access_token='%s'" % access_token)
instagram_answer = json.loads(search_results) # Load Instagram Media Result
for row in instagram_answer['data']:
if row['type'] == "image": # Filter non images files
filename = row['id']
url = row['images']['standard_resolution']['url']
file_obj, headers = urllib.urlretrieve(
url=url,
filename=url
) # Save images

How do I manipulate a binary plist retrieved using urllib2.urlopen into a readable xml plist without saving the file locally using Python?

We use a Python script to communicate with our Deploy Studio server to automate the updating of user and computer information using Deploy Studio's web access URLs. Due to a recent change, they are now storing the Plists for computers in the binary plist format as opposed to XML.
Here is what currently works with old version of DS (source: http://macops.ca/interfacing-with-deploystudio-using-http/):
#!/usr/bin/python
import urllib2
import plistlib
from random import randrange
host = 'https://my.ds.repo:60443'
adminuser = 'testdsuser'
adminpass = '12345'
def setupAuth():
"""Install an HTTP Basic Authorization header globally so it's used for
every request."""
auth_handler = urllib2.HTTPBasicAuthHandler()
auth_handler.add_password(realm='DeployStudioServer',
uri=host,
user=adminuser,
passwd=adminpass)
opener = urllib2.build_opener(auth_handler)
urllib2.install_opener(opener)
def getHostData(machine_id):
"""Return the full plist for a computer entry"""
machine_data = urllib2.urlopen(host + '/computers/get/entry?id=%s' % machine_id)
plist = plistlib.readPlistFromString(machine_data.read())
# if id isn't found, result will be an empty plist
return plist
def updateHostProperties(machine_id, properties, key_mac_addr=False, create_new=False):
"""Update the computer at machine_id with properties, a dict of properties and
values we want to set with new values. Return the full addinfourl object or None
if we found no computer to update and we aren't creating a new one. Set create_new
to True in order to enable creating new entries."""
found_comp = getHostData(machine_id)
# If we found no computer and we don't want a new record created
if not found_comp and not create_new:
return None
new_data = {}
if found_comp:
# Computer data comes back as plist nested like: {'SERIALNO': {'cn': 'my-name'}}
# DeployStudioServer expects a /set/entry POST like: {'cn': 'my-new-name'}
# so we copy the keys up a level
update = dict((k, v) for (k, v) in found_comp[machine_id].items())
new_data = update.copy()
else:
# No computer exists for this ID, we need to set up two required keys:
# 'dstudio-host-primary-key' and one of 'dstudio-host-serial-number'
# or 'dstudio-mac-addr' is required, otherwise request is ignored
# - IOW: you can't only rely on status codes
# - primary key is a server-level config, but we seem to need this per-host
if key_mac_addr:
new_data['dstudio-host-primary-key'] = 'dstudio-mac-addr'
else:
new_data['dstudio-host-primary-key'] = 'dstudio-host-serial-number'
new_data[new_data['dstudio-host-primary-key']] = machine_id
for (k, v) in properties.items():
new_data[k] = v
plist_to_post = plistlib.writePlistToString(new_data)
result = urllib2.urlopen(host + '/computers/set/entry?id=' + machine_id,
plist_to_post)
return result
def main():
setupAuth()
# Update HOWSANNIE with a new computer name (assuming this entry already exists)
random_name = 'random-id-' + str(randrange(100))
result = updateHostProperties('HOWSANNIE', {'cn': random_name,
'dstudio-hostname': random_name})
# Update DOUGLASFIRS with a new computername and custom properties, or create
# it if it doesn't already exist
random_name = 'random-id-' + str(randrange(100))
updateHostProperties('DOUGLASFIRS',
{'cn': random_name,
'dstudio-hostname': random_name,
'dstudio-custom-properties': [{
'dstudio-custom-property-key': 'ASSET_TAG',
'dstudio-custom-property-label': 'My Great Asset Tag',
'dstudio-custom-property-value': 'BL4CKL0DG3'}]
},
create_new=True)
if __name__ == "__main__":
main()
We use this in conjunction with a home-grown web interface for our technicians to enter in the information when re-imaging a machine and automatically update the information in our DS database.
I've tried using libraries such as biplist to no avail. I'd prefer to not have to store the file locally on the server and then convert it using the bash command plutil. Is there anyway I can manipulate the variable the information gets stored into? In this case it would be 'machine_data'.
I've had success using the bash command curl with the -o flag to indicate saving the file as a .plist file which works, however as said before, I would like to do this without saving the file locally if possible.
Deploy Studio web services available: .
Just putting together all of #martijn-pieters comments into an answer:
pip install biplist
import urllib2
import io
import biplist
sock = urllib2.urlopen('https://cfpropertylist.googlecode.com/svn/trunk/examples/sample.binary.plist' )
buf = io.BytesIO(sock.read())
xml = biplist.readPlist(buf)
print xml
Output:
{'Pets Names': [], 'Name': 'John Doe', 'Picture': '<B\x81\xa5\x81\xa5\x99\x81B<', 'Year Of Birth': 1965, 'Date Of Graduation': datetime.datetime(2004, 6, 22, 19, 23, 43), 'City Of Birth': 'Springfield', 'Kids Names': ['John', 'Kyra']}

Categories

Resources