OSM Overpass missing data in query result - python

I'm gathering all cities, towns and villages of some countries from OSM using an Overpass query in a Python program.
Everything seems to be correct but I found a town in Luxembug that is missing im my result set. It concerns the town Kiischpelt.
'''
import requests
import json
Country = 'LU'
overpass_url = "http://overpass-api.de/api/interpreter"
overpass_query = """
[out:json];
area["ISO3166-1"=""" + Country + """][admin_level=2]->.search;
(node["place"="city"](area.search);
node["place"="town"](area.search);
node["place"="village"](area.search);
way["place"="city"](area.search);
way["place"="town"](area.search);
way["place"="village"](area.search);
rel["place"="city"](area.search);
rel["place"="town"](area.search);
rel["place"="village"](area.search);
);
out center;
"""
response = requests.get(overpass_url,
params={'data': overpass_query})
data = response.json()
filename = """C:/Data/GetGeoData/data/""" + Country + 'cities' +'.json'
f = open(filename,'w', encoding="utf-8")
json.dump(data, f)
f.close()
'''
When searching on the OSM site for Kiischpelt, I get a result of type relation but it doesnet appear in my result set.
Also when I change the query as follows
'''rel"place";''' which should return all places of all kinds (city, town, village, isolated dwelling,...)
Any idea what I'm doing wrong?
Many thanks!

Related

How to handle multiple missing keys in a dict?

I'm using an API to get basic information about shops in my area, name of shop, address, postcode, phone number etc… The API returns back a long list about each shop, but I only want some of the data from each shop.
I created a for loop that just takes the information that I want for every shop that the API has returned. This all works fine.
Problem is not all shops have a phone number or a website, so I get a KeyError because the key website does not exist in every return of a shop. I tried to use try and except which works but only if I only handle one thing, but a shop might not have a phone number and a website, which leads to a second KeyError.
What can I do to check for every key in my for loop and if a key is found missing to just add the value "none"?
My code:
import requests
import geocoder
import pprint
g = geocoder.ip('me')
print(g.latlng)
latitude, longitude = g.latlng
URL = "https://discover.search.hereapi.com/v1/discover"
latitude = xxxx
longitude = xxxx
api_key = 'xxxxx' # Acquire from developer.here.com
query = 'food'
limit = 12
PARAMS = {
'apikey':api_key,
'q':query,
'limit': limit,
'at':'{},{}'.format(latitude,longitude)
}
# sending get request and saving the response as response object
r = requests.get(url = URL, params = PARAMS)
data = r.json()
#print(data)
for x in data['items']:
title = x['title']
address = x['address']['label']
street = x['address']['street']
postalCode = x['address']['postalCode']
position = x['position']
access = x['access']
typeOfBusiness = x['categories'][0]['name']
contacts = x['contacts'][0]['phone'][0]['value']
try:
website = x['contacts'][0]['www'][0]['value']
except KeyError:
website = "none"
resultList = {
'BUSINESS NAME:':title,
'ADDRESS:':address,
'STREET NAME:':street,
'POSTCODE:':postalCode,
'POSITION:':position,
'POSITSION2:':access,
'TYPE:':typeOfBusiness,
'PHONE:':contacts,
'WEBSITE:':website
}
print("--"*80)
pprint.pprint( resultList)
I think a good way to handle it would be to use the operator.itemgetter() to create a callable the will attempt to retrieve all the keys at once, and if any aren't found, it will generate a KeyError.
A short demonstration of what I mean:
from operator import itemgetter
test_dict = dict(name="The Shop", phone='123-45-6789', zipcode=90210)
keys = itemgetter('name', 'phone', 'zipcode')(test_dict)
print(keys) # -> ('The Shop', '123-45-6789', 90210)
keys = itemgetter('name', 'address', 'phone', 'zipcode')(test_dict)
# -> KeyError: 'address'

pysnow 0.7.5 - How can I make a complex Query with 2 tables like in SQL?

I'm using pysnow 0.7.5.
I would like to extract a part of data from 2 service now tables ("rm_defect" and "u_cmdb_ci_appl_entreprise" )
These tables are linked by u_cmdb_ci_appl_entreprise.sys_id=rm_defect.u_application.value
How can I make a complex Query with this 2 tables like in SQL?
Something like that:
qb = (
pysnow.QueryBuilder()
.field('rm_defect.number').contains('ANO01234')
.AND()
.field('rm_defect.u_application.value').equals('u_cmdb_ci_appl_entreprise.sys_id')
)
Here is my source
import pysnow
import os
instance='XXXX' user='XXXX' password='XXXX'
c = pysnow.Client(instance=instance, user=user, password=password)
# List of defects defectResource = c.resource(api_path='/table/rm_defect')
qb = (
pysnow.QueryBuilder()
.field('number').contains('ANO01234') )
defectRecords = defectResource.get(query=qb,stream=True)
print('rm_defect:') for defectRecord in defectRecords.all():
number = str(defectRecord['number'])
state = str(defectRecord['state'])
if (isinstance(defectRecord['u_application'], dict)):
application = defectRecord['u_application']['value']
else:
application = ""
print('number '+number)
print('state '+state)
print('application '+application)
print('------------')
# List of applications appResource = c.resource(api_path='/table/u_cmdb_ci_appl_entreprise')
qb = (
pysnow.QueryBuilder()
.field('name').equals('appli1') )
appRecords = appResource.get(query=qb)
for appRecord in appRecords.all():
sys_id = str(appRecords['sys_id'])
name = str(appRecords['name'])
print('sys_id '+sys_id)
print('name '+name)
print('------------')
Here is the result
number ANO01234
state 15
application 551de62ddbe02e80e478751bbf9619b0
------------
sys_id 551de62ddbe02e80e478751bbf9619b0
name appli1
------------
Thank you for your help
You can create view, that will contain these two tables and then run the query against that view instead of the individual tables.
System Definition => Database Views [sys_db_view]
For example you can have a look at view covering incident and task_sla that has name incident_sla. Please note the Variable prefix column. It is important in querying the view. Such view can be queried in the following way.
var gr = new GlideRecord('incident_sla');
gr.setLimit(10);
gr.query();
while(gr.next()) {
gs.print(gr.getValue('inc_number') + ' ---- ' + gr.getDisplayValue('taskslatable_sla'));
}
This script will return number from the incident table and sla from task_sla table.

(python) Iterating through a list of Salesforce tables to extract and load into AWS S3

Good Morning All!
I'm trying to have a routine iterate through a table list. The below code works on a single table 'contact'. I want to iterate through all of the tables listed in my tablelist.csv. I bolded the selections below which would need to be dynamically modified in the code. My brain is pretty fried at this point from working through two nights and I'm fully prepared for the internet to tell me that this is in chapter two of intro to Python but I could use the help just to get over this hurdle.
import pandas as pd
import boto3
from simple_salesforce import salesforce
li = pd.read_csv('tablelist.csv', header=none)
desc = sf.**Contact**.describe()
field_names = [field['name'] for field in desc['fields']]
soql = "SELECT {} FROM **Contact**".format(','.join(field_names))
results = sf.query_all(soql)
sf_df = pd.DataFrame(results['records']).drop(columns='attributes')
sf_df.to_csv('**contact**.csv')
s3 = boto3.client('s3')
s3.upload_file('contact.csv', 'mybucket', 'Ops/20201027/contact.csv')
Would help if you could provide a sample of the tablelist file, but here's a stab at...you really just need to get list of tables and loop through it.
#assuming table is a column somewhere in the file
df_tablelist = pd.read_csv('tablelist.csv', header=none)
for Contact in df_tablelist['yourtablecolumttoiterateon'].tolist():
desc = sf.**Contact**.describe()
field_names = [field['name'] for field in desc['fields']]
soql = "SELECT {} FROM {}".format(','.join(field_names), Contact)
results = sf.query_all(soql)
sf_df = pd.DataFrame(results['records']).drop(columns='attributes')
sf_df.to_csv(Contact + '.csv')
s3 = boto3.client('s3')
s3.upload_file(Contact + '.csv', 'mybucket', 'Ops/20201027/' + Contact + '.csv')

KeyError: 'docSentiment' for sentiment analysis using AlchemyAPI in Python

I have a text file with userID and tweet text separated by a "-->". I want to load these into a dictionary and then iterate over the values, computing the sentiment for each tweet using AlchemyAPI.
My input data is similar to this (real file has millions of records):
v2cigs --> New #ecig #regulations in #Texas mean additional shipping charges for residents. https:\/\/t.co\/aN3O5UfGUM #vape #ecigs #vapeon #vaporizer
JessyQuil --> FK SHIPPING I DON'T WANT TO WAIT TO BUY MY VAPE STUFF
thebeeofficial --> #Lancashire welcomes latest #ECIG law READ MORE: https:\/\/t.co\/qv6foghaOL https:\/\/t.co\/vYiTAQ6VED
2br --> #Lancashire welcomes latest #ECIG law READ MORE: https:\/\/t.co\/ghRWTxQy8r https:\/\/t.co\/dKh9TLkNRe
My code is:
import re
from alchemyapi import AlchemyAPI
alchemyapi = AlchemyAPI()
outputFile = ("intermediate.txt", "w")
tid = 1; #counter for keys in dictionary
tdict = {} #dictionary to store tweet data
with open("testData.txt", "r") as inputfile :
for lines in inputfile:
tweets = lines.split("-->")[1].lstrip()
tweets = re.sub("[^A-Za-z0-9#\s'.#]+", '', tweets)
tdict[tid] = tweets.strip("\n")
tid+=1
for k in tdict:
response = alchemyapi.sentiment("text", str(tdict[k]))
sentiment = response["docSentiment"]["type"]
print sentiment
I am getting the error:
sentiment = response["docSentiment"]["type"]
KeyError: 'docSentiment'
I don't understand what I am doing wrong. Can anybody please help?
You need to check if the response was successful before trying to access the key.
for k in tdict:
response = alchemyapi.sentiment("text", str(tdict[k]))
status = response.get('status')
if status == 'ERROR':
print(response['statusInfo'])
else:
print(response['docSentiment']['type'])

In Python, trying to convert geocoded tsv file into geojson format

trying to convert a geocoded TSV file into JSON format but i'm having trouble with it. Here's the code:
import geojson
import csv
def create_map(datafile):
geo_map = {"type":"FeatureCollection"}
item_list = []
datablock = list(csv.reader(datafile))
for i, line in enumerate(datablock):
data = {}
data['type'] = 'Feature'
data['id'] = i
data['properties']={'title': line['Movie Title'],
'description': line['Amenities'],
'date': line['Date']}
data['name'] = {line['Location']}
data['geometry'] = {'type':'Point',
'coordinates':(line['Lat'], line['Lng'])}
item_list.append(data)
for point in item_list:
geo_map.setdefault('features', []).append(point)
with open("thedamngeojson.geojson", 'w') as f:
f.write(geojson.dumps(geo_map))
create_map('MovieParksGeocode2.tsv')
I'm getting a TypeError:list indices must be integers, not str on the data['properties'] line but I don't understand, isn't that how I set values to the geoJSON fields?
The file I'm reading from has values under these keys: Location Movie Title Date Amenities Lat Lng
The file is viewable here: https://github.com/yongcho822/Movies-in-the-park/blob/master/MovieParksGeocodeTest.tsv
Thanks guys, much appreciated as always.
You have a couple things going on here that need to get fixed.
1.Your TSV contains newlines with double quotes. I don't think this is intended, and will cause some problems.
Location Movie Title Date Amenities Formatted_Address Lat Lng
"
Edgebrook Park, Chicago " A League of Their Own 7-Jun "
Family friendly activities and games. Also: crying is allowed." Edgebrook Park, 6525 North Hiawatha Avenue, Chicago, IL 60646, USA 41.9998876 -87.7627672
"
2.You don't need the geojson module to dump out JSON - which is all GeoJSON is. Just import json instead.
3.You are trying to read a TSV, but you don't include the delimiter=\t option that is needed for that.
4.You are trying to read keys off the rows, but you aren't using DictReader which does that for you.Hence the TypeError about indices you mention above.
Check out my revised code block below..you still need to fix your TSV to be a valid TSV.
import csv
import json
def create_map(datafile):
geo_map = {"type":"FeatureCollection"}
item_list = []
with open(datafile,'r') as tsvfile:
reader = csv.DictReader(tsvfile,delimiter='\t')
for i, line in enumerate(reader):
print line
data = {}
data['type'] = 'Feature'
data['id'] = i
data['properties']={'title': line['Movie Title'],
'description': line['Amenities'],
'date': line['Date']}
data['name'] = {line['Location']}
data['geometry'] = {'type':'Point',
'coordinates':(line['Lat'], line['Lng'])}
item_list.append(data)
for point in item_list:
geo_map.setdefault('features', []).append(point)
with open("thedamngeojson.geojson", 'w') as f:
f.write(json.dumps(geo_map))
create_map('MovieParksGeocode2.tsv')

Categories

Resources