I am working in Arcmap using the Field Calculator.
I have a attibute with values like the follwoing:
"addr:city"="Bielefeld","addrostcode"="33699","addr:street"="Westerkamp"
"addr:city"="Bielefeld","addr:street"="Detmolder Straße"
"addr:city"="Bielefeld","addr:housenumber"="34"
I want to extract them into individual attributes.
So I thought I need codes like:
dim city
if sPrefix = "addr:city":
return everything past "addr:city" until a comma appears
Any ideas how to solve that. I don't have much experience unfortunatley.
Thanks,
Uli!
here is a screenshot
screenshot
Have a look at python's csv module.
Edit:
I've never used Arcmap, but I'd imagine you can still import modules in it.
If the strings are pretty regular, you could just parse the data without it though:
eg.
#test.py
def func(s, srch):
parts = dict([item.replace('"','').split('=') for item in s.split(',')])
return parts.get(srch,'')
if __name__ == '__main__':
tags = '"addr:city"="Bielefeld","addrostcode"="33699","addr:street"="Westerkamp"'
print func(tags, 'addr:city')
>python test.py
>Bielefeld
something like this, define your own function:
In [40]: def func(x,item):
spl=strs.split(",")
for y in spl:
if item in y:
return y.split("=")[-1].strip('"')
....:
....:
In [53]: strs='"addr:city"="Bielefeld","addrostcode"="33699","addr:street"="Westerkamp"'
In [54]: func(strs,"addr:city")
Out[54]: 'Bielefeld'
In [55]: func(strs,"addr:street")
Out[55]: 'Westerkamp'
As I read your question, you want to extract a string which looks like '"addr:city"="Bielefeld","addr:housenumber"="34"' into individual (key, value) pairs. The easiest way to do this is probably to use the csv reader (http://docs.python.org/2/library/csv.html). You will need to determine exactly how to use it in your use case, but here is a generic example which is likely to work:
import csv
for pairs in csv.reader(attribute_list):
key, value = pair.split('"="')
print key, value
Related
def read_prices(tikrList):
#read each file and get the price list dictionary
def getPriceDict():
priceDict = {}
TLL = len(tikrList)
for x in range(0,TLL):
with open(tikrList[x] + '.csv','r') as csvFile:
csvReader = csv.reader(csvFile)
for column in csvReader:
priceDict[column[0]] = float(column[1])
return priceDict
#populate the final dictionary with the price dictionary from the previous function
def popDict():
combDict = {}
TLL = len(tikrList)
for x in range(0,TLL):
for y in tikrList:
combDict[y] = getPriceDict()
return combDict
return(popDict())
print(read_prices(['GOOG','XOM','FB']))
What is wrong with the code is that when I return the final dictionary the key for GOOG,XOM,FB is represnting the values for the FB dictionary only.
As you can see with this output:
{'GOOG': {'2015-12-31': 104.660004, '2015-12-30': 106.220001},
'XOM': {'2015-12-31': 104.660004, '2015-12-30': 106.220001},
'FB': {'2015-12-31': 104.660004, '2015-12-30': 106.220001}
I have 3 different CSV files but all of them are just reading the CSV file for FB.
I want to apologize ahead of time if my code is not easy to read or doesn't make sense. I think there is an issue with storing the values and returning the priceDict in the getPriceDict function but I cant seem to figure it out.
Any help is appreciated, thank you!
Since this is classwork I won't provide a solution but I'll point a few things out.
You have defined three functions - two are defined inside the third. While structuring functions like that can make sense for some problems/solutions I don't see any benefit in your solution. It seems to make it more complicated.
The two inner functions don't have any parameters, you might want to refactor them so that when they are called you pass them the information they need. One advantage of a function is to encapsulate an idea/process into a self-contained code block that doesn't rely on resources external to itself. This makes it easy to test so you know that the function works and you can concentrate on other parts of the code.
This piece of your code doesn't make much sense - it never uses x from the outer loop:
...
for x in range(0,TLL):
for y in tikrList:
combDict[y] = getPriceDict()
When you iterate over a list the iteration will stop after the last item and it will iterate over the items themselves - no need to iterate over numbers to access the items: don't do for i in range(thelist): print(thelist[i])
>>> tikrList = ['GOOG','XOM','FB']
>>> for name in tikrList:
... print(name)
GOOG
XOM
FB
>>>
When you read through a tutorial or the documentation, don't just look at the examples - read and understand the text .
I am looking for a way to write the code below in a more concise manner. I thought about trying df[timemonths] = pd.to_timedelta(df[timemonths])...
but it did not work (arg must be a string, timedelta, list, tuple, 1-d array, or Series).
Appreciate any help. Thanks
timemonths = ['TimeFromPriorRTtoSRS', 'TimetoAcuteG3','TimetoLateG3',
'TimeSRStoLastFUDeath','TimeDiagnosistoLastFUDeath',
'TimetoRecurrence']
monthsec = 2.628e6 # to convert to months
df.TimetoLocalRecurrence = pd.to_timedelta(df.TimetoLocalRecurrence).dt.total_seconds()/monthsec
df.TimeFromPriorRTtoSRS = pd.to_timedelta(df.TimeFromPriorRTtoSRS).dt.total_seconds()/monthsec
df.TimetoAcuteG3 = pd.to_timedelta(df.TimetoAcuteG3).dt.total_seconds()/monthsec
df.TimetoLateG3 = pd.to_timedelta(df.TimetoLateG3).dt.total_seconds()/monthsec
df.TimeSRStoLastFUDeath = pd.to_timedelta(df.TimeSRStoLastFUDeath).dt.total_seconds()/monthsec
df.TimeDiagnosistoLastFUDeath = pd.to_timedelta(df.TimeDiagnosistoLastFUDeath).dt.total_seconds()/monthsec
df.TimetoRecurrence = pd.to_timedelta(df.TimetoRecurrence).dt.total_seconds()/monthsec
You could write your operation as a lambda function and then apply it to the relevant columns:
timemonths = ['TimeFromPriorRTtoSRS', 'TimetoAcuteG3','TimetoLateG3',
'TimeSRStoLastFUDeath','TimeDiagnosistoLastFUDeath',
'TimetoRecurrence']
monthsec = 2.628e6
convert_to_months = lambda x: pd.to_timedelta(x).dt.total_seconds()/monthsec
df[timemonths] = df[timemonths].apply(convert_to_months)
Granted I am kind of guessing here since you haven't provided any example data to work with.
Iterate over vars() of df
Disclaimer: this solution will most likely only work if the df class doesn't have any other variables.
The way this works is by simply moving the repetitive code after the = to a function.
def convert(times):
monthsec = 2.628e6
return {
key: pd.to_timedelta(value).dt.total_seconds()/monthsec
for key, value in times.items()
}
Now we have to apply this function to each variable.
The problem here is that it can be tedious to apply it to each variable individually, so we could use your list timemonths to apply it based on the keys, however, this requires us to create an array of keys manually like so:
timemonths = ['TimeFromPriorRTtoSRS', 'TimetoAcuteG3','TimetoLateG3', 'TimeSRStoLastFUDeath','TimeDiagnosistoLastFUDeath', 'TimetoRecurrence']
And this can be annoying, especially if you add more, or take away some because you have to keep updating this array.
So instead, let's dynamically iterate over every variable in df
for key, value in convert(vars(df)).items():
setattr(df, key, value)
Full Code:
def convert(times):
monthsec = 2.628e6
return {
key: pd.to_timedelta(value).dt.total_seconds()/monthsec
for key, value in times.items()
}
for key, value in convert(vars(df)).items():
setattr(df, key, value)
Sidenote
The reason I am using setattr is because when examining your code, I came to the conclusion that df was most likely a class instance, and as such, properties (by this I mean variables like self.variable = ...) of a class instance must by modified via setattr and not df['variable'] = ....
The API here: https://api.bitfinex.com/v2/tickers?symbols=ALL
does not have any labels and I want to extract all of the tBTCUSD, tLTCUSD etc.. Basically everything without numbers. Normally, i would extract this information if they are labeled so i can do something like:
data['name']
or something like that however this API does not have labels.. how can i get this info with python?
You can do it like this:
import requests
j = requests.get('https://api.bitfinex.com/v2/tickers?symbols=ALL').json()
mydict = {}
for i in j:
mydict[i[0]] = i[1:]
Or using dictionary comprehension:
mydict = {i[0]: i[1:] for i in j}
Then access it as:
mydict['tZRXETH']
I don't have access to Python right now, but it looks like they're organized in a superarray of several subarrays.
You should be able to extract everything (the superarray) as data, and then do a:
for array in data:
print array[0]
Not sure if this answers your question. Let me know!
Even if it doesn't have labels (or, more specifically, if it's not a JSON object) it's still a perfectly legal piece of JSON, since it's just some arrays contained within a parent array.
Assuming you can already get the text from the api, you can load it as a Python object using json.loads:
import json
data = json.loads(your_data_as_string)
Then, since the labels you want to extract are always in the first position of the arrays, you can store them in a list using a list comprehension:
labels = [x[0] for x in data]
labels will be:
['tBTCUSD', 'tLTCUSD', 'tLTCBTC', 'tETHUSD', 'tETHBTC', 'tETCBTC', ...]
I'm using a python module https://gis.stackexchange.com/a/5943/16793 to export a list of feature classes within an SDE database.
import os, csv, arcpy, arcplus
>>> fcs = arcplus.listAllFeatureClasses("Database Connections\\Connection to oracle.sde\\BASINS.ACF")
The output is a list:
[u'BASINS.ACF_FL_SUB', u'BASINS.ACF_CHATTAHOOCHEE_BASIN', u'BASINS.ACF_CHIPOLA_BASIN', u'BASINS.ACF_CHIPOLA_AL']
In order to pass this list to another function, I've prepended a string to each element in the list:
mylist = ['Database Connections\\Connection to oracle.sde\{0}'.format(i) for i in fcs]
which looks like:
print mylist[0]
Database Connections\Connection to oracle.sde\BASINS.ACF_FL_SUB
I'd like to pass this list to another function arcpy.ListFields(dataset) which will return the fields of each feature class:
fn = [f.name for f in arcpy.ListFields(mylist[0])]
>>> print fn
[u'OBJECTID', u'HUC', u'BASIN', u'NAME', u'ACRES', u'SHAPE', u'SHAPE.AREA', u'SHAPE.LEN']
I'm trying to figure out how to pass the list in fcs to the function arcpy.ListFields and write the results to csv file, but the structure of the loop needed is really giving me trouble. I'm a novice at this and the Python documentation is getting me turned around. Any pointers would be helpful.
_________________________ETC__________________________________________________
#Tony Your solution worked great. Although I try to use listAllFeatureClasses on the larger geodatabase, I don't have insufficient privileges to read some of the attributes, which gives an IOError: Database Connections\Connection to oracle.sde\LAND.LANDS\LAND_POINTS does not exist. I'm working on how to handle this, and continue to the next feature class in the list. Maybe Try/Continue?
To call the arcpy.ListFields for every item in a list :
fns = [[f.name for f in arcpy.ListFields( list_entry )] for list_entry in mylist]
This will give you a list of lists, where fns[0] are the functions for entry mylist[0]
what might be easier to work with is a dictionary :
fns_dict = dict( [ (list_entry, [f.name for f in arcpy.ListFields( list_entry ) ] )
for list_entry in mylist ] )
Using the data in your example :
fns_dict["BASINS.ACF_FL_SUB"] should be
[u'OBJECTID', u'HUC', u'BASIN', u'NAME', u'ACRES', u'SHAPE', u'SHAPE.AREA', u'SHAPE.LEN']
My data.json is
{"a":[{"b":{"c":{ "foo1":1, "foo2":2, "foo3":3, "foo4":4}}}],"d":[{"e":{"bar1":1, "bar2":2, "bar3":3, "bar4":4}}]}
I am able to list both key/pair values. My code is:
#! /usr/bin/python
import json
from pprint import pprint
with open('data2.json') as data_file:
data = json.load(data_file)
pprint(data["d"][0]["e"])
Which gives me:
{u'bar1': 1, u'bar2': 2, u'bar3': 3, u'bar4': 4}
But I want to display only the keys without any quotes and u like this:
bar1, bar2, bar3, bar4
Can anybody suggest anything? It need not be only in python, can be in shell script also.
The keys of this object are instances of the unicode string class. Given this, the default printing behavior of the dict instance for which they are the keys will print them as you show in your post.
This is because the dict implementation of representing its contents as a string (__repr__ and/or __str__) seeks to show you what objects reside in the dict, not what the string representation of those objects looks like. This is an important distinction, for example:
In [86]: print u'hi'
hi
In [87]: x = u'hi'
In [88]: x
Out[88]: u'hi'
In [89]: print x
hi
This should work for you, assuming that printing the keys together as a comma-separated unicode is fine:
print ", ".join(data["d"][0]["e"])
You can achieve this using the keys member function from dict too, but it's not strictly necessary.
print ', '.join((data["d"][0]["e"].keys()))
data["d"][0]["e"] returns a dict. In python2, You could use this to get the keys of that dict with something like this:
k = data["d"][0]["e"].keys()
print(", ".join(k))
In python3, wrap k in a list like this
k = list(data["d"][0]["e"].keys())
print(", ".join(k))
Even simpler, join will iterate over the keys of the dict.
print(", ".join(data["d"][0]["e"]))
Thanks to #thefourtheye for pointing this out.