Update single row formatting for entire sheet - python

I want to just apply a formatting from a JSON Entry. The first thing I did was make my desirable format on my spreadsheet for the second row of all columns. I then retrieved them with a .get request (from A2 to AO3).
request = google_api.service.spreadsheets().get(
spreadsheetId=ss_id,
ranges="Tab1!A2:AO3",
includeGridData=True).execute()
The next thing I did was collect each of the formats for each column and record them in a dictionary.
my_dictionary_of_formats = {}
row_values = row_1['sheets'][0]['data'][0]['rowData'][0]['values']
for column in range(0, len(row_values)):
my_dictionary_of_formats[column] = row_values[column]['effectiveFormat']
Now I have a dictionray of all my effective formats for all my columns. I'm having trouble now applying that format to all rows in each column. I tried a batchUpdate request:
cell_data = {
"effectiveFormat": my_dictionary_of_formats[0]}
row_data = {
"values": [
cell_data
]
}
update_cell = {
"rows": [
row_data
],
"fields": "*",
"range":
{
"sheetId": input_master.tab_id,
"startRowIndex": 2,
"startColumnIndex": 0,
"endColumnsIndex": 1
}
}
request_body = {
"requests": [
{"updateCells": update_cell}],
"includeSpreadsheetInResponse": True,
"responseIncludeGridData": True}
service.spreadsheets().batchUpdate(spreadsheetId=my_id, body=request_body).execute()
This wiped out everything and I'm not sure why. I don't think I understand the fields='* attribute.
TL;DR
I want to apply a format to all rows in a single column. Much like if I used the "Paint Format" tool on the second row, first column and dragged it all the way down to the last row.
-----Update
Hi, thanks to the comments this was my solution:
###collect all formats from second row
import json
row_2 = goolge_api.service.spreadsheets().get(
spreadsheetId=spreadsheet_id,
ranges="tab1!A2:AO2",
includeGridData=True).execute()
my_dictionary = {}
row_values = row_2['sheets'][0]['data'][0]['rowData'][0]['values']
for column in range(0,len(row_values)):
my_dictionary[column] = row_values[column]
json.dumps(my_dictionary,open('config/format.json','w'))
###Part 2, apply formats
requests = []
my_dict = json.load(open('config/format.json'))
for column in my_dict:
requests.append(
{
"repeatCell": {
"range": {
"sheetId": tab_id,
"startRowIndex": str(1),
"startColumnIndex":str(column),
"endColumnIndex":str(int(column)+1)
},
"cell": {
"userEnteredFormat": my_dict[column]
},
'fields': "userEnteredFormat({})".format(",".join(my_dict[column].keys()))
}
})
body = {"requests": requests}
google_api.service.spreadsheets().batchUpdate(spreadsheetId=s.spreadsheet_id,body=body).execute()

When you include fields as a part of the request, you indicate to the API endpoint that it should overwrite the specified fields in the targeted range with the information found in your uploaded resource. fields="*" correspondingly is interpreted as "This request specifies the entire data and metadata of the given range. Remove any previous data and metadata from the range and use what is supplied instead."
Thus, anything not specified in your updateCells requests will be removed from the range supplied in the request (e.g. values, formulas, data validation, etc.).
You can learn more in the guide to batchUpdate
For an updateCell request, the fields parameter is as described:
The fields of CellData that should be updated. At least one field must be specified. The root is the CellData; 'row.values.' should not be specified. A single "*" can be used as short-hand for listing every field.
If you then view the resource description of CellData, you observe the following fields:
"userEnteredValue"
"effectiveValue"
"formattedValue"
"userEnteredFormat"
"effectiveFormat"
"hyperlink"
"note"
"textFormatRuns"
"dataValidation"
"pivotTable"
Thus, the proper fields specification for your request is likely to be fields="effectiveFormat", since this is the only field you supply in your row_data property.
Consider also using the repeatCell request if you are just specifying a single format.

Related

Convert Embedded JSON Dict To Panda DataFrame Where Columns Headers Are Seperate From Values

I'm trying to create a python pandas DataFrame out of a JSON dictionary. The embedding is tripping me up.
The column headers are in a different section of the JSON file to the values.
The json looks similar to below. There is one section of column headers and multiple sections of data.
I need each column filled with the data that relates to it. So value_one in each case will fill the column under header_one and so on.
I have come close, but can't seem to get it to spit out the dataframe as described.
{
"my_data": {
"column_headers": [
"header_one",
"header_two",
"header_three"
],
"values": [
{
"data": [
"value_one",
"value_two",
"value_three"
]
},
{
"data": [
"value_one",
"value_two",
"value_three"
]
}
]
}
}
Assuming your dictionary is my_dict, try:
>>> pd.DataFrame(data=[d["data"] for d in my_dict["my_data"]["values"]],
columns=my_dict["my_data"]["column_headers"])

Using pandas.json_normalize to "unfold" a dictionary of a list of dictionaries

I am new to Python (and coding in general) so I'll do my best to explain the challenge I'm trying to work through.
I'm working with a large dataset which was exported as a CSV from a database. However, there is one column within this CSV export that contains a nested list of dictionaries (as best as I can tell). I've looked around extensively online for a solution, including on Stackoverflow, but haven't quite gotten a full solution. I think I understand conceptually what I'm trying to accomplish, but not clear as to the best method or data prepping process to use.
Here is an example of the data (pared down to just the two columns I'm interested in):
{
"app_ID": {
"0": 1abe23574,
"1": 4gbn21096
},
"locations": {
"0": "[ {"loc_id" : "abc1", "lat" : "12.3456", "long" : "101.9876"
},
{"loc_id" : "abc2", "lat" : "45.7890", "long" : "102.6543"}
]",
"1": "[ ]",
]"
}
}
Basically each app_ID can have multiple locations tied to a single ID, or it can be empty as seen above. I have attempted using some guides I found online using Panda's json_normalize() function to "unfold" or get the list of dictionaries into their own rows in a Panda dataframe.
I'd like to end up with something like this:
loc_id lat long app_ID
abc1 12.3456 101.9876 1abe23574
abc1 45.7890 102.6543 1abe23574
etc...
I am learning about how to use the different functions of json_normalize, like "record_path" and "meta", but haven't been able to get it to work yet.
I have tried loading the json file into a Jupyter Notebook using:
with open('location_json.json', 'r') as f:
data = json.loads(f.read())
df = pd.json_normalize(data, record_path = ['locations'])
but it only creates a dataframe that is 1 row and multiple columns long, where I'd like to have multiple rows generated from the inner-most dictionary that tie back to the app_ID and loc_ID fields.
Attempt at a solution:
I was able to get close to the dataframe format I wanted using:
with open('location_json.json', 'r') as f:
data = json.loads(f.read())
df = pd.json_normalize(data['locations']['0'])
but that would then require some kind of iteration through the list in order to create a dataframe, and then I'd lose the connection to the app_ID fields. (As best as I can understand how the json_normalize function works).
Am I on the right track trying to use json_normalize, or should I start over again and try a different route? Any advice or guidance would be greatly appreciated.
I can't say that suggesting you using convtools library is a good thing since you are a beginner, because this library is almost like another Python over the Python. It helps to dynamically define data conversions (generating Python code under the hood).
But anyway, here is the code if I understood the input data right:
import json
from convtools import conversion as c
data = {
"app_ID": {"0": "1abe23574", "1": "4gbn21096"},
"locations": {
"0": """[ {"loc_id" : "abc1", "lat" : "12.3456", "long" : "101.9876" },
{"loc_id" : "abc2", "lat" : "45.7890", "long" : "102.6543"} ]""",
"1": "[ ]",
},
}
# define it once and use multiple times
converter = (
c.join(
# converts "app_ID" data to iterable of dicts
(
c.item("app_ID")
.call_method("items")
.iter({"id": c.item(0), "app_id": c.item(1)})
),
# converts "locations" data to iterable of dicts,
# where each id like "0" is zipped to each location.
# the result is iterable of dicts like {"id": "0", "loc": {"loc_id": ... }}
(
c.item("locations")
.call_method("items")
.iter(
c.zip(id=c.repeat(c.item(0)), loc=c.item(1).pipe(json.loads))
)
.flatten()
),
# join on "id"
c.LEFT.item("id") == c.RIGHT.item("id"),
how="full",
)
# process results, where 0 index is LEFT item, 1 index is the RIGHT one
.iter(
{
"loc_id": c.item(1, "loc", "loc_id", default=None),
"lat": c.item(1, "loc", "lat", default=None),
"long": c.item(1, "loc", "long", default=None),
"app_id": c.item(0, "app_id"),
}
)
.as_type(list)
.gen_converter()
)
result = converter(data)
assert result == [
{'loc_id': 'abc1', 'lat': '12.3456', 'long': '101.9876', 'app_id': '1abe23574'},
{'loc_id': 'abc2', 'lat': '45.7890', 'long': '102.6543', 'app_id': '1abe23574'},
{'loc_id': None, 'lat': None, 'long': None, 'app_id': '4gbn21096'}
]

how to select data range with google sheets API when number of columns is unknown?

I am converting a script that was originally done in App Script to apply formatting to google sheets.
This script needs to apply to many sheets, and the number of columns is not known in advance. Before, in the App Scripts, I used basic getDataRange() without parameters, and it would select the correct number of columns and rows. How can I do the same via API? Is there a way to set end column index to end of data range?
For example, I'm using
{
"setBasicFilter": {
"filter": {
"range": {
"sheetId": SHEET_ID,
"startRowIndex": 0
}
}
}
}
To set top row as a filter. But it applies filters to all the empty cells as well, that are outside the table with data, while I need them to stop at last column.
What is the best way to do this via the API?
Solution:
You can call spreadsheets.values.get to get the values of a range, then get the length of the first element array. Then plug it in the setBasicFilter request.
Sample Code:
# The ID and range of a sample spreadsheet.
SAMPLE_SPREADSHEET_ID = 'enter spreadsheet ID here'
SAMPLE_RANGE_NAME = 'Sheet1!A1:1'
.
.
.
# Call the Sheets API
sheet = service.spreadsheets()
result = sheet.values().get(spreadsheetId=SAMPLE_SPREADSHEET_ID,
range=SAMPLE_RANGE_NAME).execute()
values = result.get('values', [])
length = len(values[0])
.
.
.
# filter parameters
{
"setBasicFilter": {
"filter": {
"range": {
"sheetId": SHEET_ID,
"startRowIndex": 0
"startColumnIndex": 0
"endColumnIndex" : length
}
}
}
}
References:
Python Quickstart
Grid Range

Map JSON values into another JSON object in Python

Background to this question is I have JSON Responses A and wish to send Response A's values only to JSON format B. JSON format B has different fields.
JSON Response A
{
"res_comment": "work description",
"res_id": "62",
"res_priority": "P2",
"res_qid": "INC0140315"
}
....... etc x n
Looking to map the values in JSON Response A to the below JSON format B.
JSON format B
{
"ServiceIssueCategoryID": "62", **#res_id**
"ServicePriorityCode": "P2", **#res_priority**
"ServiceNowID_KUT": "INC0140315", **#res_qid**
"ServiceRequestTextCollection":[
{
"TypeCode": "10004", **value does not change**
"Text": "work description" **#res_comment**
}
]
}
## update
To be clear:
I have made a single map of Values (between one set of curly brackets) to Format B. Then I used the below to update Format B:
with open("sapFormat.json", "r+") as jsonFile:
data = json.load(jsonFile)
tmp1 = data['ServiceIssueCategoryID']
data["ServiceIssueCategoryID"] = res_id
.... other fields
Works fine, however I have 100's of these individual 'res_id'. Not sure how to loop through all individually and add individually. The function that pulls the Values, takes all Values, e.g. 100 x res_id, need to take 1 res_id, update B and move to the next res_id, update etc

Get values with the key names from json object

I am trying to retrieve the values from specific columns from the python list object. This is the response format from Log analytics API here - https://dev.loganalytics.io/documentation/Using-the-API/ResponseFormat
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
{
"tables": [
{
"name": "PrimaryResult",
"columns": [
{
"name": "Category",
"type": "string"
},
{
"name": "count_",
"type": "long"
}
],
"rows": [
[
"Administrative",
20839
],
[
"Recommendation",
122
],
[
"Alert",
64
],
[
"ServiceHealth",
11
]
]
}
]
}
There are hundreds of columns and i want to read specific columns and row values. To do that, i initially tried to find an index for the column for e.g., "Category" and retrieve all the values from rows. Here is what i have done so far.
result=requests.get(url, params=params, headers=headers, verify=False)
index_category = (result.json()['tables'][0]['columns']).index('Category')
result contains data in the format posted above. I get this below error. What am i missing?
ValueError: 'Category' is not in list
I want to be able to retrieve the Category values from rows array in a loop. I have also done this below loop and i am able to get what i want but want to confirm if there is a better way to do this. Also i am retrieving the column index first before reading the row value because i suspect blindly reading the row values with explicit index values is prone to error, particularly when the sequence of columns change.
for column in range(0,columns):
if ((result.json()['tables'][0]['columns'][column]['name']) == 'Category'):
index_category = column
for row in range(0,rows):
print(result.json()['tables'][0]['rows'][row][index_category])
json_data = results.json()
for index, columns in enumerate(json_data['tables'][0]['columns']):
if columns['name'] == 'Category':
category_index = index
break
category_list = []
for row in json_data['tables'][0]['rows']:
category_list.append(row[category_index])
Haven't tested it btw.
You could also refactor the first loop where we find the index for the category with the filter function.

Categories

Resources