Best approach completing this table using docxtpl template/python (Border attribute)

Best approach completing this table using docxtpl template/python (Border attribute) - python

I have a table with a fixed number of columns but an undefined number of rows (until it's populated). I am able to modify some example code to dynamically enter the rows, but I can't figure out how to apply the table styling, specifically the border properties. I would like the thicker grey outline of the table to continue around the left hand side and bottom.
Can anyone point me in the right direction please?
tpl = DocxTemplate('G:\ELECSUPP\Documentation\Results\Kaya Marks/dynamic_table_tpl.docx')
context = {
'col_labels': ['Description', 'Company Number', 'Calibration \nDue Date'],
'tbl_contents': [
{'label': '1', 'cols': ['Desc', 'BXXXXXX', '12/12/21']},
{'label': '2', 'cols': ['Desc', 'BXXXXXX', '10/01/22']},
{'label': '3', 'cols': ['Desc', 'BXXXXXX', '10/10/10']},
],
}

Related

Scrap data from multiple categories

I am scraping a product at softsurroundings.com
This is the product link: https://www.softsurroundings.com/p/estelle-dress/
This product has 3 types of size categories["Misses","Petites","Women"]. Each size category has furthur sizes. i.e.
for "Misses" we has ["XS","S","M","L","XL"]
for "Petites" we has ["PXS","PS","PM","PL","PXL"]
for "Women" we has ["1X","2X","3X"]
I am confused on the css to get the sizes of all three categories.
I get the size of only Misses category because when website loads only Misses category shows
The current code I have is
raw_skus = []
for sku_sel in response.css('.dtlFormBulk.flexItem .size[class="box size"]'):
sku = {
'sku_id': sku_sel.css('.size ::attr(id)').get(),
'size': sku_sel.css('.size ::text').get()
}
raw_skus.append(sku)
return raw_skus
the above code returns me
[
{'sku_id': 'size_501', 'size': 'XS'},
{'sku_id': 'size_601', 'size': 'S'},
{'sku_id': 'size_701', 'size': 'M'},
{'sku_id': 'size_801', 'size': 'L'},
{'sku_id': 'size_901', 'size': 'XL'}
]
I am only getting sizes from Misses category I need sizes from other two categories appended in the list too.
Please help.

How can one update the secondary Y-axis of a Plotly chart through an update menu

I'm looking for what the keyword is to update the secondary axis label in a Plotly chart using buttons and an updatemenu. Everything else is working like I want. The button setup code is below. The goal is to change the left and right y variables the using information from the DataFrame. The data, primary axis and both legend names changed as desired but I can't find an option to replace my made up 'secondary_axis' keyword when making the right buttons.
# above here make subplots, add traces, etc.
left_buttons = []
for option in axis_options:
left_buttons.append({'method': 'update',
'label': option,
'args': [{'y': [df[option]], 'name': [option]}, {'yaxis': {'title': option}}, [0]]})
right_buttons = []
for option in axis_options:
right_buttons.append({'method': 'update',
'label': option,
'args': [{'y': [df[option]], 'name': [option]}, {'secondary_yaxis': {'title': option}}, [1]]})
# figure.update_layout(...

Figured it out eventually, I'm assuming that the 2 suffix goes with the trace number but I couldn't confirm that with any documentation yet. I needed to add the overlaying and side parameters to get it where I wanted it to go. Got some weird overlaps and missing data otherwise.
# above here make subplots, add traces, etc.
left_buttons = []
for option in axis_options:
left_buttons.append({'method': 'update',
'label': option,
'args': [{'y': [df[option]],
'name': [option]},
{'yaxis': {'title': option}},
[0]]
})
right_buttons = []
for option in axis_options:
right_buttons.append({'method': 'update',
'label': option,
'args': [{'y': [df[option]],
'name': [option]},
{'yaxis2': {'title': option, 'overlaying': 'y', 'side': 'right'}},
[1]]
})
# figure.update_layout(...

Write multiple Dataframes to same PDF file using matplotlib

I'm stuck at a point where I have to write multiple pandas dataframe's to a PDF file.The function accepts dataframe as input.
However, I'm able to write to PDF for the first time but all the subsequent calls are overriding the existing data, leaving with only one dataframe in the PDF by the end.
Please find the python function below :
def fn_print_pdf(df):
pp = PdfPages('Sample.pdf')
total_rows, total_cols = df.shape;
rows_per_page = 30; # Number of rows per page
rows_printed = 0
page_number = 1;
while (total_rows >0):
fig=plt.figure(figsize=(8.5, 11))
plt.gca().axis('off')
matplotlib_tab = pd.tools.plotting.table(plt.gca(),df.iloc[rows_printed:rows_printed+rows_per_page],
loc='upper center', colWidths=[0.15]*total_cols)
#Tabular styling
table_props=matplotlib_tab.properties()
table_cells=table_props['child_artists']
for cell in table_cells:
cell.set_height(0.024)
cell.set_fontsize(12)
# Header,Footer and Page Number
fig.text(4.25/8.5, 10.5/11., "Sample", ha='center', fontsize=12)
fig.text(4.25/8.5, 0.5/11., 'P'+str(page_number), ha='center', fontsize=12)
pp.savefig()
plt.close()
#Update variables
rows_printed += rows_per_page;
total_rows -= rows_per_page;
page_number+=1;
pp.close()
And I'm calling this function as ::
raw_data = {
'subject_id': ['1', '2', '3', '4', '5'],
'first_name': ['Alex', 'Amy', 'Allen', 'Alice', 'Ayoung'],
'last_name': ['Anderson', 'Ackerman', 'Ali', 'Aoni', 'Atiches']}
df_a = pd.DataFrame(raw_data, columns=['subject_id', 'first_name', 'last_name'])
fn_print_pdf(df_a)
raw_data = {
'subject_id': ['4', '5', '6', '7', '8'],
'first_name': ['Billy', 'Brian', 'Bran', 'Bryce', 'Betty'],
'last_name': ['Bonder', 'Black', 'Balwner', 'Brice', 'Btisan']}
df_b = pd.DataFrame(raw_data, columns=['subject_id', 'first_name', 'last_name'])
fn_print_pdf(df_b)
PDF file is available at
SamplePDF
.As you can see only the data from second dataframe is saved ultimately.Is there a way to append to the same Sample.pdf in the second pass and so on while still preserving the former data?

Your PDF's are being overwritten, because you're creating a new PDF document every time you call fn_print_pdf(). You can try keep your PdfPages instance open between function calls, and make a call to pp.close() only after all your plots are written. For reference see this answer.
Another option is to write the PDF's to a different file, and use pyPDF to merge them, see this answer.
Edit : Here is some working code for the first approach.
Your function is modified to :
def fn_print_pdf(df,pp):
total_rows, total_cols = df.shape;
rows_per_page = 30; # Number of rows per page
rows_printed = 0
page_number = 1;
while (total_rows >0):
fig=plt.figure(figsize=(8.5, 11))
plt.gca().axis('off')
matplotlib_tab = pd.tools.plotting.table(plt.gca(),df.iloc[rows_printed:rows_printed+rows_per_page],
loc='upper center', colWidths=[0.15]*total_cols)
#Tabular styling
table_props=matplotlib_tab.properties()
table_cells=table_props['child_artists']
for cell in table_cells:
cell.set_height(0.024)
cell.set_fontsize(12)
# Header,Footer and Page Number
fig.text(4.25/8.5, 10.5/11., "Sample", ha='center', fontsize=12)
fig.text(4.25/8.5, 0.5/11., 'P'+str(page_number), ha='center', fontsize=12)
pp.savefig()
plt.close()
#Update variables
rows_printed += rows_per_page;
total_rows -= rows_per_page;
page_number+=1;
Call your function with:
pp = PdfPages('Sample.pdf')
fn_print_pdf(df_a,pp)
fn_print_pdf(df_b,pp)
pp.close()

Elasticsearch sorting fails after update / insert

I'm inserting documents into elasticsearch and trying to sort on a given field that's present in all documents. However, whenever I update a document, indexing seems to break and I do not get a sorted order. I have created an index by doing:
self.conn = ES(server=url)
self.conn.create_index("test.test")
For instance, I would like to sort on a "_ts" field. Given the following dictionaries and code:
def update_or_insert(doc):
doc_type = "string"
index = doc['ns']
doc['_id'] = str(doc['_id'])
doc_id = doc['_id']
self.conn.index(doc, index, doc_type, doc_id)
to_insert = [
{'_id': '4', 'name': 'John', '_ts': 3, 'ns':'test.test'},
{'_id': '5', 'name': 'Paul', '_ts': 2', ns':'test.test'},
{'_id': '6', 'name': 'George', '_ts': 1', ns':'test.test'},
{'_id': '6', 'name': 'Ringo', '_ts': 4, 'ns':'test.test'} ,
]
for x in to_insert:
update_or_insert(x)
result = self.conn.search(q, sort={'_ts:desc'})
for it in result:
print it
I would expect to get an ordering of "Ringo, John, Paul" but instead get an ordering of "John, Paul, Ringo". Any reason why this might be the case? I see there's a bug here:
https://github.com/elasticsearch/elasticsearch/issues/3078
But that seems to affect ES .90.0 and I'm using .90.1.

It should be:
sort={"_ts":"desc"}

Python jsonpath Filter Expression

Background:
I have the following example data structure in JSON:
{'sensor' : [
{'assertions_enabled': 'ucr+',
'deassertions_enabled': 'ucr+',
'entity_id': '7.0',
'lower_critical': 'na',
'lower_non_critical': 'na',
'lower_non_recoverable': 'na',
'reading_type': 'analog',
'sensor_id': 'SR5680 TEMP (0x5d)',
'sensor_reading': {'confidence_interval': '0.500',
'units': 'degrees C',
'value': '42'},
'sensor_type': 'Temperature',
'status': 'ok',
'upper_critical': '59.000',
'upper_non_critical': 'na',
'upper_non_recoverable': 'na'}
]}
The sensor list will actually contain many of these dicts containing sensor info.
Problem:
I'm trying to query the list using jsonpath to return me a subset of sensor dicts that have sensor_type=='Temperature' but I'm getting 'False' returned (no match). Here's my jsonpath expression:
results = jsonpath.jsonpath(ipmi_node, "$.sensor[?(#.['sensor_type']=='Temperature')]")
When I remove the filter expression and just use "$.sensor.*" I get a list of all sensors, so I'm sure the problem is in the filter expression.
I've scanned multiple sites/posts for examples and I can't seem to find anything specific to Python (Javascript and PHP seem to be more prominent). Could anyone offer some guidance please?

The following expression does what you need (notice how the attribute is specified):
jsonpath.jsonpath(impi_node, "$.sensor[?(#.sensor_type=='Temperature')]")

I am using jsonpath-ng which seems to be active (as of 23.11.20) and I provide solution based on to Pedro's jsonpath expression:
data = {
'sensor' : [
{'sensor_type': 'Temperature', 'id': '1'},
{'sensor_type': 'Humidity' , 'id': '2'},
{'sensor_type': 'Temperature', 'id': '3'},
{'sensor_type': 'Density' , 'id': '4'}
]}
from jsonpath_ng.ext import parser
for match in parser.parse("$.sensor[?(#.sensor_type=='Temperature')]").find(data):
print(match.value)
Output:
{'sensor_type': 'Temperature', 'id': '1'}
{'sensor_type': 'Temperature', 'id': '3'}
NOTE: besides basic documentation provided on project's homepage I found additional information in tests.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Best approach completing this table using docxtpl template/python (Border attribute) - python

Related

Scrap data from multiple categories

How can one update the secondary Y-axis of a Plotly chart through an update menu

Write multiple Dataframes to same PDF file using matplotlib

Elasticsearch sorting fails after update / insert

Python jsonpath Filter Expression

Categories

Resources