Unable to select Month and Year at DatePicker use Selenium + Python - python

Actually, I am able to select day and put the value.
Already try use some solution from other link :
1. Getting availability from datepicker
2. Python Selenium Date Picker
3. Python & Selenium Cannot select date in datepicker
When try to select month and year still no lock to get the result
Below my code :
start_date = wait.until(EC.visibility_of_element_located((
By.CSS_SELECTOR, "#departureDate_i")))
start_date.click() #Show Datepciker
browser.execute_script("document.getElementsByClassName('next')[0].click()")
current_month = browser.find_element_by_css_selector(".datepicker-months").text
print("current_month:", current_month)
Below HTML format :
<div class="datepicker datepicker-dropdown dropdown-menu datepicker-orient-left datepicker-orient-top" style="display: none; top: 176.4px; left: 448.667px;">
<div class="datepicker-days" style="display: block;">
<table class=" table-condensed">
<thead>
<tr>
<th class="prev" style="visibility: hidden;"></th>
<th colspan="5" class="datepicker-switch">January 2019</th>
<th class="next" style="visibility: visible;"></th>
</tr>
<tr>
<th class="dow">Su</th>
<th class="dow">Mo</th>
<th class="dow">Tu</th>
<th class="dow">We</th>
<th class="dow">Th</th>
<th class="dow">Fr</th>
<th class="dow">Sa</th>
</tr>
</thead>
<tbody>
<tr>
<td class="day disabled old">30</td>
<td class="day disabled old">31</td>
<td class="day disabled">1</td>
<td class="day disabled">2</td>
<td class="day">3</td>
<td class="day today">4</td>
<td class="day">5</td>
</tr>
<tr>
<td class="day">6</td>
<td class="day">7</td>
<td class="day">8</td>
<td class="day">9</td>
<td class="day">10</td>
<td class="day">11</td>
<td class="day">12</td>
</tr>
<tr>
<td class="day">13</td>
<td class="day">14</td>
<td class="day">15</td>
<td class="day active">16</td>
<td class="day">17</td>
<td class="day">18</td>
<td class="day">19</td>
</tr>
<tr>
<td class="day">20</td>
<td class="day">21</td>
<td class="day">22</td>
<td class="day">23</td>
<td class="day">24</td>
<td class="day">25</td>
<td class="day">26</td>
</tr>
<tr>
<td class="day">27</td>
<td class="day">28</td>
<td class="day">29</td>
<td class="day">30</td>
<td class="day">31</td>
<td class="day new">1</td>
<td class="day new">2</td>
</tr>
<tr>
<td class="day new">3</td>
<td class="day new">4</td>
<td class="day new">5</td>
<td class="day new">6</td>
<td class="day new">7</td>
<td class="day new">8</td>
<td class="day new">9</td>
</tr>
</tbody>
<tfoot>
<tr>
<th colspan="7" class="today" style="display: none;">Today</th>
</tr>
<tr>
<th colspan="7" class="clear" style="display: none;">Clear</th>
</tr>
</tfoot>
</table>
</div>
<div class="datepicker-months" style="display: none;">
<table class="table-condensed">
<thead>
<tr>
<th class="prev" style="visibility: hidden;"></th>
<th colspan="5" class="datepicker-switch">2019</th>
<th class="next" style="visibility: visible;"></th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="7" style=""><span class="month active">Jan</span><span class="month">Feb</span><span class="month">Mar</span><span class="month">Apr</span><span class="month">May</span><span class="month">Jun</span><span class="month">Jul</span><span class="month">Aug</span><span class="month">Sep</span><span class="month">Oct</span><span class="month">Nov</span><span class="month">Dec</span></td>
</tr>
</tbody>
<tfoot>
<tr>
<th colspan="7" class="today" style="display: none;">Today</th>
</tr>
<tr>
<th colspan="7" class="clear" style="display: none;">Clear</th>
</tr>
</tfoot>
</table>
</div>
<div class="datepicker-years" style="display: none;">
<table class="table-condensed">
<thead>
<tr>
<th class="prev" style="visibility: hidden;"></th>
<th colspan="5" class="datepicker-switch">2010-2019</th>
<th class="next" style="visibility: visible;"></th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="7"><span class="year old disabled">2009</span><span class="year disabled">2010</span><span class="year disabled">2011</span><span class="year disabled">2012</span><span class="year disabled">2013</span><span class="year disabled">2014</span><span class="year disabled">2015</span><span class="year disabled">2016</span><span class="year disabled">2017</span><span class="year disabled">2018</span><span class="year active">2019</span><span class="year new">2020</span></td>
</tr>
</tbody>
<tfoot>
<tr>
<th colspan="7" class="today" style="display: none;">Today</th>
</tr>
<tr>
<th colspan="7" class="clear" style="display: none;">Clear</th>
</tr>
</tfoot>
</table>
</div>
</div>
much appreciate for suggest how to handle it
Thank you

Related

How to add a 2nd Y-axis on a grouped bar chart using Altair? and sort the bar using value of one of the column from the data

I'm trying to add a 3rd axis or 2nd Y-axis to the group chart. I'm not sure if it is possible.
Ideally, I want to -
1) add a line to this chart, which represents the "percentage of Arrest" made for the given year and a crime type.
2) sort the bars with each group using a value of column "rank" from the data.
Here is my code and the current visualization. Your valuable feedback is much appreciated. Thank you.
import altair as alt
base = alt.Chart().encode(
x=alt.X('primary_type',scale=alt.Scale(rangeStep=12),title=None,sort=alt.EncodingSortField(op='sum', field='rank')),
color=alt.Color('primary_type:N')
)
bar = base.mark_bar().encode(
alt.Y('sum(Number_of_Incidents):Q',title='Total Number of Incidents')
)
line = base.mark_line(color='red').encode(
alt.Y('percent_arrest',
axis=alt.Axis(title=None))
)
combined = alt.layer(bar, line, data=q13a)
combined.facet(
column=alt.Column('year')
).resolve_scale(x='independent'
).configure_view(
stroke='transparent'
)
Sample Data -
<table class="table table-bordered table-hover table-condensed">
<thead><tr><th title="Field #1">year</th>
<th title="Field #2">primary_type</th>
<th title="Field #3">Number_of_Incidents</th>
<th title="Field #4">number_of_arrests</th>
<th title="Field #5">percent_arrest</th>
<th title="Field #6">rank</th>
</tr></thead>
<tbody><tr>
<td align="right">2018</td>
<td>THEFT</td>
<td align="right">57330</td>
<td align="right">5503</td>
<td align="right">9.6</td>
<td align="right">1</td>
</tr>
<tr>
<td align="right">2018</td>
<td>BATTERY</td>
<td align="right">44667</td>
<td align="right">8886</td>
<td align="right">19.89</td>
<td align="right">2</td>
</tr>
<tr>
<td align="right">2018</td>
<td>CRIMINAL DAMAGE</td>
<td align="right">24889</td>
<td align="right">1498</td>
<td align="right">6.02</td>
<td align="right">3</td>
</tr>
<tr>
<td align="right">2018</td>
<td>ASSAULT</td>
<td align="right">18229</td>
<td align="right">2931</td>
<td align="right">16.08</td>
<td align="right">4</td>
</tr>
<tr>
<td align="right">2018</td>
<td>DECEPTIVE PRACTICE</td>
<td align="right">15879</td>
<td align="right">713</td>
<td align="right">4.49</td>
<td align="right">5</td>
</tr>
<tr>
<td align="right">2017</td>
<td>THEFT</td>
<td align="right">64334</td>
<td align="right">6459</td>
<td align="right">10.04</td>
<td align="right">1</td>
</tr>
<tr>
<td align="right">2017</td>
<td>BATTERY</td>
<td align="right">49213</td>
<td align="right">10060</td>
<td align="right">20.44</td>
<td align="right">2</td>
</tr>
<tr>
<td align="right">2017</td>
<td>CRIMINAL DAMAGE</td>
<td align="right">29040</td>
<td align="right">1747</td>
<td align="right">6.02</td>
<td align="right">3</td>
</tr>
<tr>
<td align="right">2017</td>
<td>ASSAULT</td>
<td align="right">19298</td>
<td align="right">3455</td>
<td align="right">17.9</td>
<td align="right">4</td>
</tr>
<tr>
<td align="right">2017</td>
<td>DECEPTIVE PRACTICE</td>
<td align="right">18816</td>
<td align="right">805</td>
<td align="right">4.28</td>
<td align="right">5</td>
</tr>
<tr>
<td align="right">2016</td>
<td>THEFT</td>
<td align="right">61600</td>
<td align="right">6518</td>
<td align="right">10.58</td>
<td align="right">1</td>
</tr>
<tr>
<td align="right">2016</td>
<td>BATTERY</td>
<td align="right">50292</td>
<td align="right">10328</td>
<td align="right">20.54</td>
<td align="right">2</td>
</tr>
<tr>
<td align="right">2016</td>
<td>CRIMINAL DAMAGE</td>
<td align="right">31018</td>
<td align="right">1668</td>
<td align="right">5.38</td>
<td align="right">3</td>
</tr>
<tr>
<td align="right">2016</td>
<td>ASSAULT</td>
<td align="right">18738</td>
<td align="right">3490</td>
<td align="right">18.63</td>
<td align="right">4</td>
</tr>
<tr>
<td align="right">2016</td>
<td>DECEPTIVE PRACTICE</td>
<td align="right">18733</td>
<td align="right">815</td>
<td align="right">4.35</td>
<td align="right">5</td>
</tr>
<tr>
<td align="right">2015</td>
<td>THEFT</td>
<td align="right">57335</td>
<td align="right">6771</td>
<td align="right">11.81</td>
<td align="right">1</td>
</tr>
<tr>
<td align="right">2015</td>
<td>BATTERY</td>
<td align="right">48918</td>
<td align="right">11558</td>
<td align="right">23.63</td>
<td align="right">2</td>
</tr>
<tr>
<td align="right">2015</td>
<td>CRIMINAL DAMAGE</td>
<td align="right">28675</td>
<td align="right">1835</td>
<td align="right">6.4</td>
<td align="right">3</td>
</tr>
<tr>
<td align="right">2015</td>
<td>NARCOTICS</td>
<td align="right">23883</td>
<td align="right">23875</td>
<td align="right">99.97</td>
<td align="right">4</td>
</tr>
<tr>
<td align="right">2015</td>
<td>OTHER OFFENSE</td>
<td align="right">17552</td>
<td align="right">4795</td>
<td align="right">27.32</td>
<td align="right">5</td>
</tr>
<tr>
<td align="right">2014</td>
<td>THEFT</td>
<td align="right">61561</td>
<td align="right">7415</td>
<td align="right">12.04</td>
<td align="right">1</td>
</tr>
<tr>
<td align="right">2014</td>
<td>BATTERY</td>
<td align="right">49447</td>
<td align="right">12517</td>
<td align="right">25.31</td>
<td align="right">2</td>
</tr>
<tr>
<td align="right">2014</td>
<td>NARCOTICS</td>
<td align="right">29116</td>
<td align="right">29000</td>
<td align="right">99.6</td>
<td align="right">3</td>
</tr>
<tr>
<td align="right">2014</td>
<td>CRIMINAL DAMAGE</td>
<td align="right">27798</td>
<td align="right">2095</td>
<td align="right">7.54</td>
<td align="right">4</td>
</tr>
<tr>
<td align="right">2014</td>
<td>OTHER OFFENSE</td>
<td align="right">16979</td>
<td align="right">4159</td>
<td align="right">24.49</td>
<td align="right">5</td>
</tr>
<tr>
<td align="right">2013</td>
<td>THEFT</td>
<td align="right">71530</td>
<td align="right">7727</td>
<td align="right">10.8</td>
<td align="right">1</td>
</tr>
<tr>
<td align="right">2013</td>
<td>BATTERY</td>
<td align="right">54002</td>
<td align="right">12927</td>
<td align="right">23.94</td>
<td align="right">2</td>
</tr>
<tr>
<td align="right">2013</td>
<td>NARCOTICS</td>
<td align="right">34127</td>
<td align="right">33819</td>
<td align="right">99.1</td>
<td align="right">3</td>
</tr>
<tr>
<td align="right">2013</td>
<td>CRIMINAL DAMAGE</td>
<td align="right">30853</td>
<td align="right">2107</td>
<td align="right">6.83</td>
<td align="right">4</td>
</tr>
<tr>
<td align="right">2013</td>
<td>OTHER OFFENSE</td>
<td align="right">17993</td>
<td align="right">3400</td>
<td align="right">18.9</td>
<td align="right">5</td>
</tr>
<tr>
<td align="right">2012</td>
<td>THEFT</td>
<td align="right">75460</td>
<td align="right">8249</td>
<td align="right">10.93</td>
<td align="right">1</td>
</tr>
<tr>
<td align="right">2012</td>
<td>BATTERY</td>
<td align="right">59135</td>
<td align="right">13061</td>
<td align="right">22.09</td>
<td align="right">2</td>
</tr>
<tr>
<td align="right">2012</td>
<td>CRIMINAL DAMAGE</td>
<td align="right">35854</td>
<td align="right">2462</td>
<td align="right">6.87</td>
<td align="right">3</td>
</tr>
<tr>
<td align="right">2012</td>
<td>NARCOTICS</td>
<td align="right">35488</td>
<td align="right">35226</td>
<td align="right">99.26</td>
<td align="right">4</td>
</tr>
<tr>
<td align="right">2012</td>
<td>BURGLARY</td>
<td align="right">22843</td>
<td align="right">1285</td>
<td align="right">5.63</td>
<td align="right">5</td>
</tr>
<tr>
<td align="right">2011</td>
<td>THEFT</td>
<td align="right">75148</td>
<td align="right">8468</td>
<td align="right">11.27</td>
<td align="right">1</td>
</tr>
<tr>
<td align="right">2011</td>
<td>BATTERY</td>
<td align="right">60458</td>
<td align="right">14139</td>
<td align="right">23.39</td>
<td align="right">2</td>
</tr>
<tr>
<td align="right">2011</td>
<td>NARCOTICS</td>
<td align="right">38605</td>
<td align="right">38544</td>
<td align="right">99.84</td>
<td align="right">3</td>
</tr>
<tr>
<td align="right">2011</td>
<td>CRIMINAL DAMAGE</td>
<td align="right">37332</td>
<td align="right">2583</td>
<td align="right">6.92</td>
<td align="right">4</td>
</tr>
<tr>
<td align="right">2011</td>
<td>BURGLARY</td>
<td align="right">26619</td>
<td align="right">1272</td>
<td align="right">4.78</td>
<td align="right">5</td>
</tr>
<tr>
<td align="right">2010</td>
<td>THEFT</td>
<td align="right">76754</td>
<td align="right">7844</td>
<td align="right">10.22</td>
<td align="right">1</td>
</tr>
<tr>
<td align="right">2010</td>
<td>BATTERY</td>
<td align="right">65403</td>
<td align="right">14277</td>
<td align="right">21.83</td>
<td align="right">2</td>
</tr>
<tr>
<td align="right">2010</td>
<td>NARCOTICS</td>
<td align="right">43393</td>
<td align="right">43294</td>
<td align="right">99.77</td>
<td align="right">3</td>
</tr>
<tr>
<td align="right">2010</td>
<td>CRIMINAL DAMAGE</td>
<td align="right">40653</td>
<td align="right">2641</td>
<td align="right">6.5</td>
<td align="right">4</td>
</tr>
<tr>
<td align="right">2010</td>
<td>BURGLARY</td>
<td align="right">26422</td>
<td align="right">1382</td>
<td align="right">5.23</td>
<td align="right">5</td>
</tr>
<tr>
<td align="right">2009</td>
<td>THEFT</td>
<td align="right">80973</td>
<td align="right">9900</td>
<td align="right">12.23</td>
<td align="right">1</td>
</tr>
<tr>
<td align="right">2009</td>
<td>BATTERY</td>
<td align="right">68462</td>
<td align="right">16325</td>
<td align="right">23.85</td>
<td align="right">2</td>
</tr>
<tr>
<td align="right">2009</td>
<td>CRIMINAL DAMAGE</td>
<td align="right">47724</td>
<td align="right">3270</td>
<td align="right">6.85</td>
<td align="right">3</td>
</tr>
<tr>
<td align="right">2009</td>
<td>NARCOTICS</td>
<td align="right">43543</td>
<td align="right">43193</td>
<td align="right">99.2</td>
<td align="right">4</td>
</tr>
<tr>
<td align="right">2009</td>
<td>BURGLARY</td>
<td align="right">26766</td>
<td align="right">1412</td>
<td align="right">5.28</td>
<td align="right">5</td>
</tr>
<tr>
<td align="right">2008</td>
<td>THEFT</td>
<td align="right">88433</td>
<td align="right">9291</td>
<td align="right">10.51</td>
<td align="right">1</td>
</tr>
<tr>
<td align="right">2008</td>
<td>BATTERY</td>
<td align="right">75922</td>
<td align="right">15520</td>
<td align="right">20.44</td>
<td align="right">2</td>
</tr>
<tr>
<td align="right">2008</td>
<td>CRIMINAL DAMAGE</td>
<td align="right">52841</td>
<td align="right">3403</td>
<td align="right">6.44</td>
<td align="right">3</td>
</tr>
<tr>
<td align="right">2008</td>
<td>NARCOTICS</td>
<td align="right">46507</td>
<td align="right">45459</td>
<td align="right">97.75</td>
<td align="right">4</td>
</tr>
<tr>
<td align="right">2008</td>
<td>OTHER OFFENSE</td>
<td align="right">26533</td>
<td align="right">3496</td>
<td align="right">13.18</td>
<td align="right">5</td>
</tr>
<tr>
<td align="right">2007</td>
<td>THEFT</td>
<td align="right">85156</td>
<td align="right">9783</td>
<td align="right">11.49</td>
<td align="right">1</td>
</tr>
<tr>
<td align="right">2007</td>
<td>BATTERY</td>
<td align="right">79591</td>
<td align="right">19386</td>
<td align="right">24.36</td>
<td align="right">2</td>
</tr>
<tr>
<td align="right">2007</td>
<td>NARCOTICS</td>
<td align="right">54454</td>
<td align="right">53251</td>
<td align="right">97.79</td>
<td align="right">3</td>
</tr>
<tr>
<td align="right">2007</td>
<td>CRIMINAL DAMAGE</td>
<td align="right">53749</td>
<td align="right">3994</td>
<td align="right">7.43</td>
<td align="right">4</td>
</tr>
<tr>
<td align="right">2007</td>
<td>OTHER OFFENSE</td>
<td align="right">26863</td>
<td align="right">4230</td>
<td align="right">15.75</td>
<td align="right">5</td>
</tr>
<tr>
<td align="right">2006</td>
<td>THEFT</td>
<td align="right">86240</td>
<td align="right">10108</td>
<td align="right">11.72</td>
<td align="right">1</td>
</tr>
<tr>
<td align="right">2006</td>
<td>BATTERY</td>
<td align="right">80666</td>
<td align="right">18892</td>
<td align="right">23.42</td>
<td align="right">2</td>
</tr>
<tr>
<td align="right">2006</td>
<td>CRIMINAL DAMAGE</td>
<td align="right">57124</td>
<td align="right">4135</td>
<td align="right">7.24</td>
<td align="right">3</td>
</tr>
<tr>
<td align="right">2006</td>
<td>NARCOTICS</td>
<td align="right">55813</td>
<td align="right">55236</td>
<td align="right">98.97</td>
<td align="right">4</td>
</tr>
<tr>
<td align="right">2006</td>
<td>OTHER OFFENSE</td>
<td align="right">27100</td>
<td align="right">4010</td>
<td align="right">14.8</td>
<td align="right">5</td>
</tr>
<tr>
<td align="right">2005</td>
<td>THEFT</td>
<td align="right">85685</td>
<td align="right">11338</td>
<td align="right">13.23</td>
<td align="right">1</td>
</tr>
<tr>
<td align="right">2005</td>
<td>BATTERY</td>
<td align="right">83965</td>
<td align="right">19994</td>
<td align="right">23.81</td>
<td align="right">2</td>
</tr>
<tr>
<td align="right">2005</td>
<td>NARCOTICS</td>
<td align="right">56234</td>
<td align="right">56121</td>
<td align="right">99.8</td>
<td align="right">3</td>
</tr>
<tr>
<td align="right">2005</td>
<td>CRIMINAL DAMAGE</td>
<td align="right">54548</td>
<td align="right">4083</td>
<td align="right">7.49</td>
<td align="right">4</td>
</tr>
<tr>
<td align="right">2005</td>
<td>OTHER OFFENSE</td>
<td align="right">28028</td>
<td align="right">4726</td>
<td align="right">16.86</td>
<td align="right">5</td>
</tr>
<tr>
<td align="right">2004</td>
<td>THEFT</td>
<td align="right">95463</td>
<td align="right">12068</td>
<td align="right">12.64</td>
<td align="right">1</td>
</tr>
<tr>
<td align="right">2004</td>
<td>BATTERY</td>
<td align="right">87136</td>
<td align="right">20718</td>
<td align="right">23.78</td>
<td align="right">2</td>
</tr>
<tr>
<td align="right">2004</td>
<td>NARCOTICS</td>
<td align="right">57060</td>
<td align="right">57034</td>
<td align="right">99.95</td>
<td align="right">3</td>
</tr>
<tr>
<td align="right">2004</td>
<td>CRIMINAL DAMAGE</td>
<td align="right">53164</td>
<td align="right">3965</td>
<td align="right">7.46</td>
<td align="right">4</td>
</tr>
<tr>
<td align="right">2004</td>
<td>OTHER OFFENSE</td>
<td align="right">29532</td>
<td align="right">5386</td>
<td align="right">18.24</td>
<td align="right">5</td>
</tr>
<tr>
<td align="right">2003</td>
<td>THEFT</td>
<td align="right">98875</td>
<td align="right">12889</td>
<td align="right">13.04</td>
<td align="right">1</td>
</tr>
<tr>
<td align="right">2003</td>
<td>BATTERY</td>
<td align="right">88378</td>
<td align="right">20459</td>
<td align="right">23.15</td>
<td align="right">2</td>
</tr>
<tr>
<td align="right">2003</td>
<td>CRIMINAL DAMAGE</td>
<td align="right">55011</td>
<td align="right">4060</td>
<td align="right">7.38</td>
<td align="right">3</td>
</tr>
<tr>
<td align="right">2003</td>
<td>NARCOTICS</td>
<td align="right">54288</td>
<td align="right">54283</td>
<td align="right">99.99</td>
<td align="right">4</td>
</tr>
<tr>
<td align="right">2003</td>
<td>OTHER OFFENSE</td>
<td align="right">31147</td>
<td align="right">5856</td>
<td align="right">18.8</td>
<td align="right">5</td>
</tr>
<tr>
<td align="right">2002</td>
<td>THEFT</td>
<td align="right">98327</td>
<td align="right">13697</td>
<td align="right">13.93</td>
<td align="right">1</td>
</tr>
<tr>
<td align="right">2002</td>
<td>BATTERY</td>
<td align="right">94153</td>
<td align="right">21331</td>
<td align="right">22.66</td>
<td align="right">2</td>
</tr>
<tr>
<td align="right">2002</td>
<td>CRIMINAL DAMAGE</td>
<td align="right">55940</td>
<td align="right">4403</td>
<td align="right">7.87</td>
<td align="right">3</td>
</tr>
<tr>
<td align="right">2002</td>
<td>NARCOTICS</td>
<td align="right">51789</td>
<td align="right">51781</td>
<td align="right">99.98</td>
<td align="right">4</td>
</tr>
<tr>
<td align="right">2002</td>
<td>OTHER OFFENSE</td>
<td align="right">32599</td>
<td align="right">5701</td>
<td align="right">17.49</td>
<td align="right">5</td>
</tr>
<tr>
<td align="right">2001</td>
<td>THEFT</td>
<td align="right">99264</td>
<td align="right">15543</td>
<td align="right">15.66</td>
<td align="right">1</td>
</tr>
<tr>
<td align="right">2001</td>
<td>BATTERY</td>
<td align="right">93447</td>
<td align="right">20463</td>
<td align="right">21.9</td>
<td align="right">2</td>
</tr>
<tr>
<td align="right">2001</td>
<td>CRIMINAL DAMAGE</td>
<td align="right">55851</td>
<td align="right">4548</td>
<td align="right">8.14</td>
<td align="right">3</td>
</tr>
<tr>
<td align="right">2001</td>
<td>NARCOTICS</td>
<td align="right">50567</td>
<td align="right">50559</td>
<td align="right">99.98</td>
<td align="right">4</td>
</tr>
<tr>
<td align="right">2001</td>
<td>ASSAULT</td>
<td align="right">31384</td>
<td align="right">7150</td>
<td align="right">22.78</td>
<td align="right">5</td>
</tr>
</tbody></table>
The trouble is that, as far as I know, you cannot draw lines across charts. When creating a grouped bar chart, you have to facet across a column of your data. In effect, this produces several charts that are horizontally concatenated. So, for each chart you have only one point (for each color). If you want to have a line across years, you have to define your x axis to be years, and not facet it, and plot it separately. I would suggest vertical concatenation, to have the lines below the bars.
Note that I have taken the data from your previous question (How to create a nested Grouped Bar Chart using Altair? - Added sample data) because the way you provided it is not practical and I already had this one.
import altair as alt
import pandas as pd
from io import StringIO
q13a = pd.read_table(StringIO("""year primary_type Number_of_Incidents number_of_arrests percent_arrest rank
2018 THEFT 57330 5503 9.6 1
2018 BATTERY 44667 8886 19.89 2
2018 CRIMINAL DAMAGE 24889 1498 6.02 3
2018 ASSAULT 18229 2931 16.08 4
2018 DECEPTIVE PRACTICE 15879 713 4.49 5
2017 THEFT 64334 6459 10.04 1
2017 BATTERY 49213 10060 20.44 2
2017 CRIMINAL DAMAGE 29040 1747 6.02 3
2017 ASSAULT 19298 3455 17.9 4
2017 DECEPTIVE PRACTICE 18816 805 4.28 5
2016 THEFT 61600 6518 10.58 1
2016 BATTERY 50292 10328 20.54 2
2016 CRIMINAL DAMAGE 31018 1668 5.38 3
2016 ASSAULT 18738 3490 18.63 4
2016 DECEPTIVE PRACTICE 18733 815 4.35 5
2015 THEFT 57335 6771 11.81 1
2015 BATTERY 48918 11558 23.63 2
2015 CRIMINAL DAMAGE 28675 1835 6.4 3
2015 NARCOTICS 23883 23875 99.97 4
2015 OTHER OFFENSE 17552 4795 27.32 5
2014 THEFT 61561 7415 12.04 1
2014 BATTERY 49447 12517 25.31 2
2014 NARCOTICS 29116 29000 99.6 3
2014 CRIMINAL DAMAGE 27798 2095 7.54 4
2014 OTHER OFFENSE 16979 4159 24.49 5
2013 THEFT 71530 7727 10.8 1
2013 BATTERY 54002 12927 23.94 2
2013 NARCOTICS 34127 33819 99.1 3
2013 CRIMINAL DAMAGE 30853 2107 6.83 4
2013 OTHER OFFENSE 17993 3400 18.9 5"""))
bar = alt.Chart(height=200, width=100).mark_bar().encode(
x=alt.X('primary_type:N',
axis=None,
title=None,
sort=alt.EncodingSortField(op='sum', field='rank')),
y=alt.Y('sum(Number_of_Incidents):Q',
title='Total Number of Incidents'),
color=alt.Color('primary_type:N')
).facet(
column=alt.Column('year:O')
).resolve_scale(
x='independent'
)
line = alt.Chart().mark_line(point=True, color='red').encode(
x=alt.X('year:O', axis=alt.Axis(labelAngle=0)),
y=alt.Y('percent_arrest:Q'),
color=alt.Color('primary_type:N', legend=None)
).properties(height=80, width=680)
alt.vconcat(bar, line, data=q13a).configure_view(stroke='transparent')
Created on 2018-11-29 by the reprexpy package

add icon to particular rows in django template table

I am working on python Django templates in which I have a table having column as id, factor A, factor B, factor C. Values for id, factor A, factor B and factor C respectively are 79, 0.56, 1.1, 1.3.
The code for the html template is like this:
<table class="table table-bordered">
<thead>
<tr>
<th class="text-center">id</th>
<th class="text-center">Factor A</th>
<th class="text-center">Factor B</th>
<th class="text-center">Factor C</th>
</tr>
</thead>
<tbody >
<tr ng-class="{'info':aggregateData.Mode, 'closed':!aggregateData.Open}">
<td class="text-center">{{aggregateData.id}}</td>
<td class="text-center">{{aggregateData.factor_a}}</td>
<td class="text-center">{{aggregateData.factor_b}}</td>
<td class="text-center">{{aggregateData.factor_c}}</td>
</tr>
</tbody>
</table
I want to add a clickable icon to this similar like this for rows having aggregateData.Open true.
Can someone suggest a way how I can achieve this.
Try this.
<table class="table table-bordered">
<thead>
<tr>
<th class="text-center">id</th>
<th class="text-center">Factor A</th>
<th class="text-center">Factor B</th>
<th class="text-center">Factor C</th>
</tr>
</thead>
<tbody >
<tr ng-class="{'info':aggregateData.Mode, 'closed':!aggregateData.Open}">
<td class="text-center">{{aggregateData.id}}</td>
<td class="text-center">{{aggregateData.factor_a}}</td>
<td class="text-center">{{aggregateData.factor_b}}</td>
<td class="text-center">{{aggregateData.factor_c}}</td>
{% if aggregateData.open == True %}
<td class="text-center">
<a href="https://www.google.co.in">
<img src="/path_toicon.png">
</a>
</td>
{% endif %}
</tr>
</tbody>
</table>

How to parse an HTML table with rowspans in Python?

The problem
I'm trying to parse an HTML table with rowspans in it, as in, I'm trying to parse my college schedule.
I'm running into the problem where if the last row contains a rowspan, the next row is missing a TD where the rowspan is now that TD that is missing.
I have no clue how to account for this and I hope to be able to parse this schedule.
What I tried
Pretty much everything I can think of.
The result I get
[
{
'blok_eind': 4,
'blok_start': 3,
'dag': 4, # Should be 5
'leraar': 'DOODF000',
'lokaal': 'ALK C212',
'vak': 'PROJ-T',
},
]
As you can see, there's a vak key with the value PROJ-T in the output snippet above, dag is 4 while it's supposed to be 5 (a.k.a Friday/Vrijdag), as seen here:
The result I want
A Python dict() that looks like the one posted above, but with the right value
Where:
day/dag is an int from 1~5 representing Monday~Friday
block_start/blok_start is an int that represents when the course starts (Time block, left side of table)
block_end/blok_eind is an int that represent in what block the course ends
classroom/lokaal is the classroom's code the course is in
teacher/leraar is the teacher's ID
course/vak is the ID of the course
Basic HTML Structure for above data
<center>
<table>
<tr>
<td>
<table>
<tbody>
<tr>
<td>
<font>
TEACHER-ID
</font>
</td>
<td>
<font>
<b>
CLASSROOM ID
</b>
</font>
</td>
</tr>
<tr>
<td>
<font>
COURSE ID
</font>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</table>
</center>
The code
HTML
<CENTER><font size="3" face="Arial" color="#000000">
<BR></font>
<font size="6" face="Arial" color="#0000FF">
16AO4EIO1B
</font> <font size="4" face="Arial">
IO1B
</font>
<BR>
<TABLE border="3" rules="all" cellpadding="1" cellspacing="1">
<TR>
<TD align="center">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 align="center" nowrap="1">
<TABLE>
<TR>
<TD align="center" nowrap=1><font size="2" face="Arial" color="#000000">
Maandag 29-08
</font> </TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 align="center" nowrap="1">
<TABLE>
<TR>
<TD align="center" nowrap=1><font size="2" face="Arial">
Dinsdag 30-08
</font> </TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 align="center" nowrap="1">
<TABLE>
<TR>
<TD align="center" nowrap=1><font size="2" face="Arial">
Woensdag 31-08
</font> </TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 align="center" nowrap="1">
<TABLE>
<TR>
<TD align="center" nowrap=1><font size="2" face="Arial">
Donderdag 01-09
</font> </TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 align="center" nowrap="1">
<TABLE>
<TR>
<TD align="center" nowrap=1><font size="2" face="Arial">
Vrijdag 02-09
</font> </TD>
</TR>
</TABLE>
</TD>
</TR>
<TR>
<TD rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD align="center" rowspan="2" nowrap=1><font size="3" face="Arial">
<B>1</B>
</font> </TD>
<TD align="center" nowrap=1><font size="2" face="Arial">
8:30
</font> </TD>
</TR>
<TR>
<TD align="center" nowrap=1><font size="2" face="Arial">
9:20
</font> </TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=4 align="center" nowrap="1">
<TABLE>
<TR>
<TD width="50%" nowrap=1><font size="2" face="Arial">
BLEEJ002
</font> </TD>
<TD width="50%" nowrap=1><font size="2" face="Arial">
<B>ALK B021</B>
</font> </TD>
</TR>
<TR>
<TD colspan="2" width="50%" nowrap=1><font size="2" face="Arial">
WEBD
</font> </TD>
</TR>
</TABLE>
</TD>
</TR>
<TR>
</TR>
<TR>
<TD rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD align="center" rowspan="2" nowrap=1><font size="3" face="Arial">
<B>2</B>
</font> </TD>
<TD align="center" nowrap=1><font size="2" face="Arial">
9:20
</font> </TD>
</TR>
<TR>
<TD align="center" nowrap=1><font size="2" face="Arial">
10:10
</font> </TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=4 align="center" nowrap="1">
<TABLE>
<TR>
<TD width="50%" nowrap=1><font size="2" face="Arial">
BLEEJ002
</font> </TD>
<TD width="50%" nowrap=1><font size="2" face="Arial">
<B>ALK B021B</B>
</font> </TD>
</TR>
<TR>
<TD colspan="2" width="50%" nowrap=1><font size="2" face="Arial">
WEBD
</font> </TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
</TR>
<TR>
</TR>
<TR>
<TD rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD align="center" rowspan="2" nowrap=1><font size="3" face="Arial">
<B>3</B>
</font> </TD>
<TD align="center" nowrap=1><font size="2" face="Arial">
10:25
</font> </TD>
</TR>
<TR>
<TD align="center" nowrap=1><font size="2" face="Arial">
11:15
</font> </TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=4 align="center" nowrap="1">
<TABLE>
<TR>
<TD width="50%" nowrap=1><font size="2" face="Arial">
DOODF000
</font> </TD>
<TD width="50%" nowrap=1><font size="2" face="Arial">
<B>ALK C212</B>
</font> </TD>
</TR>
<TR>
<TD colspan="2" width="50%" nowrap=1><font size="2" face="Arial">
PROJ-T
</font> </TD>
</TR>
</TABLE>
</TD>
</TR>
<TR>
</TR>
<TR>
<TD rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD align="center" rowspan="2" nowrap=1><font size="3" face="Arial">
<B>4</B>
</font> </TD>
<TD align="center" nowrap=1><font size="2" face="Arial">
11:15
</font> </TD>
</TR>
<TR>
<TD align="center" nowrap=1><font size="2" face="Arial">
12:05
</font> </TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=4 align="center" nowrap="1">
<TABLE>
<TR>
<TD width="50%" nowrap=1><font size="2" face="Arial">
BLEEJ002
</font> </TD>
<TD width="50%" nowrap=1><font size="2" face="Arial">
<B>ALK B021B</B>
</font> </TD>
</TR>
<TR>
<TD colspan="2" width="50%" nowrap=1><font size="2" face="Arial">
MENT
</font> </TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
</TR>
<TR>
</TR>
<TR>
<TD rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD align="center" rowspan="2" nowrap=1><font size="3" face="Arial">
<B>5</B>
</font> </TD>
<TD align="center" nowrap=1><font size="2" face="Arial">
12:05
</font> </TD>
</TR>
<TR>
<TD align="center" nowrap=1><font size="2" face="Arial">
12:55
</font> </TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
</TR>
<TR>
</TR>
<TR>
<TD rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD align="center" rowspan="2" nowrap=1><font size="3" face="Arial">
<B>6</B>
</font> </TD>
<TD align="center" nowrap=1><font size="2" face="Arial">
12:55
</font> </TD>
</TR>
<TR>
<TD align="center" nowrap=1><font size="2" face="Arial">
13:45
</font> </TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=4 align="center" nowrap="1">
<TABLE>
<TR>
<TD width="50%" nowrap=1><font size="2" face="Arial">
JONGJ003
</font> </TD>
<TD width="50%" nowrap=1><font size="2" face="Arial">
<B>ALK B008</B>
</font> </TD>
</TR>
<TR>
<TD colspan="2" width="50%" nowrap=1><font size="2" face="Arial">
BURG
</font> </TD>
</TR>
</TABLE>
</TD>
</TR>
<TR>
</TR>
<TR>
<TD rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD align="center" rowspan="2" nowrap=1><font size="3" face="Arial">
<B>7</B>
</font> </TD>
<TD align="center" nowrap=1><font size="2" face="Arial">
13:45
</font> </TD>
</TR>
<TR>
<TD align="center" nowrap=1><font size="2" face="Arial">
14:35
</font> </TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=4 align="center" nowrap="1">
<TABLE>
<TR>
<TD width="50%" nowrap=1><font size="2" face="Arial">
FLUIP000
</font> </TD>
<TD width="50%" nowrap=1><font size="2" face="Arial">
<B>ALK B004</B>
</font> </TD>
</TR>
<TR>
<TD colspan="2" width="50%" nowrap=1><font size="2" face="Arial">
ICT algemeen Prakti
</font> </TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
</TR>
<TR>
</TR>
<TR>
<TD rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD align="center" rowspan="2" nowrap=1><font size="3" face="Arial">
<B>8</B>
</font> </TD>
<TD align="center" nowrap=1><font size="2" face="Arial">
14:50
</font> </TD>
</TR>
<TR>
<TD align="center" nowrap=1><font size="2" face="Arial">
15:40
</font> </TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=4 align="center" nowrap="1">
<TABLE>
<TR>
<TD width="50%" nowrap=1><font size="2" face="Arial">
KOOLE000
</font> </TD>
<TD width="50%" nowrap=1><font size="2" face="Arial">
<B>ALK B008</B>
</font> </TD>
</TR>
<TR>
<TD colspan="2" width="50%" nowrap=1><font size="2" face="Arial">
NED
</font> </TD>
</TR>
</TABLE>
</TD>
</TR>
<TR>
</TR>
<TR>
<TD rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD align="center" rowspan="2" nowrap=1><font size="3" face="Arial">
<B>9</B>
</font> </TD>
<TD align="center" nowrap=1><font size="2" face="Arial">
15:40
</font> </TD>
</TR>
<TR>
<TD align="center" nowrap=1><font size="2" face="Arial">
16:30
</font> </TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
</TR>
<TR>
</TR>
<TR>
<TD rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD align="center" rowspan="2" nowrap=1><font size="3" face="Arial">
<B>10</B>
</font> </TD>
<TD align="center" nowrap=1><font size="2" face="Arial">
16:30
</font> </TD>
</TR>
<TR>
<TD align="center" nowrap=1><font size="2" face="Arial">
17:20
</font> </TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
<TD colspan=12 rowspan=2 align="center" nowrap="1">
<TABLE>
<TR>
<TD></TD>
</TR>
</TABLE>
</TD>
</TR>
<TR>
</TR>
</TABLE>
<TABLE cellspacing="1" cellpadding="1">
<TR>
<TD valign=bottom> <font size="4" face="Arial" color="#0000FF"></TR></TABLE><font size="3" face="Arial">
Periode1 29-08-2016 (35) - 04-09-2016 (35) G r u b e r & P e t t e r s S o f t w a r e
</font></CENTER>
Python
from pprint import pprint
from bs4 import BeautifulSoup
import requests
r = requests.get("http://rooster.horizoncollege.nl/rstr/ECO/AMR/400-ECO/Roosters/36"
"/c/c00025.htm")
daytable = {
1: "Maandag",
2: "Dinsdag",
3: "Woensdag",
4: "Donderdag",
5: "Vrijdag"
}
timetable = {
1: ("8:30", "9:20"),
2: ("9:20", "10:10"),
3: ("10:25", "11:15"),
4: ("11:15", "12:05"),
5: ("12:05", "12:55"),
6: ("12:55", "13:45"),
7: ("13:45", "14:35"),
8: ("14:50", "15:40"),
9: ("15:40", "16:30"),
10: ("16:30", "17:20"),
}
page = BeautifulSoup(r.content, "lxml")
roster = []
big_rows = 2
last_row_big = False
# There are 10 blocks, each made up out of 2 TR's, run through them
for block_count in range(2, 22, 2):
# There are 5 days, first column is not data we want
for day in range(2, 7):
dayroster = {
"dag": 0,
"blok_start": 0,
"blok_eind": 0,
"lokaal": "",
"leraar": "",
"vak": ""
}
# This selector provides the classroom
table_bold = page.select(
"html > body > center > table > tr:nth-of-type(" + str(block_count) + ") > td:nth-of-type(" + str(
day) + ") > table > tr > td > font > b")
# This selector provides the teacher's code and the course ID
table = page.select(
"html > body > center > table > tr:nth-of-type(" + str(block_count) + ") > td:nth-of-type(" + str(
day) + ") > table > tr > td > font")
# This gets the rowspan on the current row and column
rowspan = page.select(
"html > body > center > table > tr:nth-of-type(" + str(block_count) + ") > td:nth-of-type(" + str(
day) + ")")
try:
if table or table_bold and rowspan[0].attrs.get("rowspan") == "4":
last_row_big = True
# Setting end of class
dayroster["blok_eind"] = (block_count // 2) + 1
else:
last_row_big = False
# Setting end of class
dayroster["blok_eind"] = (block_count // 2)
except IndexError:
pass
if table_bold:
x = table_bold[0]
# Classroom ID
dayroster["lokaal"] = x.contents[0]
if table:
iter = 0
for x in table:
content = x.contents[0].lstrip("\r\n").rstrip("\r\n")
# Cell has data
if content != "":
# Set start of class
dayroster["blok_start"] = block_count // 2
# Set day of class
dayroster["dag"] = day - 1
if iter == 0:
# Teacher ID
dayroster["leraar"] = content
elif iter == 1:
# Course ID
dayroster["vak"] = content
iter += 1
if table or table_bold:
# Store the data
roster.append(dayroster)
# Remove duplicates
seen = set()
new_l = []
for d in roster:
t = tuple(d.items())
if t not in seen:
seen.add(t)
new_l.append(d)
pprint(new_l)
You'll have to track the rowspans on previous rows, one per column.
You could do this simply by copying the integer value of a rowspan into a dictionary, and subsequent rows decrement the rowspan value until it drops to 1 (or we could store the integer value minus 1 and drop to 0 for ease of coding). Then you can adjust subsequent table counts based on preceding rowspans.
Your table complicates this a little by using a default span of size 2, incrementing in steps of two, but that can easily be brought back to manageable numbers by dividing by 2.
Rather than use massive CSS selectors, select just the table rows and we'll iterate over those:
roster = []
rowspans = {} # track rowspanning cells
# every second row in the table
rows = page.select('html > body > center > table > tr')[1:21:2]
for block, row in enumerate(rows, 1):
# take direct child td cells, but skip the first cell:
daycells = row.select('> td')[1:]
rowspan_offset = 0
for daynum, daycell in enumerate(daycells, 1):
# rowspan handling; if there is a rowspan here, adjust to find correct position
daynum += rowspan_offset
while rowspans.get(daynum, 0):
rowspan_offset += 1
rowspans[daynum] -= 1
daynum += 1
# now we have a correct day number for this cell, adjusted for
# rowspanning cells.
# update the rowspan accounting for this cell
rowspan = (int(daycell.get('rowspan', 2)) // 2) - 1
if rowspan:
rowspans[daynum] = rowspan
texts = daycell.select("table > tr > td > font")
if texts:
# class info found
teacher, classroom, course = (c.get_text(strip=True) for c in texts)
roster.append({
'blok_start': block,
'blok_eind': block + rowspan,
'dag': daynum,
'leraar': teacher,
'lokaal': classroom,
'vak': course
})
# days that were skipped at the end due to a rowspan
while daynum < 5:
daynum += 1
if rowspans.get(daynum, 0):
rowspans[daynum] -= 1
This produces correct output:
[{'blok_eind': 2,
'blok_start': 1,
'dag': 5,
'leraar': u'BLEEJ002',
'lokaal': u'ALK B021',
'vak': u'WEBD'},
{'blok_eind': 3,
'blok_start': 2,
'dag': 3,
'leraar': u'BLEEJ002',
'lokaal': u'ALK B021B',
'vak': u'WEBD'},
{'blok_eind': 4,
'blok_start': 3,
'dag': 5,
'leraar': u'DOODF000',
'lokaal': u'ALK C212',
'vak': u'PROJ-T'},
{'blok_eind': 5,
'blok_start': 4,
'dag': 3,
'leraar': u'BLEEJ002',
'lokaal': u'ALK B021B',
'vak': u'MENT'},
{'blok_eind': 7,
'blok_start': 6,
'dag': 5,
'leraar': u'JONGJ003',
'lokaal': u'ALK B008',
'vak': u'BURG'},
{'blok_eind': 8,
'blok_start': 7,
'dag': 3,
'leraar': u'FLUIP000',
'lokaal': u'ALK B004',
'vak': u'ICT algemeen Prakti'},
{'blok_eind': 9,
'blok_start': 8,
'dag': 5,
'leraar': u'KOOLE000',
'lokaal': u'ALK B008',
'vak': u'NED'}]
Moreover, this code will continue to work even if courses span more than 2 blocks, or just one block; any rowspan size is supported.
Maybe it is better to use bs4 builtin function like "findAll" to parse your table.
You may use the following code :
from pprint import pprint
from bs4 import BeautifulSoup
import requests
r = requests.get("http://rooster.horizoncollege.nl/rstr/ECO/AMR/400-ECO/Roosters/36"
"/c/c00025.htm")
content=r.content
page = BeautifulSoup(content, "html")
table=page.find('table')
trs=table.findAll("tr", {},recursive=False)
tr_count=0
trs.pop(0)
final_table={}
for tr in trs:
tds=tr.findAll("td", {},recursive=False)
if tds:
td_count=0
tds.pop(0)
for td in tds:
if td.has_attr('rowspan'):
final_table[str(tr_count)+"-"+str(td_count)]=td.text.strip()
if int(td.attrs['rowspan'])==4:
final_table[str(tr_count+1)+"-"+str(td_count)]=td.text.strip()
if final_table.has_key(str(tr_count)+"-"+str(td_count+1)):
td_count=td_count+1
td_count=td_count+1
tr_count=tr_count+1
roster=[]
for i in range(0,10): #iterate over time
for j in range(0,5): #iterate over day
item=final_table[str(i)+"-"+str(j)]
if len(item)!=0:
block_eind=i+1
try:
if final_table[str(i+1)+"-"+str(j)]==final_table[str(i)+"-"+str(j)]:
block_eind=i+2
except:
pass
try:
lokaal=item.split('\r\n \n\n')[0]
leraar=item.split('\r\n \n\n')[1].split('\n \n\r\n')[0]
vak=item.split('\n \n\r\n')[1]
except:
lokaal=leraar=vak="---"
dayroster = {
"dag": j+1,
"blok_start": i+1,
"blok_eind": block_eind,
"lokaal": lokaal,
"leraar": leraar,
"vak": vak
}
dayroster_double = {
"dag": j+1,
"blok_start": i,
"blok_eind": block_eind,
"lokaal": lokaal,
"leraar": leraar,
"vak": vak
}
#use to prevent double dict for same event
if dayroster_double not in roster:
roster.append(dayroster)
print (roster)

Why can't I delete duplicates from this list (of BeautifulSoup HTML) in Python?

I currently have a list of BeautifulSoup HTML items that I got with the following method call:
tables = HTML.findAll("table", {"class": "datadisplaytable"})
This simply returns all the tables in the HTML document that match the query. This all great and well, but it returns duplicate tables (as seen below in my output).
I've tried doing this to delete the duplicates:
tables = list(set(HTML.findAll("table", {"class": "datadisplaytable"})))
And it deletes the duplicates but it doesn't preserve the order which I need.
So I tried this:
holder = []
for item in tables:
if item not in holder:
holder.append(item)
However, the duplicates still exist. Is the above method not capable of handling BeautifulSoup HTML? If not, how do you delete BeautifulSoup HTML duplicates with preserving the order?
EDIT:
tables = OrderedDict.fromkeys(HTML.findAll("table", {"class": "datadisplaytable"})).keys()
Then when printing, it was duplicate free:
for item in tables:
print "\n\n\n"
print item
But then, when I try to print doing the following, the duplicates are back. Am I going crazy?
i = 0
while (i < len(tables)-1):
print "\n\nitem[i]: \n", tables[i]
print "\n\nitem[i+1]: \n", tables[i+1]
i += 1
Any ideas?
item[i]:
<table class="datadisplaytable" summary="This table lists the scheduled meeting times and assigned instructors for this class.."><caption class="captiontext">Scheduled Meeting Times</caption>
<tbody><tr>
<th class="ddheader" scope="col">Type</th>
<th class="ddheader" scope="col">Time</th>
<th class="ddheader" scope="col">Days</th>
<th class="ddheader" scope="col">Where</th>
<th class="ddheader" scope="col">Date Range</th>
<th class="ddheader" scope="col">Schedule Type</th>
<th class="ddheader" scope="col">Instructors</th>
</tr>
<tr>
<td class="dddefault">Class</td>
<td class="dddefault">2:00 pm - 3:15 pm</td>
<td class="dddefault">MWF</td>
<td class="dddefault">Manchester Hall 241</td>
<td class="dddefault">Jan 13, 2015 - May 07, 2015</td>
<td class="dddefault">Lecture</td>
<td class="dddefault">William H. Turkett (<abbr title="Primary">P</abbr>)<img align="middle" alt="E-mail" border="0" class="headerImg" height="28" hspace="0" name="web_email" src="/wtlgifs/web_email.gif" title="E-mail" vspace="0" width="28"/></td>
</tr>
</tbody></table>
item[i+1]:
<table class="datadisplaytable" summary="This layout table is used to present the schedule course detail"><caption class="captiontext">Linear Algebra I - MTH 121 - C</caption>
<tbody><tr>
<th class="ddlabel" colspan="2" scope="row">Associated Term:</th>
<td class="dddefault">Spring 2015</td>
</tr>
<tr>
<th class="ddlabel" colspan="2" scope="row"><acronym title="Course Reference Number">CRN</acronym>:</th>
<td class="dddefault">19765</td>
</tr>
<tr>
<th class="ddlabel" colspan="2" scope="row">Status:</th>
<td class="dddefault">**Web Registered** on Nov 05, 2014</td>
</tr>
<tr>
<th class="ddlabel" colspan="2" scope="row">Assigned Instructor:</th>
<td class="dddefault">
Jason D. Gaddis<img align="middle" alt="E-mail" border="0" class="headerImg" height="28" hspace="0" name="web_email" src="/wtlgifs/web_email.gif" title="E-mail" vspace="0" width="28"/>
</td>
</tr>
<tr>
<th class="ddlabel" colspan="2" scope="row">Grade Mode:</th>
<td class="dddefault">Standard Letter</td>
</tr>
<tr>
<th class="ddlabel" colspan="2" scope="row">Credits:</th>
<td class="dddefault"> 4.000</td>
</tr>
<tr>
<th class="ddlabel" colspan="2" scope="row">Level:</th>
<td class="dddefault">Undergraduate</td>
</tr>
<tr>
<th class="ddlabel" colspan="2" scope="row">Campus:</th>
<td class="dddefault">Reynolda Campus (UG)</td>
</tr>
</tbody></table>
item[i]:
<table class="datadisplaytable" summary="This layout table is used to present the schedule course detail"><caption class="captiontext">Linear Algebra I - MTH 121 - C</caption>
<tbody><tr>
<th class="ddlabel" colspan="2" scope="row">Associated Term:</th>
<td class="dddefault">Spring 2015</td>
</tr>
<tr>
<th class="ddlabel" colspan="2" scope="row"><acronym title="Course Reference Number">CRN</acronym>:</th>
<td class="dddefault">19765</td>
</tr>
<tr>
<th class="ddlabel" colspan="2" scope="row">Status:</th>
<td class="dddefault">**Web Registered** on Nov 05, 2014</td>
</tr>
<tr>
<th class="ddlabel" colspan="2" scope="row">Assigned Instructor:</th>
<td class="dddefault">
Jason D. Gaddis<img align="middle" alt="E-mail" border="0" class="headerImg" height="28" hspace="0" name="web_email" src="/wtlgifs/web_email.gif" title="E-mail" vspace="0" width="28"/>
</td>
</tr>
<tr>
<th class="ddlabel" colspan="2" scope="row">Grade Mode:</th>
<td class="dddefault">Standard Letter</td>
</tr>
<tr>
<th class="ddlabel" colspan="2" scope="row">Credits:</th>
<td class="dddefault"> 4.000</td>
</tr>
<tr>
<th class="ddlabel" colspan="2" scope="row">Level:</th>
<td class="dddefault">Undergraduate</td>
</tr>
<tr>
<th class="ddlabel" colspan="2" scope="row">Campus:</th>
<td class="dddefault">Reynolda Campus (UG)</td>
</tr>
</tbody></table>
item[i+1]:
<table class="datadisplaytable" summary="This table lists the scheduled meeting times and assigned instructors for this class.."><caption class="captiontext">Scheduled Meeting Times</caption>
<tbody><tr>
<th class="ddheader" scope="col">Type</th>
<th class="ddheader" scope="col">Time</th>
<th class="ddheader" scope="col">Days</th>
<th class="ddheader" scope="col">Where</th>
<th class="ddheader" scope="col">Date Range</th>
<th class="ddheader" scope="col">Schedule Type</th>
<th class="ddheader" scope="col">Instructors</th>
</tr>
<tr>
<td class="dddefault">Class</td>
<td class="dddefault">12:30 pm - 1:45 pm</td>
<td class="dddefault">MWF</td>
<td class="dddefault">Carswell Hall 101</td>
<td class="dddefault">Jan 13, 2015 - May 07, 2015</td>
<td class="dddefault">Lecture</td>
<td class="dddefault">Jason Dale Gaddis (<abbr title="Primary">P</abbr>)<img align="middle" alt="E-mail" border="0" class="headerImg" height="28" hspace="0" name="web_email" src="/wtlgifs/web_email.gif" title="E-mail" vspace="0" width="28"/></td>
</tr>
</tbody></table>
Instead, I'd rely on something unique about the tables. For example, on summary attribute:
summaries = set()
tables = []
for table in soup.find_all("table", {"class": "datadisplaytable"}):
summary = table['summary']
if summary not in summaries:
summaries.add(summary)
tables.append(table)

Python BeautifulSoup how to get the index or of the HTML table

<TABLE WIDTH="100%"> <TR> <TH scope="row" VALIGN="TOP" ALIGN="LEFT" WIDTH="10%">Inventors:</TH> <TD ALIGN="LEFT" WIDTH="90%">
<B>Shimada; Masahiro</B> (Shiga, <B>JP</B>) </TD> </TR>
<TR><TH scope="row" VALIGN="TOP" ALIGN="LEFT" WIDTH="10%">Applicant: </TH><TD ALIGN="LEFT" WIDTH="90%"> <TABLE> <TR> <TH scope="column" ALIGN="center">Name</TH> <TH scope="column" ALIGN="center">City</TH> <TH scope="column" ALIGN="center">State</TH> <TH
scope="column" ALIGN="center">Country</TH> <TH scope="column" ALIGN="center">Type</TH> </TR> <TR> <TD> <B><br>Shimada; Masahiro</B> </TD><TD> <br>Shiga </TD><TD ALIGN="center"> <br>N/A </TD><TD ALIGN="center"> <br>JP </TD> </TD><TD ALIGN="left"> </TD>
</TR> </TABLE> </TD></TR>
<TR> <TH scope="row" VALIGN="TOP" ALIGN="LEFT" WIDTH="10%">Assignee:</TH>
<TD ALIGN="LEFT" WIDTH="90%">
<B>Ishida Co., Ltd.</B>
(Kyoto,
<B>JP</B>)
<BR>
</TD>
</TR>
<TR><TH scope="row" VALIGN="TOP" ALIGN="LEFT" WIDTH="10%" NOWRAP>Appl. No.:
</TH><TD ALIGN="LEFT" WIDTH="90%">
<B>12/791,478</B></TD></TR>
<TR><TH scope="row" VALIGN="TOP" ALIGN="LEFT" WIDTH="10%">Filed:
</TH><TD ALIGN="LEFT" WIDTH="90%">
<B>June 1, 2010</B></TD></TR>
</TABLE>
which is taken from this US Patent Office URL.
Above is the HTML Table I need to get the data out.
But when I use the:
trtemp=souptemp.findAll('tr')
PattentInventors=trtemp[7].text.strip()
PattentCompany=trtemp[11].text.strip()
PattentFiledtime=trtemp[13].text.strip()
The tr index 7,11,13 is not constant at all the pages.
So I change to use re module like this:
souptemp.findAll(text=re.compile("Assi"))[0]
This is to get the data for Assignee: Ishida Co., Ltd. (Kyoto, JP)
but I could not get the index of the tr list.
How could I do the get the right index for Assignee: Ishida Co., Ltd. (Kyoto, JP)
Thank you!
In [78]: anchor = soup.findAll(text=re.compile("Assi"))[0]
In [77]: ' '.join(anchor.find_next('td').stripped_strings)
Out[77]: u'Ishida Co., Ltd. (Kyoto, JP )'
import bs4 as bs
import urllib2
import re
url = 'http://patft.uspto.gov//netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=2&f=G&l=50&co1=AND&d=PTXT&s1=%22X+ray%22.ABTX.&s2=detect.ABTX.&OS=ABST/%22X+ray%22+AND+ABST/detect&RS=ABST/%22X+ray%22+AND+ABST/detect'
soup = bs.BeautifulSoup(urllib2.urlopen(url).read())
anchor = soup.findAll(text=re.compile("Assi"))[0]
assignee = ' '.join(anchor.find_next('td').stripped_strings)
print(assignee)
yields
Ishida Co., Ltd. (Kyoto, JP )

Categories

Resources