Turning a list of strings into the names of tables

Turning a list of strings into the names of tables - python

I'm trying to generate tables within this "for" loop. I'm stuck on what I need to do to allow the string in tablenames to be the name of the table.
tablenames = ['t_10', 't_20', 't_30', 't_40', 't_50', 't_60', 't_70', 't_80', 't_90', 't_100']
firstrow = ['10%', '20%', '30%', '40%', '50%', '60%', '70%', '80%', '90%', '100%']
for t, r, in zip(tablenames, firstrow):
t = [[r, '']]
Here's what I ended up doing - I know, it's not pretty (I know there's a way to generate the number of spaces I wanted...it was just not cooperating...also, the cond1 and cond0 lists are odd....)
t_10 = [['10%', '', '', '', '', '', '', '', '', '','','','']]
t_20 = [['20%', '', '', '', '', '', '', '', '', '','','','']]
t_30 = [['30%', '', '', '', '', '', '', '', '', '','','','']]
t_40 = [['40%', '', '', '', '', '', '', '', '', '','','','']]
t_50 = [['50%', '', '', '', '', '', '', '', '', '','','','']]
t_60 = [['60%', '', '', '', '', '', '', '', '', '','','','']]
t_70 = [['70%', '', '', '', '', '', '', '', '', '','','','']]
t_80 = [['80%', '', '', '', '', '', '', '', '', '','','','']]
t_90 = [['90%', '', '', '', '', '', '', '', '', '','','','']]
t_100 = [['100%', '', '', '', '', '', '', '', '', '','','','']]
tnames = [t_10, t_20, t_30, t_40, t_50, t_60, t_70, t_80, t_90, t_100]
cond1 = [conditions_list[0], conditions_list[2], conditions_list[4], conditions_list[6], conditions_list[8]]
cond0 = [conditions_list[1], conditions_list[3], conditions_list[5], conditions_list[7], conditions_list[9]]
for t, c1, c0 in zip(tnames,cond1,cond0):
c1_results = process_exhaust(c1,1)
c0_results = process_exhaust(c0, 0)
t.append(c1_results[0])
t.append(c1_results[1])
t.append(c1_results[2])
t.append(c0_results[0])
t.append(c1_results[3])
t.append(c0_results[1])

Trying to programmatically create variable names is usually a bad idea. You can do it with exec, but a better approach would be to use a dictionary.
D = {t: [r,] for t, r in zip(tablenames, firstrow)}
Now you can do D["t_10"] to get the output you desire.

Related

Is there a way to iterate through a list of lists without getting an index error? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 6 months ago.
Improve this question
I have a list of lists and I am trying to pull out every nth term from each list within the list.
Here is my input:
[['', '', '', '', '1', '', '', '', '', '', '', '', '1TD1131D17025-2035', '', '', '',
'', '', '', '', '', '', '', '', '', '', '09/16/2022', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '2',
'', '', '', '', 'EA', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '353.60', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '707.20', '\n'], ['', '', '', '', '2', '',
'', '', '', '', '', '', '1TD1131D17025-2036', '', '', '', '', '', '', '', '', '', '',
'', '', '', '09/16/2022', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '2', '', '', '', '', 'EA', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'353.60', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '707.20', '\n'], ['', '', '', '', '3', '', '', '', '', '', '', '',
'1TD1131D17025-2037', '', '', '', '', '', '', '', '', '', '', '', '', '', '09/16/2022',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '2', '', '', '', '', 'EA', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '353.60', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '707.20',
'\n']]
Right now I am trying to pull out the first integer from each list.
Here is my sample code.
def find(n,e):
for line in range(len(line_nu)):
item = line_nu[n][e]
n += 1
return item_nu.append(item)
I'm getting an 'Index out of range' error.
I can call ' line_nu[0][4] ' outside of this loop, but using same numbers in def find() I get an error. I have also tried this as a while loop where I replace n with i and start count at 0. Same error.
End goal is to get each none '' in a list of its own.
Anyone know what I'm doing wrong?

assuming your list of lists has the name data, we can get the first integer of each sublist like:
data = [[...],[...],...]
for list in data:
for item in list:
if item.isdigit():
print(item)
break

I interpreted your question as looking for the nth non-empty element in each sub list, so elements != ''. This function will execute that and return an array of the nth non-empty element in each sub array
exl = [['', '', '', '', '1', '', '', '', '', '', '', '', '1TD1131D17025-2035', '', '', '',
'', '', '', '', '', '', '', '', '', '', '09/16/2022', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '2',
'', '', '', '', 'EA', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '353.60', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '707.20', '\n'], ['', '', '', '', '2', '',
'', '', '', '', '', '', '1TD1131D17025-2036', '', '', '', '', '', '', '', '', '', '',
'', '', '', '09/16/2022', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '2', '', '', '', '', 'EA', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'353.60', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '707.20', '\n'], ['', '', '', '', '3', '', '', '', '', '', '', '',
'1TD1131D17025-2037', '', '', '', '', '', '', '', '', '', '', '', '', '', '09/16/2022',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '2', '', '', '', '', 'EA', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '353.60', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '707.20',
'\n']]
def find(n,listOfLists):
result = []
for list in listOfLists:
discoveredItems = 0
for item in list:
if item != '':
discoveredItems += 1
if discoveredItems == n:
result.append(item)
break
return result
So to get the first non-empty item in each list (the integers you mentioned above)...
find(1,exl)
# ['1', '2', '3']
And the dates (aka 3rd non-empty element):
find(3,exl)
# ['09/16/2022', '09/16/2022', '09/16/2022']

You'll need two for loops. For example:
test = [['', '', '', '', '1', '', '', '', '', '', '', '', '1TD1131D17025-2035', '', '', '',
'', '', '', '', '', '', '', '', '', '', '09/16/2022', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '2',
'', '', '', '', 'EA', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '353.60', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '707.20', '\n'], ['', '', '', '', '2', '',
'', '', '', '', '', '', '1TD1131D17025-2036', '', '', '', '', '', '', '', '', '', '',
'', '', '', '09/16/2022', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '2', '', '', '', '', 'EA', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'353.60', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '707.20', '\n'], ['', '', '', '', '3', '', '', '', '', '', '', '',
'1TD1131D17025-2037', '', '', '', '', '', '', '', '', '', '', '', '', '', '09/16/2022',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '2', '', '', '', '', 'EA', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '353.60', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '707.20',
'\n']
for row in range(len(test)):
for col in range(len(test[row])):
print(test[row][col])

If the numbers are always at the same index for all sublists, you can get a list of the numbers (as strings) by looping once:
def find(lists, n):
out = []
for lst in lists:
out.append(lst[n])
return out
# call it with n=4
numbers = find(mylists, 4)
Obviously, a list comprehension can more appropriate here
def find(lists, n):
return [lst[n] for lst in lists]

L = [['', '', '', '', '1', '', '', '', '', '', '', '', '1TD1131D17025-2035', '', '', '',
'', '', '', '', '', '', '', '', '', '', '09/16/2022', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '2',
'', '', '', '', 'EA', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '353.60', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '707.20', '\n'], ['', '', '', '', '2', '',
'', '', '', '', '', '', '1TD1131D17025-2036', '', '', '', '', '', '', '', '', '', '',
'', '', '', '09/16/2022', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '2', '', '', '', '', 'EA', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'353.60', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '707.20', '\n'], ['', '', '', '', '3', '', '', '', '', '', '', '',
'1TD1131D17025-2037', '', '', '', '', '', '', '', '', '', '', '', '', '', '09/16/2022',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '2', '', '', '', '', 'EA', '', '', '', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '353.60', '', '', '', '', '',
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '707.20',
'\n']]
for _ in range(len(L)):
print(set(L[_]))
OUTPUT:
{'', '707.20', '353.60', '1TD1131D17025-2035', '2', 'EA', '09/16/2022', '1', '\n'}
{'', '707.20', '353.60', '2', 'EA', '09/16/2022', '1TD1131D17025-2036', '\n'}
{'', '707.20', '353.60', '2', 'EA', '09/16/2022', '1TD1131D17025-2037', '\n', '3'}

Selenium Python not able to extract text within all span tags

I am creating a small python program that automates 10fastfingers. In order to do that, I have to first extract all the words that I have to type. All these words are stored within span tags like this:
When I run my code, it just extracts the first 20-30 words rather than extracting all the words. Why is this so? Here is my code:
from selenium import webdriver
import time
url = "https://10fastfingers.com/typing-test/english"
browser = webdriver.Chrome("D:\\Python_Files\\Programs\\chromedriver.exe")
browser.get(url)
time.sleep(10)
count = 1
wordlst = []
while True:
try:
word = browser.find_element_by_xpath(f'//*[#id="row1"]/span[{count}]')
wordlst.append(word.text)
count += 1
except:
break
print(wordlst)
Output:
['them', 'how', 'said', 'light', 'show', 'seem', 'not', 'two', 'under', 'hear', 'them', 'there', 'about', 'face', 'us', 'change', 'year', 'only', 'leave', 'number', 'found', 'father', 'people', 'house', 'really', 'my', 'spell', 'when', 'look', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '']
How to solve this problem? Any help would be appreciated. Thanks!

You can do that with BeautifulSoup
from selenium import webdriver
import time
from bs4 import BeautifulSoup
url = "https://10fastfingers.com/typing-test/english"
browser = webdriver.Chrome("D:\\Python_Files\\Programs\\chromedriver.exe")
browser.get(url)
time.sleep(3)
html_soup = BeautifulSoup(browser.page_source, 'html.parser')
div = html_soup.find_all('div', id = 'row1')
wordlst=div[0].get_text().split()
browser.quit()
print(wordlst)
OR
to continue your approach,
from selenium import webdriver
import time
url = "https://10fastfingers.com/typing-test/english"
browser = webdriver.Chrome("D:\\Python_Files\\Programs\\chromedriver.exe")
browser.get(url)
time.sleep(6)
wordlst=browser.find_elements_by_xpath('//div[#id="row1"]/span')
wordlst=[x.get_attribute("innerHTML") for x in wordlst]
browser.quit()
print(wordlst)

Is it possible to return the correct values from a list of dictionaries based on user inputted subject ID?

I apologize for any formatting issues or unclear parts, I'm VERY new to Python and programming in general. I want to make a script that pulls a list of research participant records (the code provided here is sample data) and the list contains separate dictionary-like items that have all of the screening questions (including the record id, or subject ID). I want to pull particular items (self-harm reports and suicidal thoughts questions) from this depending on what the script's user inputs as the record id
I want the script to be able to pull from a growing list of dictionaries, so it has to index So far I have tried to return a tuple based on the user input, but it returns the same values
regardless of what I input for subj, it returns the same three values ('1', '2', '1'), the values of ONLY the first dictionary
from redcap import Project, RedcapError
URL = 'https://redcap.lib.umd.edu/api/'
#API KEY for sample data
API_KEY = 'B2E685118B86FA89F57C49A1C9A38BDC'
project = Project(URL, API_KEY)
all_data = project.export_records()
def find(subj, data):
index = 0
j = 0
for i in data:
for k,v in i.items():
if k == 'record_id' and v == subj:
index = j
j+=1
else:
j+=1
return data[index]['record_id'],data[index]['selfharm_18yr'],data[index]['talksaboutkillingself_18yr']
AN EXAMPLE OF DATA RECORD
[{'record_id': '1', 'child_gender': '', 'c_age': '', 'c_dob': '', 't_date': '', 'school_yn': '', 'school_grade': '', 'father_job': '', 'mother_work': '', 'parentgender': '', 'relation_to_child': '', 'other': '', 'no_sports': '', 'sport_a': '', 'average_time_a': '', 'average_skill_a': '', 'sport_b_yes': '', 'sport_b': '', 'average_time_b': '', 'average_skill_b': '', 'sport_c_yes': '', 'sport_c': '', 'average_time_c': '', 'average_skill_c': '', 'hobby_a_yes': '', 'hobby_a': '', 'hobby_a_time': '', 'hobby_a_skill': '', 'hobby_b_yes': '', 'hobby_b': '', 'hobby_b_time': '', 'hobby_b_skill': '', 'hobby_c_yes': '', 'hobby_c': '', 'hobby_c_time': '', 'hobby_c_skill': '', 'clubs': '', 'club1': '', 'activeclub1': '', 'clubs_2': '', 'club2': '', 'activeclub2': '', 'clubs_3': '', 'club3': '', 'activeclub3': '', 'chore_a_yes': '', 'chore_a': '', 'chore_a_skill': '', 'chore_b_yes': '', 'chore_b': '', 'chore_b_skill': '', 'chore_c_yes': '', 'chore_c': '', 'chore_c_skill': '', 'close_friends': '', 'friends': '', 'get_along_siblings': '', 'along_withkids': '', 'behave': '', 'play_work': '', 'attend_school': '', 'school_reason': '', 'performance1': '', 'performance2': '', 'performance3': '', 'performance4': '', 'othersubjects': '', 'other_subjects': '', 'performanceother': '', 'other2': '', 'other_subjects_2': '', 'performanceother_2': '', 'other3': '', 'other_subjects_3': '', 'performanceother_3': '', 'specialeducation': '', 'sp_ed': '', 'repeat_grades': '', 'repeat2': '', 'academic_problems': '', 'describe_problems': '', 'problems_date': '', 'problems_yn': '', 'end_problems': '', 'disabilities': '', 'disability2': '', 'concerns': '', 'best_things': '', 'too_young': '', 'alcohol': '', 'describe_alc18yr': '', 'argues': '', 'fails_finishing_things': '', 'enjoyment': '', 'bm': '', 'bragging': '', 'concentration': '', 'obsessions': '', 'describe_obesessions': '', 'restlessness': '', 'dependence': '', 'lonely': '', 'confusion': '', 'crying': '', 'cruelty_animals': '', 'cruelty': '', 'daydreams': '', 'selfharm_18yr': '2', 'attention': '', 'destruction': '', 'destruction2': '', 'disobedience': '', 'school_disobedience': '', 'eating_well': '', 'getting_along': '', 'guilt_misbehaving': '', 'jealousy': '', 'rule_breaking': '', 'fearful': '', 'describe_fears': '', 'fears_school': '', 'fears_thoughts': '', 'perfection': '', 'loveless': '', 'others_outtoget': '', 'worthlessness': '', 'accident_prone': '', 'fights': '', 'teasing': '', 'trouble_makers': '', 'voices': '', 'describe_voices': '', 'impulsive_acts': '', 'solitary': '', 'lying_cheating': '', 'fingernails': '', 'tense': '', 'movements': '', 'describe_movements': '', 'nightmares': '', 'likeability': '', 'constipation': '', 'fear_anxiety': '', 'dizziness': '', 'guilt': '', 'overeating': '', 'overtired': '', 'overweight': '', 'aches_pains': '', 'headaches': '', 'nausea': '', 'eye_problems': '', 'describe_eyes': '', 'skin_problems': '', 'stomach_aches': '', 'vomiting': '', 'other_conditions': '', 'describe_other': '', 'physical_violence': '', 'picks_skin': '', 'describe_skin': '', 'public': '', 'public2': '', 'school_work': '', 'coordination': '', 'older_kids': '', 'younger_kids': '', 'talking_refusal': '', 'compulsions': '', 'describe_compulsions': '', 'runs_away': '', 'screams': '', 'secretive': '', 'seeing_things': '', 'describe_seeingthings': '', 'self_conscious': '', 'sets_fires': '', 'sexual_problems': '', 'describe_sexualproblems': '', 'clowning': '', 'shy_timid': '', 'sleeps_less': '', 'sleeps_more': '', 'describe_sleeping': '', 'inattentive': '', 'speech_problems': '', 'describe_speechproblems': '', 'stares_blankly': '', 'steals_home': '', 'steals_outside': '', 'stores': '', 'describe_hoarding': '', 'strange_behavior': '', 'describe_strangebehavior': '', 'strange_ideas': '', 'describe_ideas': '', 'stubborn_sullen': '', 'mood_changes': '', 'sulking': '', 'suspicious': '', 'swearing_obscenities': '', 'talksaboutkillingself_18yr': '1', 'sleeptalking_walking': '', 'describe_sleeptalking': '', 'talks_toomuch': '', 'frequent_teasing': '', 'temper_tantrums': '', 'thinks_sex': '', 'threatens_people': '', 'thumb_sucking': '', 'smoking': '', 'sleeping_troubles': '', 'describe_sleepingtroubles': '', 'truancy': '', 'low_energy': '', 'depression': '', 'loud': '', 'uses_drugs': '', 'describe_drugusage': '', 'vandalism': '', 'wets_self': '', 'wets_bed': '', 'whining': '', 'opposite_sex': '', 'withdrawn': '', 'frequent_worries': '', 'additional_problems': '', 'problem_a': '', 'prob_a_true': '', 'problem_b_yes': '', 'problem_b': '', 'prob_b_true': '', 'problem_c_yes': '', 'problem_c': '', 'prob_c_true': ''}, {'record_id': '2', 'child_gender': '', 'c_age': '', 'c_dob': '', 't_date': '', 'school_yn': '', 'school_grade': '', 'father_job': '', 'mother_work': '', 'parentgender': '', 'relation_to_child': '', 'other': '', 'no_sports': '', 'sport_a': '', 'average_time_a': '', 'average_skill_a': '', 'sport_b_yes': '', 'sport_b': '', 'average_time_b': '', 'average_skill_b': '', 'sport_c_yes': '', 'sport_c': '', 'average_time_c': '', 'average_skill_c': '', 'hobby_a_yes': '', 'hobby_a': '', 'hobby_a_time': '', 'hobby_a_skill': '', 'hobby_b_yes': '', 'hobby_b': '', 'hobby_b_time': '', 'hobby_b_skill': '', 'hobby_c_yes': '', 'hobby_c': '', 'hobby_c_time': '', 'hobby_c_skill': '', 'clubs': '', 'club1': '', 'activeclub1': '', 'clubs_2': '', 'club2': '', 'activeclub2': '', 'clubs_3': '', 'club3': '', 'activeclub3': '', 'chore_a_yes': '', 'chore_a': '', 'chore_a_skill': '', 'chore_b_yes': '', 'chore_b': '', 'chore_b_skill': '', 'chore_c_yes': '', 'chore_c': '', 'chore_c_skill': '', 'close_friends': '', 'friends': '', 'get_along_siblings': '', 'along_withkids': '', 'behave': '', 'play_work': '', 'attend_school': '', 'school_reason': '', 'performance1': '', 'performance2': '', 'performance3': '', 'performance4': '', 'othersubjects': '', 'other_subjects': '', 'performanceother': '', 'other2': '', 'other_subjects_2': '', 'performanceother_2': '', 'other3': '', 'other_subjects_3': '', 'performanceother_3': '', 'specialeducation': '', 'sp_ed': '', 'repeat_grades': '', 'repeat2': '', 'academic_problems': '', 'describe_problems': '', 'problems_date': '', 'problems_yn': '', 'end_problems': '', 'disabilities': '', 'disability2': '', 'concerns': '', 'best_things': '', 'too_young': '', 'alcohol': '', 'describe_alc18yr': '', 'argues': '', 'fails_finishing_things': '', 'enjoyment': '', 'bm': '', 'bragging': '', 'concentration': '', 'obsessions': '', 'describe_obesessions': '', 'restlessness': '', 'dependence': '', 'lonely': '', 'confusion': '', 'crying': '', 'cruelty_animals': '', 'cruelty': '', 'daydreams': '', 'selfharm_18yr': '3', 'attention': '', 'destruction': '', 'destruction2': '', 'disobedience': '', 'school_disobedience': '', 'eating_well': '', 'getting_along': '', 'guilt_misbehaving': '', 'jealousy': '', 'rule_breaking': '', 'fearful': '', 'describe_fears': '', 'fears_school': '', 'fears_thoughts': '', 'perfection': '', 'loveless': '', 'others_outtoget': '', 'worthlessness': '', 'accident_prone': '', 'fights': '', 'teasing': '', 'trouble_makers': '', 'voices': '', 'describe_voices': '', 'impulsive_acts': '', 'solitary': '', 'lying_cheating': '', 'fingernails': '', 'tense': '', 'movements': '', 'describe_movements': '', 'nightmares': '', 'likeability': '', 'constipation': '', 'fear_anxiety': '', 'dizziness': '', 'guilt': '', 'overeating': '', 'overtired': '', 'overweight': '', 'aches_pains': '', 'headaches': '', 'nausea': '', 'eye_problems': '', 'describe_eyes': '', 'skin_problems': '', 'stomach_aches': '', 'vomiting': '', 'other_conditions': '', 'describe_other': '', 'physical_violence': '', 'picks_skin': '', 'describe_skin': '', 'public': '', 'public2': '', 'school_work': '', 'coordination': '', 'older_kids': '', 'younger_kids': '', 'talking_refusal': '', 'compulsions': '', 'describe_compulsions': '', 'runs_away': '', 'screams': '', 'secretive': '', 'seeing_things': '', 'describe_seeingthings': '', 'self_conscious': '', 'sets_fires': '', 'sexual_problems': '', 'describe_sexualproblems': '', 'clowning': '', 'shy_timid': '', 'sleeps_less': '', 'sleeps_more': '', 'describe_sleeping': '', 'inattentive': '', 'speech_problems': '', 'describe_speechproblems': '', 'stares_blankly': '', 'steals_home': '', 'steals_outside': '', 'stores': '', 'describe_hoarding': '', 'strange_behavior': '', 'describe_strangebehavior': '', 'strange_ideas': '', 'describe_ideas': '', 'stubborn_sullen': '', 'mood_changes': '', 'sulking': '', 'suspicious': '', 'swearing_obscenities': '', 'talksaboutkillingself_18yr': '2', 'sleeptalking_walking': '', 'describe_sleeptalking': '', 'talks_toomuch': '', 'frequent_teasing': '', 'temper_tantrums': '', 'thinks_sex': '', 'threatens_people': '', 'thumb_sucking': '', 'smoking': '', 'sleeping_troubles': '', 'describe_sleepingtroubles': '', 'truancy': '', 'low_energy': '', 'depression': '', 'loud': '', 'uses_drugs': '', 'describe_drugusage': '', 'vandalism': '', 'wets_self': '', 'wets_bed': '', 'whining': '', 'opposite_sex': '', 'withdrawn': '', 'frequent_worries': '', 'additional_problems': '', 'problem_a': '', 'prob_a_true': '', 'problem_b_yes': '', 'problem_b': '', 'prob_b_true': '', 'problem_c_yes': '', 'problem_c': '', 'prob_c_true': ''}]
I expect it to output a truple of the three keys, depending on what the record id of the corresponding dictionary is, but it instead outputs the same thing regardless of the subject ID
AN EXAMPLE OF THE OUTPUT
find('1', all_data)
('1', '2', '1')
find('2', all_data)
('1', '2', '1')
In the future I also want to be able to send those to an Excel spreadsheet.

So in this case, you're doing a ton of unnecessary iteration. The beauty of python dictionaries is that they're hashed and optimized for lookup operations.
Rather than iterating through keys and values, all you need to do is supply the key as the index and return early if the record exists. (Note, I changed a few names around for clarity, and to ensure that things like find() don't shadow built-in methods from other classes)
def find_item(subj, data):
for subdict in data:
if subdict['record_id']== subj:
return subdict['record_id'],subdict['selfharm_18yr'],subdict['talksaboutkillingself_18yr']
return "No Records Found"
find_item('1',data)
('1', '2', '1')
find_item('2',data)
('2', '3', '2')
find_item('zyzzyx',data)
"No records found"
And, regarding your function, here's where I believe the problem lies:
if k == 'record_id' and v == subj:
index = j
j+=1
else:
j+=1
In the case of the provided list of 2 records, this means you're setting index==0 before you update j, so even if the record is found at i[1], you still return the values from i[0]

How to correctly parse HTML to Unicode strings with pandas?

I'm running a Python program which fetches a UTF-8-encoded web page, and I extract some text from HTML table using pandas(read_html) and write result to csv file
However, when I write this text to a file,all spaces in it gets written in an unexpected encoding (example \xd0\xb9\xd1\x82\xd0\xb8).
to solve the problem I added a line i = i.split(" ")
after, all spaces in csv file substitutes for characters, the example below:
['0', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '1', '', '', '', '', '', '', '', '', '', '', '', '', '', '2', '', '', '3\n0', '', '', '', '', '', '', '', 'number', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', 'last name', '', 'number', 'plan', 'NaN\n1', '', '', '', '', '', '', '', '', '', 'NaN', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', 'NaN', '', '', 'not', 'NaN\n2', '', '', '', '', '53494580', '', '', '', '', '', '', '', '', '', '+', '(53)494580', '', '', '', '', '', '', '', '', 'NP_551', 'NaN\n3', '', '', '', '', '53494581', '', '', '', '', '', '', '', '', '', '+', '(53)494581', '', '', '', '', '', '', '', '', 'NP_551', 'NaN\n4', '', '', '', '']
I would like to get rid of character ( '', ) Is there a way to fix this?
Any pointers would be much appreciated.
code python:
import pandas as pd
import html5lib
filename="1.csv"
file=open(filename,"w",encoding='UTF-8', newline='\n');
output=csv.writer(file, dialect='excel',delimiter =' ')
r = requests.get('http://10.45.87.12/og?sh=1&CallerName=&Sys=.79.83.86.51&')
pd.set_option('max_rows',10000)
df = pd.read_html(r.content)
for i in df:
i = str(i)
i = i.strip()
i = i.encode('UTF-8').decode('UTF-8')
i = i.split(" ")
output.writerow(i)
file.close()

You can use the filter method to remove of empty values. you can add the below snippet after 'i = i.split(" ")'
A = ['0', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '1', '', '', '', '', '', '', '', '', '', '', '', '', '', '2', '', '', '3\n0', '', '', '', '', '', '', '', 'number', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', 'last name', '', 'number', 'plan', 'NaN\n1', '', '', '', '', '', '', '', '', '', 'NaN', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', 'NaN', '', '', 'not', 'NaN\n2', '', '', '', '', '53494580', '', '', '', '', '', '', '', '', '', '+', '(53)494580', '', '', '', '', '', '', '', '', 'NP_551', 'NaN\n3', '', '', '', '', '53494581', '', '', '', '', '', '', '', '', '', '+', '(53)494581', '', '', '', '', '', '', '', '', 'NP_551', 'NaN\n4', '', '', '', '']
print filter(None, A)
Output:
['0', '1', '2', '3\n0', 'number', 'last name', 'number', 'plan', 'NaN\n1', 'NaN', 'NaN', 'not', 'NaN\n2', '53494580', '+', '(53)494580', 'NP_551', 'NaN\n3', '53494581', '+', '(53)494581', 'NP_551', 'NaN\n4']

map list values when the position is not known

I I have the following list of lists:
(['investmentseminar', '300', '', '', 'CNAME', '', 'domain.com.'], 7)
(['#', '300', '', '', '', '', '', '', '', 'CNAME', '', 'domain.com.'], 12)
(['#', '300', '', '', '', '', '', '', '', '', '', '', '', '', '', 'MX', '', '1', '', 'eu-smtp-inbound-1.com.'], 20)
(['#', '3600', '', '', '', '', '', '', '', '', '', '', '', '', '', '', 'TXT', '', 'MS=ms87183849'], 19)
(['#', '3600', '', '', '', '', '', '', '', '', '', '', '', '', '', '', 'TXT', '', 'MS=ms91398333'], 19)
it is from a parsed file with BIND data, i am trying to extract the record type and TTL, where the position of the items in the list are fixed.
this is the code i have so far:
lines = [['#', '', '', 'MX', '', '10', '', 'relay1.netnames.net.'],['#', '', '', 'MX', '', '20', '', 'relay2.netnames.net.'], ['#', '3600', '', '', '', '', '', '', '', '', '', '', '', '', '', '', 'TXT', '', 'MS=ms91398333'], ['#', '300', '', '', '', '', '', '', '', '', '', '', '', '', '', 'MX', '', '1', '', 'eu-smtp-inbound-1.com.'], ['domain.tld.', '3600', '', '', '', '', '', '', '', '', '', '', '', 'TXT', '', 'v=spf1 redirect=spf.domain.tld'],['a.ns.slf', '', '', '', '', '', '', '', '', '', 'A', '', '192.123.54.133'],['adfs', '', '', '', '', '', '', '', '', '', '', '', '', '', 'A', '', '192.123.67.20']]
record_set_list = []
def record_set(record):
resource = {
'Name': record[0],
'TTL': record[1],
'Type': record[4],
'Value': record[-1]
}
record_set_list.append({'RecordSets': resource})
types = ['A', 'AAAA', 'CAA', 'CNAME', 'MX', 'NAPTR', 'PTR', 'SPF', 'SRV', 'TXT', 'ZONE']
for record in csv.reader(lines, delimiter=" "):
any_in = any(i in record for i in types)
if any_in is True:
record_set(record)
how do i match the TTL, Type and in the case of MX record the preference?
any advise is much appreciated

Use the builtin function filter to remove the empty strings, zip the remaining values with the corresponding keys, and make a dict.
def record_set(record):
keys = ['Name', 'TTL', 'Type', 'Value']
values = filter(None, record)
resource = dict(zip(keys, values))
record_set_list.append({'RecordSets': resource})

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Turning a list of strings into the names of tables - python

Trying to programmatically create variable names is usually a bad idea. You can do it with exec, but a better approach would be to use a dictionary. D = {t: [r,] for t, r in zip(tablenames, firstrow)} Now you can do D["t_10"] to get the output you desire.

Related

Is there a way to iterate through a list of lists without getting an index error? [closed]

Selenium Python not able to extract text within all span tags

Is it possible to return the correct values from a list of dictionaries based on user inputted subject ID?

How to correctly parse HTML to Unicode strings with pandas?

map list values when the position is not known

Categories

Resources