Related
I have a nested list that looks like this:
[['Student', 'Exam 1', 'Exam 2', 'Exam 3'],
['Thorny', '100', '90', '80'],
['Mac', '88', '99', '111'],
['Farva', '45', '56', '67'],
['Rabbit', '59', '61', '67'],
['Ursula', '73', '79', '83'],
['Foster', '89', '97', '101']]
I was able to pull out student names using the following:
`def get_students(grades):
student_names = []
for i in range(1,len(grades)):
student_names.append(grades[i][0])
return student_names `
The output was ['Thorny', 'Mac', 'Farva', 'Rabbit', 'Ursula', 'Foster']
Now I need to pull out the "Exam" headers. I used similar code but it is getting very different results.
`def get_assignments(grades):
exam_list = []
for i in range(6, len(grades)):
exam_list.append(grades[0][1 : ])
return exam_list`
This returns the output:
`[['Exam 1', 'Exam 2', 'Exam 3']]`
I have been looking here and at GeeksForGeeks to try and figure out what I am doing wrong. It seemed like it should be a simple tweaking of the first code to access a different part of the nested list but I can't seem to get it. Any tips would be appreciated.
EDIT: Apologies, my problem may not have been clear. The desired output is
['Exam 1', 'Exam 2', 'Exam 3']
Thank you!
this is a short solution with list Comprehension
lst=[['Student', 'Exam 1', 'Exam 2', 'Exam 3'], ['Thorny', '100', '90', '80'], ['Mac', '88', '99', '111'], ['Farva', '45', '56', '67'], ['Rabbit', '59', '61', '67'], ['Ursula', '73', '79', '83'], ['Foster', '89', '97', '101']]
lstStudents=[nlst[0] for nlst in lst]
lstExams=[nlst[1:] for nlst in lst[1:]]
print(lstStudents)
print(lstExams)
the out put will be:
['Student', 'Thorny', 'Mac', 'Farva', 'Rabbit', 'Ursula', 'Foster']
[['100', '90', '80'], ['88', '99', '111'], ['45', '56', '67'], ['59', '61', '67'], ['73', '79', '83'], ['89', '97', '101']]
Maybe consider turning your data in to a list of dictionaries, then we can make use of Python's list comprehensions to retrieve whatever you need.
grades = [['Student', 'Exam 1', 'Exam 2', 'Exam 3'], ['Thorny', '100', '90', '80'], ['Mac', '88', '99', '111'], ['Farva', '45', '56', '67'], ['Rabbit', '59', '61', '67'], ['Ursula', '73', '79', '83'], ['Foster', '89', '97', '101']]
headers = grades[0]
grades_list = [dict(zip(headers, student)) for student in grades[1:]]
print(*(grade for grade in grades_list if grade['Student']=='Rabbit'))
# Output {'Student': 'Rabbit', 'Exam 1': '59', 'Exam 2': '61', 'Exam 3': '67'}
Im struggling to find a way to loop through pages and scrape data from a table - i've managed to get the data from the first page, but i dont know how to proceed with going through each page and getting the data. Ive tried various different bits of code but im unable to get anything to work. The site im trying to scrape adds &pageno=2 to the end of the url and next buttons (rather than numbered buttons) - any help would be great.
current code for scraping the first page successfully is as follows:
from cgitb import text
import requests
import pprint
import csv
from bs4 import BeautifulSoup
from lxml import html
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'}
url = 'https://www.revcomps.com/past-entry-lists/?draw_chosen=2693823'
r = requests.get(url, headers=headers)
soup = BeautifulSoup(r.text, 'html.parser')
table = soup.find('table', {'class':'ticket_results'})
data = [td.text for td in table.find_all('td')]
for table in soup.find_all('table', {'class':'ticket_results'}):
data = [td.text for td in table.find_all('td')]
pprint.pprint(data)
You can just add your requests into a loop for the page number. A Python f string can be used to add the page variable into the URL:
import requests
from bs4 import BeautifulSoup
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'}
for page in range(1, 3):
print(f"Page {page}")
url = f'https://www.revcomps.com/past-entry-lists/?draw_chosen=2693823&pageno={page};'
r = requests.get(url, headers=headers)
soup = BeautifulSoup(r.text, 'html.parser')
for table in soup.find_all('table', {'class':'ticket_results'}):
data = [td.text for td in table.find_all('td')]
print(data)
Giving you output starting:
Page 1
['Order', 'WINNING TICKET', 'Ticket', '2694340', 'Andrew Reynolds', '694', '2699224', 'Martin Lilge', '315', '2703986', 'Ricky Parton', '975']
['Order', 'Customer', 'Ticket', '2704184', 'Philip Stoyles', '001', '2700874', 'Timothy Powell', '002', '2696801', 'Steven Hill', '003', '2696301', 'Trevor Larken', '004', '2696387', 'george malone', '005', '2701735', 'Williams jonathan', '006', '2704193', 'Michael Worthington', '007', '2695573', 'Mike Bates', '008', '2695170', 'Debbie Gent', '009', '2699892', 'Edward Buffett', '010', '2694080', 'David Miller', '011', '2701554', 'Liz Coates', '012', '2694944', 'Amanda Demellweek', '013', '2695128', 'John Crowe', '014', '2698092', 'Jamie Houston', '015', '2703986', 'Ricky Parton', '016', '2700944', 'Tom Chant', '017', '2698687', 'Gary Young', '018', '2696026', 'Tritean Emanuel', '019', '2704117', 'Stephen Melekeowei', '020', '2700379', 'Darren Pearson', '021', '2696357', 'Kane Nicholas', '022', '2704062', 'Jessie Nellany', '023', '2700621', 'Nick Hart', '024', '2704879', 'Chris Maynard', '025', '2703091', 'Nils Omell', '026', '2702854', 'mr stephen elsley', '027', '2698997', 'Mark Skedgel-hill', '028', '2701558', 'Bradley King', '029', '2698372', 'simon miles', '030', '2694701', 'Gillian Chisnall', '031', '2701365', 'Sarah Tingle-kitchen', '032', '2694591', 'Robert Townsend', '033', '2695077', 'Glen Davies', '034', '2695177', 'Wayne Cummings', '035', '2701899', 'Ross Hay', '036', '2703464', 'Shaun Raynsford', '037', '2704149', 'K Oszlanczi', '038', '2703566', 'Daniel Fuller', '039', '2699263', 'Adam Torok', '040', '2700621', 'Nick Hart', '041', '2703279', 'Omar Shawesh', '042', '2699452', 'Mark Widger', '043', '2695848', 'Zoey Longley', '044', '2703704', 'Daniel Hyndman', '045', '2696997', 'Paul Daniel', '046', '2694506', 'Mark Thompson', '047', '2699460', 'Martin Buckingham', '048', '2695186', 'Matt Beavis', '049', '2701503', 'Craig Driscoll', '050', '2699318', 'Alan Parker', '051', '2699729', 'Stephen Minnikin', '052', '2695573', 'Mike Bates', '053', '2698438', 'Andrew Kosinski', '054', '2698679', 'Carly Mason', '055', '2702121', 'Mark Adams', '056', '2698613', 'Neil Gunn', '057', '2704149', 'K Oszlanczi', '058', '2699109', 'Steve Bowen', '059', '2702108', 'Thomas Martin', '060', '2696482', 'mr stephen elsley', '061', '2696813', 'Nigel Scott', '062', '2701394', 'Chris Brown', '063', '2698459', 'Gordon Bickerton', '064', '2700546', 'jon tribbeck', '065', '2702492', 'Mark Bentley', '066', '2704155', 'Ryan Stephens', '067', '2694831', 'David Godfrey', '068', '2695671', 'Lee Smith', '069', '2695066', 'Kristian Howells', '070', '2694225', 'Simon Costello', '071', '2695186', 'Matt Beavis', '072', '2699947', 'Anthony Abbey', '073', '2701845', 'Paul Quarterman', '074', '2695573', 'Mike Bates', '075', '2701618', 'Adam Kimber', '076', '2704433', 'John Hayes', '077', '2699484', 'Jamie Brookes', '078', '2695587', 'Richard Hurst', '079', '2696301', 'Trevor Larken', '080', '2698200', 'Ewa Krzyszkowska', '081', '2698023', 'Jason Reed', '082', '2702455', 'Simon Harrington', '083', '2694869', 'Mike Bates', '084', '2703644', 'Jason Gow', '085', '2700989', 'A J Freeman', '086', '2696784', 'Adam Timberlake', '087', '2701447', 'Lewis Middleton', '088', '2701236', 'Scot Beall', '089', '2695477', 'Simon Farrow', '090', '2697197', 'Marcus du Preez', '091', '2697115', 'Roderick Evans', '092', '2700621', 'Nick Hart', '093', '2701231', 'Norma Matheson', '094', '2695587', 'Richard Hurst', '095', '2702017', 'Michael Richardson', '096', '2703702', 'Dean Brain', '097', '2699907', 'Lee Murray', '098', '2694583', 'Kasia Krzyzak', '099', '2700048', 'terri palmer', '100', '2699499', 'Simon Hack', '101', '2694206', 'Graeme Allister', '102', '2700158', 'Melissa Dedman', '103', '2699262', 'Romans Zolovs', '104', '2694125', 'Jonathan Byrne', '105', '2702812', 'Nicola McLaughlin', '106', '2704152', 'Howard Pearson', '107', '2696432', 'Zac Sirrell', '108', '2696474', "Luke Davies-O'Grady", '109', '2699367', 'Charles Mulinder', '110', '2701365', 'Sarah Tingle-kitchen', '111', '2703659', 'Andrew Fenton', '112', '2695167', 'Roy heer', '113', '2698200', 'Ewa Krzyszkowska', '114', '2697494', 'Steve Nightingale', '115', '2698916', 'Dale Hodges', '116', '2695502', 'G G hearn', '117', '2699776', 'Antiny Swift', '118', '2704778', 'MARK SHAKESBY', '119', '2698200', 'Ewa Krzyszkowska', '120', '2694027', 'Paul Otway', '121', '2700621', 'Nick Hart', '122', '2695847', 'Gavin Holmes', '123', '2699915', 'Torquil Stupart', '124', '2703807', 'Andrew Telfer', '125', '2699931', 'Lloyd Reed', '126', '2700991', 'Clare Brown', '127', '2699914', 'Luke Twivey', '128', '2699308', 'MIKE SPENCER', '129', '2698885', 'dave wills', '130', '2695933', 'Regan Thacker', '131', '2696301', 'Trevor Larken', '132', '2698960', 'Adam Hamada', '133', '2699566', 'Action Fighter', '134', '2703704', 'Daniel Hyndman', '135', '2702652', 'Sarah Brooke', '136', '2694305', 'Scott Knowles', '137', '2700635', 'Jasen Swann', '138', '2696301', 'Trevor Larken', '139', '2694831', 'David Godfrey', '140', '2694174', 'Silviu Dan', '141', '2704446', 'Alan Ball', '142', '2699026', 'Adam Gillett', '143', '2699916', 'Dillon Graham', '144', '2698613', 'Neil Gunn', '145', '2697494', 'Steve Nightingale', '146', '2696380', 'Danny James Pearson', '147', '2700010', 'Peter Ede-Morley', '148', '2704731', 'Simon Wise', '149', '2694056', 'Joel Binns', '150']
Page 2
['Order', 'WINNING TICKET', 'Ticket', '2694340', 'Andrew Reynolds', '694', '2699224', 'Martin Lilge', '315', '2703986', 'Ricky Parton', '975']
['Order', 'Customer', 'Ticket', '2694305', 'Scott Knowles', '151', '2694171', 'Mariusz Karczewski', '152', '2704983', 'Jonathan Hill', '153', '2696473', 'claudia stefanoaia', '154', '2694111', 'David Robinson', '155', '2696301', 'Trevor Larken', '156', '2696270', 'Stuart Bowater', '157', '2699819', 'Ben Funnell', '158', '2703237', 'Mark Lund', '159', '2702804', 'Iain Wallace', '160', '2694206', 'Graeme Allister', '161', '2703060', 'Mark Maskell', '162', '2699308', 'MIKE SPENCER', '163', '2700589', 'Aidan McGilligan', '164', '2698428', 'Benjamin Melsome', '165', '2701686', 'Mariusz Karczewski', '166', '2694121', 'Joseph Woodard', '167', '2700989', 'A J Freeman', '168', '2699109', 'Steve Bowen', '169', '2704382', 'Keith Groundwater', '170', '2700144', 'Carl Marshall', '171', '2698017', 'Geoff Hall', '172', '2704941', 'Graham Riley', '173', '2697494', 'Steve Nightingale', '174', '2697796', 'Gary Leech', '175', '2699229', 'Karl Anson', '176', '2702100', 'Gary Plaskett', '177', '2694826', 'Rayminther Singh', '178', '2702394', 'Rebecca Smith', '179', '2694149', 'Martin Yates', '180', '2700860', 'Katie West', '181', '2695412', 'Daniel Payne', '182', '2695412', 'Daniel Payne', '183', '2699052', 'Ryan Stephens', '184', '2699136', 'Kevin Oliver', '185', '2696124', 'Lee Beesley', '186', '2695997', 'Matthew Prowse', '187', '2704493', 'Mrs P E Cranwell-Hayes', '188', '2701735', 'Williams jonathan', '189', '2699013', 'Charley Isaacs', '190', '2696452', 'Caroline Calver', '191', '2703014', 'Ryan Stephens', '192', '2699776', 'Antiny Swift', '193', '2694206', 'Graeme Allister', '194', '2702649', 'Jason Mcknight', '195', '2701415', 'Daniella Murphy', '196', '2694225', 'Simon Costello', '197', '2702685', 'Chris Firth', '198', '2701445', 'Ashlyn Adams', '199', '2694305', 'Scott Knowles', '200', '2694305', 'Scott Knowles', '201', '2695587', 'Richard Hurst', '202', '2694992', 'Dave Tomley', '203', '2694296', 'Rob Thornton', '204', '2699275', 'barry venn', '205', '2701234', 'Ben Cassidy', '206', '2699460', 'Martin Buckingham', '207', '2697494', 'Steve Nightingale', '208', '2694206', 'Graeme Allister', '209', '2697361', 'Nathan Bambury', '210', '2703464', 'Shaun Raynsford', '211', '2694471', 'lewis ballantyne', '212', '2694831', 'David Godfrey', '213', '2699627', 'Ross Fulton', '214', '2700449', 'Josh Hill', '215', '2695609', 'Will Badman', '216', '2698885', 'dave wills', '217', '2700989', 'A J Freeman', '218', '2694953', 'Mark Thomas', '219', '2700184', 'steven bennetts', '220', '2699109', 'Steve Bowen', '221', '2694305', 'Scott Knowles', '222', '2701572', 'Ethne Gambrill-Jarman', '223', '2694944', 'Amanda Demellweek', '224', '2698549', 'Mr R Bennett', '225', '2704463', 'Chris Beckett', '226', '2694608', 'Ryan Stephens', '227', '2700637', 'Andrew Mckimm', '228', '2694346', 'Will Stanyard', '229', '2699109', 'Steve Bowen', '230', '2701735', 'Williams jonathan', '231', '2701554', 'Liz Coates', '232', '2694818', 'Matt Dawe', '233', '2694372', 'Richard Lindsay', '234', '2699148', 'Grant Sivewright', '235', '2704556', 'Dale Warren', '236', '2694080', 'David Miller', '237', '2701266', 'Russell Miller', '238', '2694171', 'Mariusz Karczewski', '239', '2701647', 'Peter Renshaw', '240', '2699252', 'Nicola Haigh', '241', '2695609', 'Will Badman', '242', '2702654', 'I Petkuns', '243', '2698634', 'Gay Pieters', '244', '2701286', 'timothy cozens', '245', '2697830', 'Kevin Teasdale', '246', '2695046', 'Dan Christian Buentipo Palos', '247', '2694304', 'Gary Faulkner', '248', '2702737', 'Michael Welch', '249', '2704123', 'Paul Mcdermott', '250', '2696161', 'Jono Carter', '251', '2695871', 'Cameron Davidson', '252', '2704384', 'Lauren Redhead', '253', '2694414', 'Elaine Hills', '254', '2700798', 'Mathew Pierce', '255', '2704839', 'Danny Irvine', '256', '2704790', 'Gary Perry', '257', '2694056', 'Joel Binns', '258', '2694346', 'Will Stanyard', '259', '2700243', 'Scott Gourlay', '260', '2694206', 'Graeme Allister', '261', '2699263', 'Adam Torok', '262', '2695077', 'Glen Davies', '263', '2699109', 'Steve Bowen', '264', '2695149', 'Martin Wheeler', '265', '2697877', 'Rob Poundall', '266', '2697906', 'Mike Finn', '267', '2698068', 'Miguel Pacheco', '268', '2701176', 'alex mitchell', '269', '2700998', 'Antonio Domingo', '270', '2697049', 'James Skinner', '271', '2701415', 'Daniella Murphy', '272', '2698886', 'Julie Neill', '273', '2696260', 'John Doody', '274', '2696301', 'Trevor Larken', '275', '2694831', 'David Godfrey', '276', '2703702', 'Dean Brain', '277', '2702017', 'Michael Richardson', '278', '2697361', 'Nathan Bambury', '279', '2699938', 'Charlotte Jukes', '280', '2695350', 'Paul Fieldhouse', '281', '2702350', 'Barry Little', '282', '2694849', 'Matthew Riddell', '283', '2695592', 'Robert Harvey', '284', '2703363', 'Mason BURKINSHAW', '285', '2698579', 'Louise Davies', '286', '2696694', 'Stewart Smith', '287', '2704522', 'Adam Gillett', '288', '2701236', 'Scot Beall', '289', '2696784', 'Adam Timberlake', '290', '2704628', 'Lee Heginbotham', '291', '2699389', 'Lucy Donovan', '292', '2702673', 'James Jackson', '293', '2700232', 'raysean wharton', '294', '2699109', 'Steve Bowen', '295', '2699451', 'Winai Mays', '296', '2702364', 'Graham Williams', '297', '2695368', 'Daniel Moore', '298', '2703678', 'Ian Smith', '299', '2694027', 'Paul Otway', '300']
You should look into what happens when the end of the table is reached and test for that.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 months ago.
Improve this question
a=['0', '0.05', '0.1', '0.15', '0.2', '0.25', '0.3', '0.35', '0.4', '0.45', '0.5', '0.55', '0.6', '0.65', '0.7', '0.75', '0.8', '0.85', '0.9', '0.95', '1', '1.05', '1.1', '1.15', '1.2', '1.25', '1.3', '1.35', '1.4', '1.45', '1.5', '1.55', '1.6', '1.65', '1.7', '1.75', '1.8', '1.85', '1.9', '1.95', '10', '10.05', '10.1', '10.15', '10.2', '10.25', '10.3', '10.35', '10.4', '10.45', '10.5', '10.55', '10.6', '10.65', '10.7', '10.75', '10.8', '10.85', '10.9', '10.95', '11', '11.05', '11.1', '11.15', '11.2', '11.25', '11.3', '11.35', '11.4', '11.45', '11.5', '11.55', '11.6', '11.65', '11.7', '11.75', '11.8', '11.85', '11.9', '11.95', '12', '12.05', '12.1', '12.15', '12.2', '12.25', '12.3', '12.35', '12.4', '12.45', '12.5', '12.55', '12.6', '12.65', '12.7', '12.75', '12.8', '12.85', '12.9', '12.95', '13', '13.05', '13.1', '13.15', '13.2', '13.25', '13.3', '13.35', '13.4', '13.45', '13.5', '13.55', '13.6', '13.65', '13.7', '13.75', '13.8', '13.85', '13.9', '13.95', '14', '14.05', '14.1', '14.15', '14.2', '14.25', '14.3', '14.35', '14.4', '14.45', '14.5', '14.55', '14.6', '14.65', '14.7', '14.75', '14.8', '14.85', '14.9', '14.95', '15', '15.05', '15.1', '15.15', '15.2', '15.25', '15.3', '15.35', '15.4', '15.45', '15.5', '15.55', '15.6', '15.65', '15.7', '15.75', '15.8', '15.85', '15.9', '15.95', '16', '16.05', '16.1', '16.15', '16.2', '16.25', '16.3', '16.35', '16.4', '16.45', '16.5', '16.55', '16.6', '16.65', '16.7', '16.75', '16.8', '16.85', '16.9', '16.95', '17', '17.05', '17.1', '17.15', '17.2', '17.25', '17.3', '17.35', '17.4', '17.45', '17.5', '17.55', '17.6', '17.65', '17.7', '17.75', '17.8', '17.85', '17.9', '17.95', '18', '18.05', '18.1', '18.15', '18.2', '18.25', '18.3', '18.35', '18.4', '18.45', '18.5', '18.55', '18.6', '18.65', '18.7', '18.75', '18.8', '18.85', '18.9', '18.95', '19', '19.05', '19.1', '19.15', '19.2', '19.25', '19.3', '19.35', '19.4', '19.45', '19.5', '19.55', '19.6', '19.65', '19.7', '19.75', '19.8', '19.85', '19.9', '19.95', '2', '2.05', '2.1', '2.15', '2.2', '2.25', '2.3', '2.35', '2.4', '2.45', '2.5', '2.55', '2.6', '2.65', '2.7', '2.75', '2.8', '2.85', '2.9', '2.95', '20', '20.05', '20.1', '20.15', '20.2', '20.25', '20.3', '20.35', '20.4', '20.45', '20.5', '20.55', '20.6', '20.65', '20.7', '20.75', '20.8', '20.85', '20.9', '20.95', '21', '21.05', '21.1', '21.15', '21.2', '21.25', '21.3', '21.35', '21.4', '21.45', '21.5', '21.55', '21.6', '21.65', '21.7', '21.75', '21.8', '21.85', '21.9', '21.95', '22', '22.05', '22.1', '22.15', '22.2', '22.25', '22.3', '22.35', '22.4', '22.45', '22.5', '22.55', '22.6', '22.65', '22.7', '22.75', '22.8', '22.85', '22.9', '22.95', '23', '23.05', '23.1', '23.15', '23.2', '23.25', '23.3', '23.35', '23.4', '23.45', '23.5', '23.55', '23.6', '23.65', '23.7', '23.75', '23.8', '23.85', '23.9', '23.95', '24', '24.05', '24.1', '24.15', '24.2', '24.25', '24.3', '24.35', '24.4', '24.45', '24.5', '24.55', '24.6', '24.65', '24.7', '24.75', '24.8', '24.85', '24.9', '24.95', '25', '25.05', '25.1', '25.15', '25.2', '25.25', '25.3', '25.35', '25.4', '25.45', '25.5']
a.sort()
print (a)
You need to convert your list in a list of float, not of string
a = [*map(float, a)] # [*map(float, a)] is equivalent to list(map(float, a))
a.sort()
print(a)
Should work.
If you want to convert it back to str you can do:
a = [*map(str,a)]
OR, if you don't want trailing zeros:
a = [*map(lambda c: str(round(c,2)).rstrip('0').rstrip('.'), a)]
As pointed by ShadowRanger, if you want it to keep it in str without converting it in Float at any time you can do
a.sort(key = float)
Or
a = a.sorted(key = float)
I am new to python and I am currently learning. I got text file with variable number of spaces in between words per line:
I am trying to read it as follows:
import re
...: results = []
...: with open ("../../103.Immune_gene_families/Immune_genes/Human/human_immunegene.hits") as file:
...: for line in file:
...: if not line.startswith("#"):
...: line = re.sub("\s\s+" , " ", line)
...: #print(line)
...: ens_id = line.split(" ")[1]
...: print(ens_id)
...:
But I got the following error:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-3-469f5598d359> in <module>
6 line = re.sub("\s\s+" , " ", line)
7 #print(line)
----> 8 ens_id = line.split(" ")[1]
9 print(ens_id)
10
IndexError: list index out of range
Example lines I get with
print(line)
['ENSG00000128016', '115', '138', '107', '147', 'TF106503', '9', '32', '5.9', '8.3', '0', 'No_clan', '']
['ENSG00000128016', '135', '169', '130', '172', 'TF317698', '454', '488', '18.0', '0.00073', '0', 'No_clan', '']
['ENSG00000128016', '137', '175', '134', '196', 'TF318914', '95', '132', '21.9', '8e-05', '0', 'No_clan', '']
['ENSG00000128016', '137', '167', '130', '173', 'TF326635', '1096', '1127', '5.7', '3.3', '0', 'No_clan', '']
['ENSG00000128016', '138', '170', '133', '173', 'TF329017', '881', '912', '5.3', '4.3', '0', 'No_clan', '']
['ENSG00000128016', '139', '166', '129', '173', 'TF105541', '764', '791', '9.3', '0.38', '0', 'No_clan', '']
['ENSG00000128016', '139', '166', '132', '172', 'TF105970', '278', '305', '8.4', '0.6', '0', 'No_clan', '']
['ENSG00000128016', '140', '170', '131', '174', 'TF314946', '110', '140', '4.5', '6.3', '0', 'No_clan', '']
['ENSG00000128016', '142', '167', '134', '184', 'TF329287', '9', '33', '6.8', '2.3', '0', 'No_clan', '']
If you could help me on this regard, much appreciated.
Thank you,
AK
Welcome to SO!
If you run
string = 'abc'
print(string.split(' '))
you will see that the result is
['abc']
If you tried to string.split(' ')[1], you would generate an IndexError.
So what is happening is that, somewhere, you likely don't have the character that you are splitting on.
You get the index error because there is less than 2 elements in line.split(" "),
also meaning there was less than 2 spaces in line. Try line.split(" ")[0] instead:
import re
results = []
with open ("../../103.Immune_gene_families/Immune_genes/Human/human_immunegene.hits") as file:
for line in file:
if not line.startswith("#"):
line = re.sub("\s\s+" , " ", line)
#print(line)
ens_id = line.split(" ")[0]
print(ens_id)
#!/usr/bin/env python3.7
import subprocess
import re
import os
def main():
output=subprocess.check_output(["ps","aux"])
output=output.decode()
print(output)
if __name__=="__main__":
main()
I am trying to extract all PID values and put them in a sepearate list but i am unable to extract these.
to extract all PID values and put them in a sepearate list
To extract only pid numbers change ps command to use a specific user format
(-o format - specify user-defined format) to limit output fields.
import subprocess
import os
def main():
output = subprocess.check_output(["ps", "ax", "-o", "pid", "--no-headers"])
pids = output.decode().split()
print(pids)
if __name__=="__main__":
main()
Sample output:
['1', '2', '3', '4', '6', '8', '9', '10', '11', '12', '13', '14', '16', '17',
'18', '19', '20', '21', '23', '24', '25', '26', '27', '28', '30', '31', '32',
'33', '34', '35', '37', '38', '39', '40', '41', '42', '44', '45', '46', '47',
'48', '49', '51', '52', '53', '54', '55', '56', '58', '59', '60', '61', '62',
'63', '65', '66', '67', '68', '69', '70', '72', '73', '74', '75', '76', '77',
'79', '80', '81', '82', '83', '84', '86', '87', '88', '89', '90', '91', '93',
'94', '95', '96', '97', '100', '101', '102', '103', '104', '105', '193', '194',
'195', '199', '200', '202', '205', '206', '209', '210', '211', '212', '213',
'214', '220', '231', '248', '287', '288', '289', '290', '291', '296', '297',
'300', '307', '314', '315', '321', '324', '326', '328', '341', '344', '347',
'348', '357', '361', '362', '363', '366', '432', '483', '488', '494', '516',
'517', '518', '519', '520', '521', '522', '523', '524', '525', '526', '527',
'528', '529', '604', '620', '621', '624', '625', '627', '636', '637', '650',
'651', '743', '744', '752', '753', '770', '771', '785', '786', '791', '792',
'793', '794', '795', '796', '797', '798', '829', '838', '848', '853', '854',
'855', '856', '857', '858', '859', '860', '865', '896', '900', '901', '911',
'912', '921', '936', '937', '940', '944', '960', '964', '968', '970', '975',
'984', '989', '991', '995', '999', '1001', '1016', '1025', '1030', '1033',
'1034', '1036', '1038', '1050', '1059', '1067', '1071', '1078', '1095', '1098',
'1104', '1110', '1112', '1117', '1122', '1131', '1132', '1152', '1157', '1163',
'1169', '1175', '1181', '1191', '1201', '1204', '1210', '1218', '1225', '1250',
'1258', '1261', '1288', '1289', '1290', '1291', '1292', '1293', '1294', '1295',
'1296', '1297', '1298', '1300', '1327', '1334', '1339', '1346', '1395', '1436',
'1444', '1469', '1682', '1687', '1689', '1701', '1715', '1727', '1751', '1771',
'1797', '1837', '1900', '1902', '1992', '2025', '2075', '2307', '2492', '2801',
'2842', '2911', '3404', '3870', '3871', '3874', '4086', '4195', '5217', '5249',
'5745', '5762', '5773', '5803', '5808', '5809', '5812', '5813', '5816', '5836',
'5841', '6008', '6073', '6087', '6104', '6605', '7934', '8127', '8663',
'10274', '10862', '12317', '12428', '12605', '12622', '12650', '12676',
'12677', '12756', '12904', '13242', '13609', '14722', '14812', '15367',
'15409', '15522', '15536', '15839', '15859', '16087', '16152', '16303',
'16386', '16387']