Making a list of numbers from a textfile

Making a list of numbers from a textfile - python

I'm trying to make a program which bubblesorts a list of numbers from a text file. The file has one integer per line. I tried opening the file like so:
data = open(file).readlines()
but if I do this, the line breaks \n are included in the list and my bubblesort orders the number by the first digit only (i.e. 6 is after 19). Here's an example of what happens when I run my program. I first print out the unsorted list, then print the sorted list.
['13\n', '6\n', '87\n', '19\n', '8\n', '23\n', '8\n', '65']
['13\n', '19\n', '23\n', '6\n', '65', '8\n', '8\n', '87\n']

You need to convert the elements of data into ints, as files are read in as strings. Before you do the conversion, it's probably also wise to remove the \n characters, which you can do with str.strip.
Using a list comprehension:
with open(file, 'r') as f:
data = [int(line.strip()) for line in f]
I added the with context manager. It's usually good practice to use it when opening files, as it ensures that the file is afterwards. Also note that readlines isn't actually needed - iterating over a file provides each line as a string by default.
Actually, strip isn't even needed, as int automatically seems to strip whitespace. I might keep it just in case though.
int(' 13') # 13
int('13\t') # 13
int('13 \n') # 13

You want a list of integers:
int_data = [int(dat) for dat in data]
Of course, it'd be even better to do it one integer at a time instead of reading the whole file and then converting it to integers:
with open('datafile') as fin:
int_data = [int(line) for line in fin]

I'd recommend stripping the newline character and int converting. You can do this in one succinct line with a list comprehension, but a for loop would also suffice if the list comprehension syntax is confusing.
data = open(file).readlines()
out = [int(x.strip('\n') for x in data]
out.sort()

with open(filename) as f:
data = f.read().splitlines() # give list without endline chars
numbers = map(int, data)
# but be careful, this can throw ValueError on non-number strings
if you expect that not all rows can be converted to integers, write helper generator:
def safe_ints(iterable):
for item in iterable:
try:
yield int(item)
except ValueError as err:
continue
and then use:
numbers = list(safe_ints(data))

Related

What's the fastest way to convert a list into a python list?

Say I have a list like such:
1
2
3
abc
What's the fastest way to convert this into python list syntax?
[1,2,3,"abc"]
The way I currently do it is to use regex in a text editor. I was looking for a way where I can just throw in my list and have it convert immediately.

Read the file, split into lines, convert each line if numeric.
# Read the file
with open("filename.txt") as f:
text = f.read()
# Split into lines
lines = text.splitlines()
# Convert if numeric
def parse(line):
try:
return int(line)
except ValueError:
return line
lines = [parse(line) for line in lines]

If it's in a text file like you mentioned in the comments then you can simply read it and split by a new line. For example, your code would be:
with open("FileName.txt", "r") as f:
L = f.read().split("\n")
print(L)
Where L is the list.
Now, if your looking for the variables to be of the correct type instead of all being strings, then you can loop through each in the list to check. For example, you could do the following:
for i in range(len(L)):
if L[i].isnumeric():
L[i] = int(L[i])

Reading in a list of values into a Python variable (or variables)

If you look at the site Code Abbey you'll see a list of simple tasks to be solved in programming. Typically they ask for something simple like finding the minimum of three numbers, and you're given a list of twenty-five to thirty sets of numbers to process.
8258665 -1509184 -1150960
6426035 -8744356 -3699930
-5253083 -3480272 -195609
-9613917 -4137099 3192037
I'm looking for a way to take these three lists and insert them into three lists that I can then process. I'm a beginner at all this and have to manually paste the numbers in and do lots of deletions and adding of commas to manually make three lists. But there has to be a better way.
Some of the programs I've seen use input().strip or input.slice() (I don't remember exactly) to process the numbers, but I can't see their complete solutions so can't see their complete logic.
You can't save these to an external file and read them in. My thought was to make a text file and read them in that way. No dice -- the way the answers are processed has you run your code in their page, and your answer is automatically put in the answer slot. But that's too much detail.
I want to know if there is a way in Python to have a list of numbers like I have above just pasted into a Python program and then read them into a python data structure. The input() methods don't make sense to me as I thought input() was for keyboard input. And in none of my reading have I seen a way to have data just splatted into a script for processing. Since these are on the very simplest problems, the Code Abbey people must think it something really easy to do and I just don't see it.

This is not a direct answer to your mentioned question. you can append list_i into some other list in order to solve exact mentioned question.
number_of_list = int(input()) # take number of lists in input in first line.
for i in range(number_of_list):
list_i = list(map(int, input().split())) # this will map all values into INT and put them into a list.
print(min(list_i))

Given a string separated with spaces (which is what you showed) we can do this to turn it into a list:
myString = input() #This is where you input your numbers, simply separated by spaces
myArray = myString.split() #This separates your string into a list of strings
myArray = [int(s) for s in myArray] #This uses turns the list of strings into a list of integers
From there, you'll have a list of integers to work with

If you want to be able to read in multiline input, you can read each line individually and then convert each line to an array of ints:
def create_list():
lines = []
print('Paste the input below or enter it line by line:')
while True:
line = input()
if line:
lines.append(line)
else:
break
return [list(map(int, line.split(' '))) for line in lines]
Using this you can copy and paste your input and get the output you want:
>>> nums = create_list()
Paste the input below or enter it line by line:
8258665 -1509184 -1150960
6426035 -8744356 -3699930
-5253083 -3480272 -195609
-9613917 -4137099 3192037
>>> nums
[[8258665, -1509184, -1150960], [6426035, -8744356, -3699930], [-5253083, -3480272, -195609], [-9613917, -4137099, 3192037]]
What this line [list(map(int, line.split(' '))) for line in lines] is doing is the following:
It iterates through each line in the lines list (for line in lines)
It splits each line into a list using a space as a delimiter (line.split(' '))
It maps a function call to each element to replace each value of the passed list with the result of calling the function on the value, in this case it replaces each result with its conversion from a str to an int by calling int() on each element of line.split(' ') (map(int, line.split(' '))
Then it converts this returned map iterator to a list, producing the value you want
That line would be roughly equivalent to the following:
for idx, line in enumerate(lines):
split_line = line.split(' ')
for idx2, el in enumerate(split_line):
split_line[idx2] = int(el)
lines[idx] = split_line
return lines

map function in Python

Content of file scores.txt that lists the performance of players at a certain game:
80,55,16,26,37,62,49,13,28,56
43,45,47,63,43,65,10,52,30,18
63,71,69,24,54,29,79,83,38,56
46,42,39,14,47,40,72,43,57,47
61,49,65,31,79,62,9,90,65,44
10,28,16,6,61,72,78,55,54,48
The following program reads the file and stores the scores into a list
f = open('scores.txt','r')
L = []
for line in f:
L = L + map(float,str.split(line[:-1],','))
print(L)
But it leads to error messages. I was given code in class so quite confused as very new to Pyton.
Can I fix code?

It appears you've adapted python2.x code to use in python3.x. Note that map does not return a list in python3.x, it returns a generator map object (not a list, basically) that you've to convert to a list appropriately.
Furthermore, I'd recommend using list.extend instead of adding the two together. Why? The former creates a new list object every time you perform addition, and is wasteful in terms of time and space.
numbers = []
for line in f:
numbers.extend(list(map(float, line.rstrip().split(','))))
print(numbers)
An alternative way of doing this would be:
for line in f:
numbers.extend([float(x) for x in line.rstrip().split(',')])
Which happens to be slightly more readable. You could also choose to get rid of the outer for loop using a nested list comprehension.
numbers = [float(x) for line in f for x in line.rstrip().split(',')]
Also, forgot to mention this (thanks to chris in the comments), but you really should be using a context manager to handle file I/O.
with open('scores.txt', 'r') as f:
...
It's cleaner, because it closes your files automatically when you're done with them.
After seeing your ValueError message, it's clear there's issues with your data (invalid characters, etc). Let's try something a little more aggressive.
numbers = []
with open('scores.txt', 'r') as f:
for line in f:
for x in line.strip().split(','):
try:
numbers.append(float(x.strip()))
except ValueError:
pass
If even that doesn't work, perhaps, something even more aggressive with regex might do it:
import re
numbers = []
with open('scores.txt', 'r') as f:
for line in f:
line = re.sub('[^\d\s,.+-]', '', line)
... # the rest remains the same

I have a list of lists, I want to group by 10

I've got a raw text file with a line-separated list of EAN numbers, which I'm adding to a list (as a string) as follows:
listofEAN = []
with open('Data', newline='\r\n') as inputfile:
for row in csv.reader(inputfile):
listofEAN.append(row)
This creates a "list of lists" (I'm not sure why it doesn't create a single list?) in the format:
[['0075678660924'], ['0093624912613'], ['3299039990322'], ['0190295790394'], ['0075678660627'], ['0075678661150'], ...]
I'm trying to do transform the list into a list of lists with 10 EAN's each. So running listofEAN[0] would return the first 10 EAN's, and so forth.
I'm afraid I'm struggling to do this - I presume I need to use a loop of some kind, but I'm having trouble with creating a loop and combining the loop operations with the list syntax.
Any pointers would be greatly appreciated.
EDIT: Similar to the query here: How do you split a list into evenly sized chunks? but the corrections here to the way I'm importing the list is of particular interest. Thank you!

A row in the CSV file is always a list, even for a file with just one column. You don't really need to use a CSV reader when you have just one value per line; just strip the whitespace from each line to create a flat list, and use the standard universal newline support:
with open('Data') as inputfile:
# create a list of non-empty lines, with whitespace removed
stripped = (line.strip() for line in inputfile)
listofEAN = [line for line in stripped if line]
Now it is trivial to make that into groups of a fixed size:
per_ten = [listofEAN[i:i + 10] for i in range(0, len(listofEAN), 10)]

Trying to input data from a txt file in a list, then make a list, then assign values to the lines

allt = []
with open('towers1.txt','r') as f:
towers = [line.strip('\n') for line in f]
for i in towers:
allt.append(i.split('\t'))
print allt [0]
now i need help, im inputting this text
mw91 42.927 -72.84 2.8
yu9x 42.615 -72.58 2.3
HB90 42.382 -72.679 2.4
and when i output im getting
['mw91 42.927 -72.84 2.8']
where in my code and what functions can i use to define the 1st 2nd 3rd and 4th values in this list and all the ones below that will output, im trying
allt[0][2] or
allt[i][2]
but that dosent give me -72.84, its an error, then other times it goes list has no attribute split
update, maybe i need to use enumerate?? i need to make sure though the middle 2 values im imputing though can be used and numbers and not strings because im subtracting them with math

Are you sure those are tabs? You can specify no argument for split and it automatically splits on whitespace (which means you won't have to strip newlines beforehand either). I copied your sample into a file and got it to work like this:
allt = []
with open('towers1.txt','r') as f:
for line in f:
allt.append(line.split())
>>>print allt[0]
['mw91', '42.927', '-72.84', '2.8']
>>>print allt[0][1]
'42.927'
Footnote: if you get rid of your first list comprehension, you're only iterating the file once, which is less wasteful.
Just saw that you want help converting the float values as well. Assuming that line.split() splits up the data correctly, something like the following should probably work:
allt = []
with open('towers1.txt','r') as f:
for line in f:
first, *_else = line.split() #Python3
data = [first]
float_nums = [float(x) for x in _else]
data.extend(float_nums)
allt.append(data)
>>>print allt[0]
['mw91', 42.927, -72.84, 2.8]
For Python2, substitute the first, *_else = line.split() with the following:
first, _else = line.split()[0], line.split()[1:]
Finally (in response to comments below), if you want a list of a certain set of values, you're going to have to iterate again and this is where list comprehensions can be useful. If you want the [2] index value for each element in allt, you'll have to do something like this:
>>> some_items = [item[2] for item in allt]
>>> some_items
[-72.84, -72.58, -72.679]

[] implies a list.
'' implies a string.
allt = ['mw91 42.927 -72.84 2.8']
allt is a list that contains a string:
allt[0] --> 'mw91 42.927 -72.84 2.8'
allt[0][2] --> '9'
allt.split() --> ['mw91', '42.927', '-72.84', '2.8']
allt.split()[2] --> '-72.84' #This is still a string.
float(allt.split()[2]) --> -72.84 #This is now a float.

I think this should also work
with open('towers.txt', 'r') as f:
allt = map(str.split, f)
And if you need the values after the first one to be floats...
with open('towers.txt', 'r') as f:
allt = [line[:1] + map(float, line[1:]) for line in map(str.split, f)]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.