How do I print certain parts of a list? - python

Here is my code:
def option_A():
print("Pick a Fixture!")
fixture_choice = int(input("Enter: "))
file = open("firesideFixtures.txt", "r")
fixture_number = file.readlines(fixture_choice)
fixture = [linecache.getline("firesideFixtures.txt", fixture_choice)]
print(fixture)
file.close()
The first line from the file I am using is:
1,02/09/15,18:00,RNGesus,Ingsoc,Y,Ingsoc
The expected result is:
1, 02/09/15, RNGesus, Ingsoc, Y, Ingsoc
The result I get:
['1,02/09/15,18:00,RNGesus,Ingsoc,Y,Ingsoc\n']
How can I do this?

Print the only element of your list by indexing into it:
print(fixture[0])
Output:
1,02/09/15,18:00,RNGesus,Ingsoc,Y,Ingsoc
Or, even better don't create a list in the fist place (note the missing []):
fixture = linecache.getline("firesideFixtures.txt", fixture_choice)
How can I remove the "18:00" part from the output because all I need is "1,02/09/15, RNGesus, Ingsoc, Y, Ingsoc" (from comment)
Now, remove the time:
fixture = linecache.getline("firesideFixtures.txt", fixture_choice)
parts = fixture.split(',')
res = ','.join(parts[:2] + parts[3:])
print(res)
print(fixture)
Output:
1,02/09/15,RNGesus,Ingsoc,Y,Ingsoc

Related

How to separate different input formats from the same text file with Python

I'm new to programming and python and I'm looking for a way to distinguish between two input formats in the same input file text file. For example, let's say I have an input file like so where values are comma-separated:
5
Washington,A,10
New York,B,20
Seattle,C,30
Boston,B,20
Atlanta,D,50
2
New York,5
Boston,10
Where the format is N followed by N lines of Data1, and M followed by M lines of Data2. I tried opening the file, reading it line by line and storing it into one single list, but I'm not sure how to go about to produce 2 lists for Data1 and Data2, such that I would get:
Data1 = ["Washington,A,10", "New York,B,20", "Seattle,C,30", "Boston,B,20", "Atlanta,D,50"]
Data2 = ["New York,5", "Boston,10"]
My initial idea was to iterate through the list until I found an integer i, remove the integer from the list and continue for the next i iterations all while storing the subsequent values in a separate list, until I found the next integer and then repeat. However, this would destroy my initial list. Is there a better way to separate the two data formats in different lists?
You could use itertools.islice and a list comprehension:
from itertools import islice
string = """
5
Washington,A,10
New York,B,20
Seattle,C,30
Boston,B,20
Atlanta,D,50
2
New York,5
Boston,10
"""
result = [[x for x in islice(parts, idx + 1, idx + 1 + int(line))]
for parts in [string.split("\n")]
for idx, line in enumerate(parts)
if line.isdigit()]
print(result)
This yields
[['Washington,A,10', 'New York,B,20', 'Seattle,C,30', 'Boston,B,20', 'Atlanta,D,50'], ['New York,5', 'Boston,10']]
For a file, you need to change it to:
with open("testfile.txt", "r") as f:
result = [[x for x in islice(parts, idx + 1, idx + 1 + int(line))]
for parts in [f.read().split("\n")]
for idx, line in enumerate(parts)
if line.isdigit()]
print(result)
You're definitely on the right track.
If you want to preserve the original list here, you don't actually have to remove integer i; you can just go on to the next item.
Code:
originalData = []
formattedData = []
with open("data.txt", "r") as f :
f = list(f)
originalData = f
i = 0
while i < len(f): # Iterate through every line
try:
n = int(f[i]) # See if line can be cast to an integer
originalData[i] = n # Change string to int in original
formattedData.append([])
for j in range(n):
i += 1
item = f[i].replace('\n', '')
originalData[i] = item # Remove newline char in original
formattedData[-1].append(item)
except ValueError:
print("File has incorrect format")
i += 1
print(originalData)
print(formattedData)
The following code will produce a list results which is equal to [Data1, Data2].
The code assumes that the number of entries specified is exactly the amount that there is. That means that for a file like this, it will not work.
2
New York,5
Boston,10
Seattle,30
The code:
# get the data from the text file
with open('filename.txt', 'r') as file:
lines = file.read().splitlines()
results = []
index = 0
while index < len(lines):
# Find the start and end values.
start = index + 1
end = start + int(lines[index])
# Everything from the start up to and excluding the end index gets added
results.append(lines[start:end])
# Update the index
index = end

Sorting algorithm help in python

I've been playing around with a program that will take in information from two files and then write the information out to a single file in sorted order.
So what i did was store each line of the file as an element in a list. I create another function that splits each element into a 2d array where i can easily access the name variables. From there i want to create a nested for loop that as it iterates it checks for the highest value in the array, removes the value from the list and appending it to a new list until there's a sorted list.
I think I am like 90% of the way there, but I am having trouble wrapping my head around the logic of sorting algorithms. It seems like the problem just keeps getting more complex and i keep wanting to use pointers. If someone could help shine some light on the subject I would greatly appreciate it.
import os
from http.cookiejar import DAYS
from macpath import split
# This program reads a given input file and finds its longest line.
class Employee:
def __init__(self, EmployeeID, name, wage, days):
self.EmployeeID = EmployeeID
self.name = name
self.wage = wage
self.days = days
def Extraction(file,file2):
employList = []
while True:
line1 = file.readline().strip()
line2 = file2.readline().strip()
#print(type(line1))
employList.append(line1)
#print(line1)
employList.append(line2)
#print(line2)
if line1 == '' or line2 == '':
break
return employList
def Sort(mylist):
splitlist = []
sortedlist = []
print(len(mylist))
for items in range(len(mylist)):
#print(mylist[items].split())
splitlist.append(mylist[items].split())
print(splitlist)
#print(splitlist[1][1])
#print(splitlist[1][2])
highest = "z"
print(highest)
sortingLength = len(splitlist)
for i in range(10):
for items in range(len(splitlist)-2):
if highest > splitlist[items][2]:
istrue = highest < splitlist[items][2]
highest = splitlist[items][1]
print(items)
print(istrue)
print('marker')
print(splitlist[items][2])
if items == (len(splitlist)-2):
print("End of list",splitlist[items][2])
print(highest)
print(splitlist.index(highest))
print(splitlist[len(splitlist)-1][2])
print(sortingLength)
fPath = 'C:/Temp'
fileName = 'payroll1.txt'
fullFileName = os.path.join(fPath,fileName)
fileName2 = 'payroll2.txt'
fullFileName2 = os.path.join(fPath,fileName2)
f = open(fullFileName,'r')
f2 = open(fullFileName2, 'r')
employeeList = Extraction(f,f2)#pulling out each line in the file and placing into a list
Sort(employeeList)
ReportName= "List of Employees:"
marker = '-'* len(ReportName)
print (ReportName + ' \n' + marker)
total = 0
f.close()
I am having trouble with once having the higest value trying to append that value to a sortedlist, removing the value from the splitlist, and re running the code.
Using the sorted method is much easier and already built-in, per Joran's suggestion. I've edited your reading method so that it builds two lists of tuples, representing the line and the length of the line. The sorted method will return a list sorted according to the key (line length) and descending order (reverse=True)
from operator import itemgetter
class Employee:
def __init__(self, EmployeeID, name, wage, days):
self.EmployeeID = EmployeeID
self.name = name
self.wage = wage
self.days = days
def Extraction(file,file2):
employList = []
mylines = [(i, len(l.strip()), 'file1') for i,l in enumerate(file.readlines())]
mylines2 = [(i, len(l.strip()), 'file2') for i,l in enumerate(file2.readlines())]
employList = [*mylines, *mylines2]
return employList
fPath = 'C:/Temp'
fileName = 'payroll1.txt'
fullFileName = os.path.join(fPath,fileName)
fileName2 = 'payroll2.txt'
fullFileName2 = os.path.join(fPath,fileName2)
f = open(fullFileName,'r')
f2 = open(fullFileName2, 'r')
employeeList = Extraction(f,f2)#pulling out each line in the file and placing the line_number and length into a list
f.close()
f2.close()
# Itemgetter will sort on the second element of the tuple, len(line)
# and reverse will put it in descending order
ReportName = sorted(employeeList, key=itemgetter(1), reverse=True)
EDIT: I've added markers in the tuples so that you can keep track of what lines came from what file. Might be a bit confusing without them

Modify function to output and save list

I am trying to return a list of unit numbers from about 1000 csv file names. I can read them in then get python to remove all the junk from around them and replace the 5th character to format it how I need it done. I would like to return a list of all the unit numbers so like ['6726-0501', '6826-1144']. What I am currently getting is it printing out the unit number one by one and not saving them. I have looked through previous questions but can't seem to get the mode of creating a list then appending the unit numbers to the list and saving that list to a variable to work. Does anyone know a good method for simply modifying this to output a list and save the list for later use?
Thanks,
Robin
file_names = ['job_1106_unit_672600501_las_PN23074.LAS.csv', 'job_1108_unit_682601144_las_PN23072.LAS.csv']
def change(file_names):
for comps in file_names:
comps_of_comps = list(comps)
unit_num = comps_of_comps[14:23] #[672600501]
a = (unit_num[0:4]) #[6726]
b = (unit_num[5:9]) #[0501]
unit_num = a + list('-') + b #[6,7,2,6,-,0,5,0,1]
unit_num = ''.join(unit_num) #6726-0501
print unit_num
change(file_names)
You can initialize a new list and append to it and return that list. Like
file_names = ['job_1106_unit_672600501_las_PN23074.LAS.csv', 'job_1108_unit_682601144_las_PN23072.LAS.csv']
def change(file_names):
result = []
for comps in file_names:
comps_of_comps = list(comps)
unit_num = comps_of_comps[14:23] #[672600501]
a = (unit_num[0:4]) #[6726]
b = (unit_num[5:9]) #[0501]
unit_num = a + list('-') + b #[6,7,2,6,-,0,5,0,1]
unit_num = ''.join(unit_num) #6726-0501
result.append(unit_num)
return result
print change(file_names)
OR
import re
def change(file_names):
result = []
for i in file_names:
s = re.match('.*unit_(.*)_las.*', i).group(1)
result.append(s[:len(s)/2]+"-"+s[(len(s)/2)+1:])
return result

Parsing string with Python correct way

I have some problems with parsing the correct way. I want to split the complete string in two seperate strings. And then remove the "="-signs frome the first string and the ","-sign from the 2nd string. From my output I can conclude that I did something wrong, but I do not seem to get where the problem lies. I want the first part to convert to integers, and I've already tried it with map(int, split()).
If anyone has a tip, I would appreciate that.
This is my output:
('5=20=22=10=2=0=0=1=0=1', 'Vincent Appel,Johannes Mondriaan')
This is my program:
mystring = "5=20=22=10=2=0=0=1=0=1;Vincent Appel,Johannes Mondriaan"
def split_string(mystring):
strings = mystring.split(";")
x = strings[0]
y = strings[-1]
print(x,y)
def split_scores(x):
scores = x.split("=")
score = scores[0]
names = scores[-1]
stnames(names)
print score
def stnames(y):
studentname = y.split(",")
name = studentname[1]
print name
split_string(mystring)
split_string(mystring) runs the 1st function, producing the tuple with 2 strings. But nothing runs the other functions which are intended to perform further splitting.
try:
x, y = split_string(mystring)
x1 = split_scores(x)
y1 = stnames(y)
(x1, y1)
oops, your functions print the results, don't return them. So you also need:
def split_string(mystring):
# split mystring on ";"
strings = mystring.split(";")
return strings[0],strings[1]
def split_string(mystring):
# this version raises an error if mystring does not have 2 parts
x, y = mystring.split(";")
return x,y
def split_scores(x):
# return a list with all the scores
return x.split("=")
def stnames(y):
# return a list with all names
return y.split(",")
def lastname(y):
# return the last name (, delimited string)
return y.split(",")[-1]
If you are going to split the task among functions, it is better to have them return the results rather than print them. That way they can be used in various combinations. print within a function only for debugging purposes.
Or a compact, script version:
x, y = mystring.split(';')
x = x.split('=')
y = y.split(',')[-1]
print y, x
If you want the scores as numbers, add:
x = [int(x) for x in x]
to the processing.
Try this:
def split_string(mystring):
strings = mystring.split(";")
x = int(strings[0].replace("=",""))
y = strings[-1].replace(","," ")
print x,y
My two cents.
If I understood what you want to achieve, this code could help:
mystring = "5=20=22=10=2=0=0=1=0=1;Vincent Appel,Johannes Mondriaan"
def assignGradesToStudents(grades_and_indexes, students):
list_length = len(grades_and_indexes)
if list_length%2 == 0:
grades = grades_and_indexes[:list_length/2]
indexes = grades_and_indexes[list_length/2:]
return zip([students[int(x)] for x in indexes], grades)
grades_and_indexes, students = mystring.split(';')
students = students.split(',')
grades_and_indexes = grades_and_indexes.split('=')
results = assignGradesToStudents(grades_and_indexes, students)
for result in results:
print "[*] {} got a {}".format(result[0], result[1])
Output:
[*] Vincent Appel got a 5
[*] Vincent Appel got a 20
[*] Johannes Mondriaan got a 22
[*] Vincent Appel got a 10
[*] Johannes Mondriaan got a 2

Split list based on first character - Python

I am new to Python and can't quite figure out a solution to my Problem. I would like to split a list into two lists, based on what the list item starts with. My list looks like this, each line represents an item (yes this is not the correct list notation, but for a better overview i'll leave it like this) :
***
**
.param
+foo = bar
+foofoo = barbar
+foofoofoo = barbarbar
.model
+spam = eggs
+spamspam = eggseggs
+spamspamspam = eggseggseggs
So I want a list that contains all lines starting with a '+' between .param and .model and another list that contains all lines starting with a '+' after model until the end.
I have looked at enumerate() and split(), but since I have a list and not a string and am not trying to match whole items in the list, I'm not sure how to implement them.
What I have is this:
paramList = []
for line in newContent:
while line.startswith('+'):
paramList.append(line)
if line.startswith('.'):
break
This is just my try to create the first list. The Problem is, the code reads the second block of '+'s as well because break just Exits the while Loop, not the for Loop.
I hope you can understand my question and thanks in advance for any pointers!
What you want is really a simple task that can be accomplish using list slices and list comprehension:
data = ['**','***','.param','+foo = bar','+foofoo = barbar','+foofoofoo = barbarbar',
'.model','+spam = eggs','+spamspam = eggseggs','+spamspamspam = eggseggseggs']
# First get the interesting positions.
param_tag_pos = data.index('.param')
model_tag_pos = data.index('.model')
# Get all elements between tags.
params = [param for param in data[param_tag_pos + 1: model_tag_pos] if param.startswith('+')]
models = [model for model in data[model_tag_pos + 1: -1] if model.startswith('+')]
print(params)
print(models)
Output
>>> ['+foo = bar', '+foofoo = barbar', '+foofoofoo = barbarbar']
>>> ['+spam = eggs', '+spamspam = eggseggs']
Answer to comment:
Suppose you have a list containing numbers from 0 up to 5.
l = [0, 1, 2, 3, 4, 5]
Then using list slices you can select a subset of l:
another = l[2:5] # another is [2, 3, 4]
That what we are doing here:
data[param_tag_pos + 1: model_tag_pos]
And for your last question: ...how does python know param are the lines in data it should iterate over and what exactly does the first paramin param for paramdo?
Python doesn't know, You have to tell him.
First param is a variable name I'm using here, it cuold be x, list_items, whatever you want.
and I will translate the line of code to plain english for you:
# Pythonian
params = [param for param in data[param_tag_pos + 1: model_tag_pos] if param.startswith('+')]
# English
params is a list of "things", for each "thing" we can see in the list `data`
from position `param_tag_pos + 1` to position `model_tag_pos`, just if that "thing" starts with the character '+'.
data = {}
for line in newContent:
if line.startswith('.'):
cur_dict = {}
data[line[1:]] = cur_dict
elif line.startswith('+'):
key, value = line[1:].split(' = ', 1)
cur_dict[key] = value
This creates a dict of dicts:
{'model': {'spam': 'eggs',
'spamspam': 'eggseggs',
'spamspamspam': 'eggseggseggs'},
'param': {'foo': 'bar',
'foofoo': 'barbar',
'foofoofoo': 'barbarbar'}}
I am new to Python
Whoops. Don't bother with my answer then.
I want a list that contains all lines starting with a '+' between
.param and .model and another list that contains all lines starting
with a '+' after model until the end.
import itertools as it
import pprint
data = [
'***',
'**',
'.param',
'+foo = bar',
'+foofoo = barbar',
'+foofoofoo = barbarbar',
'.model',
'+spam = eggs',
'+spamspam = eggseggs',
'+spamspamspam = eggseggseggs',
]
results = [
list(group) for key, group in it.groupby(data, lambda s: s.startswith('+'))
if key
]
pprint.pprint(results)
print '-' * 20
print results[0]
print '-' * 20
pprint.pprint(results[1])
--output:--
[['+foo = bar', '+foofoo = barbar', '+foofoofoo = barbarbar'],
['+spam = eggs', '+spamspam = eggseggs', '+spamspamspam = eggseggseggs']]
--------------------
['+foo = bar', '+foofoo = barbar', '+foofoofoo = barbarbar']
--------------------
['+spam = eggs', '+spamspam = eggseggs', '+spamspamspam = eggseggseggs']
This thing here:
it.groupby(data, lambda x: x.startswith('+')
...tells python to create groups from the strings according to their first character. If the first character is a '+', then the string gets put into a True group. If the first character is not a '+', then the string gets put into a False group. However, there are more than two groups because consecutive False strings will form a group, and consecutive True strings will form a group.
Based on your data, the first three strings:
***
**
.param
will create one False group. Then, the next strings:
+foo = bar
+foofoo = barbar
+foofoofoo = barbarbar
will create one True group. Then the next string:
'.model'
will create another False group. Then the next strings:
'+spam = eggs'
'+spamspam = eggseggs'
'+spamspamspam = eggseggseggs'
will create another True group. The result will be something like:
{
False: [strs here],
True: [strs here],
False: [strs here],
True: [strs here]
}
Then it's just a matter of picking out each True group: if key, and then converting the corresponding group to a list: list(group).
Response to comment:
where exactly does python go through data, like how does it know s is
the data it's iterating over?
groupby() works like do_stuff() below:
def do_stuff(items, func):
for item in items:
print func(item)
#Create the arguments for do_stuff():
data = [1, 2, 3]
def my_func(x):
return x + 100
#Call do_stuff() with the proper argument types:
do_stuff(data, my_func) #Just like when calling groupby(), you provide some data
#and a function that you want applied to each item in data
--output:--
101
102
103
Which can also be written like this:
do_stuff(data, lambda x: x + 100)
lambda creates an anonymous function, which is convenient for simple functions which you don't need to refer to by name.
This list comprehension:
[
list(group)
for key, group in it.groupby(data, lambda s: s.startswith('+'))
if key
]
is equivalent to this:
results = []
for key, group in it.groupby(data, lambda s: s.startswith('+') ):
if key:
results.append(list(group))
It's clearer to explicitly write a for loop, however list comprehensions execute much faster. Here is some detail:
[
list(group) #The item you want to be in the results list for the current iteration of the loop here:
for key, group in it.groupby(data, lambda s: s.startswith('+')) #A for loop
if key #Only include the item for the current loop iteration in the results list if key is True
]
I would suggest doing things step by step.
1) Grab every word from the array separately.
2) Grab the first letter of the word.
3) Look if that is a '+' or '.'
Example code:
import re
class Dark():
def __init__(self):
# Array
x = ['+Hello', '.World', '+Hobbits', '+Dwarves', '.Orcs']
xPlus = []
xDot = []
# Values
i = 0
# Look through every word in the array one by one.
while (i != len(x)):
# Grab every word (s), and convert to string (y).
s = x[i:i+1]
y = '\n'.join(s)
# Print word
print(y)
# Grab the first letter.
letter = y[:1]
if (letter == '+'):
xPlus.append(y)
elif (letter == '.'):
xDot.append(y)
else:
pass
# Add +1
i = i + 1
# Print lists
print(xPlus)
print(xDot)
#Run class
Dark()

Categories

Resources