Error saying 'Not enough values to unpack Expected 2 got 1' - python

tambola_callout = {}
for line in open("bingo-call-out.txt"):
num, callout = line.split(";")
tambola_callout[num] = callout
don't know what the problem, what can I do?

This line:
num, callout = line.split(';')
is expecting that your call to split will return a list of exactly two elements. Python will error out if you try to unpack too few or too many values during an assignment.
For example, this will return a single element list:
'something'.split(';') # == ['something']
Make sure your string is what you expect it to be.

it simply means that one of the lines in your file bingo-call-out.txt has the format of <all characters here> instead <some characters here>;<some characters here>
What this error means is that in a general scenario, suppose line = abcd;efgh
line.split(";") will return an array of two elements [abcd,efgh] which will be assigned tonum and callout respectively.
Now if there is a line = abcde, so line.split(';') is returning just ['abcde'] which is a single element list, which can be unpacked into 2 variables like your syntax intends to.

Related

python enumerate out of range when looping through a file

I have a file of paths called test.txt
/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg001G_1_Clean.fastq.gz
/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg001G_2_Clean.fastq.gz
/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg001T_1_Clean.fastq.gz
/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg001T_2_Clean.fastq.gz
/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg002G_1_Clean.fastq.gz
/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg002G_2_Clean.fastq.gz
/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg002T_1_Clean.fastq.gz
/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg002T_2_Clean.fastq.gz
Notice that the number of lines is even and always even, my final goal is to parse this file and create a new one looping through these paths on a two by two basis. I am trying enumerate function but this will not parse two by two. Furthermore, I'm going out of range because indexing the way I'm doing is wrong. It would also be great if someone could tell me how to index properly with enumerate.
with open('./src/test.txt') as f:
for index,line in enumerate(f):
sample = re.search(r'pfg[\dGT]+',line)
sample_string = sample.group(0)
#print(sample_string)
print('{{"name":"{0}","readgroup":"{0}","platform_unit":"{0}","fastq_1":"{1}","fastq_2":"{2}","library":"{0}"}},'.format(sample_string,line,line[index+1]))
The result is something like this:
{"name":"pfg001G","readgroup":"pfg001G","platform_unit":"pfg001G","fastq_1":"/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg001G_1_Clean.fastq.gz
","fastq_2":"g","library":"pfg001G"},
{"name":"pfg001G","readgroup":"pfg001G","platform_unit":"pfg001G","fastq_1":"/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg001G_2_Clean.fastq.gz
","fastq_2":"r","library":"pfg001G"},
{"name":"pfg001T","readgroup":"pfg001T","platform_unit":"pfg001T","fastq_1":"/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg001T_1_Clean.fastq.gz
","fastq_2":"o","library":"pfg001T"},
{"name":"pfg001T","readgroup":"pfg001T","platform_unit":"pfg001T","fastq_1":"/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg001T_2_Clean.fastq.gz
","fastq_2":"u","library":"pfg001T"},
{"name":"pfg002G","readgroup":"pfg002G","platform_unit":"pfg002G","fastq_1":"/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg002G_1_Clean.fastq.gz
","fastq_2":"p","library":"pfg002G"},
{"name":"pfg002G","readgroup":"pfg002G","platform_unit":"pfg002G","fastq_1":"/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg002G_2_Clean.fastq.gz
","fastq_2":"s","library":"pfg002G"},
{"name":"pfg002T","readgroup":"pfg002T","platform_unit":"pfg002T","fastq_1":"/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg002T_1_Clean.fastq.gz
","fastq_2":"/","library":"pfg002T"},
{"name":"pfg002T","readgroup":"pfg002T","platform_unit":"pfg002T","fastq_1":"/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg002T_2_Clean.fastq.gz","fastq_2":"c","library":"pfg002T"},
Clearly the indexation is wrong since it's going through every element of my path that is g r etc instead of printing the next path. For the first iteration the next path printed should be: "fastq_2":"/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg001G_2_Clean.fastq.gz".
I believe the problem itself can be tackled with itertools more elegantly I just don't know how to do it. Would also be great if someone could tell me if an indexation with enumerate could also work.
One problem is that you are trying to access the data from the second line of the pair before you have read it. Additionally you can not access the second line with line[index + 1] because that refers to a character in the current line, not the next line which hasn't yet been read.
So you need to keep track of pairs of lines. You can use the index provided by enumerate() to determine whether the current line is the first (because it is an even number) or the second (because it's odd). Store the name and path for fastq_1 when you read the first line. Only write the output on the second line. Like this:
import re
with open('test.txt') as f:
for index, line in enumerate(f):
if index % 2 == 0: # even, so this is the first line of a pair
name = re.search(r'pfg[\dGT]+',line).group(0)
fastq_1 = line.rstrip()
else: # odd, so second line. Emit result
fastq_2 = line.rstrip()
print('{{"name":"{0}","readgroup":"{0}","platform_unit":"{0}","fastq_1":"{1}","fastq_2":"{2}","library":"{0}"}},'.format(name, fastq_1, fastq_2))
line.rstrip() is required to remove the trailing new line character at the end of each line.
#mhawke already provided a good solution, but to give another approach, "looping through these ... on a two by two basis" can be done with the more_itertools.chunked function from the more_itertools library or with the grouper() recipe from the Python manual.
This also gives options for what should happen when the last line is an odd one; whether that should raise an error or pair it with a default value.
You may want to consider that when you're assigning index to variable, you're getting the index character of that string not the indexation of it.
What you can do is to assign th e file to a list then get the index location so, you can switch between line as you want.
Still don't understand point, do you want to switch between lines in both fastq_1 and fastq_2 or you each path be according to its key?
Code Syntax
with open(path) as f:
lis = list(f)
for index, line in enumerate(lis):
try:
sample = re.search(r'pfg[\dGT]+',line)
sample_string = sample.group(0)
print(f'{{"name":"{sample_string}","readgroup":"{sample_string}","platform_unit":"{sample_string}","fastq_1":"{line}","fastq_2":"{lis[index+1]}","library":"{sample_string}"}},')
except IndexError:
break
Output
{"name":"pfg001G","readgroup":"pfg001G","platform_unit":"pf
g001G","fastq_1":"/groups/cgsd/javed/validation_set/Leung
SY_Targeted_SS-190528-01a/Clean/pfg001G_1_Clean.fastq.gz
","fastq_2":"/groups/cgsd/javed/validation_set/LeungSY_Ta
rgeted_SS-190528-01a/Clean/pfg001G_2_Clean.fastq.gz
","library":"pfg001G"},
{"name":"pfg001G","readgroup":"pfg001G","platform_unit":"pf
g001G","fastq_1":"/groups/cgsd/javed/validation_set/Leung
SY_Targeted_SS-190528-01a/Clean/pfg001G_2_Clean.fastq.gz
","fastq_2":"/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg001T_1_Clean.fastq.gz
","library":"pfg001G"},
{"name":"pfg001T","readgroup":"pfg001T","platform_unit":"pf
g001T","fastq_1":"/groups/cgsd/javed/validation_set/Leung
SY_Targeted_SS-190528-01a/Clean/pfg001T_1_Clean.fastq.gz
","fastq_2":"/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg001T_2_Clean.fastq.gz
","library":"pfg001T"},
{"name":"pfg001T","readgroup":"pfg001T","platform_unit":"pf
g001T","fastq_1":"/groups/cgsd/javed/validation_set/Leung
SY_Targeted_SS-190528-01a/Clean/pfg001T_2_Clean.fastq.gz
","fastq_2":"/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg002G_1_Clean.fastq.gz
","library":"pfg001T"},
{"name":"pfg002G","readgroup":"pfg002G","platform_unit":"pf
g002G","fastq_1":"/groups/cgsd/javed/validation_set/Leung
SY_Targeted_SS-190528-01a/Clean/pfg002G_1_Clean.fastq.gz
","fastq_2":"/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg002G_2_Clean.fastq.gz
","library":"pfg002G"},
{"name":"pfg002G","readgroup":"pfg002G","platform_unit":"pf
g002G","fastq_1":"/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg002G_2_Clean.fastq.gz
","fastq_2":"/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg002T_1_Clean.fastq.gz
","library":"pfg002G"},
{"name":"pfg002T","readgroup":"pfg002T","platform_unit":"pfg002T","fastq_1":"/groups/cgsd/javed/validation_set/Leung
SY_Targeted_SS-190528-01a/Clean/pfg002T_1_Clean.fastq.gz
","fastq_2":"/groups/cgsd/javed/validation_set/LeungSY_Targeted_SS-190528-01a/Clean/pfg002T_2_Clean.fastq.gz
","library":"pfg002T"},
[Program finished]

Why does this error occur when my text files have clearly more than 1 lines?

I'm a beginner in Python. I've checked my text files, and definitely have more than 1 lines, so I don't understand why it gave me the error on
---> 11 Coachid.append(split[1].rstrip())
IndexError: list index out of range
The problem are the lines:
split=line.split(",")
Coachname.append(split[0].rstrip())
Coachid.append(split[1].rstrip())
The first line assumes that line contains at lest one comma so that after method split is called variable split will be a list of at least length two. But if line contains no commas, then split will have length 1 and Coachid.append(split[1].rstrip()) will generate the error you are getting. You need to add some conditional tests of the length of split.
Update
Your code should look like (assuming that the correct action is to append an empty string to the Coachid list if it is missing from the input):
split=line.split(",")
split_length = len(split)
Coachname.append(split[0].rstrip())
# append '' if split_length is less than 2:
Coachid.append('' if split_length < 2 else split[1].rstrip())
etc. for the other fields
If you want to loop over lines of a file, you have to use
for line in f.readlines()
...

String split() function not doing what's expected

I have a problem with split function — what I'm doing wrong here?
I get the error:
not enough values to unpack (expected 2, got 1) at the marked line.
The value in line is: ;Electronics:iPhone,3999;Galaxy,2999;Xiaomi,1999.
After the first split, newCategory contains "Electorincs", but products contains only "i".
from collections import defaultdict
import sys
items = open("store.txt" , "r")
categories= dict()
for line in items:
if line == '\n':
break
details= line.split(':')
products=details[1]
newCategory=details[0].lstrip
categories[newCategory]=dict()
products=products[0].split(';')
for p in products:
name,price=p.split(',') # THIS LINE.
name=name.lstrip
price=price.lstrip
categories[newCategory][name]=price
OK, so look at your code, after reading the mentioned line, and splitting the code via ":", You have a list with 2 cells:
#0 contains: Electorincs
#1 contains: iPhone,3999;Galaxy,2999;Xiaomi,1999
SO by reading products[0] you're basically reading the first element of the string iPhone,3999;Galaxy,2999;Xiaomi,1999
you should replace below line
products=products[0].split(';')
with
products=products.split(';')
not sure what you tried to do however the for loop
"for line in items" meaning it brings you character after character and it might be part of the problem

How to read from a file into a dict with string key and tuple value?

For an assignment, I'm creating a program that retrieves from a file information regarding Olympic countries and their medal count.
One of my functions goes through a list in this format:
Country,Games,Gold,Silver,Bronze
AFG,13,0,0,2
ALG,15,5,2,8
ARG,40,18,24,28
ARM,10,1,2,9
ANZ,2,3,4,5
The function needs to go through this list, and store into a dictionary with the country name as a key, and the remaining four entries as a tuple.
Here is what I am working with so far:
def medals(string):
'''takes a file, and gathers up the country codes and their medal counts
storing them into a dictionary'''
#creates an empty dictionary
medalDict = {}
#creates an empty tuple
medalCount = ()
#These following two lines remove the column headings
with open(string) as fin:
next(fin)
for eachline in fin:
code, medal_count = eachline.strip().split(',',1)
medalDict[code] = medal_count
return medalDict
Now, the intent is for the entries to look something like this
{'AFG': (13, 0, 0, 2)}
Instead, I'm getting
{'AFG': '13,0,0,2'}
It looks like it is being stored as a string, and not a tuple. Is it something to do with the
medalDict[code] = medal_count
line of code? I'm not too sure how to convert that into separate integer values for a tuple neatly.
You are storing the whole string '13,0,0,2' as value, so
medalDict[code] = medal_count
should be replaced by:
medalDict[code] = tuple(medal_count.split(','))
Your original thought is correct, with this line being the sole exception. What is changed is now it splits the '13,0,0,2' into a list ['13', '0', '0', '2'] and converts it into a tuple.
You can also do this to convert strings inside into integers:
medalDict[code] = tuple([int(ele) for ele in medal_count.split(',')])
But make sure your medal_count contains only integers.
This line:
code, medal_count = eachline.strip().split(',',1)
... is splitting the whitespace-stripped eachline, 1 time, on ',', then storing the resulting two strings into code and medal_count ... so yes, medal_count contains a string.
You could handle this one of two ways:
Add a line along the lines of:
split_counts = tuple(medal_count.split(','))
... and then use split_counts from there on in the code, or
(in Python 3) Change the line above to
code, *medal_count = eachline.strip().split(',')
... which makes use of Extended iterable unpacking (and will give you a list, so if a tuple is necessary it'll need to be converted).
Your Problem seems to be this:
split(',',1)
# should be
split(',')
because split(..., 1) only makes 1 split and split(...) splits as much as possible.
So you should be able to do this:
for eachline in fin:
code, *medal_count = eachline.strip().split(',')
medalDict[code] = medal_count

Get a value from a string in python

Program Details:
I am writing a program for python that will need to look through a text file for the line:
Found mode 1 of 12: EV= 1.5185449E+04, f= 19.612545, T= 0.050988.
Problem:
Then after the program has found that line, it will then store the line into an array and get the value 19.612545, from f = 19.612545.
Question:
I so far have been able to store the line into an array after I have found it. However I am having trouble as to what to use after I have stored the string to search through the string, and then extract the information from variable f. Does anyone have any suggestions or tips on how to possibly accomplish this?
Depending upon how you want to go at it, CosmicComputer is right to refer you to Regular Expressions. If your syntax is this simple, you could always do something like:
line = 'Found mode 1 of 12: EV= 1.5185449E+04, f= 19.612545, T= 0.050988.'
splitByComma=line.split(',')
fValue = splitByComma[1].replace('f= ', '').strip()
print(fValue)
Results in 19.612545 being printed (still a string though).
Split your line by commas, grab the 2nd chunk, and break out the f value. Error checking and conversions left up to you!
Using regular expressions here is maddness. Just use string.find as follows: (where string is the name of the variable the holds your string)
index = string.find('f=')
index = index + 2 //skip over = and space
string = string[index:] //cuts things that you don't need
string = string.split(',') //splits the remaining string delimited by comma
your_value = string[0] //extracts the first field
I know its ugly, but its nothing compared with RE.

Categories

Resources