how to use string as list's indices in Python - python

for line in f.readlines():
(addr, vlanid, videoid, reqs, area) = line.split()
if vlanid not in dict:
dict[vlanid] = []
video_dict = dict[vlanid]
if videoid not in video_dict:
video_dict[videoid] = []
video_dict[videoid].append((addr, vlanid, videoid, reqs, area))
Here is my code, I want to use videoid as indices to creat a list. the real data of videoid are different strings like this : FYFSYJDHSJ
I got this error message:
video_dict[videoid] = []
TypeError: list indices must be integers, not str
But now how to add identifier like 1,2,3,4 for different strings in this case?

Use a dictionary instead of a list:
if vlanid not in dict:
dict[vlanid] = {}
P.S. I recommend that you call dict something else so that it doesn't shadow the built-in dict.

Don't use dict as a variable name. Try this (d instead of dict):
d = {}
for line in f.readlines():
(addr, vlanid, videoid, reqs, area) = line.split()
video_dict = d.setdefault(vlanid, {})
video_dict.setdefault(videoid, []).append((addr, vlanid, videoid, reqs, area))

As suggested above, creating dictionaries would be the most ideal code to implement. (Although you should avoid calling them dict, as that means something important to Python.
Your code may look something like what #aix had already posted above:
for line in f.readlines():
d = dict(zip(("addr", "vlanid", "videoid", "reqs", "area"), tuple(line.split())))
You would be able to do something with the dictionary d later in your code. Just remember - iterating through this dictionary will mean that, if you don't use d until after the loop is complete, you'll only get the last values from the file.

Related

How to sort a Python dictionary by a substring contained in the keys, according to the order set in a list?

I'm very new to Python and I'm stuck on a task. First I made a file containing a number of fasta files with sequence names into a dictionary, then managed to select only those I want, based on substrings included in the keys which are defined in list "flu_genes".
Now I'm trying to reorder the items in this dictionary based on the order of substrings defined in the list "flu_genes". I'm completely stuck; I found a way of reordering based on the key order in a list BUT it is not my case, as the order is defined not by the keys but by a substring within the keys.
Should also add that in this case the substring its at the end with format "_GENE", however it could be in the middle of the string with the same format, perhaps "GENE", therefore I'd rather not rely on a code to find the substring at the end of the string.
I hope this is clear enough and thanks in advance for any help!
"full_genome.fasta"
>A/influenza/1/1_NA
atgcg
>A/influenza/1/1_NP
ctgat
>A/influenza/1/1_FluB
agcta
>A/influenza/1/1_HA
tgcat
>A/influenza/1/1_FluC
agagt
>A/influenza/1/1_M
tatag
consensus = {}
flu_genes = ['_HA', '_NP', '_NA', '_M']
with open("full_genome.fasta", 'r') as myseq:
for line in myseq:
line = line.rstrip()
if line.startswith('>'):
key = line[1:]
else:
if key in consensus:
consensus[key] += line
else:
consensus[key] = line
flu_fas = {key : val for key, val in consensus.items() if any(ele in key for ele in flu_genes)}
print("Dictionary after removal of keys : " + str(flu_fas))
>>>Dictionary after removal of keys : {'>A/influenza/1/1_NA': 'atgcg', '>A/influenza/1/1_NP': 'ctgat', '>A/influenza/1/1_HA': 'tgcat', '>A/influenza/1/1_M': 'tatag'}
#reordering by keys order (not going to work!) as in: https://try2explore.com/questions/12586065
reordered_dict = {k: flu_fas[k] for k in flu_genes}
A dictionary is fundamentally unsorted, but as an implementation detail of python3 it remembers its insertion order, and you're not going to change anything later, so you can do what you're doing.
The problem is, of course, that you're not working with the actual keys. So let's just set up a list of the keys, and sort that according to your criteria. Then you can do the other thing you did, except using the actual keys.
flu_genes = ['_HA', '_NP', '_NA', '_M']
def get_gene_index(k):
for index, gene in enumerate(flu_genes):
if k.endswith(gene):
return index
raise ValueError('I thought you removed those already')
reordered_keys = sorted(flu_fas.keys(), key=get_gene_index)
reordered_dict = {k: flu_fas[k] for k in reordered_keys}
for k, v in reordered_dict.items():
print(k, v)
A/influenza/1/1_HA tgcat
A/influenza/1/1_NP ctgat
A/influenza/1/1_NA atgcg
A/influenza/1/1_M tatag
Normally, I wouldn't do an n-squared sort, but I'm assuming the lines in the data file is much larger than the number of flu_genes, making that essentially a fixed constant.
This may or may not be the best data structure for your application, but I'll leave that to code review.
It's because you are trying to reorder it with non-existent dictionary keys. Your keys are
['>A/influenza/1/1_NA', '>A/influenza/1/1_NP', '>A/influenza/1/1_HA', '>A/influenza/1/1_M']
which doesn't match the list
['_HA', '_NP', '_NA', '_M']
you first need to get transform them to make them match and since we know the pattern that it's at the end of the string starting with an underscore, we can split at underscores and get the last match.
consensus = {}
flu_genes = ['_HA', '_NP', '_NA', '_M']
with open("full_genome.fasta", 'r') as myseq:
for line in myseq:
line = line.rstrip()
if line.startswith('>'):
sequence = line
gene = line.split('_')[-1]
key = f"_{gene}"
else:
consensus[key] = {
'sequence': sequence,
'data': line
}
flu_fas = {key : val for key, val in consensus.items() if any(ele in key for ele in flu_genes)}
print("Dictionary after removal of keys : " + str(flu_fas))
reordered_dict = {k: flu_fas[k] for k in flu_genes}

How can I store this in a dictionary?

I must write a dictionary. This is my first time doing it and I can't wrap my head around it. The first 5 element should be the key to it and the rest the value.
for i in verseny:
if i not in eredmeny:
eredmeny[i] = 1
else:
eredmeny[i] += 1
YS869 CCCADCADBCBCCB this is a line from the hw. This YS869 should be the key and this CCCADCADBCBCCB should be the value.
The problem is that I can't store them in a dictionary. I'm grinding gears here but getting nowhere.
Assuming that erdemeny is your dictionary name and that verseny is the list that includes your values and keys strings. This should do it:
verseny = ['YS869 CCCADCADBCBCCB', 'CS769 CCCADCADBCBCCB', 'BS869 CCCADCADBCBCCB']
eredmeny = {}
for i in verseny:
key, value = i.split(' ')[0], i.split(' ')[1]
if key not in eredmeny.keys():
eredmeny[key] = value
else:
eredmeny[key].append(value)
I'm not really understanding the question well, but an easy way to do the task at hand would be converting the list into a string and then using split():
line = 'YS869 CCCADCADBCBCCB'
words = l.split()
d = {ls[0]: ls[1]}
print(d)
this is the basic skill in python. I hope you can refer to the existing materials. As your example, the following demonstrations are given:
line = 'YS869 CCCADCADBCBCCB'
dictionary = {}
dictionary[line[:4]] = line[5:]
print(dictionary) # {'YS86': ' CCCADCADBCBCCB'}

Python: dictionary to collection

I have a file with 2 columns:
Anzegem Anzegem
Gijzelbrechtegem Anzegem
Ingooigem Anzegem
Aalst Sint-Truiden
Aalter Aalter
The first column is a town and the second column is the district of that town.
I made a dictionary of that file like this:
def readTowns(text):
input = open(text, 'r')
file = input.readlines()
dict = {}
verzameling = set()
for line in file:
tmp = line.split()
dict[tmp[0]] = tmp[1]
return dict
If I set a variable 'writeTowns' equal to readTowns(text) and do writeTown['Anzegem'], I want to get a collection of {'Anzegem', 'Gijzelbrechtegem', 'Ingooigem'}.
Does anybody know how to do this?
I think you can just create another function that can create appropriate data structure for what you need. Because, at the end you will end up writing code which basically manipulates the dictionary returned by readTowns to generate data as per your requirement. Why not keep the code clean and create another function for that. You Just create a name to list dictionary and you are all set.
def writeTowns(text):
input = open(text, 'r')
file = input.readlines()
dict = {}
for line in file:
tmp = line.split()
dict[tmp[1]] = dict.get(tmp[1]) or []
dict.get(tmp[1]).append(tmp[0])
return dict
writeTown = writeTowns('file.txt')
print writeTown['Anzegem']
And if you are concerned about reading the same file twice, you can do something like this as well,
def readTowns(text):
input = open(text, 'r')
file = input.readlines()
dict2town = {}
town2dict = {}
for line in file:
tmp = line.split()
dict2town[tmp[0]] = tmp[1]
town2dict[tmp[1]] = town2dict.get(tmp[1]) or []
town2dict.get(tmp[1]).append(tmp[0])
return dict2town, town2dict
dict2town, town2dict = readTowns('file.txt')
print town2dict['Anzegem']
You could do something like this, although, please have a look at #ubadub's answer, there are better ways to organise your data.
[town for town, region in dic.items() if region == 'Anzegem']
It sounds like you want to make a dictionary where the keys are the districts and the values are a list of towns.
A basic way to do this is:
def readTowns(text):
with open(text, 'r') as f:
file = input.readlines()
my_dict = {}
for line in file:
tmp = line.split()
if tmp[1] in dict:
my_dict[tmp[1]].append(tmp[0])
else:
my_dict[tmp[1]] = [tmp[0]]
return dict
The if/else blocks can also be achieved using python's defaultdict subclass (docs here) but I've used the if/else statements here for readability.
Also some other points: the variables dict and file are python types so it is bad practice to overwrite these with your own local variable (notice I've changed dict to my_dict in the code above.
If you build your dictionary as {town: district}, so the town is the key and the district is the value, you can't do this easily*, because a dictionary is not meant to be used in that way. Dictionaries allow you to easily find the values associated with a given key. So if you want to find all the towns in a district, you are better of building your dictionary as:
{district: [list_of_towns]}
So for example the district Anzegem would appear as {'Anzegem': ['Anzegem', 'Gijzelbrechtegem', 'Ingooigem']}
And of course the value is your collection.
*you could probably do it by iterating through the entire dict and checking where your matches occur, but this isn't very efficient.

After I split a string how can I use the list that is created to then create a dictionary?

I have a file that has names put together that are related to each other, and I need the first set to a key, the second to a value, but when I run the program, I get the error
ValueError: too many values to unpack
I have researched this, but, I haven't found a way to fix it. Below is the code, and a link to some of the material I found in trying to fix this issue.
http://www.youtube.com/watch?v=p2BwrdjlsW4
dataFile = open("names.dat", 'r')
myDict = {}
for line in dataFile:
k,v = line.strip( ). split(",")
myDict[k.strip (":")] = v.strip ( )
print(k, v)
dataFile.close()
def findFather(myDict, lookUp):
for key, value in myDict.items ( ):
for v in value:
if lookUp in value:
return key
lookUp = raw_input ("Enter a son's name: ")
print "The father you are looking for is ",findFather(myDict, lookUp)
the file is saved as "names.dat" and is listed all on one line with the values:
john:fred, fred:bill, sam:tony, jim:william, william:mark, krager:holdyn, danny:brett, danny:issak, danny:jack, blasen:zade, david:dieter, adam:seth, seth:enos
The code
line.strip( ). split(",")
returns a list like:
["jhon:fred", "fred:bill", "sam:tony", ...]
so, when you do
k,v = line.strip( ). split(",")
you're trying to put all values of that list into k and v that are only two.
Try this code:
for line in dataFile:
for pair in line.strip(). split(","):
k,v = pair. split(":")
myDict[k.strip (":")] = v.strip()
print(k, v)
NOTE: The code above is just for remove the error you're getting. I do not guarantee that this code is going to do what you want to do. Also I've not idea about what are you trygin to du with the code:
myDict[k.strip (":")] = v.strip()

Creating a dictionary of lists from a file

I have a list in the following format in a txt file :
Shoes, Nike, Addias, Puma,...other brand names
Pants, Dockers, Levis,...other brand names
Watches, Timex, Tiesto,...other brand names
how to put these into dictionary like this format:
dictionary={Shoes: [Nike, Addias, Puma,.....]
Pants: [Dockers, Levis.....]
Watches:[Timex, Tiesto,.....]
}
How to do this in a for loop rather than manual input.
i have tried
clothes=open('clothes.txt').readlines()
clothing=[]
stuff=[]
for line in clothes:
items=line.replace("\n","").split(',')
clothing.append(items[0])
stuff.append(items[1:])
Clothing:{}
for d in clothing:
Clothing[d]= [f for f in stuff]
Here's a more concise way to do things, though you'll probably want to split it up a bit for readability
wordlines = [line.split(', ') for line in open('clothes.txt').read().split('\n')]
d = {w[0]:w[1:] for w in wordlines}
How about:
file = open('clothes.txt')
clothing = {}
for line in file:
items = [item.strip() for item in line.split(",")]
clothing[items[0]] = items[1:]
Try this, it will remove the need for replacing line breaks and is quite simple, but effective:
clothes = {}
with open('clothes.txt', 'r', newline = '/r/n') as clothesfile:
for line in clothesfile:
key = line.split(',')[0]
value = line.split(',')[1:]
clothes[key] = value
The 'with' statement will make sure the file reader is closed after your code to implement the dictionary is executed. From there you can use the dictionary to your heart's content!
Using list comprehension you could do:
clothes=[line.strip() for line in open('clothes.txt').readlines()]
clothingDict = {}
for line in clothes:
arr = line.split(",")
clothingDict[arr[0]] = [arr[i] for i in range(1,len(arr))]

Categories

Resources