I have a text file with tuples in it that I would like to convert to a list with indices as follows:
2, 60;
3, 67;
4, 67;
5, 60;
6, 60;
7, 67;
8, 67;
Needs to become:
60, 2 5 6
67, 3 4 7 8
And so on with many numbers...
I've made it as far as reading in the file and getting rid of the punctuation and casting it as ints, but I'm not quite sure how to iterate through and add multiple items at a given index of a list. Any help would be much appreciated!
Here is my code so far:
with open('cues.txt') as f:
lines = f.readlines()
arr = []
for i in lines:
i = i.replace(', ', ' ')
i = i.replace(';', '')
i = i.replace('\n', '')
arr.append(i)
array = []
for line in arr: # read rest of lines
array.append([int(x) for x in line.split()])
arr = []
#make array of first values 40 to 80
for i in range(40, 81):
arr.append(i)
print arr
for j in range(0, len(array)):
for i in array:
if (i[0] == arr[j]):
arr[i[0]].extend(i[1])
Do you need it in a list you can simply collect them into a dict:
i = {}
with open('cues.txt') as f:
for (x, y) in (l.strip(';').split(', ') for l in f):
i.setdefault(y, []).append(x)
for k, v in i.iteritems():
print "{0}, {1}".format(k, " ".join(v))
You could use defaultdict function from collections module.
from collections import defaultdict
with open('file') as f:
l = []
for line in f:
l.append(tuple(line.replace(';','').strip().split(', ')))
m = defaultdict(list)
for i in l:
m[i[1]].append(i[0])
for j in m:
print j+", "+' '.join(m[j])
You can use a dict to store the index:
results = {}
with open("cues.txt") as f:
for line in f:
value, index = line.strip()[:-1].split(", ")
if index not in results:
results[index] = [value]
else:
results[index].append(value)
for index in results:
print("{0}, {1}".format(index, " ".join(results[index]))
1) This code is wrong at many level. See inline comment
arr = []
for i in lines:
i = i.replace(', ', ' ')
i = i.replace(';', '')
i = i.replace('\n', '') # Wrong identation. You will only get the last line in arr
arr.append(i)
You can simply do
arr = []
for i in lines:
i = i.strip().replace(';', '').split(", ")
arr.append(i)
It will remove newline character, remove ; and nicely split a line into a tuple of (index, value)
2) This code can be simplified to one line
arr = [] # It should not be named `arr` because it destroyed the arr created in stage 1
for i in range(40, 81):
arr.append(i)
print arr
becomes:
result = range(40, 81)
But it is not an ideal data structure for your problem. You should use dictionary instead. In the other word, you can lose this bit of code altogether
3) Finally you are ready to iterate arr and build the result
result = defaultdict(list)
for a in arr:
result[a[1]].append(a[0])
You should use dict to save text data, the following code:
d = {}
with open('cues.txt') as f:
lines = f.readlines()
for line in lines:
line = line.split(',')
key = line[1].strip()[0:-1]
if d.has_key(key):
d[key].append(line[0])
else:
d[key] = [line[0]]
for key, value in d.iteritems():
print "{0}, {1}".format(key, " ".join(value))
Related
i would like to know how I could create a dictionary using the three lists. coun_keys to be a key and months_values and cases_values are to be the values.
I only found sources where I could use the zip() function to have a key: value, but how can I have key: value1, value2?
def main(csvfile,country ,type ):
with open(csvfile,"r") as file:
if type.lower() == "statistics ":
coun_keys = []
months_values = []
cases_values = []
listname =[]
coun_month={}
for line in file:
columns = (line.strip().split(","))
listname.append(columns)
listname.pop(0)
for line in listname:
date1 = line[3].split("/")
coun_keys.append(str(line[2]))
months_values.append(int(date1[1]))
cases_values.append(int(line[4]))
Do you mean, like:
list1 = [1, 2, 3]
list2 = 'abc'
list3 = [5, 6, 7]
print(dict(zip(list1,zip(list2, list3))))
#############################
For your code specifically, I would break up what you want to do into pieces. First define what you want to do with each line of your file:
def process_line(line):
line = line.strip().split(',')
date1 = line[3].split("/")
key = str(line[2])
month = int(date1[1])
case = int(line[4])
return key,(month,case)
Notice I group the values I want in a tuple, in particular, I want the process_line function to return my "key" and my "value" (a pair). Now open your file and process the lines:
f = open(csvfile)
next(f) #Skip the first line
result = dict(process_line(line) for line in f)
f.close()
This might help...
newDict = dict(zip(coun_keys, [months_values, cases_values]))
Assuming they're the same length, you can also do:
your_dict = {coun_keys[i] : (months_values[i], cases_values[i]) for i in range(len(coun_keys))}
For example,
lst1 = [a,b,c,d]
lst2 = [1,2,3,4]
lst3 = [5,6,7,8]
dict1 = dict(zip(lst1,zip(lst2,lst3)))
I have this text file:
English
hello bye
italian
ciao hola
spanish
hola chao
I want to create a dictionary from each 2 consecutive lines:
{
'English': 'hello bye',
'italian': 'ciao hola',
'spanish': 'hola chao',
}
Here's my code:
d= {}
with open("test.txt", 'r') as f:
l = f.readlines()
for line in l:
(key,val) = line.split()
d[key]=val
I get the error:
too many values to unpack error
You can also use this approach:
d = {}
with open("test.txt", 'r') as f:
l = f.readlines()
i = 0
while i < len(l):
d[l[i].replace("\n","")] = l[i+1].replace("\n","")
i += 2
In your original code, you are reading all lines in the file in one go using f.readlines() and then you are splitting each line. The problem is that not each line is giving you a list with two elements, so key, val = line.split() gives you a values to unpack, since you are trying to assign a single element list to two items. e.g a,b = [2] which will result in this error like so.
In [66]: a,b = [2]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-66-f9f79b7d1d3c> in <module>
----> 1 a,b = [2]
ValueError: not enough values to unpack (expected 2, got 1)
To avoid it, we just iterate through the lines we read, and every even element is a key and every odd element is a value in the dictionary.
dct = {}
with open("file.txt", 'r') as f:
l = f.readlines()
idx = 0
while idx < len(l):
#Even element is key, Odd element is value
key = l[idx].strip()
value = l[idx+1].strip()
dct[key] = value
idx+=2
#{'English': 'hello bye', 'italian': 'ciao hola', 'spanish': 'hola chao'}
Or a more terse solution using dict-comprehension is
l = []
with open("file.txt", 'r') as f:
l = f.readlines()
#This will be a list of tuples, with the first element of tuple being the key #and second value being the value
#Keys are obtained by slicing all even indexes, and values by slicing all odd indexes
key_value_tups = zip(l[::2], l[1::2])
#[('English \n', 'hello bye \n'), ('italian \n', 'ciao hola\n'), ('spanish\n', 'hola chao\n')]
#Iterate through the tuples and create the dict via dict-comprehension
dct = {key.strip() : value.strip() for key, value in key_value_tups}
print(dct)
#{'English': 'hello bye', 'italian': 'ciao hola', 'spanish': 'hola chao'}
i = 0
d = {}
prev_key = None
for line in l:
if i % 2 == 0:
prev_key = line
else:
d[prev_key] = line
i += 1
You can do it in a single line :
with open("test.txt", 'r') as f:
lines = f.readlines()
dict( zip( lines[::2], lines[1::2] ) )
lines[::2] will give you all the elements of lines that have an even index
lines[1::2] will give you all the elements of lines that have an odd index
zip will create an iterator (list1 elem, list2 elem) from the two lists
dict will take each tuple (key, value) from the iterator as a dictionary item and create a dictionary
That one line is the equivalent of :
keys = []
values = []
for index, elem in enumerate(lines):
if index % 2 == 0:
keys += [elem]
else:
values += [elem]
d = {}
for key, val in zip(keys, values):
d[key] = val
Use a dictionary-comprehension using zip():
with open("test.txt", 'r') as f:
l = f.readlines()
d = {x: y for x, y in zip(l[::2], l[1::2])}
I want to match a certain string in a CSV file and return the column of the string within the CSV file for example
import csv
data = ['a','b','c'],['d','e','f'],['h','i','j']
for example I'm looking for the word e, I want it to return [1] as it is in the second column.
The solution using csv.reader object and enumerate function(to get key/value sequence):
def get_column(file, word):
with open(file) as csvfile:
reader = csv.reader(csvfile)
for row in reader:
for k,v in enumerate(row):
if v == word:
return k # immediate value return to avoid further loop iteration
search_word = 'e'
print(get_column("data/sample.csv", search_word)) # "data/sample.csv" is an exemplary file path
The output:
1
I am not sure why do you need csv in this example.
>>> data = ['a','b','c'],['d','e','f'],['h','i','j']
>>>
>>>
>>> string = 'e'
>>> for idx, lst in enumerate(data):
... if string in lst:
... print idx
1
A variation of wolendranh's answer:
>>> data = ['a','b','c'],['d','e','f'],['h','i','j']
>>> word = 'e'
>>> for row in data:
... try:
... print(row.index(word))
... except ValueError:
... continue
Try the following:
>>> data_list = [['a','b','c'],['d','e','f'],['h','i','j']]
>>> col2_list = []
>>>
>>> for d in data_list:
... col2=d[1]
... col2_list.append(col2)
So in the end you get a list with all the values of column [1]:
col2_list = ["b","e","i"]
For example, I've got file with multilines like
<<something>> 1, 5, 8
<<somethingelse>> hello
<<somethingelseelse>> 1,5,6
I need to create dict with keys
dict = { "something":[1,5,8], "somethingelse": "hello" ...}
I need to somehow read what is inside << >> and put it as a key, and also I need to check if there are a lot of elements or only 1. If only one then I put it as string. If more then one then I need to put it as a list of elements.
Any ideas how to help me?
Maybe regEx's but I'm not great with them.
I easily created def which is reading a file lines, but don't know how to separate those values:
f = open('something.txt', 'r')
lines = f.readlines()
f.close()
def finding_path():
for line in lines:
print line
finding_path()
f.close()
Any ideas? Thanks :)
Assuming that your keys will always be single words, you can play with split(char, maxSplits). Something like below
import sys
def finding_path(file_name):
f = open(file_name, 'r')
my_dict = {}
for line in f:
# split on first occurance of space
key_val_pair = line.split(' ', 1)
# if we do have a key seprated by a space
if len(key_val_pair) > 1:
key = key_val_pair[0]
# proceed only if the key is enclosed within '<<' and '>>'
if key.startswith('<<') and key.endswith('>>'):
key = key[2:-2]
# put more than one value in list, otherwise directly a string literal
val = key_val_pair[1].split(',') if ',' in key_val_pair[1] else key_val_pair[1]
my_dict[key] = val
print my_dict
f.close()
if __name__ == '__main__':
finding_path(sys.argv[1])
Using a file like below
<<one>> 1, 5, 8
<<two>> hello
// this is a comment, it will be skipped
<<three>> 1,5,6
I get the output
{'three': ['1', '5', '6\n'], 'two': 'hello\n', 'one': ['1', ' 5', ' 8\n']}
Please check the below code:
Used regex to get key and value
If the length of value list is 1, then converting it into string.
import re
demo_dict = {}
with open("val.txt",'r') as f:
for line in f:
m= re.search(r"<<(.*?)>>(.*)",line)
if m is not None:
k = m.group(1)
v = m.group(2).strip().split(',')
if len(v) == 1:
v = v[0]
demo_dict[k]=v
print demo_dict
Output:
C:\Users\dinesh_pundkar\Desktop>python demo.Py
{'somethingelseelse': [' 1', '5', '6'], 'somethingelse': 'hello', 'something': [
' 1', ' 5', ' 8']}
My answer is similar to Dinesh's. I've added a function to convert the values in the list to numbers if possible, and some error handling so that if a line doesn't match, a useful warning is given.
import re
import warnings
regexp =re.compile(r'<<(\w+)>>\s+(.*)')
lines = ["<<something>> 1, 5, 8\n",
"<<somethingelse>> hello\n",
"<<somethingelseelse>> 1,5,6\n"]
#In real use use a file descriptor instead of the list
#lines = open('something.txt','r')
def get_value(obj):
"""Converts an object to a number if possible,
or a string if not possible"""
try:
return int(obj)
except ValueError:
pass
try:
return float(obj)
except ValueError:
return str(obj)
dictionary = {}
for line in lines:
line = line.strip()
m = re.search(regexp, line)
if m is None:
warnings.warn("Match failed on \n {}".format(line))
continue
key = m.group(1)
value = [get_value(x) for x in m.group(2).split(',')]
if len(value) == 1:
value = value[0]
dictionary[key] = value
print(dictionary)
output
{'something': [1, 5, 8], 'somethingelse': 'hello', 'somethingelseelse': [1, 5, 6]}
I have a file with its contents like this:
1 257.32943114
10 255.07893867
100 247.686049588
1000 248.560238357
101 250.673715233
102 250.150281581
103 247.076694596
104 257.491337952
105 250.804702983
106 252.043717069
107 253.786482488
108 255.588547067
109 251.253294801
...
What I want to do is create an array from this list with the numbers in the first column as index. For example, the 1st element of the array will be 257.32943114 which corresponds to 1 in the list, the 109th element of the array will be 251.253294801 which corresponds to number 109 in the list, and so on. How can I achieve this in Python?
If you insist on using list, here is another more pythonic solution:
with open('test.in', 'r') as f:
r = []
map(lambda (a,b): [0, [r.append(0) for i in xrange(a - len(r))]] and r.append(b), sorted([(int(l.split(' ')[0]), float(l.split(' ')[-1])) for l in f], key=lambda (a,b): a))
And r is what you are looking for.
Separator: you can use tab or spaces in split line
file = open(location, 'r')
dictionary = {}
for line in file.readlines():
aux = line.split(' ') #separator
dictionary[aux[0]] = aux[1]
print dictionary
If you have something like '257.32943114\n' like your values, you can use instead dictionary[aux[0]] = aux[1][:-1] to evade the char of new line.
Likely you want a dictionary, not a list, but if you do want a list:
def insert_and_extend(lst, location, value):
if len(lst) <= location:
lst.extend([None] * (location - len(lst) + 1))
lst[location] = value
mylist = []
insert_and_extend(mylist, 4, 'a')
insert_and_extend(mylist, 1, 'b')
insert_and_extend(mylist, 5, 'c')
print mylist
To do it as dictionary:
dict = {}
dict[4] = 'a'
dict[1] = 'b'
dict[5] = 'c'
print dict