I have a file with its contents like this:
1 257.32943114
10 255.07893867
100 247.686049588
1000 248.560238357
101 250.673715233
102 250.150281581
103 247.076694596
104 257.491337952
105 250.804702983
106 252.043717069
107 253.786482488
108 255.588547067
109 251.253294801
...
What I want to do is create an array from this list with the numbers in the first column as index. For example, the 1st element of the array will be 257.32943114 which corresponds to 1 in the list, the 109th element of the array will be 251.253294801 which corresponds to number 109 in the list, and so on. How can I achieve this in Python?
If you insist on using list, here is another more pythonic solution:
with open('test.in', 'r') as f:
r = []
map(lambda (a,b): [0, [r.append(0) for i in xrange(a - len(r))]] and r.append(b), sorted([(int(l.split(' ')[0]), float(l.split(' ')[-1])) for l in f], key=lambda (a,b): a))
And r is what you are looking for.
Separator: you can use tab or spaces in split line
file = open(location, 'r')
dictionary = {}
for line in file.readlines():
aux = line.split(' ') #separator
dictionary[aux[0]] = aux[1]
print dictionary
If you have something like '257.32943114\n' like your values, you can use instead dictionary[aux[0]] = aux[1][:-1] to evade the char of new line.
Likely you want a dictionary, not a list, but if you do want a list:
def insert_and_extend(lst, location, value):
if len(lst) <= location:
lst.extend([None] * (location - len(lst) + 1))
lst[location] = value
mylist = []
insert_and_extend(mylist, 4, 'a')
insert_and_extend(mylist, 1, 'b')
insert_and_extend(mylist, 5, 'c')
print mylist
To do it as dictionary:
dict = {}
dict[4] = 'a'
dict[1] = 'b'
dict[5] = 'c'
print dict
Related
i would like to know how I could create a dictionary using the three lists. coun_keys to be a key and months_values and cases_values are to be the values.
I only found sources where I could use the zip() function to have a key: value, but how can I have key: value1, value2?
def main(csvfile,country ,type ):
with open(csvfile,"r") as file:
if type.lower() == "statistics ":
coun_keys = []
months_values = []
cases_values = []
listname =[]
coun_month={}
for line in file:
columns = (line.strip().split(","))
listname.append(columns)
listname.pop(0)
for line in listname:
date1 = line[3].split("/")
coun_keys.append(str(line[2]))
months_values.append(int(date1[1]))
cases_values.append(int(line[4]))
Do you mean, like:
list1 = [1, 2, 3]
list2 = 'abc'
list3 = [5, 6, 7]
print(dict(zip(list1,zip(list2, list3))))
#############################
For your code specifically, I would break up what you want to do into pieces. First define what you want to do with each line of your file:
def process_line(line):
line = line.strip().split(',')
date1 = line[3].split("/")
key = str(line[2])
month = int(date1[1])
case = int(line[4])
return key,(month,case)
Notice I group the values I want in a tuple, in particular, I want the process_line function to return my "key" and my "value" (a pair). Now open your file and process the lines:
f = open(csvfile)
next(f) #Skip the first line
result = dict(process_line(line) for line in f)
f.close()
This might help...
newDict = dict(zip(coun_keys, [months_values, cases_values]))
Assuming they're the same length, you can also do:
your_dict = {coun_keys[i] : (months_values[i], cases_values[i]) for i in range(len(coun_keys))}
For example,
lst1 = [a,b,c,d]
lst2 = [1,2,3,4]
lst3 = [5,6,7,8]
dict1 = dict(zip(lst1,zip(lst2,lst3)))
I have a text file with tuples in it that I would like to convert to a list with indices as follows:
2, 60;
3, 67;
4, 67;
5, 60;
6, 60;
7, 67;
8, 67;
Needs to become:
60, 2 5 6
67, 3 4 7 8
And so on with many numbers...
I've made it as far as reading in the file and getting rid of the punctuation and casting it as ints, but I'm not quite sure how to iterate through and add multiple items at a given index of a list. Any help would be much appreciated!
Here is my code so far:
with open('cues.txt') as f:
lines = f.readlines()
arr = []
for i in lines:
i = i.replace(', ', ' ')
i = i.replace(';', '')
i = i.replace('\n', '')
arr.append(i)
array = []
for line in arr: # read rest of lines
array.append([int(x) for x in line.split()])
arr = []
#make array of first values 40 to 80
for i in range(40, 81):
arr.append(i)
print arr
for j in range(0, len(array)):
for i in array:
if (i[0] == arr[j]):
arr[i[0]].extend(i[1])
Do you need it in a list you can simply collect them into a dict:
i = {}
with open('cues.txt') as f:
for (x, y) in (l.strip(';').split(', ') for l in f):
i.setdefault(y, []).append(x)
for k, v in i.iteritems():
print "{0}, {1}".format(k, " ".join(v))
You could use defaultdict function from collections module.
from collections import defaultdict
with open('file') as f:
l = []
for line in f:
l.append(tuple(line.replace(';','').strip().split(', ')))
m = defaultdict(list)
for i in l:
m[i[1]].append(i[0])
for j in m:
print j+", "+' '.join(m[j])
You can use a dict to store the index:
results = {}
with open("cues.txt") as f:
for line in f:
value, index = line.strip()[:-1].split(", ")
if index not in results:
results[index] = [value]
else:
results[index].append(value)
for index in results:
print("{0}, {1}".format(index, " ".join(results[index]))
1) This code is wrong at many level. See inline comment
arr = []
for i in lines:
i = i.replace(', ', ' ')
i = i.replace(';', '')
i = i.replace('\n', '') # Wrong identation. You will only get the last line in arr
arr.append(i)
You can simply do
arr = []
for i in lines:
i = i.strip().replace(';', '').split(", ")
arr.append(i)
It will remove newline character, remove ; and nicely split a line into a tuple of (index, value)
2) This code can be simplified to one line
arr = [] # It should not be named `arr` because it destroyed the arr created in stage 1
for i in range(40, 81):
arr.append(i)
print arr
becomes:
result = range(40, 81)
But it is not an ideal data structure for your problem. You should use dictionary instead. In the other word, you can lose this bit of code altogether
3) Finally you are ready to iterate arr and build the result
result = defaultdict(list)
for a in arr:
result[a[1]].append(a[0])
You should use dict to save text data, the following code:
d = {}
with open('cues.txt') as f:
lines = f.readlines()
for line in lines:
line = line.split(',')
key = line[1].strip()[0:-1]
if d.has_key(key):
d[key].append(line[0])
else:
d[key] = [line[0]]
for key, value in d.iteritems():
print "{0}, {1}".format(key, " ".join(value))
Hi im a phyton newbee in a spot of bother i am taking a file from the internet reading it and cleaning it up by splitting on the new line then on the comma and the output is thousands of rows which look like this:
[59, 'Self-emp-inc', 'none', 'none', 10, 'Married-civ-spouse',
'Craft-repair', 'Husband', 'White', 'Male', 0, 0, 50, 'none', '>50K']
what i am trying to do then is loop through line count each attribute and depending on whether the last element is either >50K or <=50Ki want to put it in either age_over_dict or age_under_dict so at the end i should have for each attribute age_over_dict{59:79,Self-emp-inc:56} so for the ammount of people who are 59 and earn >50K is 79 and so on. I cant seem to get this part working any help would be greatly appreciated thanks in advance. This is the code i have at the moment
def trainClassifier(f):
age_over = {}
age_under = {}
count = 0
count_over = 0
count_under = 0
for row in f:
row = row.split(", ")
count +=1
if row[-1]in f == " >50K":
if row[0] in f == age_over:
age_over +=1
count_over+=1
else:
age_over = age_over + 1
count_over+=1
print(age_over,count_over,count)
return age_over
This should do what you are asking:
def trainClassifier(f):
count = 0
count_over = 0
count_under = 0
age_over = {}
age_under = {}
for line in f:
row = [ x.strip() for x in line.split(",") ]
print row
count +=1
if int(row[0]) > 50:
dest_dict = age_over
else:
dest_dict = age_under
for attr in row:
if attr not in dest_dict:
dest_dict[attr] = 1
else:
dest_dict[attr] += 1
return age_over,age_under
However, notice that duplicated attributes would be counted multiple times for each record in the csv. Not sure that is the behavior you want.
The Counter from collections creates histograms from iterables.
from collections import Counter
First you need to filter for the information you want to count on. In your case, age, the first element, and income, the last element. I grab these elements in the list comprehension, and pass the result to Counter for the counts of each age, income pair.
Counter([(i[0],i[-1]) for i in f])
Here's an example, a list of lists with [age, nonesense, income]:
>>> a
[[30, 1, '>50'], [59, 2, '<50'], [30, 3, '>50']]
The intermediate step of filtering on just the data we need:
>>> b = [(i[0], i[-1]) for i in a]
>>> b
[(30, '>50'), (59, '<50'), (30, '>50')]
and the result of building the counter:
>>> c = Counter(b)
>>> c
Counter({(30, '>50'): 2, (59, '<50'): 1})
From here, if you wanted to know the number of people who are 30 and make greater than 50k, you can use c like a dictionary:
>>> c[30,'>50']
2
This is my code. I have to add a whole list of things here:(http://pastebin.com/u5S0rF9D) into a list, how do I do that? this is my excel file that I imported into python (https://docs.google.com/spreadsheet/ccc?key=0Atza5UMAhSHRdHJMWGZqZlRrZWpySnU1SHhKOXFlN2c#gid=0)
What do I "append" into the maleList?
import random
gender = raw_input("Enter your character gender (Male/Female): ")
start = raw_input("Enter Please enter starting letter of name(A to B): ")
import csv
readerFileHandle = open("Book1.csv", "rb")
malenames = csv.reader(readerFileHandle)
for row in malenames:
y = []
for x in row:
if x[-1] == '\xa0':
y.append(x[:-2])
else:
y.append(x)
for z in y:
print z
maleList = []
maleList.append()
print maleList
readerFileHandle.close()
If you want them all in one list like the paste bin, just append them directly
malenames = csv.reader(readerFileHandle)
maleList = []
for row in malenames:
for x in row:
maleList.append(x.rstrip('\xa0'))
You could instead write that as a list comprehension
malenames = csv.reader(readerFileHandle)
maleList = [x.rstrip('\xa0') for row in malenames for x in row]
Edit: looks like there is a combo '\xc2\xa0' attached to the end of some of the entries. So it should be x.rstrip('\xc2\xa0') to clean them properly
if you have a=[1,2] and b=[3,4], and want list b to be appended to list a, then
>>> a=[1,2]
>>> b=[3,4]
>>> a.extend(b)
>>> a
[1, 2, 3, 4]
I have a list of values in a for loop. e.g. myList = [1,5,7,3] which I am using to create a bar chart (using google charts)
I want to label each value with a letter of the alphabet (A-Z) e.g. A = 1, B = 5, C = 7, D = 3
What is the best way to do this running through a for loop
e.g.
for x in myList:
x.label = LETTER OF THE ALPHABET
The list can be any length in size so wont always be just A to D
EDIT
myList is a list of objects not numbers as I have put in example above. Each object will have a title attached to it (could be numbers, text letters etc.), however these titles are quite long so mess things up when displaying them on Google charts. Therefore on the chart I was going to label the chart with letters going A, B, C, ....to the lenth of the myList, then having a key on the chart cross referencing the letters on the chart with the actual titles. The length of myList is more than likely going to be less than 10 so there would be no worries about running of of letters.
Hope this clears things up a little
If you want to go on like ..., Y, Z, AA, AB ,... you can use itertools.product:
import string
import itertools
def product_gen(n):
for r in itertools.count(1):
for i in itertools.product(n, repeat=r):
yield "".join(i)
mylist=list(range(35))
for value, label in zip(mylist, product_gen(string.ascii_uppercase)):
print(value, label)
# value.label = label
Part of output:
23 X
24 Y
25 Z
26 AA
27 AB
28 AC
29 AD
import string
for i, x in enumerate(myList):
x.label = string.uppercase[i]
This will of course fail if len(myList) > 26
import string
myList = [1, 5, 7, 3]
labels = [string.uppercase[x+1] for x in myList]
# ['C', 'G', 'I', 'E']
for i in range(len(myList)):
x.label = chr(i+65)
More on the function here.
charValue = 65
for x in myList:
x.label = chr(charValue)
charValue++
Be careful if your list is longer than 26 characters
First, if myList is a list of integers, then,
for x in myList:
x.label = LETTER OF THE ALPHABET
won't work, since int has no attribute label. You could loop over myList and store the labels in a list (here: pairs):
import string
pairs = []
for i, x in enumerate(myList):
label = string.letters(i) # will work for i < 52 !!
pairs.append( (label, x) )
# pairs is now a list of (label, value) pairs
If you need more than 52 labels, you can use some random string generating function, like this one:
import random
def rstring(length=4):
return ''.join([ random.choice(string.uppercase) for x in range(length) ])
Since I like list comprehensions, I'd do it like this:
[(i, chr(x+65)) for x, i in enumerate([1, 5, 7, 3])]
Which results in:
[(1, 'A'), (5, 'B'), (7, 'C'), (3, 'D')]
import string
for val in zip(myList, string.uppercase):
val[0].label = val[1]
You can also use something like this:
from string import uppercase
res = ((x , uppercase[i%26]*(i//26+1)) for i,x in enumerate(inputList))
Or you can use something like this - note that this is just an idea how to deal with long lists not the solution:
from string import uppercase
res = ((x , uppercase[i%26] + uppercase[i/26]) for i,x in enumerate(inputList))
Are you looking for a dictionary, where each of your values are keyed to a letter of the alphabet? In that case, you can do:
from string import lowercase as letters
values = [1, 23, 3544, 23]
mydict = {}
for (let, val) in zip(letters, values):
mydict[let] = val
<<< mydict == {'a': 1, 'c': 23, 'b': 3544, 'd': 23}
<<< mydict['a'] == 1
You'll have to add additional logic if you need to handle lists longer than the alphabet.