how to print year-month in proper order - python

I need to output data in a file in the following format: year-month,val. it should be sorted on year-month
for example:
2016-1,5
2016-7,1
2016-9,3
2016-11,4
2016-12,2
But, I am getting:
2016-1,5
2016-11,4
2016-12,2
2016-7,1
2016-9,3
the code is as follows:
for k,v in sorted(dictD.items()):
drow = [k,v]
writer.writerow(drow)
How to get the desired output?

Split the date at the hyphen and convert it to a tuple of numbers rather than strings.
for row in sorted(dictD.items(), key = lambda(x): map(int, x[0].split('-'))):
writer.writerow(row)
x is the (key, value) tuple returned by items(), so x[0] is the key, which is a date like '2016-1'. split splits this into the tuple ('2016', '1'), and map(int) converts that to a sequence of integers (2016, 1). Using this as the sort key will order them numerically instead of lexicographically.

Well, it's not a direct code, but I couldn't make it more simple so you may try to change the format of the month like this:
dictD = {'2016-1':5, '2017-7':1,'2016-9':3, '2016-11':4, '2016-12':2}
formatedKey = [list(dictD)[i].split('-')[0]+'-'+'{:02d}'.format(int(list(dictD)[i].split('-')[1])) \
for i in range(len(list(dictD)))]
dictD2 = dict(zip(formatedKey, list(dictD.values())))
for k,v in sorted(dictD2.items()):
drow = [k,v]
print(drow)
I didn't use the writer, but I hope this helps.

Assuming your dictionary is keyed by the YYYY-MM string and the value is the number after the comma, you can add a key argument to your sorted() call.
The key func could be:
lambda item: item[0][:5] + ('0' if len(item[0]) < 7 else '') + item[0][5:]
So your sorted call goes from:
sorted(dictD.items())
to:
sorted(dictD.items(), key=lambda item: <the rest from above>)
This leaves sorting by strings, but by adding the leading zero to the one-digit month, things come out as you want.
As a side note, you can pass a named function in as the key. You're not limited to using a lambda call.
When you pass things into sorted() without specifying a sorting algorithm, a default sort order is used. Dicts are sorted by keys (as strings), and tuples are sorted by tuple elements, starting with the first. For you, your .items() call produces a list of tuples (or at least close enough), with the key as the first element of the tuple, so the tuples get sorted by the dict keys as strings, ignoring any potential numeric value. By padding the leading zero to the one-digit months, the dates can be properly sorted as strings. The lambda call does just that -- it pads that extra '0' when necessary to allow the sorting to occur with the desired results.

Related

Ordering links in python

I have a list containing some links: ["http://link1.rar", "http://link1.rev","http://link2.rar","http://link2.rev"]
Is there a way to sort them, in order to look like:
["http://link1.rar", "http://link2.rar", "http://link1.rev", "http://link2.rev"]
I've tried with this:
def order(x):
if "rar" not in x:
return x
else:
return ""
new_links = sorted(links, key=order)
But in this way, rev links are sorted from the highest.
You want to solve according to multiple criteria: first, the file extension; then, the whole string.
The usual trick to sort according to multiple criteria is to use a tuple as the key. Tuples are sorted in lexicographical order, which means the first criterion is compared first; and in case of a tie, the second criterion is compared.
For instance, the following key returns a tuple such as (False, 'http://link1.rar') or (True, 'http://link1.rev'):
new_links = sorted(links, key=lambda x: ('rar' not in x, x))
Alternatively, you could use str.rsplit to split the string on the last '.' and get a tuple such as ('http://link1', 'rev'). Since you want the extension to be the first criterion, use slicing [::-1] to reverse the order of the tuple:
new_links = sorted(links, key=lambda x: x.rsplit('.', maxsplit=1)[::-1])
Note that using rsplit('.', maxsplit=1) to split on the last '.' is a bit of a hack. If the filename contains a double extension, such as '.tar.gz', only the last extension will be isolated.
One last point to consider: numbers in strings. As long as your strings contain only single digits, such as '1' or '2', sorting the strings will work as you expect. However, if you have a 'link10' somewhere in there, the order might not be the one you expect: lexicographically, 'link10' comes before 'link2', because the first character of '10' is '1'. In that case, I refer you to this related question: Is there a built in function for string natural sort?

how to sort the elements in a list in numerical order and not alphabetical

i have a list containing sentences with numbers in it, i want to sort then numerically and not alphabetically. but when i use the sort function, it sorts the elements alphabetically and not numerically, how do i fix this?
num = ['ryan has 8 apples','charles has 16 bananas','mylah has 3 watermelons']
num.sort()
print(num)
output:
['charles.....','mylah.....','ryan......']
as you can see, it is sorted alphabetically but that is not my expected output
the dots represent the rest of the sentence
expected result:
['mylah has 3 watermelons','ryan has 8 apples','charles has 16 bananas']
here's the expected output where the elements are sorted numerically and not alphabetically
You need to pass in a key to sort them on that splits the string and parses the number as a number to sort, you can also then sort alphabetically if you wish for those with the same number.
sorted(num, key=lambda x: (int(x.split()[2]), x))
or
num.sort(key=lambda x: (int(x.split()[2]), x))
You need to use key= to let sort know which value you want to assign each element for sorting.
If the strings are always that structured, you can use:
num = ['ryan has 8 apples','charles has 16 bananas','mylah has 3 watermelons']
num.sort(key=lambda x: int(x.split(' ')[2]))
print(num)
If they are more complex, take a look at regular expressions.
The solutions so far focus on a solution specific to the structure of your strings but would fail for slightly different string structures. If we extract only the digit parts of the string and then convert to an int, we'll get a value that we can pass to the sort function's key parameter:
num.sort(
key=lambda s: int(''.join(c for c in s if c.isdigit()))
)
The key parameter lets you specify some alternative aspect of your datum to sort by, in this case, a value provided by the lambda function.

Splitting and sorting a list based on substring

I'm trying to take a list from an array, and split the strings to sort the list sequentially by the last series of 6 numbers (for instance '042126). To do this I would split by '.', use the second to the last split of the string [-2], and then sort matchfiles[1] with this substring.
The files should end up sorted like:
erl1.041905, erl1.041907, erl2.041908, erl1.041909, erl2.041910, etc.
Two questions: how do I specify unlimited number of splits per string (in case of longer names using additional '.'? I am using 4 splits, but this case may not hold. Else, how would I just split two times working backwards?
More importantly, I am returned an error: 'list' object is not callable. What am I doing wrong?
Thanks
matchfiles = [ [1723], ['blue.2017-09-05t15-15-07.erl1.041905.png',
'blue.2017-09-05t15-15-11.erl1.041907.png',
'blue.2017-09-05t15-15-14.erl1.041909.png',
'blue.2017-09-05t14-21-35.erl2.041908.png',
'blue.2017-09-05t14-21-38.erl2.041910.png',
'blue.2017-09-05t14-21-41.erl2.041912.png',
'blue.2017-09-05t14-21-45.erl2.041914.png'],
[09302] ]
matchtry = sorted(matchfiles[1], key = [i.split('.', 4)[-2] for i in
matchfiles[1]])
The keyargument expects a function, but you give it a list, hence the error list is not callable.
You should use split('.')[-2] which always takes the second to last element.
matchfiles = [ [1723], ['blue.2017-09-05t15-15-07.erl1.041905.png',
'blue.2017-09-05t15-15-11.erl1.041907.png',
'blue.2017-09-05t15-15-14.erl1.041909.png',
'blue.2017-09-05t14-21-35.erl2.041908.png',
'blue.2017-09-05t14-21-38.erl2.041910.png',
'blue.2017-09-05t14-21-41.erl2.041912.png',
'blue.2017-09-05t14-21-45.erl2.041914.png'],
[9302] ]
matchtry = sorted(matchfiles[1], key=lambda x: x.rsplit('.')[-2])
print(matchtry)
# ['blue.2017-09-05t15-15-07.erl1.041905.png', 'blue.2017-09-05t15-15-11.erl1.041907.png',
'blue.2017-09-05t14-21-35.erl2.041908.png', 'blue.2017-09-05t15-15-14.erl1.041909.png',
'blue.2017-09-05t14-21-38.erl2.041910.png', 'blue.2017-09-05t14-21-41.erl2.041912.png',
'blue.2017-09-05t14-21-45.erl2.041914.png']
The key parameter to sorted requires a function. [i.split('.', 4)[-2] for i in matchfiles[1]] is a list, not a function. The expected function acts on a single element from the list, so you need a function that takes a string, splits it on the '.' character, and returns the second last column, possibly converted to an integer.
Also, Python does not allow integers to begin with a zero, so you must change that [09302] to [9302]. (Beginning with 0 signifies that the number will be non-decimal. In Python 2, 0427 would be 427 octal, but in Python 3, octal number must be preceded by 0o instead. 09302 is invalid in both versions, as an octal number cannot contain 9.)
matchfiles = [ [1723], ['blue.2017-09-05t15-15-07.erl1.041905.png',
'blue.2017-09-05t15-15-11.erl1.041907.png',
'blue.2017-09-05t15-15-14.erl1.041909.png',
'blue.2017-09-05t14-21-35.erl2.041908.png',
'blue.2017-09-05t14-21-38.erl2.041910.png',
'blue.2017-09-05t14-21-41.erl2.041912.png',
'blue.2017-09-05t14-21-45.erl2.041914.png'],
[9302] ]
matchtry = sorted(matchfiles[1], key = lambda str: int(str.split('.')[-2]))
Remember that the key argument to sorted takes each element of your iterable (list in your case) and converts it to some value. The values of each element after being transformed by key determine the sort order. So a simple way to get this to work every time is to define a function that
takes one element and converts it to something that's easy to sort:
def fname_to_value(fname):
name, ext = os.path.splitext(fname) # remove extension
number = name.split('.')[-1] # Get the last set of stuff after the last '.'
return number # no need to convert to int, string compare does what you want
So now you have a function converting the filename to a sortable value. Simple supply this to sorted as the key argument and you're done.
matchtry = sorted(matchfiles[1], key = fname_to_value)
for match in matchtry:
print(match)
result:
blue.2017-09-05t15-15-07.erl1.041905.png
blue.2017-09-05t15-15-11.erl1.041907.png
blue.2017-09-05t14-21-35.erl2.041908.png
blue.2017-09-05t15-15-14.erl1.041909.png
blue.2017-09-05t14-21-38.erl2.041910.png
blue.2017-09-05t14-21-41.erl2.041912.png
blue.2017-09-05t14-21-45.erl2.041914.png
You can then process the resulting list as needed.
Yes, the issue is your key. You can use a lambda expression: https://en.wikipedia.org/wiki/Anonymous_function#Python
Imagine this as a mathematical map. The key being used to sort needs a function, so you define a lambda like:
lambda curr: curr.split('.')[-2]
This gives each current object in the list the name "curr" and applies the expression following the :.
So in your case this should do the thing:
matchtry = sorted(matchfiles[1], key=lambda curr: curr.split('.')[-2])

python dictionary sorted based on time

I have a dictionary such as below.
d = {
'0:0:7': '19734',
'0:0:0': '4278',
'0:0:21': '19959',
'0:0:14': '9445',
'0:0:28': '14205',
'0:0:35': '3254'
}
Now I want to sort it by keys with time priority.
Dictionaries are not sorted, if you want to print it out or iterate through it in sorted order, you should convert it to a list first:
e.g.:
sorted_dict = sorted(d.items(), key=parseTime)
#or
for t in sorted(d, key=parseTime):
pass
def parseTime(s):
return tuple(int(x) for x in s.split(':'))
Note that this will mean you can not use the d['0:0:7'] syntax for sorted_dict though.
Passing a 'key' argument to sorted tells python how to compare the items in your list, standard string comparison will not work to sort by time.
Dictionaries in python have no guarantees on order. There is collections.OrderedDict, which retains insertion order, but if you want to work through the keys of a standard dictionary in order you can just do:
for k in sorted(d):
In your case, the problem is that your time strings won't sort correctly. You need to include the additional zeroes needed to make them do so, e.g. "00:00:07", or interpret them as actual time objects, which will sort correctly. This function may be useful:
def padded(s, c=":"):
return c.join("{0:02d}".format(int(i)) for i in s.split(c))
You can use this as a key for sorted if you really want to retain the current format in your output:
for k in sorted(d, key=padded):
Have a look at the collections.OrderedDict module

How to sort a list of lists (non integers)?

I have a list of lists that looks like this:
[['10.2100', '0.93956088E+01'],
['11.1100', '0.96414905E+01'],
['12.1100', '0.98638361E+01'],
['14.1100', '0.12764182E+02'],
['16.1100', '0.16235739E+02'],
['18.1100', '0.11399972E+02'],
['20.1100', '0.76444933E+01'],
['25.1100', '0.37823686E+01'],
['30.1100', '0.23552237E+01'],...]
(here it looks as if it is already ordered, but some of the rest of the elements not included here to avoid a huge list, are not in order)
and I want to sort it by the first element of each pair, I have seen several very similar questions, but in all the cases the examples are with integers, I don't know if that is why when I use the list.sort(key=lambda x: x[0]) or the sorter, or the version with the operator.itemgetter(0) I get the following:
[['10.2100', '0.93956088E+01'],
['100.1100', '0.33752517E+00'],
['11.1100', '0.96414905E+01'],
['110.1100', '0.25774972E+00'],
['12.1100', '0.98638361E+01'],
['14.1100', '0.12764182E+02'],
['14.6100', '0.14123326E+02'],
['15.1100', '0.15451733E+02'],
['16.1100', '0.16235739E+02'],
['16.6100', '0.15351242E+02'],
['17.1100', '0.14040859E+02'],
['18.1100', '0.11399972E+02'], ...]
apparently what is doing is sorting by the first character appearing in the first element of each pair.
Is there a way of using list.sort or sorted() for ordering this pairs with respect to the first element?
dont use list as a variable name!
some_list.sort(key=lambda x: float(x[0]) )
will convert the first element to a float and comparit numerically instead of alphabetically
(note the cast to float is only for comparing... the item is still a string in the list)

Categories

Resources