How to select number columns in pandas dataframe - python

How can I select number columns in the below column names
output_df.columns = Index(['EVENT_ID', 'Date', 'Time', 'Track', '#', 'Distance', 'Betfair Grade','Runners', 'Win Trap', 'Win BSP', '1', '2', '3', '4', '5', '6', '7',
'8', '9', '10', 'Trap1 Odds Band', 'Trap2 Odds Band', 'Trap3 Odds Band'],
dtype='object')
I tried this function and I got the below output.
output_df.filter(regex="\d+", axis=1).columns
Index(['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', 'Trap1 Odds Band',
'Trap2 Odds Band', 'Trap3 Odds Band'],dtype='object')
I just want the number columns:
['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']

new_df = df[df.columns.isnumeric()]
This should work?

Try filtering a full match:
output_df.filter(regex="^\d+$", axis=1).columns
Or better without filter:
df.columns[df.columns.isdigit()]

Related

Insertion Sort not working for single digit entries if #entries> 9

I am trying to implement insertion sort using python. Below is my code:
import numpy as np
N=int(input())
A=input().split()
#A=[int(x) for x in input().split()]
B_sorted=A.copy()
C= A.copy()
def inssort(arr):
for i in range(1, len(arr)):
key = arr[i]
j = i-1
while (j >=0 and key < arr[j]):
arr[j+1] = arr[j]
j -= 1
arr[j+1] = key
inssort(B_sorted)
for i in range(0,len(A)):
for j in range(0,len(A)):
if A[i]==B_sorted[j]:
C[i]=j+1
break
j=j+1
i=i+1
for x in B_sorted:
print (x, "",end="")
My code works for the following set of entries:
[9,8,7,...,2,1]
[20,19,...,12,11]
[20,19,...,11,10]
but it doesn't work for following
[10,9,8,...,2,1]
I don't understand what exactly is the issue here.
I tried to go to the detail by printing the output at every loop:
10 9 8 7 6 5 4 3 2 1
['10', '9', '8', '7', '6', '5', '4', '3', '2', '1']
['10', '8', '9', '7', '6', '5', '4', '3', '2', '1']
['10', '7', '8', '9', '6', '5', '4', '3', '2', '1']
['10', '6', '7', '8', '9', '5', '4', '3', '2', '1']
['10', '5', '6', '7', '8', '9', '4', '3', '2', '1']
['10', '4', '5', '6', '7', '8', '9', '3', '2', '1']
['10', '3', '4', '5', '6', '7', '8', '9', '2', '1']
['10', '2', '3', '4', '5', '6', '7', '8', '9', '1']
['1', '10', '2', '3', '4', '5', '6', '7', '8', '9']
Looks like for some reason at the first iteration j is not taking the value 0 for this particular sequence. As I mentioned before, it works perfectly for a lot of other sequences.
Kindly help.

how can i split string with no blank?

input = '1+2++3+++4++5+6+7++8+9++10'
string = input.split('+')
print(string)
when we run this code the output is ['1', '2', '', '3', '', '', '4', '', '5', '6', '7', '', '8', '9', '', '10']
But i want to split the string with no blank like ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
Is there any function or method to remove blanks without using for loop like
for i in string:
if i == '':
string.remove(i)
Generate a list based on the output of split, and only include the elements which are not None
You can achieve this in multiple ways. The cleanest way here would be to use regex.
Regex:
import re
re.split('\++', inp)
#['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
List Comprehension:
inp = '1+2++3+++4++5+6+7++8+9++10'
[s for s in inp.split('+') if s]
#['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
Loop & Append:
result = []
for s in inp.split('+'):
if s:
result.append(s)
result
#['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
Simplest way:
customStr="1+2++3+++4++5+6+7++8+9++10"
list( filter( lambda x : x!="" ,customStr.split("+") ) )

how to sort python list in a way that if 1 should come before and 10 and 2 before 20

['2', '8', '2', '3', '6', '4', '1', '1', '10', '6', '3', '3', '6', '1', '3', '8', '4', '6', '1', '10', '8', '4', '10', '4', '1', '3', '2', '3', '2', '6', '1', '5', '2', '9', '8', '5', '10', '8', '7', '9', '6', '4', '2', '6', '3', '8', '8', '9', '8', '2', '9', '10', '3', '10', '7', '5', '7', '1', '7', '5', '1', '4', '7', '6', '1', '10', '5', '4', '8', '4', '2', '7', '8', '1', '1', '7', '4', '1', '1', '9', '8', '6', '5', '9', '9', '3', '7', '6', '3', '10', '8', '10', '7', '2', '5', '1', '1', '9', '9', '5']
after using lambda function inf following way:
a.sort(key=lambda a: int(a.split()[0]))
a = a[::-1]
I got
['10', '10', '10', '10', '10', '10', '10', '10', '10', '9', '9', '9', '9', '9', '9', '9', '9', '9', '8', '8', '8', '8', '8', '8', '8', '8', '8', '8', '8', '8', '7', '7', '7', '7', '7', '7', '7', '7', '7', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '5', '5', '5', '5', '5', '5', '5', '5', '4', '4', '4', '4', '4', '4', '4', '4', '4', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '2', '2', '2', '2', '2', '2', '2', '2', '2', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1']
But i want
10 in end after 1 , likewise if put 20 and 2 in list than 2 should come before 20 and 20 before 10 etc
The operation:
a.sort()
with no other options sets a to:
['1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '10', '10', '10', '10', '10', '10', '10', '10', '10', '2', '2', '2', '2', '2', '2', '2', '2', '2', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '4', '4', '4', '4', '4', '4', '4', '4', '4', '5', '5', '5', '5', '5', '5', '5', '5', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '7', '7', '7', '7', '7', '7', '7', '7', '7', '8', '8', '8', '8', '8', '8', '8', '8', '8', '8', '8', '8', '9', '9', '9', '9', '9', '9', '9', '9', '9']
You want an ordered list of strings. Those strings are representations of numbers. Now you want a grouped sorting. First everything that starts with a '9', then '8' and down to '1'. In each of those groups the values should be sorted in numeric order.
An example list:
a = ['11', '105', '2', '8', '2', '3', '6', '4', '1', '1', '10', '81', '3', '3', '5', '10', '8', '7', '9', '6', '4', '2']
Now let's do a grouped sorting with a.sort(key=lambda v: v[0]):
['11', '105', '1', '1', '10', '10', '2', '2', '2', '3', '3', '3', '4', '4', '5', '6', '6', '7', '8', '81', '8', '9']
We see, that the values are grouped now, but we want the values starting with '9' first. We're going to fix this by reversing the result with a.sort(key=lambda v: v[0], reversed=True)
['9', '8', '81', '8', '7', '6', '6', '5', '4', '4', '3', '3', '3', '2', '2', '2', '11', '105', '1', '1', '10', '10']
The groups are correct, now we have to sort the values in the groups. So after the sorting according to the first character we have to sort the value by number. That's easy, we just have to create a tuple for the key: a.sort(key=lambda v: (v[0], int(v)), reverse=True)
['9', '81', '8', '8', '7', '6', '6', '5', '4', '4', '3', '3', '3', '2', '2', '2', '105', '11', '10', '10', '1', '1']
OK, the values are sorted now, but we have to reverse them in the groups. The easiest way to do that ist to take the negative number: a.sort(key=lambda v: (v[0], -int(v)), reverse=True).
['9', '8', '8', '81', '7', '6', '6', '5', '4', '4', '3', '3', '3', '2', '2', '2', '1', '1', '10', '10', '11', '105']
you can use the following sample using key=str
integers = ['2', '8', '2', '3', '6', '4', '1', '1', '10', '6', '3', '3', '6', '1', '3', '8', '4', '6', '1', '10', '8', '4', '10', '4', '1', '3', '2', '3', '2', '6', '1', '5', '2', '9', '8', '5', '10', '8', '7', '9', '6', '4', '2', '6', '3', '8', '8', '9', '8', '2', '9', '10', '3', '10', '7', '5', '7', '1', '7', '5', '1', '4', '7', '6', '1', '10', '5', '4', '8', '4', '2', '7', '8', '1', '1', '7', '4', '1', '1', '9', '8', '6', '5', '9', '9', '3', '7', '6', '3', '10', '8', '10', '7', '2', '5', '1', '1', '9', '9', '5']
print(sorted(integers, key=str))
you don't need to use lambda method here. Instead of using lambda you can use '.sort()' method to sort the items in the list. like this one:
li=['2', '8', '2', '3', '6', '4', '1', '1', '10', '6', '3', '3', '6', '1', '3', '8', '4', '6', '1', '10', '8', '4', '10', '4', '1', '3', '2', '3', '2', '6', '1', '5', '2', '9', '8', '5', '10', '8', '7', '9', '6', '4', '2', '6', '3', '8', '8', '9', '8', '2', '9', '10', '3', '10', '7', '5', '7', '1', '7', '5', '1', '4', '7', '6', '1', '10', '5', '4', '8', '4', '2', '7', '8', '1', '1', '7', '4', '1', '1', '9', '8', '6', '5', '9', '9', '3', '7', '6', '3', '10', '8', '10', '7', '2', '5', '1', '1', '9', '9', '5','20']
li.sort()
print(li)
output:
['1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '10', '10', '10', '10', '10', '10', '10', '10', '10', '2', '2', '2', '2', '2', '2', '2', '2', '2', '20', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '4', '4', '4', '4', '4', '4', '4', '4', '4', '5', '5', '5', '5', '5', '5', '5', '5', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '7', '7', '7', '7', '7', '7', '7', '7', '7', '8', '8', '8', '8', '8', '8', '8', '8', '8', '8', '8', '8', '9', '9', '9', '9', '9', '9', '9', '9', '9']
I hope you get your answere.
This might helps you:
a.sort(key=str)
or
new list = sorted(a, key=str) # if you dont want to change a
this will sort the list as you want even if the items are integer

missing last bin in histogram plot from matplot python

I'm trying to draw histrogram based of my value
x = ['3', '1', '4', '1', '5', '9', '2', '6', '5', '3', '5',
'2', '3', '4', '5', '6', '4', '2', '0', '1', '9', '8',
'8', '8', '8', '8', '9', '3', '8', '0', '9', '5', '2',
'5', '7', '2', '0', '1', '0', '6', '5']
x_num = [int(i) for i in x]
key = '0123456789'
for i in key:
print(i," count =>",x.count(i))
plt.hist(x_num, bins=[0,1,2,3,4,5,6,7,8,9])
The last 2 numbers "8, 9" bin should have distribution count of 6 , 4
But in histogram it combine 8 and 9 and get value of 10 instead of separate them. Total number of bin should be 10 => but it only giving me graph of 9..
How could I separate them and break 8 and 9 ?
import matplotlib.pyplot as plt
x = ['3', '1', '4', '1', '5', '9', '2', '6', '5', '3', '5',
'2', '3', '4', '5', '6', '4', '2', '0', '1', '9', '8',
'8', '8', '8', '8', '9', '3', '8', '0', '9', '5', '2',
'5', '7', '2', '0', '1', '0', '6', '5']
x_num = [int(i) for i in x]
key = '0123456789'
for i in key:
print(i, " count =>", x.count(i))
plt.hist(x_num, bins=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9,10])
plt.show()

Swap indexes using slices?

I know that you can swap 2 single indexes in Python
r = ['1', '2', '3', '4', '5', '6', '7', '8']
r[2], r[4] = r[4], r[2]
output:
['1', '2', '5', '4', '3', '6', '7', '8']
But why can't you swap 2 slices of indexes in python?
r = ['1', '2', '3', '4', '5', '6', '7', '8']
I want to swap the numbers 3 + 4 with 5 + 6 + 7 in r:
r[2:4], r[4:7] = r[4:7], r[2:4]
output:
['1', '2', '5', '6', '3', '4', '7', '8']
expected output:
['1', '2', '5', '6', '7', '3', '4', '8']
What did I wrong?
output:
The slicing is working as it should. You are replacing slices of different lengths. r[2:4] is two items, and r[4:7] is three items.
>>> r = ['1', '2', '3', '4', '5', '6', '7', '8']
>>> r[2:4]
['3', '4']
>>> r[4:7]
['5', '6', '7']
So when ['3', '4'] is replaced, it can only fit ['5', '6'], and when ['5', '6', '7'] is replaced, it only gets ['3', '4']. So you have ['1', '2',, then the next two elements are the first two elements from ['5', '6', '7'] which is just ['5', '6', then the two elements from ['3', '4' go next, then the remaining '7', '8'].
If you want to replace the slices, you have to start slices at the right places and allocate an appropriate size in the array for each slice:
>>> r = ['1', '2', '3', '4', '5', '6', '7', '8']
>>> r[2:5], r[5:7] = r[4:7], r[2:4]
>>> r
['1', '2', '5', '6', '7', '3', '4', '8']
old index: 4 5 6 2 3
new index: 2 3 4 5 6
Think of this:
r[2:4], r[4:7] = r[4:7], r[2:4]
as similar to this:
original_r = list(r)
r[2:4] = original_r[4:7]
r[4:7] = original_r[2:4]
So, by the time it gets to the third line of that, the 4th element isn't what you think it is anymore... You replaced '3', '4' with '5', '6', '7', and now the [4:7] slice starts with that '7'.
>>> r = ['1', '2', '3', '4', '5', '6', '7', '8']
>>> r[2:5], r[5:7] = r[4:7], r[2:4]
>>> r
['1', '2', '5', '6', '7', '3', '4', '8']
In your code:
>>> r[2:4], r[4:7] = r[4:7], r[2:4]
You are assigning r[4:7] which have 3 elements to r[2:4] which have only 2.
In the code I posted:
>>> >>> r[2:5], r[5:7] = r[4:7], r[2:4]
r[4:7] which is ['5', '6', '7'], replaces
r[2:5] which is ['3', '4', '5']
r resulting in ['1', '2', '5', '6', '7', '6', '7', '8']
and then:
r[2:4] which was ['3', '4'], replaces
r[5:7] which is ['6', '7']
So final result being:
['1', '2', '5', '6', '7', '3', '4', '8']

Categories

Resources