I have a dataframe containing clinical readings of hospital patients, for example a similar dataframe could look like this
heartrate pid time
0 67 151 0.0
1 75 151 1.2
2 78 151 2.5
3 99 186 0.0
In reality there are many more columns, but I will just keep those 3 to make the example more concise.
I would like to "expand" the dataset. In short, I would like to be able to give an argument n_times_back and another argument interval.
For each iteration, which corresponds to for i in range (n_times_back + 1), we do the following:
Create a new, unique pid [OLD ID | i] (Although as long as the new
pid is unique for each duplicated entry, the exact name isn't
really important to me so feel free to change this if it makes it
easier)
For every patient (pid), remove the rows with time column which is
more than the final time of that patient - i * interval. For
example if i * interval = 2.0 and the times associated to one pid
are [0, 0.5, 1.5, 2.8], the new times will be [0, 0.5], as final
time - 2.0 = 0.8
iterate
Since I realize that explaining this textually is a bit messy, here is an example.
With the dataset above, if we let n_times_back = 1 and interval=1 then we get
heartrate pid time
0 67 15100 0.0
1 75 15100 1.2
2 78 15100 2.5
3 67 15101 0.0
4 75 15101 1.2
5 99 18600 0.0
For n_times_back = 2, the result would be
heartrate pid time
0 67 15100 0.0
1 75 15100 1.2
2 78 15100 2.5
3 67 15101 0.0
4 75 15101 1.2
5 67 15102 0.0
6 99 18600 0.0
n_times_back = 3 and above would lead to the same result as n_times_back = 2, as no patient data goes below that point in time
I have written code for this.
def expand_df(df, n_times_back, interval):
for curr_patient in df['pid'].unique():
patient_data = df[df['pid'] == curr_patient]
final_time = patient_data['time'].max()
for i in range(n_times_back + 1):
new_data = patient_data[patient_data['time'] <= final_time - i * interval]
new_data['pid'] = patient_data['pid'].astype(str) + str(i).zfill(2)
new_data['pid'] = new_data['pid'].astype(int)
#check if there is any time index left, if not don't add useless entry to dataframe
if(new_data['time'].count()>0):
df = df.append(new_data)
df = df[df['pid'] != curr_patient] # remove original patient data, now duplicate
df.reset_index(inplace = True, drop = True)
return df
As far as functionality goes, this code works as intended. However, it is very slow. I am working with a dataframe of 30'000 patients and the code has been running for over 2 hours now.
Is there a way to use pandas operations to speed this up? I have looked around but so far I haven't managed to reproduce this functionality with high level pandas functions
ended up using a groupby function and breaking when no more times were available, as well as creating an "index" column that I merge with the "pid" column at the end.
def expand_df(group, n_times, interval):
df = pd.DataFrame()
final_time = group['time'].max()
for i in range(n_times + 1):
new_data = group[group['time'] <= final_time - i * interval]
new_data['iteration'] = str(i).zfill(2)
#check if there is any time index left, if not don't add useless entry to dataframe
if(new_data['time'].count()>0):
df = df.append(new_data)
else:
break
return df
new_df = df.groupby('pid').apply(lambda x : expand_df(x, n_times_back, interval))
new_df = new_df.reset_index(drop=True)
new_df['pid'] = new_df['pid'].map(str) + new_df['iteration']
Trying to filter out a number of actions a user has done if the number of actions reaches a threshold.
Here is the data set: (Only Few records)
user_id,session_id,item_id,rating,length,time
123,36,28,3.5,6243.0,2015-03-07 22:44:40
123,36,29,2.5,4884.0,2015-03-07 22:44:14
123,36,30,3.5,6846.0,2015-03-07 22:44:28
123,36,54,6.5,10281.0,2015-03-07 22:43:56
123,36,61,3.5,7639.0,2015-03-07 22:43:44
123,36,62,7.5,18640.0,2015-03-07 22:43:34
123,36,63,8.5,7189.0,2015-03-07 22:44:06
123,36,97,2.5,7627.0,2015-03-07 22:42:53
123,36,98,4.5,9000.0,2015-03-07 22:43:04
123,36,99,7.5,7514.0,2015-03-07 22:43:13
223,63,30,8.0,5412.0,2015-03-22 01:42:10
123,36,30,5.5,8046.0,2015-03-07 22:42:05
223,63,32,8.5,4872.0,2015-03-22 01:42:03
123,36,32,7.5,11914.0,2015-03-07 22:41:54
225,63,35,7.5,6491.0,2015-03-22 01:42:19
123,36,35,5.5,7202.0,2015-03-07 22:42:15
123,36,36,6.5,6806.0,2015-03-07 22:42:43
123,36,37,2.5,6810.0,2015-03-07 22:42:34
225,63,41,5.0,15026.0,2015-03-22 01:42:37
225,63,45,6.5,8532.0,2015-03-07 22:42:25
I can groupby the data using user_id and session_id and get a count of items a user has rated in a session:
df.groupby(['user_id', 'session_id']).agg({'item_id':'count'}).rename(columns={'item_id': 'count'})
List of items that user has rated in a session can be obtained:
df.groupby(['user_id','session_id'])['item_id'].apply(list)
The goal is to get following if a user has rated more than 3 items in session, I want to pick only the first three items (keep only first three per user per session) from the original data frame. Maybe use the time to sort the items?
First tried to obtain which sessions contain more than 3, somewhat struggling to go beyond.
df.groupby(['user_id', 'session_id'])['item_id'].apply(
lambda x: (x > 3).count())
Example: from original df, user 123 should have first three records belong to session 36
It seems like you want to use groupby with head:
In [8]: df.groupby([df.user_id, df.session_id]).head(3)
Out[8]:
user_id session_id item_id rating length time
0 123 36 28 3.5 6243.0 2015-03-07 22:44:40
1 123 36 29 2.5 4884.0 2015-03-07 22:44:14
2 123 36 30 3.5 6846.0 2015-03-07 22:44:28
10 223 63 30 8.0 5412.0 2015-03-22 01:42:10
12 223 63 32 8.5 4872.0 2015-03-22 01:42:03
14 225 63 35 7.5 6491.0 2015-03-22 01:42:19
18 225 63 41 5.0 15026.0 2015-03-22 01:42:37
19 225 63 45 6.5 8532.0 2015-03-07 22:42:25
One way is to use sort_values followed by groupby.cumcount. A method I find useful is to extract any series or MultiIndex data before applying any filtering.
The below example filters for minimum user_id / session_id combination of 3 items and only takes the first 3 in each group.
sizes = df.groupby(['user_id', 'session_id']).size()
counter = df.groupby(['user_id', 'session_id']).cumcount() + 1 # counting begins at 0
indices = df.set_index(['user_id', 'session_id']).index
df = df.sort_values('time')
res = df[(indices.map(sizes.get) >= 3) & (counter <=3)]
print(res)
user_id session_id item_id rating length time
0 123 36 28 3.5 6243.0 2015-03-07 22:44:40
1 123 36 29 2.5 4884.0 2015-03-07 22:44:14
2 123 36 30 3.5 6846.0 2015-03-07 22:44:28
14 225 63 35 7.5 6491.0 2015-03-22 01:42:19
18 225 63 41 5.0 15026.0 2015-03-22 01:42:37
19 225 63 45 6.5 8532.0 2015-03-07 22:42:25
I need help looping through table rows and putting them into a list. On this website, there are three tables, each with different statistics - http://www.fangraphs.com/statsplits.aspx?playerid=15640&position=OF&season=0&split=0.4
For instance, these three tables have rows for 2016, 2017, and a total row. I would like the following:
A list of the following -->table 1 - row 1, table 2 - row 1, table 3 - row 1
A second list of the following -->table 1 - row 2, table 2 - row 2, table 3 - row 2
A third list: -->table 1 - row 3, table 2 - row 3, table 3 - row 3
I know I obviously need to create lists, and need to use the append function; however, I am not sure how to get it to loop through just the first row of each table, then the second row of each table, and etc through each row of the table (the number of rows will vary in each instance - this one just happens to have 3).
Any help is greatly appreciated. The code is below:
from bs4 import BeautifulSoup
import requests
import pandas as pd
import csv
idList2 = ['15640', '9256']
splitList=[0.4,0.2,0.3,0.4]
for id in idList2:
pos = 'OF'
for split in splitList:
url = 'http://www.fangraphs.com/statsplits.aspx?playerid=' +
str(id) + '&position=' + str(pos) + '&season=0&split=' +
str(split) + ''
r = requests.get(url)
for season in range(1,4):
print(season)
soup = BeautifulSoup(r.text, "html.parser")
tableStats = soup.find("table", {"id" : "SeasonSplits1_dgSeason" + str(season) + "_ctl00"})
soup.findAll('th')
column_headers = [th.getText() for th in soup.findAll('th')]
statistics = soup.find("table", {"id" :
'"SeasonSplits1_dgSeason" + str(season) + "_ctl00"})'
tabledata = [td.getText() for td in statistics('td')]
print(tabledata)
This will be my last attempt. It has every thing you should need. I created a traceback to where the tables, rows and columns are being scraped. This all happens in the function extract_table(). follow the traceback markers and don't worry about any other code. Don't let the large file size worry you its mostly documentation and spacing.
Traceback marker: ### ... ###
Start at line 95 with traceback marker ### START HERE ###
from bs4 import BeautifulSoup as Soup
import requests
import urllib
###### GO TO LINE 95 ######
### IGNORE ###
def generate_urls (idList, splitList):
""" Using and id list and a split list generate a list urls"""
urls = []
url = 'http://www.fangraphs.com/statsplits.aspx'
for id in idList:
for split in splitList:
# The parameters used in creating the url
url_payload = {'split': split, 'playerid': id, 'position': 'OF', 'season': 0}
# Create the url and store add to the collection of urls
urls += ['?'.join([url, urllib.urlencode(url_payload)])]
return urls # Return the list of urls
### IGNORE ###
def extract_player_name (soup):
""" Extract the player name from the browser title """
# Browser title contains player name, strip all but name
player_name = repr(soup.title.text.strip('\r\n\t'))
player_name = player_name.split(' \\xbb')[0] # Split on ` »`
player_name = player_name[2:] # Erase a leading characters from using `repr`
return player_name
########## FINISH HERE ##########
def extract_table (table_id, soup):
""" Extract data from a table, return the column headers and the table rows"""
### IMPORTANT: THIS CODE IS WHERE ALL THE MAGIC HAPPENS ###
# - First: Find lowest level tag of all the data we want (container).
#
# - Second: Extract the table column headers, requires minimal mining
#
# - Third: Gather a list of tags that represent the tables rows
#
# - Fourth: Loop through the list of rows
# A): Mine all columns in the row
### IMPORTANT: Get A Reference To The Table ###
# SCRAPE 1:
table_tag = soup.find("table", {"id" : 'SeasonSplits1_dgSeason%d_ctl00' % table_id})
# SCRAPE 2:
columns = [th.text for th in table_tag.findAll('th')]
# SCRAPE 3:
rows_tags = table_tag.tbody.findAll('tr'); # All 'tr' tags in the table `tbody` tag are row tags
### IMPORTANT: Cycle Through Rows And Collect Column Data ###
# SCRAPE 4:
rows = [] # List of all table rows
for row_tag in rows_tags:
### IMPORTANT: Mine All Columns In This Row || LOWEST LEVEL IN THE MINING OPERATION. ###
# SCRAPE 4.A
row = [col.text for col in row_tag.findAll('td')] # `td` represents a column in a row.
rows.append (row) # Add this row to all the other rows of this table
# RETURN: The column header and the rows of this table
return [columns, rows]
### Look Deeper ###
def extract_player (soup):
""" Extract player data and store in a list. ['name', [columns, rows], [table2]]"""
player = [] # A list store data in
# player name is first in player list
player.append (extract_player_name (soup))
# Each table is a list entry
for season in range(1,4):
### IMPORTANT: No Table Related Data Has Been Mined Yet. START HERE ###
### - Line: 37
table = extract_table (season, soup) # `season` represents the table id
player.append(table) # Add this table(list to the player data list
# Return the player list
return player
##################################################
################## START HERE ####################
##################################################
###
### OBJECTIVE:
###
### - Follow the trail of important lines that extract the data
### - Important lines will be marked as the following `### ... ###`
###
### All this code really needs is a url and the `extract_table()` function.
###
### The `main()` function is where the journey starts
###
##################################################
##################################################
def main ():
""" The main function is the core program code. """
# Luckily the pages we will scrape all have the same layout making mining easier.
all_players = [] # A place to store all the data
# Values used to alter the url when making requests to access more player statistics
idList2 = ['15640', '9256']
splitList=[0.4,0.2,0.3,0.4]
# Instead of looping through variables that dont tell a story,
# lets create a list of urls generated from those variables.
# This way the code is self-explanatory and is human-readable.
urls = generate_urls(idList2, splitList) # The creation of the url is not important right now
# Lets scrape each url
for url in urls:
print url
# First Step: get a web page via http request.
response = requests.get (url)
# Second step: use a parsing library to create a parsable object
soup = Soup(response.text, "html.parser") # Create a soup object (Once)
### IMPORTANT: Parsing Starts and Ends Here ###
### - Line: 75
# Final Step: Given a soup object, mine player data
player = extract_player (soup)
# Add the new entry to the list
all_players += [player]
return all_players
# If this script is being run, not imported, run the `main()` function.
if __name__ == '__main__':
all_players = main ()
print all_players[0][0] # Player List -> Name
print all_players[0][1] # Player List -> Table 1
print all_players[0][2] # Player List -> Table 2
print all_players[0][3] # Player List -> Table 3
print all_players[0][3][0] # Player List -> Table 1 -> Columns
print all_players[0][3][1] # Player List -> Table 1 -> All Rows
print all_players[0][3][1][0] # Player List -> Table 1 -> All Rows -> Row 1
print all_players[0][3][1][2] # Player List -> Table 1 -> All Rows -> Row 2
print all_players[0][3][1][2][0] # Player List -> Table 1 -> All Rows -> Row 2 -> Colum 1
I've updated the code, separated functionality, used lists instead of dictionaries (as requested). Lines 85+ is output testing (can ignore).
I see now that that your making multiple request (4) for the same player to gather more data on them. In the last answer I provided, the code only kept the last request made. Using a list eliminated this problem.
You may want to condense the list so that their is only one entry per player.
The core of the program is on lines 65-77
Everything above all_player decleration on line 57 is a function To handle scraping.
UPDATED: scrape_players.py
from bs4 import BeautifulSoup as Soup
import requests
def http_get (id, split):
""" Make a get request, return the response. """
# Create url parameters dictinoary
payload = {'split': split, 'playerid': id, 'position': 'OF', 'season': 0}
url = 'http://www.fangraphs.com/statsplits.aspx'
return requests.get(url, params=payload) # Pass payload through `requests.get()`
def extract_player_name (soup):
""" Extract the player name from the browser title """
# Browser title contains player name, strip all but name
player_name = repr(soup.title.text.strip('\r\n\t'))
player_name = player_name.split(' \\xbb')[0] # Split on ` »`
player_name = player_name[2:] # Erase a leading characters from using `repr`
return player_name
def extract_table (table_id, soup):
""" Extract data from a table, return the column headers and the table rows"""
# SCRAPE: Get a table
table_tag = soup.find("table", {"id" : 'SeasonSplits1_dgSeason%d_ctl00' % table_id})
# SCRAPE: Extract table column headers
columns = [th.text for th in table_tag.findAll('th')]
rows = []
# SCRAPE: Extract Table Contents
for row in table_tag.tbody.findAll('tr'):
rows.append ([col.text for col in row.findAll('td')]) # Gather all columns in the row
# RETURN: [columns, rows]
return [columns, rows]
def extract_player (soup):
""" Extract player data and store in a list. ['name', [columns, rows], [table2]]"""
player = []
# player name is first in player list
player.append (extract_player_name (soup))
# Each table is a list entry
for season in range(1,4):
player.append(extract_table (season, soup))
# Return the player list
return player
# A list of all players
all_players = [
#'playername',
#[table_columns, table_rows],
#[table_columns, table_rows],
#[['Season', 'vs R as R'], [['2015', 'yes'], ['2016', 'no'], ['2017', 'no'],]],
]
# I dont know what these values are. Sorry!
idList2 = ['15640', '9256']
splitList=[0.4,0.2,0.3,0.4]
# Scrape data
for id in idList2:
for split in splitList:
response = http_get (id, split)
soup = Soup(response.text, "html.parser") # Create a soup object (Once)
all_players.append (extract_player (soup))
# or all_players += [scrape_player (soup)]
# Output data
def PrintPlayerAsTable (player, show_name=True):
if show_name: print player[0] # First entry is the player name
for table in player[1:]: # All other entries are tables
PrintTableAsTable(table)
def PrintTableAsTable (table, table_sep='\n'):
print table_sep
PrintRowAsTable(table[0]) # The first row in the table is the columns
for row in table[1]: # The second item in the table is a list of rows
PrintRowAsTable (row)
def PrintRowAsTable (row=[], prefix='\t'):
""" Print out the list in a table foramt. """
print prefix + ''.join([col.ljust(15) for col in row])
# There are 4 entries to every player, one for each request made
PrintPlayerAsTable (all_players[0])
PrintPlayerAsTable (all_players[1], False)
PrintPlayerAsTable (all_players[2], False)
PrintPlayerAsTable (all_players[3], False)
print '\n\nScraped %d player Statistics' % len(all_players)
for player in all_players:
print '\t- %s' % player[0]
# 4th player entry
print '\n\n'
print all_players[4][0] # Player name
print '\n'
#print all_players[4][1] # Table 1
print all_players[4][1][0] # Table 1 Column Headers
#print all_players[4][1][1] # Table 1 Rows
print all_players[4][1][1][1] # Table 1 Rows Row 1
print all_players[4][1][1][2] # Table 1 Rows Row 2
print all_players[4][1][1][-1] # Table 1 Rows Last Row
print '\n'
#print all_players[4][2] # Table 2
print all_players[4][2][0] # Table 2 Column Headers
#print all_players[4][2][1] # Table 2 Rows
print all_players[4][2][1][1] # Table 2 Rows Row 1
print all_players[4][2][1][2] # Table 2 Rows Row 2
print all_players[4][2][1][-1] # Table 2 Rows Last Row
print '\nTable 3'
PrintRowAsTable(all_players[4][2][0], '') # Table 3 Column Headers
PrintRowAsTable(all_players[4][2][1][1], '') # Table 3 Rows Row 1
PrintRowAsTable(all_players[4][2][1][2], '') # Table 3 Rows Row 2
PrintRowAsTable(all_players[4][2][1][-1], '') # Table 3 Rows Last Row
OUTPUT:
Outputs scraped data, so you can see how the all_players is structured.
Aaron Judge
Season vs R as R G AB PA H 1B 2B 3B HR R RBI BB IBB SO HBP SF SH GDP SB CS AVG
2016 vs R as R 27 69 77 14 8 2 0 4 8 10 6 0 32 1 1 0 2 0 0 .203
2017 vs R as R 66 198 231 65 34 10 2 19 37 42 31 3 71 2 0 0 8 3 0 .328
Total vs R as R 93 267 308 79 42 12 2 23 45 52 37 3 103 3 1 0 10 3 0 .296
Season vs R as R BB% K% BB/K AVG OBP SLG OPS ISO BABIP wRC wRAA wOBA wRC+
2016 vs R as R 7.8 % 41.6 % 0.19 .203 .273 .406 .679 .203 .294 7 -1.7 .291 79
2017 vs R as R 13.4 % 30.7 % 0.44 .328 .424 .687 1.111 .359 .426 54 26.1 .454 189
Total vs R as R 12.0 % 33.4 % 0.36 .296 .386 .614 1.001 .318 .394 62 24.4 .413 162
Season vs R as R GB/FB LD% GB% FB% IFFB% HR/FB IFH% BUH% Pull% Cent% Oppo% Soft% Med% Hard% Pitches Balls Strikes
2016 vs R as R 0.74 13.2 % 36.8 % 50.0 % 0.0 % 21.1 % 7.1 % 0.0 % 50.0 % 29.0 % 21.1 % 7.9 % 42.1 % 50.0 % 327 117 210
2017 vs R as R 1.14 27.6 % 38.6 % 33.9 % 2.3 % 44.2 % 6.1 % 0.0 % 45.7 % 26.8 % 27.6 % 11.0 % 39.4 % 49.6 % 985 395 590
Total vs R as R 1.02 24.2 % 38.2 % 37.6 % 1.6 % 37.1 % 6.3 % 0.0 % 46.7 % 27.3 % 26.1 % 10.3 % 40.0 % 49.7 % 1312 512 800
Season vs R as L G AB PA H 1B 2B 3B HR R RBI BB IBB SO HBP SF SH GDP SB CS AVG
2016 vs R as L 3 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 1 .000
2017 vs R as L 20 0 0 0 0 0 0 0 13 0 0 0 0 0 0 0 0 3 1 .000
Total vs R as L 23 0 0 0 0 0 0 0 15 0 0 0 0 0 0 0 0 3 2 .000
Season vs R as L BB% K% BB/K AVG OBP SLG OPS ISO BABIP wRC wRAA wOBA wRC+
2016 vs R as L 0.0 % 0.0 % 0.00 .000 .000 .000 .000 .000 .000 0 0.0 .000
2017 vs R as L 0.0 % 0.0 % 0.00 .000 .000 .000 .000 .000 .000 0 0.0 .000
Total vs R as L 0.0 % 0.0 % 0.00 .000 .000 .000 .000 .000 .000 0 0.0 .000
Season vs R as L GB/FB LD% GB% FB% IFFB% HR/FB IFH% BUH% Pull% Cent% Oppo% Soft% Med% Hard% Pitches Balls Strikes
2016 vs R as L 0.00 0.0 % 0.0 % 0.0 % 0.0 % 0.0 % 0.0 % 0.0 % 0 0 0
2017 vs R as L 0.00 0.0 % 0.0 % 0.0 % 0.0 % 0.0 % 0.0 % 0.0 % 0 0 0
Total vs R as L 0.00 0.0 % 0.0 % 0.0 % 0.0 % 0.0 % 0.0 % 0.0 % 0 0 0
Season vs L as R G AB PA H 1B 2B 3B HR R RBI BB IBB SO HBP SF SH GDP SB CS AVG
2016 vs L as R 11 15 18 1 1 0 0 0 0 0 3 0 10 0 0 0 0 0 0 .067
2017 vs L as R 26 47 61 16 9 1 1 5 9 12 13 0 16 1 0 0 2 0 0 .340
Total vs L as R 37 62 79 17 10 1 1 5 9 12 16 0 26 1 0 0 2 0 0 .274
Season vs L as R BB% K% BB/K AVG OBP SLG OPS ISO BABIP wRC wRAA wOBA wRC+
2016 vs L as R 16.7 % 55.6 % 0.30 .067 .222 .067 .289 .000 .200 0 -2.3 .164 -8
2017 vs L as R 21.3 % 26.2 % 0.81 .340 .492 .723 1.215 .383 .423 17 9.1 .496 218
Total vs L as R 20.3 % 32.9 % 0.62 .274 .430 .565 .995 .290 .387 16 6.8 .421 166
Season vs L as R GB/FB LD% GB% FB% IFFB% HR/FB IFH% BUH% Pull% Cent% Oppo% Soft% Med% Hard% Pitches Balls Strikes
2016 vs L as R 0.33 20.0 % 20.0 % 60.0 % 0.0 % 0.0 % 0.0 % 0.0 % 20.0 % 60.0 % 20.0 % 20.0 % 40.0 % 40.0 % 81 32 49
2017 vs L as R 0.73 16.1 % 35.5 % 48.4 % 0.0 % 33.3 % 0.0 % 0.0 % 29.0 % 48.4 % 22.6 % 16.1 % 35.5 % 48.4 % 295 135 160
Total vs L as R 0.67 16.7 % 33.3 % 50.0 % 0.0 % 27.8 % 0.0 % 0.0 % 27.8 % 50.0 % 22.2 % 16.7 % 36.1 % 47.2 % 376 167 209
Season vs R as R G AB PA H 1B 2B 3B HR R RBI BB IBB SO HBP SF SH GDP SB CS AVG
2016 vs R as R 27 69 77 14 8 2 0 4 8 10 6 0 32 1 1 0 2 0 0 .203
2017 vs R as R 66 198 231 65 34 10 2 19 37 42 31 3 71 2 0 0 8 3 0 .328
Total vs R as R 93 267 308 79 42 12 2 23 45 52 37 3 103 3 1 0 10 3 0 .296
Season vs R as R BB% K% BB/K AVG OBP SLG OPS ISO BABIP wRC wRAA wOBA wRC+
2016 vs R as R 7.8 % 41.6 % 0.19 .203 .273 .406 .679 .203 .294 7 -1.7 .291 79
2017 vs R as R 13.4 % 30.7 % 0.44 .328 .424 .687 1.111 .359 .426 54 26.1 .454 189
Total vs R as R 12.0 % 33.4 % 0.36 .296 .386 .614 1.001 .318 .394 62 24.4 .413 162
Season vs R as R GB/FB LD% GB% FB% IFFB% HR/FB IFH% BUH% Pull% Cent% Oppo% Soft% Med% Hard% Pitches Balls Strikes
2016 vs R as R 0.74 13.2 % 36.8 % 50.0 % 0.0 % 21.1 % 7.1 % 0.0 % 50.0 % 29.0 % 21.1 % 7.9 % 42.1 % 50.0 % 327 117 210
2017 vs R as R 1.14 27.6 % 38.6 % 33.9 % 2.3 % 44.2 % 6.1 % 0.0 % 45.7 % 26.8 % 27.6 % 11.0 % 39.4 % 49.6 % 985 395 590
Total vs R as R 1.02 24.2 % 38.2 % 37.6 % 1.6 % 37.1 % 6.3 % 0.0 % 46.7 % 27.3 % 26.1 % 10.3 % 40.0 % 49.7 % 1312 512 800
Scraped 8 player Statistics
- Aaron Judge
- Aaron Judge
- Aaron Judge
- Aaron Judge
- A.J. Pollock
- A.J. Pollock
- A.J. Pollock
- A.J. Pollock
A.J. Pollock
[u'Season', u'vs R as R', u'G', u'AB', u'PA', u'H', u'1B', u'2B', u'3B', u'HR', u'R', u'RBI', u'BB', u'IBB', u'SO', u'HBP', u'SF', u'SH', u'GDP', u'SB', u'CS', u'AVG']
[u'2013', u'vs R as R', u'115', u'270', u'295', u'70', u'52', u'12', u'2', u'4', u'25', u'21', u'21', u'1', u'54', u'1', u'0', u'3', u'4', u'3', u'1', u'.259']
[u'2014', u'vs R as R', u'71', u'215', u'232', u'66', u'42', u'17', u'3', u'4', u'21', u'14', u'15', u'0', u'41', u'2', u'0', u'0', u'3', u'7', u'1', u'.307']
[u'Total', u'vs R as R', u'395', u'1120', u'1230', u'330', u'225', u'67', u'13', u'25', u'122', u'102', u'93', u'1', u'199', u'5', u'9', u'3', u'23', u'41', u'6', u'.295']
[u'Season', u'vs R as R', u'BB%', u'K%', u'BB/K', u'AVG', u'OBP', u'SLG', u'OPS', u'ISO', u'BABIP', u'wRC', u'wRAA', u'wOBA', u'wRC+']
[u'2013', u'vs R as R', u'7.1 %', u'18.3 %', u'0.39', u'.259', u'.315', u'.363', u'.678', u'.104', u'.311', u'29', u'-3.0', u'.301', u'84']
[u'2014', u'vs R as R', u'6.5 %', u'17.7 %', u'0.37', u'.307', u'.358', u'.470', u'.828', u'.163', u'.365', u'35', u'9.6', u'.364', u'128']
[u'Total', u'vs R as R', u'7.6 %', u'16.2 %', u'0.47', u'.295', u'.349', u'.445', u'.793', u'.150', u'.337', u'168', u'30.7', u'.345', u'113']
Table 3
Season vs R as R BB% K% BB/K AVG OBP SLG OPS ISO BABIP wRC wRAA wOBA wRC+
2013 vs R as R 7.1 % 18.3 % 0.39 .259 .315 .363 .678 .104 .311 29 -3.0 .301 84
2014 vs R as R 6.5 % 17.7 % 0.37 .307 .358 .470 .828 .163 .365 35 9.6 .364 128
Total vs R as R 7.6 % 16.2 % 0.47 .295 .349 .445 .793 .150 .337 168 30.7 .345 113