Ranking MLB Organizational Strength by Position

The charts that follow combine each team’s share of 2018 WAR at a given position, with their share of Future Value (FV) at that same position (as determined by the Fangraphs BOARD prospect ratings). The sum of these two shares calculate the relative positional strength at each position, which is how the charts are ordered. So, for example, the Indians have Francisco Lindor and a whole bunch of shortstop prospects. Lindor and friends combined for about 9% of the WAR at shortstop in 2018, and Cleveland’s plethora of SS prospects make up 8% of the future value at SS as well. When we combine the two shares (9%+8%) we get 17% – far and away atop the shortstop position, and well in front of the runner-up Nationals.

This iteration reflects rosters as of 12/10/2018 – so Paul Goldschmidt’s share of WAR is embedded in the Cardinals’ portion of the 1B chart, and Patrick Corbin’s with the Nats. As you can see, a considerable share of 2018 Catcher and 2B WAR is still available on the FA market. I’ll have a longer post where I walk through each of these at some point, but it’s a little difficult at this juncture since they’re changing almost every day – with the M’s and Dbacks both in some kind of tear-down mode, and the bulk of free agents yet to find a team, these charts are far from what they’ll look like on Opening Day.

The Baseball-Nerdiest Cities in the US According to Google Trends

Here we’ll use data from Google Trends to determine the nerdiest baseball cities in the US. About 90% of the words in this post are on the basis and methodology for the analysis. If you just want to see the rankings, skip to the end.

Whenever I discover that someone I know is a baseball fan, I try throwing a few advanced metrics into the conversation just to gauge familiarity. I should preface this by mentioning I don’t think any level of familiarity with advanced metrics changes a person’s value as a baseball fan – whether we’re exporting Fangraphs data or we’re listening to sports talk radio, passion is passion regardless of how we waste our time with something as pointless as baseball fandom. But while I enjoy baseball conversations with fans of all types simply because its baseball, I love getting the insight from fellow stat geeks because I want to know what others value. There’s so much to learn from the massive collection of data generated by baseball that it’s impossible for one person to know everything on their own. The biggest problem I have throwing in the statistical jargon is simply the lack of bites on the other side of the conversation; I never get the long-awaited bWAR versus fWAR debate that I truly long for in casual chitchat. I see it on the message boards in droves. Fangraphs, Beyond the Box Score, and even Reddit all seem packed with geeks, so why isn’t the bar by Angel Stadium flooded with a few of the same people an hour before first pitch? This had me thinking about something that’s probably kind of dumb, but to me at least, is still very interesting…

Geographically speaking, where do I find all the baseball stat geeks? The nearest brewery to which stadium am I most likely to find someone equally as annoyed as I am about being unable to split half-seasons and years in the same export at Fangraphs?

Thanks to Google’s dominance in both search engine quality and creepy monitoring of our each and every move, Google Trends was my go-to resource for the data I collected. For the uninitiated, Google Trends is a way to measure the search interest in a particular term over time or space (geography), and also compare the interest of different terms to each other over those same dimensions. “Search interest” is probably better defined by the less-marketable term “search volume”, though the data produced by Google Trends isn’t a direct measure of volume like total searches or search percentage – it’s a 0-100 scale that controls for general search activity in a given area (same as controlling for population). Now I probably could’ve simply looked up Fangraphs on Google Trends (which I did) and called it a day, but the lack of rigor made it seem shallow.

It’s obvious from the Google Trends graphic when and where Fangraphs garners the most interest; baseball season, the Pacific Northwest, the area between Chicago and St Louis (I looked this up and it’s apparently called the “North Central Midwest”) and the Pittsburgh area. But we’re still not done. For US-only searches, Google Trends usually returns more data points using the “Metro” subregion, which is actually the Designated Market Area (DMA) used by Nielsen (the TV ratings people) rather than the Metropolitan Statistical Area as I’d first assumed. The exported data from Google Trends for Fangraphs revealed a handful of DMAs with search volume too low to register any quantifiable level of interest. I’d wondered if the same places would garner similar results for Baseball-Reference, and much to my delight, I found a correlation coefficient of r=0.85 (R2 is shown on the chart). I also pulled Google Trends data for the phrase “Happy Thanksgiving” (it was trending at the time) as a control set to reassure myself the correlation between Fangraphs and Baseball-Reference wasn’t a probable outcome for any random Google search; this yielded a correlation coefficient of r=-0.10 with Fangraphs…hooray!

This means it’s very likely certain regions are more baseball-nerdy than others – not that searching for Fangraphs or Baseball-Reference makes a person a baseball nerd, but the aggregated data certainly represents a solid proxy. I wanted to collect more Google Trends data on search terms that are similarly stat-geeky, so I tried “sabermetrics”, “Bill James”, and “Moneyball”, but neither sabermetrics nor Bill James yielded enough data points due to lack of volume, and the very vast majority of Moneyball’s search volume was generated when the movie came out – not a desirable trait. Still, with more search terms, we have more data, and we’ll be a lot more confident in the results while mitigating bias; similar to diversifying your portfolio to mitigate risk. So I eventually ran the following through Google Trends, both individually and combined (for volume comparison between terms):

My goal, if you couldn’t tell, was to use the aggregated data to determine the best (and worst) baseball-nerd cities and regions by summing the total interest generated for the five Google searches by location. To accurately reflect their search proportions, each Google Search was weighted by its individual search volume relative to the combined volume of all five (visualized in the appropriately titled donut chart).

I added a finishing touch for a sixth and final Google search-related measurement that doesn’t fall completely in-line with the other 5:

  • The search volume of MLB compared to the search volume of NFL
    • Each metro has a combined MLB-NFL search score of 100
      • One score for MLB
      • One score for NFL
      • They sum to 100
    • So the most baseball-ish city possible would have a MLB score of 100 and an NFL score of 0 (this place doesn’t exist…or else it’s way too small to be a blip on Google’s radar)
      • A sad (though not too relevant) side note – no metros returned an MLB score above 42…but way to go Peoria-Bloomington, IL!
    • Since it isn’t measured in terms of volume, I weighted the MLB-NFL score at 1/6th (~16.7%) of the combined score and re-weighted the other five accordingly

The final weighting of the combined score is shown in the next chart:

RESULTS

The very last table in this post reflect the complete rankings, which I can’t say are too surprising. The map generated by the initial Fangraphs search made them a little more predictable, but there’s certainly more clarity after we combine all the data. I also put together a heat map in Tableau that visualizes nerdiness in the 48 contiguous states (sorry Hawaii, Honolulu [112th] did register some data…nowhere in Alaska did). Let me also mention that Tableau doesn’t recognize Designated Market Areas as a geographic variable, so I had to map them out by ZIP code…which was the greatest pain in this entire post before I learned it would’ve been much easier had I mapped them out by county instead of ZIP.

Well…the numbers basically speak for themselves. As much as I cringe hearing Cardinals fans claim the “Best Fans in Baseball” designation, they’re easily the nerdiest. They’re also the most engaged on social media, which makes their nerdiness pretty understandable. So congratulations St. Louis and surrounding area, you’re a bunch of nerds – which makes me reeeeeeally want to visit Busch Stadium when the A’s come to town in 2019. The Columbia-Jefferson City area is the market directly west of St. Louis, and directly east of Kansas City, though the Cardinals generate about twice the search volume the Royals do in the area. The third result, Champaign-Springfield-Decatur, IL, generates more search volume for the Cubs than the Cardinals, however – SO YOU AREN’T THAT GREAT CARDINALS FANS! The other strong areas include Pittsburgh, Chicago, and New England, each one home to notably loyal and passionate fanbases – though I have to admit Pittsburgh ended up higher than I might’ve guessed. I’m also guessing Meg Rowley and Patrick Dubuque are solely responsible for Wisconsin appearing twice in the top 15…and probably a little for Seattle not being as sad as the rest of the west coast.

The west coast is basically inept when it comes to nerding out on baseball, which is sad news for me. 23rd-ranked Seattle-Tacoma is the only west coast area in the first 38, and that’s when the Bay Area finally joins in at 39th. I find the very bottom of the rankings interesting – maybe even more so than the top. These areas could easily be the places where the most blue chip high school football prospects come from in any given year – 12 of the bottom 15 are from deeeeeep football country – Texas, Oklahoma, Florida, Mississippi, and Georgia. Compared to the top of the list though, the bottom is also generally much further in proximity from any MLB team.

What’s all this mean? Probably not a whole lot. But I’ve been to the bars near Fenway, and they were definitely enthusiastic about baseball in a way I don’t ever expect to witness in Anaheim. If that same enthusiasm is topped by the nerdiness engulfing the area between St. Louis and Chicago, I actually look forward to visiting the Midwest – something I’ve never felt before in my life. At the very least, I’m guessing it beats trying to talk about run differential in El Paso.

Building a Farm of the Best Unranked Prospects – Part 1: Pitchers

A lot is made about the strength of some farm systems, and likewise the weakness of some others. There’s plenty of reasons why stockpiling noteworthy, ranked, and on-the-radar prospects can be advantageous:

  • Longevity and cost control at the positions held by your prospects
  • You can easily fill the voids of outgoing free agents
  • Prospects are the preferred currency of low-budget teams who can land big names on the trade market but not the free agent market
  • A healthy farm is often seen as a proxy for the future health of the organization

With all that said, how good of a farm could you possibly have if you don’t have any ranked prospects? If your farm is comprised of nothing but under-the-radar guys, the benefits bulleted above pretty much all go away. The best you could really hope for is that you saw something that everyone else either missed or undervalued. But I’m guessing there’s still something in the public data that could help us find a few hidden gems. So I’m going to build a farm of unranked prospects and see it becomes anything some time down the line.

I’m going start with pitchers, TINSTAAPP be damned. Looking at THE BOARD over at Fangraphs, I noticed that there’s only 31 pitchers in the top 100 (compared to 49 at MLB.com as of 11/25/18), so it’s not exactly uncommon for a team to find themselves without a ranked pitching prospect. Since my goal is to build the best possible farm system that appears to be worse than basically every other MLB team – but (hopefully) only on the surface – I have to set some baseline criteria to establish who’s eligible for my Island of Misfit Prospects, and do so in a way that ensures my farm looks really bad.

Eligibility Requirements & Composition of Farm

If this is done right, my fantasy farm will be at or near the bottom in each of the charts and tables above, and the eligibility requirements can assist with that. So here are the rules for a pitcher to be eligible for my farm:

  1. The pitcher cannot be ranked any higher than 220 overall on THE BOARD at Fangraphs.
    1. This is due in part to how common it is for a team to lack a pitcher in the top-100, which is how we generally understand “unranked” prospects.
    1. This increases the likelihood that no pitcher in our farm will be included on the most recent iteration of the major top-100 prospect lists (Baseball America, Keith Law, MLB.com, etc.)
    1. Finally, the Brewers are the only team without a pitcher in the top 220. I’d have made the cutoff lower than the Brewers’ highest ranking pitcher, but unfortunately Caden Lemons is ranked 807th on THE BOARD, and that would have been extremely limiting.
  2. The pitcher cannot have any MLB experience.
  3. The pitcher must have thrown at least 10 IP in 2018.
  4. The pitcher must be a part of a MLB organization.
  5. The pitcher must have already reached the MiLB assignment level
    1. This means that a pitcher on my triple-A team must have reached triple-A at some point in his career.
    1. So I’m not necessarily advancing anyone to a higher level, but we’re going to say my fantasy farm is being created retroactive to the start of the 2018 season.

The farm should reflect the experience at each level realistically, just so I’m not throwing a bunch of unranked pitchers who threw well in low-A ball onto my triple-A team. I’m also not really interested in inexperienced pitchers who didn’t get any further than short season A-ball either, so I’m only going as deep as high-A in my fantasy farm. That gives me three teams to make up my farm; A+, AA, and AAA. Let’s also say I’m building the 2018 version of my farm, just so I don’t have to worry about 2019 assignments or promotions; the goal is to see how everyone pans out moving forward. So here’s how each team’s pitching will be comprised:

  • 8 relief pitchers; over half of minor league appearances must have been made in relief
  • 5 starting pitchers; must have started at least 50% of minor league appearances

So without further delay, let’s move on to the players I’ve selected…

Triple-A

Double-A



High-A

Zac Lowther


Overview

In retrospect, it seems that I maybe should’ve been more restrictive on the eligibility of Fangraphs BOARD players by either setting a rank cutoff at something worse than 220, or by capping the amount of players who appear on the BOARD at a maximum that better reflects a bottom-of-the-barrel collection of minor league pitching. I say this not because my Fantasy Farm is “good”, but because it didn’t turn out remarkably “bad” in terms of perception. It is completely devoid of stud prospects, and also appears to be pretty below average when the analysis gets deeper, it isn’t exactly a bottom feeder. Check out the following charts:

Notice how the red column (representing my Fantasy Farm) moves a little more to the left as the charts descend – that’s by design. It illustrates how my collection of pitchers may appear a little less hopeless depending on how we want to evaluate farms.

Nevertheless, by common standards, I’ve put together an unremarkable, below-average farm system with no big-time prospects as of the end of the 2018 season. Other than the eligibility rules I set up, the biggest factor in my selection was a data model of MLB performance based on minor league performance. The end-result is the collection of no-name prospects you saw in the tables above, who are generally young for their league, and possess either good K%-BB% or GB% rates, if not both.

Evaluating Outcome & Final Thoughts

In terms of how I’ll measure my success, I probably put my selections at a slight disadvantage by prohibiting any pitchers with MLB experience and assigning them only to levels they’ve already pitched at. Those two rules make it so I have a collection of guys who are, in all likelihood, set to repeat triple-A in 2019. I don’t have data to cite, but I’d imagine non/unranked prospects who repeat triple-A are considerably less likely to make it to the Big Leagues than a guy who ended the prior year in double-A and is assigned to triple-A at the start of the following year. Additionally, it’s hard to find a diamond in the rough who’s managed to reach the highest levels of the minors without gaining any traction as a prospect, so I’d imagine the true talent level of my Fantasy Farm is much lighter at the top than at the bottom.

That being said, I still think my farm will outperform any expectations set forth if my farm actually existed. I don’t have anything set in stone as far as how I’ll determine the success of my farm, but I have a few ideas, and I’ll probably use at least a couple of them (save for those that I ultimately conclude are relatively useless). Keeping in mind that I’m referring to future performance/debuts/numbers, here’s what I’m thinking…

  • Measure success by MLB debuts and/or performance
    • 160 pitchers made their MLB debut in 2018 (though not all were full-time pitchers, i.e. Willians Astudillo)
    • The average number of debuts by a team was 5.3
    • This number is skewed toward bad teams and doesn’t necessarily represent talent on the farm
    • Only about 20% of debuts came from playoff teams, which make up a third of MLB (33.3%)
    • These pitchers went 265-276 with a 4.55 ERA and 1.39 WHIP, striking out 4248 and walking 1978 in 4766.1 IP, so the bar probably won’t be set too high
  • Measure success by comparing performance across all levels to that of top-ranked farms (compile aggregated figures by org and compare)
  • Measure success by aggregating numbers of all pitching prospects ranked better than 239 on the 2018 update of the Fangraphs BOARD
    • According to the version of the Fangraphs BOARD in reference, the highest ranking member of my Fantasy Farm is Jose Suarez of the Angels at 239
    • By compiling figures of all pitchers ranked above 239, I’m looking exclusively at pitchers regarded more highly than my top-ranked pitcher
    • 98 pitchers are ranked in front of Jose Suarez, so there’d likely be a large body of work to compare
    • Since both my fantasy farm and the top 98 pitchers are likely to put up numbers at all levels, the data can be collected and measured both as a whole, and/or by level

Consider this particular exercise just a part an ongoing analysis. Regardless of how the results look at the end of 2019, it still won’t be finished. The conclusion of all this probably won’t come for a few seasons, but I’ll keep checking in on the results periodically and reporting them. Sure, in the end I’d like to be able to say my farm did better with a bunch of no-names (at the time) than the top-ranked farms, but I might crash and burn too. All we can do is watch it unfold.

MLB Starting Rotations: Using Data to Define an Ace (and a 2 and a 3 and a 4…)

Since I really want to use the blogosphere to solve as many of baseball’s infinite puzzles that I possibly can (within the constraints of life), it probably seems like I’m not being very ambitious with this post – at least if you’re judging by the title. I get it…there’s even a definition of “Ace” provided by Major League Baseball at MLB.com. That’s about as official as it gets, so consider this a closed case, right? Well, you can probably assume from the inclusion of hundreds of words below this paragraph that my answer is no. There’s really not much existing literature that delineates the parameters of an “ace” (or any other spot in the rotation) in an objective, data-driven manner. The great Jeff Sullivan was on the cusp with this Fangraphs post, but ultimately conducted an opinion poll in which readers were asked if they considered the top SP in each team’s rotation an “ace”. Both Jeff’s methodology and his conclusions underscore additional benefits of establishing objective, context-neutral parameters:

(NOTE: This isn’t a criticism of Jeff Sullivan or his post…he’s probably my favorite baseball writer by a wide margin, and his objectives with said post were not the same as my objectives with this post):

  1. Posted prior to the 2016 season, the content is contemporaneously relevant, and 71% of respondents considered Sonny Gray an ace. With statistically-rigid definitions of what an ace is, we could compare Sonny Gray’s performance at that point instead of laughing at the mere thought of being asked “Is Sonny Gray an ace?”. At this juncture I’d imagine Gray’s perception is that of a fringe starter who fills in when someone goes down. But is that what he really is? I don’t know, we haven’t established what makes a fringe starter either. With context-neutral definitions of each rotation spot, we can eliminate the contemporaneous relevance and easily make comparisons across seasons or even eras.
  2. Jeff concluded there were about 20 starting pitchers in Major League Baseball that most people would agree were aces, which makes us 10 shy of what we’d expect given the MLB definition of “ace” (the top starting pitcher on a team). While small year-to-year variances are to be expected, we should consistently find about 30 pitchers to fall within the parameters of acehood. So really, Jeff’s poll found there was a perception that 20 aces were active at the time – I contend that there were actually around 30, and roughly a third of them weren’t all that obvious. We want to eliminate the perception aspect with definitive criteria that undeniably establishes acehood.
  3. It turns out that the perception of an ace wasn’t completely performance-based (shocker!): pitchers from more talented rotations were penalized for being teammates with other good starting pitchers. Stephen Strasburg outperformed many of the pitchers who scored higher than him, yet only 57% of respondents considered him an ace – largely due to being in the same rotation as Max Scherzer (and probably injuries). While some may consider it fundamentally incorrect to label multiple pitchers from the same rotation “aces”, it’s going to be harder to convince me that a league-average pitcher who leads a rotation where he’s followed by 4 below-average teammates is more worthy of the ace label. Objectively speaking, an ace is unconditionally an ace based on performance (not on that of his teammates). The ace parameters will rid us of the perception penalty incurred by aces who are teammates with aces, and likewise the perception benefit bestowed on non-aces who overshadow their relatively inferior rotation mates.

Before we go any further, I want to make it clear that I’m writing under the assumption that an “ace” and a “#1” are synonymous. On a recent episode of Effectively Wild, Ben, Jeff, and Meg Rowley all bantered about how we define an ace, and even briefly attempted to distinguish the differences between an ace and a #1; not that they’re mutually exclusive, but it sounded more like the beginning of an LSAT logic game where ‘all aces are #1s, but not all #1s are aces…’ from what I gathered. I don’t want to strictly adhere to the MLB.com definition, but for the sake of this post, we’re going to at least continue under the assumption that aces and #1s meet the same defining criteria as each other.

Perhaps counterintuitively, the task of defining each role within a rotation is even more important given the lightening workloads of starting pitchers, and, inversely, the increasing workloads of relievers. The paradigm shifts with caution, and no team is should have a perennial Cy Young candidate throw anything less than the greatest quantity of innings he can possibly throw without sacrificing performance or health.

With the advent of the Opener, what truly constitutes a “Starting Pitcher” is becoming increasingly vague. It wouldn’t be much of a surprise to see some of the more traditional roles played by back-of-the-rotation starting pitchers to completely disappear in the pretty near future. But it should be a little more than obvious that this evolutionary process isn’t necessary for all SPs, right? Perhaps the most likely progression begins with the teams under tighter budget constraints, relatively deeper relief corps than starting corps, and the ones just a little more forward thinking. We saw the Rays unveil the strategy out of necessity, soon followed by the injury-stricken Athletics. But what was spawned initially out of necessity for the early adopters should presumably expand to teams doing it out of practicality.

But in the wake of all this, one puzzle we’re left to figure out revolves around the pitchers to cut from their traditional role – who should be sacrificed to this developing experiment?

I’m not going to try and answer that in THIS post, because we need to solve another puzzle as a prerequisite – the definition of each spot in the rotation. On one hand, it couldn’t be simpler; each spot is based on the order of talent within a given pool of starting pitchers, beginning with the most talented at the top. On the other hand, it’s a complex and generally subjective matter, albeit unnecessarily; a lot of credible baseball people might require seemingly arbitrary attributes, like a minimum fastball velocity for an ace, or more strikeouts than innings for anyone in the one or two spot. I’m not saying these ideas are necessarily incorrect either, but my goal is to wash away the ambiguity. Defining the performance expectations of each spot in the rotation can be done objectively by analyzing some key metrics and keeping the parameters simple.

First we’ll define the parameters. We know MLB’s definition of an “ace” is the best starting pitcher on a given team. We also concede that not every team has an ace because talent isn’t equally distributed. So how we divide the pitcher roles will be across teams rather than within them; this means “aces” will be the top 30 starting pitchers in MLB, not the single best starting pitcher from each of the 30 teams (which is how we’d determine acehood using MLB’s definition).

As easy as it is to envision the stereotypical grumpy baseball traditionalist reciting how only a few pitchers handled the majority of innings decades ago, 5-man rotations outnumbered all other combinations for the first time in 1926 (believe it or not, the 6 man rotation was actually more common than the 3-man rotation at that point). So we can call a rotation a pool of five starting pitchers without much controversy. However, given how improbable it is to expect the same 5 pitchers to make all their scheduled starts in a given year, every team generally has a 6th pitcher who can start (either in theory or an actual place on the 25-man roster) whenever someone from the top 5 can’t. As a role every team has been forced to utilize, and the means by which many SPs crack their first rotation, the 6th spot is by no means trivial. So, while we’ll call a rotation a set of 5 SPs, we’re also saying they’re the top 5 from a pool of 6 pitchers. This establishes 6 tiers that, under optimal conditions, would be represented by sextiles (that’s what you call 6 equally-sized groups) of talent, where the first sextile holds the top 16.7% of talent, which descends with each tier.

Unfortunately, since true talent can’t really be quantified, we’ll have to proxy talent with performance metrics. Here I’m going to use ERA-, FIP-, and xFIP-. This lets us compare the metrics equally across different seasons, leagues, and parks, creating a context-neutral benchmark for comparison. I assume anyone who finds themselves on this blog is familiar with these three metrics and why they’re more useful than their slightly-more-traditional-non-minus counterparts. But if not, I highly recommend checking out their entries in the Fangraphs Glossary (you’ll learn a ton in like 5 minutes).

(NOTE: If you REALLY don’t feel like leaving the page, the key here is the number 100; 100 is average. An ERA-/FIP-/xFIP- under 100 is better than average, and anything above 100 is worse than average, with the absolute difference representing the percent better or worse than average. For example, a FIP- of 75 is 25% better (less) than league average: 75 – 100 = -25%. For normalized stats that end in “-“, any measure below 100 is good, while the opposite holds true for normalized metrics ending with a “+”, such as wRC+.)

Instead of using these metrics individually for our approximation of talent, I’m going to use the average. ERA often comes under fire because it’s a relatively poor predictor of future performance due to the amount of luck associated with its inputs – which is well warranted given that both FIP, xFIP, and even K%-BB% actually predict future ERA better than past ERA. But I’m including ERA here because I don’t see any reason to omit past success as a component that defines an ace, or any other tier of a rotation, lucky or unlucky. However, since we’re attempting to approximate talent to define each tier, it’s important we limit the magnitude of ERA since much of its variance is fielding-dependent. We do this by including the other two metrics, FIP(-) and xFIP(-), both which are obviously fielding-independent, and rely exclusively on the pitcher. Furthermore, while each metric is results-based, the most forward-looking of them is xFIP, which is a better predictor than both FIP and ERA are of their future selves. So while xFIP might be the worst descriptor of what actually happened, it’s easily the best indicator of what will eventually happen. This is important is because it makes future expectations a part of the equation.

Additionally, while it won’t be perfect given the incomparable year-to-year variance of each respective metric, the average also gives us an idea of the rough cutoff for each metric individually. So once we establish our cutoffs, we could say, “player X had an ERA- of 99 but an xFIP of -75. So he pitched like a #3 starter, but I expect him to pitch like an ace moving forward”.

So our talent proxy is simply the average of ERA-, FIP-, and xFIP-, which I’ll call MEAN-. Once we establish the cutoff for each sextile, our tiers will be defined. Using data from 2002 through 2018, I looked at every pitcher who threw at least 100 IP as a starter, calculated both their MEAN- and their respective MEAN- percentile rank, and here’s what we have:

While splitting our data into sextiles gives us the mathematical explanation as to why this happened, at first glance it might seem odd to see Tier 4 begin with the league average MEAN-…because league average should be a #3, shouldn’t it? Actually it shouldn’t. There’s a reason top pitching prospects are often given labels that imply something as seemingly underwhelming as a “3rd starter” – it’s because 3rd starters are (barely) above average pitchers. Sure, they’re seen as the midpoint in the rotation, but they’re only the midpoint when the best 5 options make all their scheduled starts, themselves included. At some point, every team utilizes their 6th option, with few exceptions. In 2018, the Indians and Rockies used the fewest starting pitchers with 7, while the average big league team utilized 12. Starting pitchers whose innings total ranked 6th or lower on their respective teams accounted for 18.8% of starting pitcher innings – only the top ranked starting pitcher (and presumable ace) accounted for more with 21%. This helps explain why the 4th Tier is where league average goes, and not the 3rd Tier.

The table above shows some average performance metrics of the starting pitchers within each tier dating back to 2002. Everything descends or ascends in the order you’d expect it to, but one interesting thing about the table is the WAR column. Tiers 2 through 6 are separated pretty evenly, ranging anywhere between a 0.6 and 0.8 WAR differential with the adjacent tier. The exception is Tier 1 (our Ace Tier), which is a full 1.5 WAR ahead of Tier 2. We can see this more clearly in the table of average WAR by tier; the linearity holds steady for the most part in tiers 2 through 6, only to slope sharper from 1 to 2. So even while we’ll find roughly the same number of pitchers within each tier on an annual basis, upgrading from a Tier 3 pitcher to a Tier 2 pitcher won’t yield the same improvement you’d see from upgrading a Tier 2 to a Tier 1. The roughly equal tier-by-tier difference in WAR from the bottom 5 tiers suggests we get essentially flat marginal returns from any single-tier upgrade unless we’re adding a Tier 1 guy (an ace!).

    

That may have been tough to follow, but let me put it another way. Let’s say you’re a GM headed into the offseason with the goal of upgrading your rotation via trade. For the sake of this hypothetical, you’re only able to offer one trade package comprised of a starting pitcher from your current rotation, a prospect, and cash. In return, you’ll receive a starting pitcher that’s 1 tier better than the SP you’re trading away (the prospect and cash are irrelevant other than making the tier downgrade worthwhile for your trade partner). We’ll hold the prospect and cash fixed, so the only part of the offer you can change is the tier of the pitcher you give up, and therefore, the tier of the pitcher you receive. So here’s what you’re looking at in the trade for a new SP:

  • Assume your 5-man rotation is comprised of a starting pitcher from each of the top 5 tiers
  • You also have a Tier 6 pitcher you use as a spot starter
  • Your ace is the only pitcher you’re unable to trade
  • If you give up a Tier 6, you’ll receive a Tier 5    (~0.8 net WAR)
  • If you give up a Tier 5, you’ll receive a Tier 4    (~0.7 net WAR)
  • If you give up a Tier 4, you’ll receive a Tier 3    (~0.6 net WAR)
  • If you give up a Tier 3, you’ll receive a Tier 2    (~0.8 net WAR)
  • If you give up a Tier 2, you’ll receive a Tier 1    (~1.5 net WAR)

The right thing to do here is to give up your Tier 2 pitcher, so you end up getting a Tier 1 SP. Sure, you get two aces in the rotation now, but the reason for giving up your #2 isn’t as simple as ‘adding an ace’. The reason you gave up your Tier 2 for a Tier 1 is because it represented the only offer with a marginal upgrade compared to what was on the table. In other words, the added benefit from swapping a Tier 6 with a Tier 5 is roughly the same as the added benefit from swapping a Tier 5 for a Tier 4, a Tier 4 for a Tier 3, and a Tier 3 for a Tier 2.

Since I have a habit of overexplaining things, I’ll end with some examples of each tier using numbers from the 2018 season. For the table of 2018 Tier Examples, 5 randomly selected pitchers within each tier were chosen just so readers get a better idea of who falls in line with a given tier.

#pitching

How to Identify Bounceback Candidates (Pitcher Edition)

Okay, a lot of people think ERA sucks. Sure, I don’t really disagree in the sense that it’s luck-laden and a poor predictor of future performance. It’s a shallow measure, but it still seems to get the best of those even at the highest levels; Jon Gray was left off the Rockies’ playoff roster after posting a 5.12 ERA that wasn’t really compatible with his 9.6 K/9 and 2.72 BB/9. Domingo German couldn’t stay in the Majors with his 5.57 ERA in spite of striking out nearly 11 per 9 and walking 3.5/9.

This isn’t a defense of ERA by any means – its not. This is a guide to find out who’s 2019 ERA is (probably) going to be better than their 2018 ERA, and it’s pretty simple. Fangraphs features a metric called “E-F”, which is simply a pitcher’s ERA minus FIP. This can give us some idea of how representative the pitcher’s ERA actually is – grossly oversimplified, it gives us a measure of luck. The following facts have been fairly well-documented, but just for a refresh, I want to reiterate the following:

  • ERA is a relatively poor predictor of future ERA
  • FIP is a better predictor of future ERA but still not great
  • xFIP is a better predictor of future ERA and future FIP than both ERA and FIP

Results-based analysis is tricky business, but not totally unreliable when done correctly. ERA is far from the ideal indicator of a pitcher’s ability, which has been addressed through FIP, which also includes a lot of noise that’s washed away in xFIP. Things that show little or no year-to-year correlation, such as HR/FB% or BABIP, are controlled for by applying constants in the calculation of xFIP, which is why it’s probably the best metric we use to evaluate how good a pitcher’s been, at least in the same context of ERA. Unfortunately, fans, fantasy leagues, and the general consumption of baseball continue to emphasize ERA in spite of it’s obvious shortcomings, probably due to a fear of adaptation. So even though it would be more practical and easier to predict future xFIP, we’re going to predict future ERA with xFIP, since it’s still the best we’ve got.

Let’s check out the correlation matrix of ERA predictors I put together. This uses all big-league pitchers from 2010-2017 with at least 30 IP in a given half-season who also threw at least 30 IP in the subsequent half-season. I did notice that the within-period correlations aren’t identical in both time periods (ERA’s respective correlation to FIP and xFIP is .67 and .49 in t=0, but .70 and .55 in t+1…this still occurs even when ERA-/FIP-/xFIP- are used instead, so I’m theorizing that it’s just a matter of a pitcher gaining consistency with an additional year of experience, but that’s another post for another day.) We can see that each of the bullet points above are reflected in the matrix, and that xFIP does a much better job of predicting the future than any other metric. So what am I trying to prove here? That xFIP is a super useful metric that isn’t used enough for predictive analysis! And unlike ERA, xFIP is a superb predictor of itself, which is why I highlighted that particular part of the matrix, and added the chart on xFIP predictability. Worth noting is that the full-season correlation between ERA and xFIP is a much better-looking 0.64, compared to the half-season correlations shown in the matrix, so being able to predict xFIP from one period to the next is pretty valuable.

        

So now that I’ve emphasized the value of xFIP versus the other metrics as predictors with some visual overkill, I’m going to rework the Fangraphs’ metric I mentioned earlier: instead of E-F (ERA-FIP), we’ll be using E-X (ERA-xFIP).

Let’s set up some definitions that will apply to the remainder of this post:

  1. Overachiever – A pitcher who’s xFIP exceeds his ERA. In this case the E-X is negative.

    2018 Example: Wade Miley; 2.57 ERA/ 4.3 xFIP/ -1.73 E-X with MIL

  2. Underachiever – A pitcher who’s xFIP is less than his ERA. In this case the E-X is positive.

    2018 Example: Marcus Stroman; 5.54 ERA/ 3.84 xFIP/ 1.7 E-X with TOR

The intuition here is simple enough – overachievers are due for positive regression (remember that “positive” is bad when it comes to ERA/FIP/xFIP) and underachievers are due for negative regression. In other words, pitchers with a negative E-X should see their ERAs increase, while pitchers with a positive E-X should see their ERAs decrease. I said “should”, but I really mean “do”, because the effect is quite robust when we use aggregated data. The first chart looking at ERA changes from 2017 to 2018 suggests that, while E-X is a good indicator of the direction a pitcher’s ERA is headed, underachievers appear to be more predictable than overachievers – at least using non-normalized metrics.

STANDARD
ERA & xFIP

Now since ERA is known to fluctuate over time and we need normalized metrics to compare across eras, I wanted to see how predictability changes (if it does at all) when we use ERA- and xFIP- instead of standard ERA and xFIP. Here, the effect is consistent across both groups (both overachievers and underachievers). Take a look at the chart below:

NORMALIZED
ERA & xFIP (ERA- & xFIP-)

This tells us that roughly 73% of overachieving pitchers in 2017 saw a rise in their 2018 ERA, while almost an identical portion of 2017 underachievers (72%) saw a decline in their 2018 ERA. That means, with respect to this sample, nearly three-quarters of the time we accurately predicted the direction of future ERA by subtracting xFIP- from ERA-. This is pretty powerful, but it’s limited in the sense that we’re looking at a binary prediction – its yes or no; while we can reasonably expect the ERA to increase or decrease, we don’t know by how much. And we all know to be skeptical when sample sizes are small; just 169 pitchers threw at least 40 IP in both 2018 and 2017, so let’s see what happens when we have a sample 8.5 larger than what’s reflected in the 2017/2018 chart…

NORMALIZED
ERA & xFIP (ERA- & xFIP-)

And there you go; 71% of overachievers saw their ERA go up in the subsequent half-season, and 72% of underachievers saw their ERA go down – basically unchanged from the previous chart. Here, time is grouped into half seasons rather than full seasons, which gives us an even greater sample to look at. So E-X is legit when it comes to predicting improvement or decline, but why not build on that if we can? If we’re trying to identify bounceback candidates, wouldn’t it be nice if we could know exactly how likely it is that a pitcher’s ERA will be lower next season (or next half-season) than it was in the most recent one?

Obviously the answer is ‘yes’, so I modeled the probability of ERA improvement using E-X as the singular dependent variable and ran a logistic regression on the binary outcome of whether or not ERA improved in the in half-season t+1. The summary statistics are shown below, as well as how to calculate the probability.

N=1454

Calculating the probability estimate of this model isn’t like a typical linear regression, so if you wanted to apply it to a particular pitcher on your own, here’s how it works:

So rather than going through too much more math, lets move on to what the model tells us by using the probability of ERA Improvement chart:

This shows us the estimated probability of a given pitcher improving his ERA in the next time period (in this case, half of a season), based on the E-X in the most recent period. While the model is built off half-season samples, we can reasonably apply it different time groups that occur consecutively, like a full season (we don’t want to stray too far from the half-season though, because we’d fail to account for a lot player-specific changes that might occur in the two time periods. For example, we wouldn’t want t=0 to be the last 5 years, where we’re trying to predict improvement in the next 5 years, because a lot of changes could occur with the pitcher we’re looking at; his mix might change, his velocity almost certainly will, perhaps Tommy John surgery, etc.) So, at an E-X of 0, we see the probability of improving ERA is 50%, which is right where we’d expect it to be (actually it’s 49.8% if we take it out to the thousandths place…the absolute probability difference in an E-X of 0 and -10 is actually almost the same as the difference between 0 and +10, but I kept the probability estimates to two decimal places for the sake of simplicity). The greater the E-X in the most recent (half) season, the more likely it is the pitcher’s ERA will drop in the next (half) season; even though only 18% of pitchers post E-Xs of at least 20, it’s certainly worth noting their probability of improvement is better than three-quarters. Even more rare is an E-X of 40 or greater, which occurs just 4% of the time, but is practically a guarantee of improvement at 91%.

So just for fun, let’s apply the model to a pitcher using his 2018 E-X, and determine the probability that his ERA will improve. One guy a lot of people might be curious about is Sonny Gray; are greener pastures ahead for Sonny in 2019? Or was all that chaos in New York City the catalyst to an irreversible downward trend? Well…let’s find out!

2018 Sonny Gray – NYY

ERA: 4.90    xFIP: 4.10

ERA-: 113    xFIP-: 97

E-X = 113-97 = 16    Now we’ll apply the model…

1/(1+e^-[-0.06+{0.059*16}]) = 0.718

Estimated probability of improvement is 71.8%! So Sonny Gray’s got a pretty good shot at being a better pitcher in 2019 than he was in 2018.

Let’s do another…how about NL Cy Young Award winner Jacob DeGrom? DeGrom had an absolutely insane year that a bunch of morons tried discrediting at various stages, but most of the people reading this are probably aware of how special it actually was. So how likely is it that DeGrom could be even better next year?

2018 Jacob DeGrom – NYM

ERA: 1.70    xFIP: 2.60

ERA-: 45    xFIP-: 64

    E-X = 45-64 = -19

1/(1+e^-[-0.06+{0.059*-19}]) = 0.245

So the model gives DeGrom a 24.5% shot at improving his ERA in 2019, which isn’t that bad considering there’s not much room for improvement when your ERA is 1.7…the closer you get to 0, the more improbable improvement becomes!

Instead of continuing with random case-by-case examples, I added a few names to the probability chart to go along with Sonny Gray and Jacob DeGrom. I also built a table of 25 semi-randomly selected pitchers alongside their 2018 numbers and their respective 2019 ERA improvement probabilities. One thing that’s fairly clear, though also quite intuitive, is that it’s difficult to improve upon good performances; DeGrom, Max Scherzer, and Justin Verlander are unlikely to be better in 2019 than they were in 2018, largely because they were just so good. Applying that same intuition to the other end of the spectrum, it’s pretty easy to improve on bad performances – Clayton Richard is almost certainly going to be better in 2019 because he set the bar so low. Those are the predictable cases – the ones in which the probability model does nothing but reaffirm what we’d basically known. Among those shown in the table, the more interesting cases are those of Josh Hader and Carlos Carrasco, both of whom enjoyed incredible 2018 seasons, and are actually more likely than not to improve in 2019. There’s also a few names not shown in the table who are in the same boat as Hader and Carrasco, such as Patrick Corbin, Dellin Betances, Ross Stripling, and Edwin Diaz – all of them are likely to improve in 2019 after being phenomenal in 2018.

#aaron-sanchez, #alex-cobb, #anibal-sanchez, #carlos-carrasco, #chris-sale, #clayton-richard, #dallas-keuchel, #dellin-betances, #domingo-german, #e-f, #e-x, #edwin-diaz, #edwin-jackson, #era, #jacob-degrom, #jake-odorizzi, #jakob-junis, #joe-musgrove, #jon-gray, #jose-quintana, #jose-urena, #josh-hader, #justin-verlander, #kenta-maeda, #kyle-freeland, #madison-bumgarner, #marcus-stroman, #matt-harvey, #max-scherzer, #michael-fulmer, #mike-leake, #patrick-corbin, #pitching, #pitching-projections, #rich-hill, #robbie-erlin, #ross-stripling, #sonny-gray, #tyler-anderson, #tyler-mahle, #wade-miley, #xfip

Fun Times with MLB Trade Data

NOTE: I wrote this just after the 2018 Non-Waiver Trade Deadline but didn’t post it until November.  I figured it was still a worthwhile read, so here it goes…

The final hours approaching the 2018 MLB non-waiver Trade Deadline were characterized by a frenzied cluster of mostly-pedestrian exchanges that were overshadowed by one headline-grabber (Chris Archer to the Pirates). Perhaps the most noteworthy part of the 2018 deadline wasn’t any particular trade, but the sheer volume of exchanges that went down. After tuning into episode 1249 of Effectively Wild last week, I wanted to pay especially close attention to the teams making deals as the deadline approached – not necessarily the players getting traded – but the teams making the trades. On the EW podcast, Ben and Jeff brought in Adam Fisher (formerly of the Braves’ and Mets’ front offices) to discuss the anatomy of trades, which piqued my interest in who traded what, and why they might have done it (outside their obvious needs lining up). Adam seemed to imply that the rapport established between front offices goes a long way in initiating the discussion between teams who ultimately make a deal. Anecdotally this sounded about right; prior to even looking at the data, it seemed to me (an A’s fan) that half the deals made by the A’s the last few years involved the Nationals…I’ve been under the assumption Billy Beane and David Forst are more cordial with Mike Rizzo than they are with most other GMs. So I set out to find who the preferred trade partners of every team are, the teams that make the most deals, the ones who make the least, and a multitude of additional frivolities to captivate the masses.

So in this post, we’re looking at deals made since the 2011 World Series – a somewhat arbitrary cutoff, but it’s an approximation of when all teams were operating with the knowledge of divisional alignment looking like it does today. While the Astros hadn’t yet moved to the AL West, their front office was aware their primary future competition would be comprised of the Angels, A’s, Mariners, and Rangers, and we assume that influenced their trading activity, as well as the teams in their current and future divisions.

One thing of note – when we aggregate trade data and compare team totals, we count each trade per team involved; when the Rangers dealt Cole Hamels to the Cubs, that was a deal for the Rangers, and also a deal for the Cubs. Since we’re looking at each trade from the perspective of a given team, the Hamels deal will be counted as two deals. I recommend not thinking too hard about it.

Trade Frequency & Game Changing GMs

Unsurprisingly, Jerry Dipoto’s Mariners lead the pack by a wide margin with 93 trades – 14 more than the next highest figure: 79 deals made by the A’s. The 14-trade margin between 1st and 2nd place is wider than any margin from consecutively ranked teams. Interestingly enough, Dipoto’s only served as the Mariners’ GM since 2015 after he was let go by the Angels, for whom he didn’t exhibit the same trigger-happy tendencies he has since his arrival in the Pacific Northwest. Nevertheless, in the 2.5 years Dipoto’s been in Seattle, he’s made 60 deals. Compare that to the 32 deals made by the Mariners the previous 4 years, and it’s pretty remarkable – not necessarily for better or worse, but remarkable in how much it stands out.

Just by looking at the chart of total trades since the 2011 World Series, we can see a gradual, pretty linear slope from the A’s down to the Cubs, who complete the top half of MLB. In contrast, the bottom half is distinguished by a steeper slope that begins with the Marlins and ends with the Giants. The visually evident divide between the top and bottom half illustrates just how top heavy trade behavior tends to be; the upper 50% has out-traded the bottom 50% 1,023 to 599 in the last six years.

Jerry Dipoto’s propensity for trading for trading’s sake has been noticed by almost everyone. With that said, Erik Neander of the Tampa Bay Rays has actually traded even more frequently than Dipoto since his promotion after the 2016 season with roughly a trade more per year than the Mariner’s GM. And while Dipoto’s wheeling and dealing has been considerably more frequent than his predecessor, Jack Zduriencik, the Mariners’ change in trade frequency ranks third behind Neander and Mike Girsch of the Cardinals. Girsch has bumped the Cardinals into the middle of the pack as far as trade frequency goes, but that’s mighty leap for a team that was among the most infrequent traders under John Mozeliak.

Probability Chicken Scratch: Who trades (and doesn’t trade) with whom?

In absolute terms, the Rays and Mariners connect on more deals than any other duo in baseball; since the 2011 World Series, they’ve struck an accord 11 times. Once we get below the 5 most common trade partners, this gets far less interesting, as the frequency of ties increases rapidly with every subsequent move down the ranking poll (that is, until we reach 1 trade, at which 117 team pairs are tied…93 pairs are tied at 0).

What’s probably more interesting to investigate is the proportion of all trades made by team X that involve team Y. Unlike the totals, the trade proportions give us a better idea of which teams are more inclined to strike a deal with a particular team when a deal is imminent. For example, since the conclusion of the 2011 season, the Mariners have made 9 deals with the Yankees. Meanwhile, the Nationals made 9 deals with the A’s. The proportions show us that 9.7% of Mariners trades involve the Yankees, whereas 20.5% of Nats trades involve the A’s. So while a Nats-A’s deal has occurred equally as often as an M’s-Yankees deal, it’s roughly twice as likely that any given deal involving the Nationals also involves the A’s. Put another way – the Nats involve the A’s in 1 out of every 5 deals they make, while the M’s deal with the Yankees in 1 out of every 10.

On the flipside, team pairs that never trade with each other may be equally as interesting (if not more so), were it not unremarkably common; 93 team pairs have combined for 0 trades. Hypothetically, however, if we assume trade partners were randomly selected by the initiating team (this obviously isn’t the case), the probability that team Y isn’t involved in any of team X’s trades becomes worse than a coin flip after X’s 20th consecutive trade not involving Y. This means that the further a team goes above 20 trades without involving a specific team, it’s decreasingly likely to be random chance, and more likely that those teams intentionally avoid dealing with each other. That all holds under an assumption we know isn’t true; teams don’t randomly select their trade partners. Instead, trade partnerships are presumably a function of both team’s front office rapport, current records, temporal focus/outlook (long term or short term), present needs, financial flexibility, competing bids, geographic markets, division placement, and other random variables. However, given that these deals aren’t sought out randomly, it stands to reason that exclusion from trades has an even smaller threshold where intentional aversion becomes more likely than not (before the 20th consecutive trade of exclusion). Since we don’t know the probability function of trade agreements, we can approximate the probability under the assumption that trade partners are randomly selected (which we’ve established as a more conservative estimate than whatever the true function would give us).

So who avoids whom? To make our data richer and get a better idea of which teams really don’t like dealing with each other, we’re going back 4 more years and starting at the 2007 World Series. This gives us 700 additional trades to observe, but more importantly, 700 more trades in which 47 pairs of teams failed to complete a deal together. How likely is it that these are coincidences? In every case, it’s pretty unlikely. In some cases, it’s almost impossible. Incorporating each team’s share of all MLB trades since 2007 and applying it to probability, we can easily calculate how probable it is that each pair hasn’t yet made a deal to this point, assuming that trade partners are randomly selected by the team initiating a deal.

The table tells us a lot, but to make it easier, we’ll look at a few examples. Lets take the pair at the very top – the Rays and Jays. They haven’t made a deal together in over a decade. The table also tells us the Rays made 92 deals with everyone who’s not the Blue Jays, while the Blue Jays made 85 deals with everyone who’s not the Rays. The probability of the Rays excluding the Blue Jays for 92 consecutive deals just by random chance (unintentionally) is 0.0785%; if we subtract that from 1, we get the probability of the Rays intentionally excluding the Blue Jays for 92 consecutive trades, or roughly 99.9% (5th column). The Blue Jays are also 99.9% likely to have avoided the Rays (6th column). The probability that either one of the teams is avoiding making a deal with the other is a near certainly at >99.9% (7th column), while the probability of both teams avoiding each other is 99.8% (last column).

At the bottom of the table we find two seldom traders in the Cardinals and Giants, who, among all the pairs, may be the least surprising – at least among the non-divisional and non-market sharing teams. So while it’s still very probable that someone’s avoiding someone, it’s no guarantee. The Cards have a probability of 80.7% of intentionally avoiding the Giants, while the Giants have a 76.9% chance of intentionally avoiding the Cards. The probability of Team 1 avoiding Team 2 isn’t the same as the probability of Team 2 avoiding Team 1 because they don’t share an equal frequency of trading; since the Cardinals have made 5 more trades than the Giants, they’re slightly more likely to be avoiding them than vice versa. When a team trades once in a blue moon, they’re not avoiding a particular team as much as they’re just avoiding trading altogether. What we see with the Giants in comparison to other teams, is general trade aversion; rather than aversion to trading with any specific team, they’re simply less inclined to trade with anyone and everyone. They’re aversion is relatively less discriminating than other teams, though it is still likely they’re avoiding deals with some teams.

A random point – the Dodgers and Tigers never trade with each other and it’s really weird. They’ve had different GMs/philosophical approaches/general front office personnel, and they’ve both made tons of trades with everyone else. But the last time the two teams traded with each other was in 2004…when Cody Ross was sent to the Dodgers for Steve Collyer! The Tigers also don’t trade with the Padres though, so maybe they’re just anti-SoCal NL teams. The Yankees and Royals also haven’t made a deal since 2000, but there’s been less turnover in their respective front offices, so maybe Brian Cashman and Dayton Moore aren’t the best of friends.

The Understandable Side of Trade Aversion

Notice how a lot of the non-trading pairs are either in the same division or same geographic market. This isn’t random chance either. As often as writers and fans label trade “winners” and “losers”, the fact is deals are generally made between teams with at least some expectation of a mutual benefit; GMs with reputations for slighting trade counterparts won’t find themselves in the good graces of their colleagues for long. When a deal is agreed upon, we often see present value being exchanged for future value (proven veterans for prospects). How much a team competes with a given trade partner is undeniably a factor when a trade is consummated, with games being just one factor of competition; leagues, divisions, free agents, fans, employees, and tv ratings can all be at stake depending on the teams involved in a given deal. Adam Fisher mentioned this on Effectively Wild when he explained why there was almost no way that Chris Archer or Jacob DeGrom are ultimately dealt to the Yankees. In the case of Archer, it would represent a deal in which a team on the cusp of competing (the Rays) send a controllable and productive asset to a team that represents more than 10% of their scheduled opponents annually that also competes for the same division. While the benefit may be mutual, you’re still offering a benefit to a team that has an edge in every competitive aspect imaginable, why contribute to it? In the case of DeGrom, you’re competing not for as much on the field as off of it; the benefit of sending one of your biggest stars across town won’t come to fruition for years, while alienated Mets fans find themselves capable of watching winning baseball if they can only convince themselves to jump ship. When you’re the #2 show in town, don’t send your biggest assets to the #1 show; you may not be on-field competitors, but economically, there’s no bigger competition.

Do teams actually prefer sending players outside their league altogether? Well, 52% of deals are interleague deals, meaning over half the trades that go down involve players swapping leagues. The difference between that and random chance is not statistically significant (15 of 29 potential trade partners are in the other league…that’s 52%). Some specific teams, however, may be more averse to intra-league trading than others; the Diamondbacks, Cubs, and Brewers trade with NL less than a third of the time. Curiously, the Reds are highly averse to intra-division trading, yet they exhibit a strong preference for trading within the National League at better than 3 out of every 4 trades.

However, as far as divisions go, we do see statistically significant trade aversion between teams residing in the same division. 8% of all trades are intra-division, while random chance would place intra-division trading at 14%. Some teams are extremely reluctant to deal in their division – most notably the Cubs and Indians, who’ve only made one intra-division deal in nearly 7 years. It was an underappreciated quasi-milestone when the Tigers sent Leonys Martin to the Indians in the final hour before the deadline this season, because it broke a streak of 8 consecutive years of non-divisional trading for the Indians.

While most teams are understandably less trigger happy when it comes to dealing with teams in their own division, three teams out-trade the random chance rate of 14% (it’s really 13.7%) within their division:

  1. The Houston Astros (15.7%)
  2. The New York Mets (15.6%)
  3. The Oakland A’s (16.5%)

There’s not much to glean here since they’re barely above 13.7%, but maybe there’s something to it – the Astros and A’s are often commended for their smart and savvy front offices. Intra-divisional trades are rarely blockbusters, and typically involve deals made at the margins. When so many others seem to prefer stagnation over marginal improvement if it requires dealing with a division rival, it’s worth nothing that improvement may be worthwhile regardless of who else it involves or benefits.

Ten teams find themselves sharing a market with another – the Yankees and Mets, the Dodgers and Angels, the Cubs and White Sox, the Giants and A’s, and the Nationals and Orioles. These teams almost never trade with each other. In fact, since the 2011 World Series, these ten teams have accounted for 569 trades; they’ve made a total of 3 – the Dodgers and Angels made 2, and the Cubs and White Sox made the Quintana deal. Go back 4 more years and they account for 791 trades collectively; they’ve still made just those 3 deals. That’s 0.76% of trades (again, you have to double count each deal so we have it from the perspective of each team involved, so it’s 6/791 instead of 3/791) – random chance would have it at 3.4%, so this is quite significant.

Going back to what inspired this particular body of research, Adam Fisher mentioned on Effectively Wild that it’s difficult to make a trade with a team you compete with for fans. Economists might refer to market-sharing teams as substitute goods, and the slightest shifts in appeal for Team X’s substitute (call it Team Y) changes the relative appeal of Team X to the consumers of baseball within a given market. Let’s say that the Nationals were buyers instead of whatever they actually were this year – and they managed to pry Manny Machado from the Orioles by outbidding the Dodgers. Obviously the Nationals were already the most appealing team between them and the Orioles in 2018, but they’ve just added a premier player for the remainder of the year. The Nationals’ product is even more appealing than it already was, while the Orioles become even less appealing. The Nats just took a considerable chunk of the O’s marginalized fans with that addition; there’s simply the local Manny Machado fans, the O’s fans who root for the Nationals as their NL team (the Nats fans who root for the O’s as their AL team left any thought of Camden Yards awhile back), the pissed off O’s fans looking for any reason or excuse to root for someone or something else, and maybe even those just trying to catch baseball at the highest level. Sure, those people are a small portion of gate revenue and merchandise, and there’s a reasonable chance they’ll be back at some point in the future, but you could’ve avoided those losses by not improving a competing product. By sending Machado to the Dodgers, the Orioles don’t directly improve the in-market competition; the Nationals may be relatively more appealing as a baseball product than they were prior to Machado leaving for LA simply because the Orioles got worse, but Orioles didn’t proactively improve the Nationals at their expense.

Performance Returns from Trades

Now for a little controversy – if we can even call it that. It’s hard to objectively evaluate trade returns for a couple reasons; the first is the obvious time factor, since trades for prospects who are years from the Majors are going to be skewed by a temporal bias until the prospects either debut, are traded, or leave baseball permanently. The second reason is based on the varying objectives between trade partners, which can be difficult to quantify for rentals who help a team win a championship or even boost them into the postseason, especially if they’re traded for prospects who contribute positively at the MLB level…it’s difficult to say a team lost a trade that helped them win the ultimate prize.

But, as it goes, we try to be objective with numbers and take note of the caveats. Table 7 shows us the WAR acquired and lost by each team since the 2011 World Series, and it’s sorted by the net WAR gained per trade. What’s interesting here is that the seldom-trading Giants actually lead all of baseball in net WAR gained per trade, adding nearly 1.2 Wins Above Replacement each deal. In other words, the Giants rarely make trades, but when they do, they get the most out of them – more so than anyone else. The Reds are the antithesis of the Giants’ economical trade behavior, giving away 1.05 WAR per deal. Like the Giants, the Reds rarely engage in trades, but unlike the Giants, they get the least out of them – a whole win worse than what we’d expect from a replacement level player we’d theoretically believe them to already possess.

The last 4 columns are split to show trades in two separate time periods; the 2011 World Series up to the 2014 World Series, and the 2014 World Series up to now. This is so we can get an idea of how much data in the middle columns should be attributed to trades that took place on a date far enough in the past that evaluating the outcome is appropriate, and how much of it is too recent to evaluate. Comparing the data specific to the respective time period only makes a handful of teams look noticeably different than what we can read into with the middle columns. We see a big discrepancy with the Tigers, in the last 3.5 years compared to the 2012-2014 period, which certainly represents a divergence in the on-field product, but this is nearly a swing of 90 WAR! By comparison, the A’s were good and bad in the same periods, and we see a 49.4 swing in Oakland. Perhaps worth noting is that the Tigers’ initial period was entirely Dave Dombrowski, and their latter period was about a quarter Dombrowski and the remainder Al Avila.

If I weren’t more cautious, I might label Dave Dombrowski a far better GM than I’d ever given him credit for; his Red Sox trades have been far more beneficial than the pre-Dombrowski Sox trades. And there’s no denying his Tigers trades look a lot better at the moment than Al Avila’s, even if the jury is still out on Avila’s Tigers moves and Dombrowski’s Sox moves. If anything, I might be willing to admit that Dombrowski is a good trader, and I’d even be cautious to make a deal with him because he’s gotten a lot more than he’s given up – at least since the 2011 World Series.

A final worthwhile look at Table 7 would be to the Gross WAR per Trade column (column 9). This one is interesting because it shows the average WAR changing hands per deal – we can look at it as the teams willing to make the biggest splashes, and perhaps even the teams more inclined to be involved in a “blockbuster” deal. If we draw the line at 4 gross WAR per trade, we see teams that often are involved in big and memorable deals: Arizona, Boston, Cincinnati, Detroit, and Oakland. Table 8 is more of an anecdotal re-hasher than anything else, but it adds some support to the idea that gross WAR per trade is a metric worth of our time.

Conclusions

There’s not a whole lot I’d like to put here, mainly because this was like 4 times longer than I expected it to be. But if anything is to be taken away from this, I’d hope that it would be the following points:

  • Jerry Dipoto trades a lot, but not as much as Erik Neander, the Rays’ GM.
  • The Mariners and Rays trade with each other more often than any other pair of teams.
  • The Nationals have a stronger preference to trade with the A’s than any other team prefers a trade partner; the 2nd-strongest is the Phillies’ preference to trade with the Dodgers.
  • Teams intentionally avoid trading with specific teams; the more at stake between the two teams, the more probable it is they’ll avoid trading with one another.
  • Most teams avoid trading in their division, but not all; about 8.4% of trades are intra-division.
  • Teams in the same market almost never trade with each other; 0.76% of trades made by 2-market teams are intra-market.
  • The Giants make fewer trades than anyone, but also acquire more WAR per trade than anyone.
  • Dave Dombrowski is probably pretty good at making trades.
  • The Dbacks, Red Sox, Reds, Tigers, and A’s have more talent changing hands on average than the rest of MLB.

First Post!

Welcome to my blog, Random Walks.  The name is kind of stupid but it’s a play on a statistics term (or I guess a math term, more generally) but it sounds like it could almost be referring to a baseball term sort of-maybe-ish.  Regardless of what you might think of the name, my intent in name the blog was to reflect my interest in baseball and stats, which is what this blog will be all about (for the most part)!

I spend a lot of time messing around with Fangraphs data, and I occasionally discover something that might be mildly interesting to other people, so I figured I’d blog about those cases, plus a few more.  I have a little bit of advanced training in statistics from when I was working on my Master’s in economics, but I’m not a PhD-holding data scientist by any means.  Still, baseball stats are among my favorite things in the world, so hopefully my insights provide some amount of usefulness to someone who finds this place!