Thomas Reichert, Doctrine, Data, and the Death of DuPont
Abstract:For fifty years, courts have claimed to apply a comprehensive thirteen-factor test for trademark confusion. They are lying, or at least deeply mistaken. Using AI-powered analysis of 4,000 decisions, this Article proves what practitioners have long suspected: the test has collapsed to just two factors.
Using a large-language-model to extract scored findings for all thirteen factors from approximately 4,000 TTAB inter partes decisions (2000-2025), the study applied statistical models to predict case outcomes. Mark similarity (Factor 1) and goods/services relatedness (Factor 2) alone achieve 99.37% accuracy. Adding the remaining eleven factors increases accuracy to only 99.79%, which is a mere 0.42-point improvement with no practical significance. More striking still, a simple categorical rule predicting confusion if and only if both factors 1 and 2 favor confusion achieves 99.52% accuracy, outperforming the regression models. Further analysis confirms that most secondary factors either repeat information already captured by the core two factors or contribute nothing meaningful to outcomes.These findings confirm at scale what prior scholarship has suggested: in determining trademark confusions, courts pay lip service to comprehensive multi-factor analysis while actually deciding cases based on just two considerations. The results also reveal concrete harms from this doctrinal gap: parties spend substantial resources litigating factors that do not influence outcomes, case results become harder to predict in advance, and adjudicators exercise broad discretion without meaningful constraints.The Article explores how these findings might inform doctrinal reform, how reforms would center the two determinative factors and limit secondary considerations to narrow tiebreakers in genuinely ambiguous cases. Finally, it advances a broader “multifactor collapse” hypothesis and outlines a research agenda for testing whether other legal balancing frameworks exhibit similar patterns where doctrinal complexity masks simpler underlying decision-making.
My comments: Empirical support for John Welch’s mantra,
which turns out to be understated—mark and goods don’t predict 95% of the
outcomes of 2(d) appeals to the TTAB, they predict 99%!
A small point: I think the article understates Beebe’s findings
on the importance of intent, which is the factor that he finds to be important
that this analysis doesn’t. This may be related to the big point: You can’t directly
compare registration inquiries, which are conducted in the abstract, to infringement
inquiries, which consider all the relevant context. Actual confusion is
especially unlikely in 2(d) inquiries, and so is intent evidence.
This paper could be very useful, but without attention to the
differences between 2(d) and infringement, it will not reach its potential and
might serve to confuse people who aren’t already conversant in trademark law. This shows up already in the abstract, which starts off with “courts” but then discusses the TTAB. Likewise, on p.44, right after saying clearly that the results are about the
TTAB, the paper says “Some readers may object that courts must have reasons for
discussing all thirteen factors. This objection conflates rhetoric with
reality. Courts discuss Factor 8 (concurrent use) because doctrine requires it,
not because it changes outcomes.” But the TTAB is not a court. This also means
that, e.g., claims about litigation costs aren’t comparable; the proper figure
is estimates for costs of opposition, which AIPLA collects separately from
litigation costs in its surveys.
The results also have fascinating implications for the
question of crowding on the register. If crowded fields rarely matter, that
gives existing registrants even more of an advantage than the crowding
literature might suggest, even as 2(d) refusals seem to be rising.
from Blogger http://tushnet.blogspot.com/2025/12/reading-list-and-comments-doctrine-data.html