Fourteenth Trademark Scholars’ Roundtable, part 3 (Evidence)

Changes in Trademark Law and Evidentiary Rules

Introduction: Jake
Linford

Before courts admitted surveys routinely, they were
concerned about hearsay. Sometimes rejecting surveys seems like judicial notice—“cola”
as generic; the court doesn’t want to hear contrary survey evidence. Some objections
go to the weight of the evidence.

Instead of surveys, can we look at things like Google
results or large text databases reflective of use in a particular community? Comicon
case: corpus linguists filed a brief trying to establish how comicon was used prior
to the first San Diego Comicon; court didn’t find that persuasive but shows
possibilities of use. You can get a sense for what the use of the putative mark
was like at first use, if your data are rich enough. If survey evidence isn’t
at the right time (Coach’s claim of fame in the Coach case), corpus evidence
might be able to fix that problem. Skeptical that it could help w/confusion
though.

Brain scans for consumers? New article proposes judging
similarity—he’s very skeptical both of cost and whether it tells us anything
[we don’t already know]. What about ChatGPT? Skeptical of that too.

If courts rely too much on dictionaries, the corpus evidence
might create a wedge b/t judge and dictionary.

Barton Beebe: How to integrate this into a larger rethinking
of competition? At the origin (in technical TM times), confusion wasn’t the
harm sought to be addressed; confusion was the evidence of the harm sought to
be addressed, which was trespass to property rights. What would it mean if we
said TM law is primarily concerned with promoting competition: what would a
survey look like for infringement? [Unsurprisingly, I say it would have to
consider materiality.] Would secondary meaning surveys change? Maybe not. But
would consumer uncertainty play a role in our understanding of how preventing
confusion promotes competition?

Earliest surveys—1921 Coca-Cola v. Chera-Cola, used as a
matter of course a Q including what we now call a Likert scale, measuring level
of certainty. So did 1923 survey. Then we lost that, especially w/ face to face
interviews. Surveys became trinary: yes/no/unsure—pushing people to a yes/no
answer. Most people are not really sure! These surveys are now designed to
suppress that uncertainty and not let it appear in the courtroom b/c we just
want numbers.

Imagine 20% confusion in a survey, but everyone was only
somewhat sure, while 80% are not confused at all and are absolutely certain. If
you present that nuanced distribution, maybe 20% isn’t enough b/c of the large
number of correct consumers. We urge introduction of that next level of detail
to that.

Also, materiality: this is a way to consider materiality in
the rubric of uncertainty.

How do we survey for arbitrariness, fancifulness, suggestiveness?

Discussant: Mark
Lemley

Maybe corpus linguistics can help w/things like
descriptiveness and nominative fair use, though skeptical about confusion or
fame (b/c you need a standard). Corpuses are attractive b/c the alternatives are
so bad: Surveys are infinitely manipulable; dictionaries have an opposite goal
to that of a statutory interpreter—to cover possible meanings instead of
correct one and so they are abused when used in legal settings.

Rogers is a rule of evidence; its goal is to replace
the cumbersome, problematic LOC multifactor analysis with a rule that irrebuttably
presumes no confusion except in limited circumstances. [At least the 9^th
Cir. version.]

One possibility: what if we required or at least said a
presumptively admissible survey was X—could be drafted into court rules. If we
did that right, we could get rid of a lot of fighting back and forth and
manipulation and problematic Qs asking about permission. We could standardize
minimum percentages, affected by Likert scale. Could also give us an idea about
what heightened confusion is—having a standard would allow us to actually
heighten the standard in Rogers cases, which generally don’t actually do
so.

Statutory presumption of secondary meaning after five years
of exclusive use: it says literally that the Director “may” presume this. There’s
a nice naïve literalism that would say the PTO can do this but courts have no
power to do so: if the PTO hasn’t so presumed, courts are not in a position to
do so.

Incontestability: maybe we ought to have reverse
incontestability: if I’ve been around for a certain period of time, we presume
that I’m not interfering with another TM owner’s rights. Goes beyond laches;
would require killing Dawn Donut, which is an additional plus in his
view.

We might think about a TM anti-SLAPP provision.

Fromer: Deciding what we’re going to measure + how do we use
a new tool—new questions arise with each new type. Corpus linguistics: consider
fame. Is fame use of a word in a particular context a sufficient amount, or is
it that the word is only used in that context? We’ve never asked that before! How
do we know whether the use is shorthand for a class and what that means [what
the Google court called synechdoche use]. Sometimes survey formats get
standardized by default, so it has to be very thoughtful.

Grynberg: TM right now only has limited concern with mental
availability for confusion; courts normally talk about confusion even though TM
owners want to control availability of mark and surveys may be measuring that.
On Likert scale, a person w/high recognition and low certainty may be reflecting
a belief in what “belongs” to the TM owner.

Heymann: efficiency/cost trades off with accuracy in
surveys. People are notoriously bad at reporting on their own mental processes
and that’s what we’re asking them to do. [especially with “why do you say that?”
where they just guess.] In UX, we don’t ask people how they’d navigate a
website; we watch them do that. So we could construct a mock situation where we
ask people to make choices and derive from that whether they are confused or
not. Choices instead of thoughts.

Silbey: companies invest in those already!

Sprigman: could be a Squirt test with context.

Heymann: but actually ask them to perform the selection task,
not just report on what they think they’d do. We’d have to think about whether
hesitation counts.

RT: [Want to strongly endorse Fromer’s point: Measuring
almost always changes what the construct is—you are answering a different Q
than the non-surveyed or non-corpus analyzed question; the survey and the
corpus themselves answer different questions (what do people say in response to
direct Qs, sometimes after training, versus how do they talk when not thinking
about it?). Maybe it’s a better Q, but that needs to be identified.

UX point: 1-800-Contacts: 10^th Circuit says look
at clickthrough rate to find maximum possible confusion—possibly there’s opportunity
to focus more on this.

Past work on how people walk around in a fog of confusion:
Jacob Jacoby did this work in the 1980s—20% of people misunderstand any given
factual claim in advertising! They walk around confused about sponsorship b/c
of their priors!

How to survey arbitrariness/TM function: Washingmachine.com
in booking.com—maybe along with not sure we also need the option “neither a
common name nor a brand name.”

Lemley’s proposed standardization: (1) Burrell’s caution
yesterday that they standardized badly, including about permission; (2) Teflon
is a pretty standard format, but questions always arise: (a) original Teflon
survey used STP as a well known brand name; wouldn’t do that now; (b) take a
look at FOOTLONG surveys, both of which were Teflon surveys w/essentially the
same Qs but different training and testing examples. 14 point swing in FOOTLONG
results—which might even seem reassuring about the general tenor of the answers
since even manipulation didn’t change a ton, but it was a lot.

McKenna: we have substantial standardization in Eveready,
Teflon, Squirt (garbage, but much less common), but there’s enormous
variability introduced by stimulus, control, universe, and those are extremely
hard if not impossible to standardize. What has happened in TM is fetishization
of confusion for its own sake: confusion is itself the harm. So one goal is to
push against that construct: things in the context of survey evidence need to
connect to marketplace behavior, and certainty/Likert scales can help with
that.

There’s a strain of marketing literature that uses scanner
data: tries to measure effect of brand extension on parent brand. And mostly
they don’t unless the products are used together; consumers just segment.
Marketing studies: what TM thinks of as related products, TM is much more
expansive than consumer behavior—things have to be really close together for consumers
to care. Mental state is relevant, but it is relevant in conjunction with
behavior.

Linford: if we’re right in our empirical work that
supposedly tarnishing uses often have a burnishing effect, what are firms doing
when they challenge those uses?

Sprigman: don’t assume firms are rational: they think it’s
theirs!

Linford: firms do try to figure out what marketing decisions
will work [but there is a difference between the expertise and methodology of
the person who does that at P&G and that of the person who does that at
Gucci—this is an example of high fashion possibly distorting our analysis of
other things].

Ivorine tooth-whitening gum & Ivory soap—the court didn’t
think that, as Ivory claimed, you could brush teeth with soap; court looked at
survey responses and heavily weighted the “I don’t know” responses against Ivory.

If we’re looking for marketplace harm, that is looking for
confusion—but in theory under current law we should be looking for the likelihood
of marketplace harm—evidence of likely changed behavior. How would we do
that?

Sheff: You could have rigorous standards about substitutes
and cross-elasticity of demand that could lead you to the proper universe for a
given survey. But one reason you think competition is good might be a deep
suspicion of private economic power, so anything that adds friction to that is
good. In that circumstance, neither universe nor stimuli selection might matter
very much; even confusion might not matter very much. We could do things like
market share analysis—but TM might not try to prevent copying others’ TMs in
general.

Ramsey: standardize by requiring Qs to separate out different
kinds of confusion?

McKenna: courts settled on a low percentage b/c this was
about an injury to the TM owner, and the Q was what we should care about. Thinking
about injury to TM owner as not the only Q could let us consider, as Sprigman
argues, the benefit to the other consumers (85% even) who have benefited from
the new option and would have to face loss of the information they got from D’s
use.

Fromer: things change over time: marketing channels mattered
more when the multifactor test was initially propounded; this makes
standardization a challenge (what do we do with increasing brand self-parody?).

McGeveran: as antitrust swings back towards power
containment and holistically about efficiency in the whole market rather than
whether a particular merger is efficient, that could change how you think about
TM’s competition goal. The narrative
about market concentration and power and the ways in which tech creates
bottlenecks instead of long tails suggest productive TM inquiries: when we say “competition”
in TM, it’s no longer clear what we’re talking about. New market entry
opportunities? Competition on store shelves like with private labels? How does
TM contribute to the shape of the whole market?

Bone: TM has always been about protecting sellers and
avoiding fraud on the public; fraud reflects a moral wrong. Our reconfiguration
has to take those into account.

Mid-Point Discussants: Eric Goldman

Antitrust is not a model of empirical evaluation at law, but
empirical evidence does matter across consumer law—formation of TOS. Does a
link to a privacy policy provide effective notice? One court says a reasonably
prudent smartphone consumer knows that an underlined blue link is a hyperlink—that’s
just made up; no survey evidence or consumer testimony, no literature. Chances
are the judges are not themselves reasonable smartphone users. So much of what we
talked about today are riffs on broader consumer law cases: what do we know and
how do we know it? Emoji evidence: courts are just, again, making it up. A “good”
emoji evidence case includes a statement from law enforcement saying “I’ve
worked this beat and I know how they talk.” That’s the best evidence they use
(and it’s bad!).

Chris Sprigman: need a way to account for consumer
heterogeneity, which is almost always what we see under the hood. Different from
saying that consumer belief is sovereign/a trump. Vintage Brand case: court
asks whether consumer belief about sponsorship/licensing should be taken
seriously or should be considered a legal conclusion about licensing. This isn’t
wholly empirical.

Stabs at a theory of competition: start w/ preferences. We’ve
been agnostic about them for a while, removing moralism from competition. One
form of private differentiation is as good as another; we don’t question why
consumers want something. But: Producers enjoy more surplus in differentiated
markets; consumers enjoy less. The response is that consumer surplus lost in
the form of price is gained in the form of differentiated preferences being satisfied
more closely. But is that true of a manufactured-by-marketing preference? To
say that in a room of economists is to risk being called an idiot, but that
doesn’t mean it’s wrong. People’s valuation of attribution depends on the frame—they
value it more if it’s an entitlement and less if they have to pay for it. This
isn’t super hard, but the response from economists is “who cares?” We set a
legal rule, create preferences, then satisfy them. We forget at the end of the
day that the preferences only arose from the entitlements created by the legal
rule. This is unsatisfying. The neo-Brandeisians say: the market has an organic
existence, which is not about people being induced to have preferences, but
manifesting them. That’s too innocent; people are convinced into preferences
all the time—but a highly engineered preference might not be as important to
satisfy. If we were to use that model of competition in TM and designing
surveys, we’d look for what would cause consumers to substitute—focus on source
primarily, and we’d need some sort of materiality requirement either as an element
of claim or a belief about source strong enough to undergird a choice in the
market.

Need more coordination about surveys, outside the context of
particular cases. Judges and survey experts need to be able to interact in ways
that make accepted surveys more credible.

TM surveys should be more incentive-compatible—in © has done
incentive-compatible experiments where people buy & sell things—these are
more reliable b/c people don’t tell you what they think but show you what they
do—their incentives and their behavior are compatible. Correct answers should
benefit participants more than incorrect one, which simulates the real world
better. Even if it’s getting the right chewing gum, something small is at stake
in the real world.

Robert Burrell

Looking behind curtain of preference satisfaction: it’s very
difficult to convince courts to do that. One possibility: makers of Nurofen
pain range were fined AU$10 million for having misled consumers for selling the
same product as different products for, e.g., period pain with huge price
differences. But Nurofen still has a massive price premium over generic
ibuprofen: is that ok? We need other examples to move the dial.

Double identity as a solution to evidentiary challenges: it
is in Art. 13 of TRIPS! He’s a fan of it for goods cases—seems to cause
relatively little harm and acts as an escape valve for the property intuition:
it defines what the “property” is. It doesn’t work so well for advertising!
Advertising can convey all sorts of nonconfusing messages, so we need robust
defensive doctrines to prevent undue harm in advertising, but there are ways to
design that.

Evidentiary problems exist everywhere, though they vary across
jurisdictions. When we look at what’s actually happening, we see courts getting
really uncomfortable w/certain evidence but we don’t use that to ask more general
questions about the quality of evidence. Particularly UK and Singapore,
acquired distinctiveness is a reliance standard for at least some forms of TMs
(trade dress)—don’t provide evidence of use, show me that consumers rely on
this to make decisions. The move to failure to function and away from acquired
distinctiveness might be a similar rule of evidence—we don’t trust the usual
proxies. But if we don’t trust them in trade dress, why do we ever trust them?

In Australia, occasional examples in which proxies for
secondary meaning are regarded as unreliable—e.g., consumers had to buy
a thing, as when everyone had to buy a set-top box to transition from analog to
digital. Massive sales aren’t enough to prove consumers care about manufacturer—similarly
with hand sanitizers and masks. But we don’t ask similar questions about shoes,
soap, rice, even though those are staples in the modern world.

Evidence of actual confusion: sometimes ignored b/c wrong
sort of person is giving answer or answer is so manifestly wrong. Anyone who
thought Nike would make toilet cleaner had an extreme and fanciful reaction,
according to SCt of Australia. Anyone who thought McDonald’s would make wine
was operating under an erroneous assumption.

Maybe consumers aren’t good at explaining their responses
not b/c of problems in articulation but b/c that’s not how they make decisions!
Might be worth looking comparatively—he previously thought German TM law loved
surveys, and that’s true for distinctiveness in general, but not in likely
confusion cases where they are verboten. Lack of trust plus commitment to
normative view of likely confusion. But the effect (answering Lemley Q) was to
broaden rights.

What about marketing experts? Do they have specialized
knowledge, or are they charlatans? Australia has debated that as well. Ongoing question
of admissibility of Wayback Machine evidence. Every now & then, more in the
UK than Australia, courts just assert things about how people shop. And damages
are often just asserted. At quotidian level, there’s just a plague of fake
evidence—e.g., invoices. In light of that, a narrow property rule makes more
sense; otherwise you should have to show real harm.

Silbey: to demand TM have a theory of competition is to
displace the consumer as sovereign in terms of what evidence matters. We’re not
measuring systems; these are private rights of action.

McKenna: but we structure private rights of action towards
systematic goals all the time—what counts as secondary meaning, w/in TM.

Silbey: if it’s not a market for lemons story, it’s a
consumer sovereignty story. All these tests arise out of and are reified at a
time when the story about what’s optimal in a market is a view of advertising as
beneficial/creating the best kind of consumer market.

Sheff: evidentiary rules depend on the theoretical
structure. Is a deceptive used car dealer competing fairly? Why does it matter
that he be honest? The problem is the seller has information the buyer doesn’t—that’s
a form of power that we think ought not to exist in consumer markets. Why do we
think that? It’s not b/c it’s inefficient to allow asymmetric info to be used,
but b/c we have a thicker theory of autonomy underlying market structure.
[McGeveran: or both!] Theories that are quite different can converge on certain
aspects of regulatory structure.

But the interesting thing is how we choose when they
diverge. If the problem of used cars is that falsities will cause markets to
fail, then some critics said to Akerlof that if he was right there should be no
market so he must not be right; the response is “this market is worse than it
could be.” That’s a comparative claim. We could take care of the market for
lemons if we forbade all ads and required plain packaging. [I don’t think
that’s true.] Why would that be worse than the market we have now? We have some
substantive notions of freedom and autonomy, including in commercial behavior,
that shape our idea of the appropriate comparator market and what type of
regulatory regimes are preferable which then informs what types of evidence
count.

Lemley: There’s a perfectly good information story: bad
information makes markets less efficient and decreases purchases. You don’t
need a new theory of TM.

Sheff: by the same token, there’s an efficiency story about
branding; if we only care about efficiency we’d have a lot more regulation of
persuasive advertising than we do.

RT: (1) Anti-ESG movement/anti-DEI movement—think not just
about progressivism but about challenges to preference formation from the
right. What interventions into the market are justified because preferences are
currently distorted? Who is distorting preferences or at least affecting
formation of preferences in way they have a vested right to do? If that’s the
wrong framing, why is it the wrong framing?

(2) In response to FTC actions against “results not typical,”
industry mounted defense of “results not typical” for weight loss products: consumers
are buying hope—even small chance might be material depending on circumstances,
which pushes back on the Likert scale?

(3) double identity and advertising: note that in the US
comparisons for house brands are often made on the bottle or package!

(4) False advertising has similar theories to those
discussed by Burrell in the US: misleading v. misunderstood; puffery and no
reasonable consumer would believe that e.g. Froot Loops have fruit in them. Also there are now two
cases
accepting the Nurofen-style deception theory in the US.

(5) periodic reminder that registration is not actually
consumer focused. If we thought of the US as at least half registration based
we would have to say that it’s at least half about competition. [Lemley says
that registration might not be about competition either, but it’s at least
closer.]

McKenna: the choice is not between satisfying preferences
and not, but which preferences we will have and who will satisfy them. There is
no neutral position. Economist move: it doesn’t matter where the preference
comes from—that is actually a sneaky way of prioritizing certain preferences,
since different preferences would have emerged under different rules. Pretending
there aren’t rules around who got to shape those preferences is itself
deceptive.

Dogan: economists are also rethinking things. What do we do
about the fact that medical professionals buy generic drugs but uninformed
consumers, often poor/resource-constrained, buy the brand names. So their behavior
probably doesn’t match their preferences in some other world—but with what do
we replace revealed preferences?

We also need to understand what’s happening within the 85%
of unconfused consumers—it may not be a loss to them; it may mean nothing to
them.

Gangjee: Registration system is focused not really on
consumers or on competition—by and large it’s a different project [I describe
it as businesses ordering their own relations]. There is a morality for that. Historically,
there was a moral account. If our drive is to more normative account, it’s a
hard thing to get right. First UK TM act: no registration for word-only marks
at all—totally normative belief that words had multiple uses. Retreat from
normative system over time. It was not competition, but availability to
producers of words to use to speak—very interesting different vision, replaced
by distinctiveness over time.

from Blogger http://tushnet.blogspot.com/2023/02/fourteenth-trademark-scholars_25.html

	La puissance des dét… on Package size can be false…
	Sava v. 21st Century… on bad influence: claims against…
	Heightened Constitut… on Initial thoughts on Elste…
	Heightened Constitut… on Initial thoughts on Elste…
	ChatGPT Ist In Den B… on I was on Hard Fork to talk cop…

Fourteenth Trademark Scholars’ Roundtable, part 3 (Evidence)

Leave a comment Cancel reply

Recent Posts

Recent Comments

Archives

Categories

Meta

Fourteenth Trademark Scholars’ Roundtable, part 3 (Evidence)

Share this:

Related

Leave a comment Cancel reply

Recent Posts

Recent Comments

Archives

Categories

Meta