On August 23, a group of publishers, including Penguin Random House, HarperCollins, and Simon & Schuster, sued Audible for copyright infringement. Audible, which is a subsidiary of Amazon, sells and produces audiobooks, and it planned to launch a new speech-to-text feature on September 10. The feature, dubbed Audible Captions, would automatically convert the licensed audio of an audiobook into unlicensed text that appears on the user’s screen as the audiobook is played. In a video posted to YouTube on August 2, Audible mentions that Captions is powered by Amazon Transcribe, which appears to provide real-time transcriptions by sending audio to Amazon’s servers where it is converted to text and then sent back to the user.
The publishers allege that, through its Captions feature, Audible infringes their reproduction, adaptation, public distribution, and public display rights, either directly or indirectly, and they’ve moved for a preliminary injunction. The crux of their argument as to direct liability is that Audible—and not the user—directly causes the various infringements to occur. The publishers don’t specifically mention the role of Amazon or its Transcribe feature in their arguments, though they do note that “Audible has further arranged for software programs and servers” to create the unlicensed text. But it seems quite likely that Audible will use distinctions over exactly who is doing what—the user, Audible, or Amazon—to argue that its role in the copying is too remote to make it a direct infringer.
Audible has agreed to forgo enabling Captions for the publishers’ works until the court rules on the preliminary injunction motion, and its response is due this coming Friday, September 13. Other than the hint that it will likely argue fair use since it believes that the Captions feature will “help kids,” Audible has not tipped its hand as to what it will argue on the merits of the publishers’ prima facie infringement case. But, given the law in the Second Circuit where the publishers have filed suit, we can certainly guess what it will argue.
In a recent article at Ars Technica, Timothy B. Lee suggests that Audible is likely to rely on two seminal copyright cases in its defense: Sony v. Universal and Cartoon Network v. CSC Holdings (aka Cablevision). Lee points out the role of Amazon’s cloud-based Transcribe service in Audible’s automated captioning process, and he contends that it “strengthens the company’s argument that it can do this without a license from publishers.” With Sony, Lee references the Supreme Court’s holding that unauthorized time-shifting can be fair use, and he further notes that, while cases like this can be unpredictable, the courts might decide that automated speech-to-text conversion is likewise fair use.
Of course, there are significant differences between the time-shifting in Sony and the transcription that occurs with Captions. Sony time-shifting created a copy of an audiovisual work without changing its work-of-authorship category, and both the original broadcast and the copy made with the VCR were audiovisual works. With Captions, by contrast, the audiobook, which is a sound recording, gets transformed into a literary work. This isn’t merely creating a copy of a work to watch later; it’s creating an entirely different—though derivative—work. And, of course, in making this derivative work, a reproduction also occurs since the two works are substantially similar. But Captions nevertheless is an entirely different beast than time-shifting, or even space-shifting for that matter, since it in fact creates a new—and separate—work of authorship.
Moreover, the analysis in Sony turned on copying by the user, not Sony itself, and thus the fair use factors were applied differently than they would be here with the publishers’ direct liability claims. Most notably, the first and fourth factors here heavily favor the publishers since Audible’s for-profit use isn’t transformative and causes harm to an established market. Indeed, Audible itself currently offers Immersion Reading, where licensed audiobooks are combined with licensed e-books, allowing the user to follow along with the text while listening to the audio. Importantly, the user must purchase both the audiobook and the e-book to use this functionality in the Audible App.
Turning to Cablevision, Lee argues that “Audible’s case will likely be strengthened by the fact that its app never creates or saves a permanent, full transcript of an audiobook” since “the software only displays a few words on the screen at a time.” To be sure, the buffer copies at issue in Cablevision, which existed for at most 1.2 seconds before being automatically overwritten, were held to be unfixed and thus not copies that could give rise to infringement liability. But Lee misconstrues the import of this holding: There was no doubt that a work could be copied by chopping it up into little pieces at a time; the question was whether those pieces existed for more than a transitory duration. Unlike the buffer copies at issue there, the copies here are clearly fixed since the user can pause the text in Captions and keep it on the screen indefinitely.
Lee notes Cablevison’s holding that it was the user, and not Cablevision itself, that caused a copy to be made with the cloud-based DVR. And he posits that Audible will argue that it thus has the right to distribute “software tools that allow customers to do speech-to-text conversion.” I agree, and I think Audible is very likely to argue that it doesn’t make the copy, the user does, and that’s fair use. But I disagree with Lee’s further claim that, if the transcription actually takes place on Amazon’s servers, the “publishers are likely to argue this means Amazon—not users—are creating the transcripts.” If the publishers thought that Amazon was making the copies, Amazon would have been named as a defendant. It wasn’t—and for good reason.
As I understand the facts, I don’t think it’s likely that Amazon would be directly liable for the copying that takes place with its Transcribe feature. That system is fully automated, and Amazon plays no role in selecting or supplying the content that gets converted. The same, however, is not true for Audible. The publishers claim that Audible is a direct infringer since it selects which specific audiobooks have the Captions feature enabled and integrates the functionality for making the unauthorized transcriptions within its Audible App. To analyze this claim, we must determine whether Audible’s actions are sufficiently proximate such that Audible itself can be said to be doing the copying, and for that we need to look no further than the Cablevision opinion itself.
Cablevision turned on whether the remote-storage DVR (RS-DVR) was more analogous to a VCR or a video-on-demand (VOD) service. With a VCR, the user who pressed the record button supplied the necessary volition to be held liable as a direct infringer—not the company that manufactured and sold the VCR, which had no control over the content that was recorded. With VOD, by contrast, the user still pressed the button to initiate the streaming, but the service provider was the one that could be held directly liable since it selected and supplied the works that were available on its service. Cablevision’s cloud-based DVR fell somewhere in between, and the Second Circuit held that it was more like a VCR since Cablevision did not control the specific content that its users could record and stream:
Cablevision, we note, also has subscribers who use home VCRs or DVRs (like TiVo), and has significant control over the content recorded by these customers. But this control is limited to the channels of programming available to a customer and not to the programs themselves. Cablevision has no control over what programs are made available on individual channels or when those programs will air, if at all. In this respect, Cablevision possesses far less control over recordable content than it does in the VOD context, where it actively selects and makes available beforehand the individual programs available for viewing. For these reasons, we are not inclined to say that Cablevision, rather than the user, “does” the copying produced by the RS–DVR system.
Lee mentions that the “courts could decide that Amazon plays too active a role in the conversion process to portray itself as a passive supplier of technology like the maker of a VCR.” I think this is very likely, but only as to Audible, not Amazon. As noted above, Amazon Transcribe is a passive system; it will transcribe any audio that it receives. Captions, on the other hand, only transcribes the specific works that Audible has decided ex ante to include. The user can’t transcribe every sound recording on a device or even every audiobook within the Audible App. Audible instead chooses the specific works within its App that the user is allowed to convert into a derivative, literary work. And it’s this selection of the specific content that makes Audible more like a VOD service than a VCR or DVR.
The Supreme Court opinions in ABC v. Aereo—both the majority opinion by Justice Breyer and the dissent by Justice Scalia—buttress this conclusion. Like Cablevision, Aereo supplied a cloud-based DVR, though the content available on Aereo’s service depended on what was publicly available via over-the-air transmissions. The Aereo majority held that it was Aereo, and not the user, that caused the transmission to occur when the user pressed a button to initiate the streaming, and this was true even though Aereo had no control over the specific content that was made available to the user. The Court held that Aereo’s role in the copying—by setting up in-house antennas, transcoders, and servers that would retransmit television broadcasts to the public—was sufficiently proximate to render it the direct infringer.
In dissent, Justice Scalia took a far more limited view of the volition necessary to hold a service provider directly liable. Relying on Cablevision, Justice Scalia argued that direct liability “demands conduct directed to the plaintiff’s copyrighted material,” and since Aereo did not select the specific content to be streamed, he would have held that it could not be directly liable. Indeed, he argued that a VOD service could be directly liable precisely because the “selection and arrangement by the service provider constitutes a volitional act directed to specific copyrighted works and thus serves as a basis for direct liability.” In his estimation, “Aereo does not ‘perform’ for the sole and simple reason that it does not make the choice of content.”
Thus, even under Justice Scalia’s narrow view of direct liability, Audible can quite reasonably be said to have crossed the line from being a passive conduit to an active participant in the copying because Audible itself selects the specific works that can be transcribed with its Captions feature. Once Audible files its response brief this coming Friday, I hope to then take a deeper dive into these fascinating issues. And perhaps then I’ll discuss the secondary liability issues that I ignored in this post. But for now I’ll conclude by saying that, by making its Captions feature available only for the specific works that it selects, Audible will have an uphill battle in arguing that it’s more like a VCR or DVR than a VOD service.