Why I took the $2,500 check to license my book for AI

You may have read that the publisher HarperCollins was offering payments to authors to license their books for AI training. Well, it’s really happening. They contacted my agent, who forwarded their offer to have me license my 2016 book Writing Without Bullshit. I signed the contract and a month or so later I got the check you see here. The amount for the license is $2,500; as with all other payments from the publisher to me, the agent takes 15%.
The terms of the deal
I carefully read the contract and I believe it is a fair deal. Here’s are some excerpts from the letter describing the deal that HarperCollins sent to my agent, who forwarded it to me:
HarperCollins has negotiated a unique opportunity with a tech company that compensates our authors for the right to use existing titles to train, fine-tune, and test AI models. This offer is opt-in, meaning works will only be used if an author agrees to it, and is being offered only for a selection of nonfiction titles. In return, the tech company will pay a flat fee to the author and to HarperCollins for the right to use the works for a limited term and set technical guardrails around the inclusion of copyrighted material in outputs that are more restrictive than “fair use” under copyright law.
Terms and conditions:
- Three-year deal
- Author will receive a per-title fee of US$2,500. HarperCollins will receive an equal amount of US$2,500 per title. The author fee will be credited to Author’s royalty account and paid to Author at the time of Publisher’s next accounting. Such monies will be deemed non-recoupable and will not be deducted from any other payments due to Author. [This means it’s outside the normal royalty accounting.]
- In return for the fee, HarperCollins will provide the company with Author’s book and the associated metadata created by HarperCollins for researching and developing AI models, including training, fine-tuning, and testing of AI models. This deal is only for AI model training, grants no right to generate derivative works, and includes no waiver or release of potential infringement claims against output; authors reserve all rights with respect to output. Following such three-year period, the company will cease use of the book for AI training but may continue to use the AI models trained on the book.
- Guardrails on output:
- The AI company represents that its technical guardrails will disallow the inclusion in AI output of (i) more than 200 consecutive words of verbatim text in a single output and/or (ii) 5% of the text of the book across a series of outputs during a user session.
- The AI company represents that it will expressly prohibit commercial users from attempting to infringe copyright of others and will monitor and enforce any infringing uses through its terms of service.
- The AI company represents that it will not scrape or otherwise obtain training data for its AI models from websites that are listed in the US Trade Representative’s Notorious Markets for Counterfeiting and Piracy List or identified by HarperCollins as hosting pirated content.
- The AI company represents that it will promptly address breaches of copyright once HarperCollins brings them to their attention.
Why I took the deal
While my agent recommended against their clients taking the deal, they had an obligation to present it to me. I have no information on the number of authors who took the deal.
Much of the discussion around AI and copyright is concerned with AI companies just ripping off content willy-nilly and then claiming they have a right to do so. They claim they have the the right to just use copyrighted content that is visible on the open web. I disagree. Even if you feel their behavior regarding web content is acceptable, there is certainly no justification for them violate ebook copy protection so they can rip off mass collections of book content wholesale.
Books — especially nonfiction books — are highly valued for AI training because they have been professionally written and edited and represent clear, logical thought (at least clearer and more logical than the average random web page).
The author of such books should be able to make a choice about whether to license a book they created. However, it’s exceedingly cumbersome for an AI company to negotiate individually with thousands of authors. The obvious approach is through publishers that publish many titles. That’s what this AI company (likely Microsoft) agreed to do, and it’s how this offer trickled down to me. If you believe AI companies should pay to license content, this is the simplest way for them to do so.
Basically, I took the deal because:
- I want to support AI companies actually using legitimate channels to license books.
- I believe authors should have a choice — and they gave me a choice.
- I want the text of Writing Without Bullshit to be part of the training set, because I believe it will in a small way improve the quality of the output of an LLM that is trained on it. I want people using LLMs to benefit from insights from my book.
- I believe that $5,000 per book is a fair rate of compensation, one that compensates authors but is not prohibitive for AI companies.
- I believe an even split with the publisher is fair, as it is in line with revenue sharing for other forms of subsidiary rights such as foreign translations.
- I like getting an additional payment for a book that has not earned out its advance and is therefore not generating any royalties.
Do not interpret this as my recommendation for what you, an author, should do about your book. You need to make your own decision. But I made a rational decision that will pay me and benefit everyone.
Do you agree? Would you license your book? Or are you morally opposed to the whole regime?
I think you made the correct choice. The best outcome for the publishing industry would be for this type of licensing model to become common among all AI companies. By supporting a pay-for-use model you made it more likely to happen.
One of my brothers-in-law was part of the screenwriters guild strike a few years ago. At a family event, he and I along with another BIL, who is CEO of largish company, had quite a long discussion about the threat posed to writers by AI. While I understand the desire to keep AI out of writing, I can’t imagine that it’s possible to do so. Neither could my CEO BIL, who I find to be a savvy and well-intentioned guy. AIMO, the best we can do is to find compromises that allow writers to get paid. You just did that.