A spreadsheet submitted as evidence in a copyright lawsuit against Midjourney allegedly lists thousands of artists whose images the startup’s AI picture generator “can successfully mimic or imitate.”
The spreadsheet is part of an ongoing case that argues Midjourney unlawfully profits from creators’ intellectual property by allowing its text-to-image tool to specifically rip off their work without permission, in violation of US copyright law.
The list was allegedly curated by Midjourney, and catalogs more than 4,700 artist names, plus labels for various image styles – from “cyberpunk” to “zombiecore” – and a bunch of NSFW terms that are presumably blocked from being used in prompts. Under one tab, titled “Proposed Additions,” another 15,800 names are listed.
As reported elsewhere, the creators range from the likes of Andy Warhol and Norman Rockwell to a six-year-old kid who won a Magic The Gathering card art competition that raised funds for a hospital.
Midjourney and other text-to-image developers including Stability AI, Runway AI, and DeviantArt have been sued by artists claiming these machine-learning houses lifted copyrighted images to train models, and made those models available so netizens can produce infringing works on demand, without permission and without recompense. The creatives allege their rights were trampled, that the software can be used to flood the market with knock-off work to their detriment, and they want damages from and other measures levied against the startups.
The plaintiffs are using the spreadsheet as evidence their images were unfairly scraped to train Midjourney’s neural networks: their names are on the list because the system was trained on their artwork, it is alleged.
In fact, the lawsuit goes further than that: it claims Midjourney CEO David Holz collected their names into the Google Sheet so that ultimately whenever users mentioned those artists in input prompts, the software can identify and mimic those artists’ specific styles.
“In other words, Holz published a list of artists who the Midjourney Image Product recognizes with the express purpose of these names being used by users and licensees of the Midjourney Image Product as terms in prompts. Holz’s comment, and the list, have remained available ever since,” the plaintiffs’ lawyers alleged in court documents [PDF].
The documents were filed late last year, and highlighted on social media in the past few days.
The lawsuit, unfolding in northern California, is the latest lunge in an ongoing AI copyright brawl that was initially led by illustrators Sarah Andersen and Kelly McKernan, and painter Karla Ortiz. Judge William Orrick previously dismissed all copyright violation claims made by McKernan and Ortiz since neither of them registered their work with the US Copyright Office.
Orrick was also dubious about their claims that text-to-image tools directly copy artists and that the machines’ outputs are simply derivatives of their artwork. The plaintiffs’ legal team was asked to amend their complaint, and the latest court documents include more artists in the class-action lawsuit and uses the list of names as evidence – referred to as Exhibit J.
Supporting evidence includes screenshots of what appear to be internal conversations between Holz and other staff at Midjourney discussing copyright infringement and knowingly scraping artists’ work. “All you have to do is just use those scraped datasets and then conveniently forget what you used to train the model. Boom legal problems solved forever,” one Discord message read.
The Register has asked Midjourney and the lawyers representing the plaintiffs for more comment. ®