"

7 Scraping

Rachael Samberg

As explained in the chapter on Text and Data Mining: In provisions labeled “Scraping,” or some variation of “Robots, Crawlers, & Scraping,” publishers sometimes try to prohibit automatic downloading of content, which could have outsized impacts on TDM.

Why would they do this? “Crawlers” are an automated way to download mass quantities of content and this frightens vendors or publishers for a few reasons. First, it could impact the functionality of the vendor’s system if large volumes of download requests inundate their servers. Second, bulk downloads seem suspicious and could be a sign (to the publisher) of the network having been hacked—not to mention the fact that if that large volume of data is downloaded in bulk, it could be more easily distributed en masse in violation of the license agreement. Those are all valid concerns.

However, your TDM users typically need to collect large quantities of content to conduct their research. So you can aim for a middle ground with language like:

Robots, spiders, crawlers or other automated downloading programs, tools, or devices to search, scrape, extract, deep link, or index the Subscribed Products may be used only to the extent reasonably necessary to conduct the TDM research.

Please see Text and Data Mining for all other relevant analysis.

License

Icon for the Creative Commons Attribution-NonCommercial 4.0 International License

E-Resource Licensing Explained Copyright © 2024 by Sandra Enimil, Rachael Samberg, Samantha Teremi, Katie Zimmerman, Erik Limpitlaw is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, except where otherwise noted.