Return to Video

Web Archiving Service (WAS) Rights Management: California Digital Library (CDL)

  • 0:04 - 0:06
    This is a Web Archiving Service video tutorial
  • 0:06 - 0:09
    on rights management.
  • 0:09 - 0:11
    This tutorial is a brief overview
  • 0:11 - 0:14
    of rights management as it relates to web archiving.
  • 0:14 - 0:17
    It is highly recommended that you utilize the additional information
  • 0:17 - 0:20
    we provide on the WAS help page.
  • 0:20 - 0:23
    We provide two very useful documents:
  • 0:23 - 0:25
    one on WAS rights management practices,
  • 0:25 - 0:33
    and one with answers to frequently asked questions about robots.txt.
  • 0:33 - 0:35
    There are two areas of copyright law
  • 0:35 - 0:37
    that are most relevant to web archiving.
  • 0:37 - 0:40
    Section 108 of the Copyright Act,
  • 0:40 - 0:43
    which relates to the limitations on exclusive rights
  • 0:43 - 0:44
    as they pertain to reproductions
  • 0:44 - 0:46
    by libraries and archives;
  • 0:46 - 0:49
    and Fair Use, addressed in Section 107
  • 0:49 - 0:50
    of the Copyright Act,
  • 0:50 - 0:53
    which states that permission
  • 0:53 - 0:54
    for reproduction and display of works
  • 0:54 - 0:55
    protected by copyright
  • 0:55 - 0:57
    isn't required in certain circumstances
  • 0:57 - 1:00
    such as scholarship and research.
  • 1:00 - 1:05
    Guidance for web archiving best practices
  • 1:05 - 1:07
    regarding copyright can be found
  • 1:07 - 1:09
    in two documents.
  • 1:09 - 1:11
    The first is the section 108 Study Group Report,
  • 1:11 - 1:13
    which contains recommendations
  • 1:13 - 1:15
    from copyright and library experts
  • 1:15 - 1:17
    representing libraries, copyright holders,
  • 1:17 - 1:19
    and others, on how copyright law's
  • 1:19 - 1:25
    library exceptions could be updated for the digital world.
  • 1:25 - 1:27
    The second is the Association of Research Libraries
  • 1:27 - 1:30
    2012 Code of Best Practices
  • 1:30 - 1:35
    in Fair Use for Academic and Research Libraries.
  • 1:35 - 1:37
    Links to these documents can be found
  • 1:37 - 1:40
    in the WAS rights management practices file.
  • 1:40 - 1:42
    WAS has distilled 9 assertions
  • 1:42 - 1:44
    regarding the rights of website owners
  • 1:44 - 1:47
    from the Section 108 Study Group Report
  • 1:47 - 1:50
    and the ARL Code of Best Practices.
  • 1:50 - 1:53
    This tutorial will go over each assertion
  • 1:53 - 1:57
    and explain how they inform WAS policies.
  • 1:57 - 1:59
    Assertion #1 states that libraries and archives
  • 1:59 - 2:02
    should have the right to capture and archive
  • 2:02 - 2:04
    publicly available websites without requesting
  • 2:04 - 2:06
    advanced permission.
  • 2:06 - 2:09
    WAS does not enforce any requirement
  • 2:09 - 2:11
    for proof of permission before capture,
  • 2:11 - 2:12
    and both the Study Group Report and
  • 2:12 - 2:17
    the Code of Best Practices support this policy.
  • 2:17 - 2:20
    Assertion #2 states that US Federal, State
  • 2:20 - 2:22
    and local government agencies,
  • 2:22 - 2:25
    political parties, political candidates and
  • 2:25 - 2:26
    political action committees
  • 2:26 - 2:28
    should not have the right to prevent
  • 2:28 - 2:32
    publicly available content from being harvested.
  • 2:32 - 2:35
    While WAS respects robots.txt exclusions by default,
  • 2:35 - 2:37
    curators do have the ability
  • 2:37 - 2:39
    to override these settings
  • 2:39 - 2:42
    in order to capture and archive content.
  • 2:42 - 2:45
    Assertion #3 states that claims of fair use
  • 2:45 - 2:47
    relating to material posted with
  • 2:47 - 2:49
    'bot exclusion' headers to ward off
  • 2:49 - 2:51
    automatic harvesting may be stronger
  • 2:51 - 2:53
    when the institution has adopted
  • 2:53 - 2:56
    and follows a consistent policy.
  • 2:56 - 2:58
    The Section 108 Study Group addressed
  • 2:58 - 3:01
    government and political sites specifically.
  • 3:01 - 3:03
    The ARL Code provides broader guidance
  • 3:03 - 3:05
    supporting WAS's practice of having default
  • 3:05 - 3:09
    settings that respect robots.txt exclusions,
  • 3:09 - 3:11
    that a curator is able to override.
  • 3:11 - 3:14
    In addition, WAS enforces consistent
  • 3:14 - 3:16
    requirements when curators choose to
  • 3:16 - 3:19
    override robots.txt.
  • 3:19 - 3:21
    Assertion #4 states that,
  • 3:21 - 3:23
    to the extent reasonably possible,
  • 3:23 - 3:26
    the legal proprietors of the sites
  • 3:26 - 3:27
    in question should be identified
  • 3:27 - 3:29
    according to the prevailing conventions
  • 3:29 - 3:31
    of attribution.
  • 3:31 - 3:33
    In order to override robots.txt exclusions,
  • 3:33 - 3:36
    WAS curators must provide
  • 3:36 - 3:38
    owner attribution for a site.
  • 3:38 - 3:41
    This owner attribution will appear
  • 3:41 - 3:43
    on the site details screen for any
  • 3:43 - 3:45
    archived website where robots.txt rules
  • 3:45 - 3:48
    have been overridden.
  • 3:48 - 3:51
    Assertion #5 states that libraries and archives
  • 3:51 - 3:53
    should be required to label prominently
  • 3:53 - 3:56
    all copies of captured online content
  • 3:56 - 3:58
    that are made accessible to users,
  • 3:58 - 4:00
    stating that the content is an archived copy
  • 4:00 - 4:04
    for use only for private study, scholarship and research
  • 4:04 - 4:07
    and providing the date of capture.
  • 4:07 - 4:10
    CDL's WAS service agreement expressly states,
  • 4:10 - 4:13
    "The Curatorial Partner recognizes Web Content
  • 4:13 - 4:16
    captured with WAS is for education purposes only
  • 4:16 - 4:19
    and may not be used for commercial purposes
  • 4:19 - 4:23
    by the Curatorial Partner or other third party."
  • 4:23 - 4:25
    In addtiion, all publicly displayed content
  • 4:25 - 4:28
    is marked "This document is an archived copy
  • 4:28 - 4:31
    for study and research."
  • 4:31 - 4:33
    Assertion #6 states that archived materials
  • 4:33 - 4:35
    are represented as they were captured,
  • 4:35 - 4:37
    with appropriate information on mode
  • 4:37 - 4:39
    of harvesting and date.
  • 4:39 - 4:42
    It is important to know that WAS does not
  • 4:42 - 4:44
    edit or alter content in any way.
  • 4:44 - 4:46
    In fact, CDL's preservation repository
  • 4:46 - 4:48
    has an ongoing fixity checking process
  • 4:48 - 4:50
    for all harvested web content,
  • 4:50 - 4:54
    to ensure that no changes have been made.
  • 4:54 - 4:56
    In addition, details of harvest and date
  • 4:56 - 4:58
    are available on the "Show Description" screen
  • 4:58 - 5:00
    of any publicly displayed archived file.
  • 5:00 - 5:04
    Assertion #7 states that an embargo of
  • 5:04 - 5:06
    a "reasonable period of time"
  • 5:06 - 5:08
    should be observed before archived materials
  • 5:08 - 5:11
    are made available to the general public.
  • 5:11 - 5:12
    WAS complies with this assertion
  • 5:12 - 5:14
    by enforcing an embargo between
  • 5:14 - 5:16
    the time that content is harvested and
  • 5:16 - 5:19
    when that archived material can become publicly accessible.
  • 5:19 - 5:23
    At the moment, this embargo time is 6 months.
  • 5:23 - 5:26
    Assertion #8 states that libraries and archives
  • 5:26 - 5:29
    should be prohibited in engaging
  • 5:29 - 5:32
    in any activies that are likely
  • 5:32 - 5:34
    to materially harm the value or operations
  • 5:34 - 5:36
    of the Internet site hosting the online content
  • 5:36 - 5:39
    that is sought to be captured and made available.
  • 5:39 - 5:43
    WAS service agreements with CDL's Curatorial Partners
  • 5:43 - 5:46
    clearly state that CDL may intervene or stop
  • 5:46 - 5:48
    a capture in progress based on an objection
  • 5:48 - 5:50
    raised by the content owner.
  • 5:50 - 5:53
    In addition, WAS crawlers are always run
  • 5:53 - 5:54
    with a significant delay, to avoid
  • 5:54 - 5:58
    impacting a site's performance.
  • 5:58 - 6:00
    Assertion #9 states that libraries
  • 6:00 - 6:01
    should provide copyright owners
  • 6:01 - 6:03
    with a simple tool for registering objections
  • 6:03 - 6:05
    to making items from such a collection
  • 6:05 - 6:07
    available online, and respond to such objections promptly.
  • 6:07 - 6:12
    WAS crawlers leave identification information
  • 6:12 - 6:15
    on each server that they interact with.
  • 6:15 - 6:17
    This includes a URL to a webpage
  • 6:17 - 6:20
    explaining the crucial role of web archiving
  • 6:20 - 6:22
    in preserving our cultural heritage and history.
  • 6:22 - 6:25
    This page contains a phone number, a webform,
  • 6:25 - 6:28
    and email address to contact CDL.
  • 6:28 - 6:32
    It is also possible for curators and CDL
  • 6:32 - 6:35
    to block specific archived files, directories
  • 6:35 - 6:38
    or all content from public view.
  • 6:38 - 6:40
    This provides the ability to block content
  • 6:40 - 6:42
    that has been requested to be taken down
  • 6:42 - 6:44
    without blocking an entire archive.
  • 6:44 - 6:48
    Selecting your desired robots.txt settings
  • 6:48 - 6:50
    can be easily done when you create a site.
  • 6:50 - 6:53
    Under the "Capture Settings" screen
  • 6:53 - 6:55
    you have the option of leaving the default
  • 6:55 - 6:57
    "yes" to honor robots.txt,
  • 6:57 - 6:59
    or to choose "no."
  • 6:59 - 7:02
    If you are interested in changing
  • 7:02 - 7:04
    the robots.txt settings for a previously
  • 7:04 - 7:07
    created site, find the site you wish
  • 7:07 - 7:10
    to adjust on your "Manage Sites" screen.
  • 7:10 - 7:15
    Choose "Edit," and then adjust the robots.txt
  • 7:15 - 7:21
    settings as desired, and click "Save."
  • 7:21 - 7:23
    This has been a Web Archiving Service
  • 7:23 - 7:26
    video tutorial on rights management.
  • 7:26 - 7:27
    As always, if you have questions,
  • 7:27 - 7:31
    contact us at washelp@ucop.edu
Title:
Web Archiving Service (WAS) Rights Management: California Digital Library (CDL)
Description:

This Web Archiving Service tutorial video will give you a brief overview of rights management as it relates to web archiving.

more » « less

English subtitles

Revisions