Web Archiving Service (WAS) Rights Management: California Digital Library (CDL)

0:04 - 0:06

This is a Web Archiving Service video tutorial
0:06 - 0:09

on rights management.
0:09 - 0:11

This tutorial is a brief overview
0:11 - 0:14

of rights management as it relates to web archiving.
0:14 - 0:17

It is highly recommended that you utilize the additional information
0:17 - 0:20

we provide on the WAS help page.
0:20 - 0:23

We provide two very useful documents:
0:23 - 0:25

one on WAS rights management practices,
0:25 - 0:33

and one with answers to frequently asked questions about robots.txt.
0:33 - 0:35

There are two areas of copyright law
0:35 - 0:37

that are most relevant to web archiving.
0:37 - 0:40

Section 108 of the Copyright Act,
0:40 - 0:43

which relates to the limitations on exclusive rights
0:43 - 0:44

as they pertain to reproductions
0:44 - 0:46

by libraries and archives;
0:46 - 0:49

and Fair Use, addressed in Section 107
0:49 - 0:50

of the Copyright Act,
0:50 - 0:53

which states that permission
0:53 - 0:54

for reproduction and display of works
0:54 - 0:55

protected by copyright
0:55 - 0:57

isn't required in certain circumstances
0:57 - 1:00

such as scholarship and research.
1:00 - 1:05

Guidance for web archiving best practices
1:05 - 1:07

regarding copyright can be found
1:07 - 1:09

in two documents.
1:09 - 1:11

The first is the section 108 Study Group Report,
1:11 - 1:13

which contains recommendations
1:13 - 1:15

from copyright and library experts
1:15 - 1:17

representing libraries, copyright holders,
1:17 - 1:19

and others, on how copyright law's
1:19 - 1:25

library exceptions could be updated for the digital world.
1:25 - 1:27

The second is the Association of Research Libraries
1:27 - 1:30

2012 Code of Best Practices
1:30 - 1:35

in Fair Use for Academic and Research Libraries.
1:35 - 1:37

Links to these documents can be found
1:37 - 1:40

in the WAS rights management practices file.
1:40 - 1:42

WAS has distilled 9 assertions
1:42 - 1:44

regarding the rights of website owners
1:44 - 1:47

from the Section 108 Study Group Report
1:47 - 1:50

and the ARL Code of Best Practices.
1:50 - 1:53

This tutorial will go over each assertion
1:53 - 1:57

and explain how they inform WAS policies.
1:57 - 1:59

Assertion #1 states that libraries and archives
1:59 - 2:02

should have the right to capture and archive
2:02 - 2:04

publicly available websites without requesting
2:04 - 2:06

advanced permission.
2:06 - 2:09

WAS does not enforce any requirement
2:09 - 2:11

for proof of permission before capture,
2:11 - 2:12

and both the Study Group Report and
2:12 - 2:17

the Code of Best Practices support this policy.
2:17 - 2:20

Assertion #2 states that US Federal, State
2:20 - 2:22

and local government agencies,
2:22 - 2:25

political parties, political candidates and
2:25 - 2:26

political action committees
2:26 - 2:28

should not have the right to prevent
2:28 - 2:32

publicly available content from being harvested.
2:32 - 2:35

While WAS respects robots.txt exclusions by default,
2:35 - 2:37

curators do have the ability
2:37 - 2:39

to override these settings
2:39 - 2:42

in order to capture and archive content.
2:42 - 2:45

Assertion #3 states that claims of fair use
2:45 - 2:47

relating to material posted with
2:47 - 2:49

'bot exclusion' headers to ward off
2:49 - 2:51

automatic harvesting may be stronger
2:51 - 2:53

when the institution has adopted
2:53 - 2:56

and follows a consistent policy.
2:56 - 2:58

The Section 108 Study Group addressed
2:58 - 3:01

government and political sites specifically.
3:01 - 3:03

The ARL Code provides broader guidance
3:03 - 3:05

supporting WAS's practice of having default
3:05 - 3:09

settings that respect robots.txt exclusions,
3:09 - 3:11

that a curator is able to override.
3:11 - 3:14

In addition, WAS enforces consistent
3:14 - 3:16

requirements when curators choose to
3:16 - 3:19

override robots.txt.
3:19 - 3:21

Assertion #4 states that,
3:21 - 3:23

to the extent reasonably possible,
3:23 - 3:26

the legal proprietors of the sites
3:26 - 3:27

in question should be identified
3:27 - 3:29

according to the prevailing conventions
3:29 - 3:31

of attribution.
3:31 - 3:33

In order to override robots.txt exclusions,
3:33 - 3:36

WAS curators must provide
3:36 - 3:38

owner attribution for a site.
3:38 - 3:41

This owner attribution will appear
3:41 - 3:43

on the site details screen for any
3:43 - 3:45

archived website where robots.txt rules
3:45 - 3:48

have been overridden.
3:48 - 3:51

Assertion #5 states that libraries and archives
3:51 - 3:53

should be required to label prominently
3:53 - 3:56

all copies of captured online content
3:56 - 3:58

that are made accessible to users,
3:58 - 4:00

stating that the content is an archived copy
4:00 - 4:04

for use only for private study, scholarship and research
4:04 - 4:07

and providing the date of capture.
4:07 - 4:10

CDL's WAS service agreement expressly states,
4:10 - 4:13

"The Curatorial Partner recognizes Web Content
4:13 - 4:16

captured with WAS is for education purposes only
4:16 - 4:19

and may not be used for commercial purposes
4:19 - 4:23

by the Curatorial Partner or other third party."
4:23 - 4:25

In addtiion, all publicly displayed content
4:25 - 4:28

is marked "This document is an archived copy
4:28 - 4:31

for study and research."
4:31 - 4:33

Assertion #6 states that archived materials
4:33 - 4:35

are represented as they were captured,
4:35 - 4:37

with appropriate information on mode
4:37 - 4:39

of harvesting and date.
4:39 - 4:42

It is important to know that WAS does not
4:42 - 4:44

edit or alter content in any way.
4:44 - 4:46

In fact, CDL's preservation repository
4:46 - 4:48

has an ongoing fixity checking process
4:48 - 4:50

for all harvested web content,
4:50 - 4:54

to ensure that no changes have been made.
4:54 - 4:56

In addition, details of harvest and date
4:56 - 4:58

are available on the "Show Description" screen
4:58 - 5:00

of any publicly displayed archived file.
5:00 - 5:04

Assertion #7 states that an embargo of
5:04 - 5:06

a "reasonable period of time"
5:06 - 5:08

should be observed before archived materials
5:08 - 5:11

are made available to the general public.
5:11 - 5:12

WAS complies with this assertion
5:12 - 5:14

by enforcing an embargo between
5:14 - 5:16

the time that content is harvested and
5:16 - 5:19

when that archived material can become publicly accessible.
5:19 - 5:23

At the moment, this embargo time is 6 months.
5:23 - 5:26

Assertion #8 states that libraries and archives
5:26 - 5:29

should be prohibited in engaging
5:29 - 5:32

in any activies that are likely
5:32 - 5:34

to materially harm the value or operations
5:34 - 5:36

of the Internet site hosting the online content
5:36 - 5:39

that is sought to be captured and made available.
5:39 - 5:43

WAS service agreements with CDL's Curatorial Partners
5:43 - 5:46

clearly state that CDL may intervene or stop
5:46 - 5:48

a capture in progress based on an objection
5:48 - 5:50

raised by the content owner.
5:50 - 5:53

In addition, WAS crawlers are always run
5:53 - 5:54

with a significant delay, to avoid
5:54 - 5:58

impacting a site's performance.
5:58 - 6:00

Assertion #9 states that libraries
6:00 - 6:01

should provide copyright owners
6:01 - 6:03

with a simple tool for registering objections
6:03 - 6:05

to making items from such a collection
6:05 - 6:07

available online, and respond to such objections promptly.
6:07 - 6:12

WAS crawlers leave identification information
6:12 - 6:15

on each server that they interact with.
6:15 - 6:17

This includes a URL to a webpage
6:17 - 6:20

explaining the crucial role of web archiving
6:20 - 6:22

in preserving our cultural heritage and history.
6:22 - 6:25

This page contains a phone number, a webform,
6:25 - 6:28

and email address to contact CDL.
6:28 - 6:32

It is also possible for curators and CDL
6:32 - 6:35

to block specific archived files, directories
6:35 - 6:38

or all content from public view.
6:38 - 6:40

This provides the ability to block content
6:40 - 6:42

that has been requested to be taken down
6:42 - 6:44

without blocking an entire archive.
6:44 - 6:48

Selecting your desired robots.txt settings
6:48 - 6:50

can be easily done when you create a site.
6:50 - 6:53

Under the "Capture Settings" screen
6:53 - 6:55

you have the option of leaving the default
6:55 - 6:57

"yes" to honor robots.txt,
6:57 - 6:59

or to choose "no."
6:59 - 7:02

If you are interested in changing
7:02 - 7:04

the robots.txt settings for a previously
7:04 - 7:07

created site, find the site you wish
7:07 - 7:10

to adjust on your "Manage Sites" screen.
7:10 - 7:15

Choose "Edit," and then adjust the robots.txt
7:15 - 7:21

settings as desired, and click "Save."
7:21 - 7:23

This has been a Web Archiving Service
7:23 - 7:26

video tutorial on rights management.
7:26 - 7:27

As always, if you have questions,
7:27 - 7:31

contact us at washelp@ucop.edu

Title:: Web Archiving Service (WAS) Rights Management: California Digital Library (CDL)
Description:: This Web Archiving Service tutorial video will give you a brief overview of rights management as it relates to web archiving.

more » « less

	cpwillett edited English subtitles for Web Archiving Service (WAS) Rights Management: California Digital Library (CDL)
	cpwillett added a translation

English subtitles

Revisions

Revision 2

cpwillett

Web Archiving Service (WAS) Rights Management: California Digital Library (CDL)

Revisions

Our website uses cookies

Operating cookies (Required)