We need to focus on the interactions that are enabled because of the intentional acts of description or arrangement that transform a collection of resources into an organizing system. With physical resources, it is easy to distinguish the interactions that are designed into and directly supported by an organizing system because of intentional acts of description or arrangement from those that can take place with resources after they have been accessed. For example, when a book is checked out of a library it might be read, translated, summarized, criticized, or otherwise used—but none of these interactions would be considered a capability of the book that had been designed into the library. Some physical resources can initiate interactions, as surely “human resources” and “smart” objects with sensors and other capabilities can, but most physical resources are passive. We will discuss this idea of resource agency in “Resource Agency”.
In contrast, in organizing systems that contain digital resources the logical boundary between the resources and their interactions is less clear because what you can do with a digital resource is often not apparent. Furthermore, some of the interactions that are outside of the boundary with physical resources can be inside of it with digital ones. For example, when you check a printed book out of the library, it is no longer in the library when you translate it. But a digital book in the Google Books library is not removed when you start reading it, and a language translation service runs “inside” of it.
Additional issues in the design of interactions with resources are whether users have direct or mediated access to the resources, and whether they interact with the resources themselves or only with copies or descriptions of them. For example, users have direct access to original resources in a collection when they browse through library stacks or wander in museum galleries. Users have mediated or indirect access when they use catalogs or search engines. Because digital resources can be easily reproduced, it can be difficult to distinguish a copy from the original, which raises questions of authenticity we will discuss in “Authenticity”.
Affordance and Capability
The concept of affordance, introduced by J. J. Gibson, then extended and popularized by Donald Norman, captures the idea that physical resources and their environments have inherent actionable properties that determine, in conjunction with an actor’s capabilities and cognition, what can be done with the resource.
Including capabilities and cognition brings accessibility considerations into the definition of affordance. A resource is only accessible when it supports interactions, and it is ineffective design to implement interactions with resources that some people are unable to perform. A person who cannot see text cannot read it, or if they are confined to a wheelchair they cannot select a book from a tall library shelf. Describing or transforming resources to ensure their accessibility is discussed in greater detail in “Accessibility”.
When organizing resources involves arranging physical resources using boxes, bins, cabinets, or shelves, the affordances and the implications for access and use can be easily perceived. Resources of a certain size and weight can be picked up and carried away. Books on the lower shelves of bookcases are easy to reach, but those stored ten feet from the ground cannot be easily accessed.
We can analyze the organizing systems with physical resources to identify the affordances and the possible interactions they imply. We can compare the affordances or overall interaction capability enabled by different organizing systems for some type of physical resources, and we often do this without thinking about it. The tradeoffs between the amount of work that goes into organizing a collection of resources and the amount of work required to find and use them are inescapable when the resources are physical objects or information resources are in physical form. We can immediately see that storing information on scrolls does not enable the random access capability that is possible with books.
What and how to count to compare the capabilities of organizing systems becomes more challenging the further we get from collections of static physical resources, like books or shoes, where it is usually easy to perceive and understand the possible interactions. With computers, information systems, and digital resources in general, considerations about affordances and capabilities are not as straightforward.
First, the affordances we can perceive might not be tied to any useful interaction. Donald Norman joked that every computer screen within reaching distance affords touching, but unless the display is touch-sensitive, this affordance only benefits companies that sell screen-cleaning materials.
Second, most of the interactions that are supported by digital resources are not apparent when you encounter them. You cannot tell from their names, but you probably know from past experience what interactions are possible with files of types “.doc” and “.pdf.” You probably do not know what interactions take place with “.xpi” and “.mobi” files.
A similar difficulty exists when we look at resource descriptions and data collections, where we often cannot tell just by examining their values what kinds of interactions and operations with them are sensible. Think of all the different kinds of information that might be associated with a collection of people like the students in a university. A database might contain student names, student IDs, gender, birth dates, addresses, a numeric code for academic major, course units completed, grade point average, and other information. These pieces of information differ in their data type; some are integers, some are real numbers, some are Boolean, and some are just text strings. The numeric data also differs in the level of measurement it represents. Student IDs and the academic major codes are nominal data, the house or apartment number in the address is ordinal data, and the course units and grade point average are interval data. Data type and level of measurement influence the kind of interactions that are meaningful; we can create an alphabetical list of students using their last names, count up the number of students with the same academic major, and calculate the average GPA or units completed. But it makes no sense to use the numeric codes for academic major to compute an average major.
Once you have discovered it, the capability of digital resources and information systems can be assessed by counting the number of functions, services, or application program interfaces. However, this very coarse measure does not take into account differences in the capability or generality of a particular interaction. For example, two organizing systems might both have a search function, but differences in the operators they allow, the sophistication of pre-processing of the content to create index terms, or their usability can make them vastly differ in power, precision, and effectiveness.
An analogous measure of functional capability for a system with dynamic or living resources is the behavioral repertoire, the number of different activities, or range of actions, that can be initiated.
We should not assume that supporting more types of interactions necessarily makes a system better or more capable; what matters is how much value is created or invoked in each interaction. A smartphone cluttered with features and apps you never use enables a great many interactions, but most of them add little value. Doors that open automatically when their sensors detect an approaching person do not need handles or require explicit interactions. Organizing systems can use stored or computed information about user preferences or past interactions to anticipate user needs or personalize recommendations. This has the effect of substituting information for interaction to make interactions unnecessary or simpler.
For example, a “smart travel agent” service can use a user’s appointment calendar, past travel history, and information sources like airline and hotel reservation services to transform a minimal interaction like “book a business trip to New York for next week’s meeting” into numerous hidden queries that would have otherwise required separate interactions. These queries are interconnected by logical or causal dependencies that are represented by information that overlaps between them. For example, all travel-related services (airlines, hotels, ground transportation) need the traveler’s identity and the time and location of his travel. A New York trip might involve all of these services, and they need to fit together in time and location for the trip to make sense. The hotel reservation needs to begin the day the flight arrives in the destination city, the limousine service needs to meet the traveler shortly after the plane lands, and the restaurant reservation should be convenient in time and location to the hotel.
Interaction and Value Creation
A useful way to distinguish types of interactions with resources is according to the way in which they create value, using a classification proposed by Apte and Mason. They noted that interactions differ not just in their overall intensity but in the absolute and relative amounts of physical manipulation, interpersonal or empathetic contact, and symbolic manipulation or information exchange involved in the interaction.
Furthermore, Apte and Mason recognized that the proportions of these three types of value creating activities can be treated as design parameters, especially where the value created by retrieving or computing information could be completely separated from the value created by physical actions and person-to-person encounters. This configuration of value creation enables automated self-service, in which the human service provider can be replaced by technology, and outsourcing, in which the human provider is separated in space or time from the customer.
Value Creation with Physical Resources
Physical manipulation is often the intrinsic type of interaction with collections of physical resources. The resource might have to be handled or directly perceived in order to interact with it, and often the experience of interacting with the resource is satisfying or entertaining, making it a goal in its own right. People often visit museums, galleries, zoos, animal theme parks or other institutions that contain physical resources because they value the direct, perceptual, or otherwise unmediated interaction that these organizing systems support.
Physical manipulation and interpersonal contact might be required to interact with information resources in physical form like the printed books in libraries.
A large university library contains millions of books and academic journals, and access to those resources can require a long walk deep into the library stacks after a consultation with a reference librarian or a search in a library catalog. For decades library users searched through description resources—first printed library cards, and then online catalogs and databases of bibliographic citations—to locate the primary resources they wanted to access. The surrogate descriptions of the resources needed to be detailed so that users could assess the relevance of the resource without expending the significant effort of obtaining and examining the primary resource.
However, for most people the primary purpose of interacting with a library is to access the information contained in its resources. Many people prefer accessing digital documents or books to accessing the original physical resource because the incidental physical and interpersonal interactions are unnecessary. In addition, many library searches are for known items, which is easily supported by digital search.
In some organizing systems robotic devices, computational processes, or other entities that can act autonomously with no need for a human agent carry out interactions with physical resources. Robots have profoundly increased efficiency in materials management, “picking and packing” in warehouse fulfillment, office mail delivery, and in many other domains where human agents once located, retrieved, and delivered physical resources. A “library robot” system that can locate books and grasp them from the shelves can manage seven times as many books in the same space used by conventional open stacks.
Interactions with physical resources often have highly tangible results; in the preceding examples of fulfillment and delivery interactions, resources move from one location to another. However, an abstract or architectural perspective on interaction design and value creation can create more flexibility in carrying out the interactions while still producing the expected value for the user. In general, more abstract descriptions of interactions and services allow for transparent substitution of the implementation, potentially enabling a computational process to be a substitute for one carried out by a person, or vice versa.
For example, a user buying from an internet-based store need not know and probably does not care which service delivers the package from the warehouse. Presenting the interaction to the shopper as the “delivery service” rather than as a “FedEx” or “UPS” service allows the retailer to choose the best service provider for each delivery. Going even further, if you need printed documents at a conference, sales meeting, or anywhere other than your current location, the interaction you desire is “provide me with documents” and not “deliver my documents.” It does not matter that FedEx will print your documents at their destination rather than shipping them there.
Value Creation with Digital Resources
With digital resources, neither physical manipulation nor interpersonal contact is required, and the essence of the interaction is information exchange or symbolic manipulation of the information contained in the resource. Put another way, by replacing interactions that involve people and physical resources with symbolic ones, organizing systems can lower costs without reducing user satisfaction. This is why so many businesses have automated their information-intensive processes with self-service technology.
Similarly, web search engines eliminate the physical effort required to visit a library and enable users to consult more readily accessible digital resources. A search engine returns a list of the page titles of resources that can be directly accessed with just another click, so it takes little effort to go from the query results to the primary resource. This reduces the need for the rich surrogate descriptions that libraries have always been known for because it enables rapid evaluation and iterative query refinement.
The ease of use and speed of search engines in finding web resources creates the expectation that any resource worth looking at can be found on the web. This is certainly false, or Google would never have begun its ambitious and audacious project to digitize millions of books from research libraries. While research libraries strive to provide access to authoritative and specialized resources, the web is undeniably good enough for answering most of the questions ordinary users put to search engines, which largely deal with everyday life, popular culture, personalities, and news of the day.
Libraries recognize that they need to do a better job integrating their collections into the “web spaces” and web-based activities of their users if they hope to change the provably suboptimal strategies of “information foraging” most people have adopted that rely too much on the web and too little on the library. Some libraries are experimenting with Semantic Web and “Linked Data” technologies that would integrate their extensive bibliographic resources with resources on the open web.
Museums have aggressively embraced the web to provide access to their collections. While few museum visitors would prefer viewing a digital image over experiencing an original painting, sculpture, or other physical artifact, the alternative is often no access at all. Most museum collections are far larger than the space available to display them, so the web makes it possible to provide access to otherwise hidden resources.
The variety and functions of interactions with digital resources are determined by the amount of structure and semantics represented in their digital encoding, in the descriptions associated with the resources, or by the intelligence of the computational processes applied to them. Digital resources can support enhanced interactions of searching, copying, zooming, and other transformations. Digital or “ebooks” demonstrate how access to content can be enhanced once it is no longer tied to the container of the printed book, but ebook readers vary substantially in their interaction repertoires; the baseline they all share is “page turning,” resizing, and full-text search.
To augment digital resources with text structures, multimedia, animation, interactive 3-D graphics, mathematical functions, and other richer content types requires much more sophisticated representation formats that tend to require a great deal of “hand-crafting.” An alternative to hand-crafted resource description is sophisticated computer processing guided by human inputs. For example, Facebook and many web-based photo organizing systems implement face recognition analysis that detects faces in photos, compares features of detected faces to features of previously identified faces, and encourages people to tag photos to make the recognition more accurate. Some online services use similar image classification techniques to bring together shoes, jewelry, or other items that look alike.
Richer interactions with digital text resources are possible when they are encoded in an application or presentation-independent format. Automated content reuse and “single-source publishing” is most efficiently accomplished when text is encoded in XML, but much of this XML is produced by transforming text originally created in word processing formats. Once it is in XML, digital information can be distributed, processed, reused, transformed, mixed, remixed, and recombined into different formats for different purposes, applications, devices, or users in ways that are almost impossible to imagine when it is represented in a tangible (and therefore static) medium like a book on a shelf or a box full of paper files.
Businesses that create or own their information resources can readily take advantage of the enhanced interactions that digital formats enable. For libraries, however, copyright is often a barrier to digitization, both as a matter of law and because digitization enables copyright enforcement to a degree not possible with physical resources.
As a result, digital books are somewhat controversial and problematic for libraries, whose access models were created based on the economics of print publication and the social contract of the copyright first sale doctrine that allowed libraries to lend printed books.
Software-based agents do analogous work to robots in “moving information around” after accessing digital resources such as web services or physical resources with sensors attached that produce digital information. Agents can control or choreograph a set of interactions with digital resources to carry out complex business processes.
The United Nations Convention on the Rights of Persons with Disabilities recognizes accessibility to information and communications technologies as a basic human right. There is also a strong business case for accessibility: studies show that accessible websites are used more often, are easier to maintain, and produce better search results.
Many of the techniques for making a resource accessible involve transforming the resource or its description into a different form so someone who could not perceive it or interact with it in its original form can now do so. The most common operating systems all come with general-purpose accessibility features such as reading text aloud, recognizing speech, magnifying text, increasing cursor size, signaling with flashing lights instead of with sounds, lights to signal keyboard shortcuts for selecting and navigating, and connecting to devices for displaying Braille. Google Translate converts text in one language to another, and many people use it to create a rough draft that is finished by a human translator.
Other techniques are not generic and automatic, and instead require investment by authors or designers to make information accessible. Websites are more accessible when images or other non-text content types have straightforward titles, captions, and “alt text” that describes what they are about. Consistent placement and appearance of navigation controls and interaction widgets is essential; for example, in a shopping site “My Cart” might always be found at the top right corner of the page.
If authors apply semantic and structural markup to the text and use formats that distinguish it from presentation instructions, page outlines and summaries can be generated to enhance navigation, and search can be made more precise by limiting it to particular sections or content types. As the “Information IQ” of the source format increases, more can be done to make it more accessible (see “Resource Format” and Figure: Information IQ.).
The Smithsonian Museum in Washington, DC invites visitors to record audio descriptions on mobile devices of the nearly 137 million objects in its collection, and then makes these available to everyone. This is just a small part of its efforts to make its exhibits more accessible. A company called D-Scriptive enables blind people to enjoy Broadway shows more by recording hundreds of audio descriptions that are synchronized with dialog spoken by the actors.
Transforming recorded spoken language to text to make it accessible and searchable is called transcription. At times transcription is necessary to comply with accessibility requirements, but is often done simply to add organization to content, as when a script is created to separate the multiple voices in a radio or television interview or story.
Transcriptions created by skilled people are highly accurate but labor-intensive to produce, so speech-to-text software is increasingly being used to transcribe speech using pre-trained acoustic and language models. Training these models is computationally intensive, and there are many clever techniques to acquire the “labeled” inputs. However, most of them are conceptually simple; they take the huge amount of data collected by voice search applications and analyze what the searcher does with the results to assess the accuracy of the transcription. Transcription accuracy can be improved when models can be specialized by industry or application. For example, speech-to-text software for doctors is trained to recognize medical terminology, while software for use by generic voice recognition services like Apple’s is trained to understand dictation and commands or questions one would ask of a smartphone.
Since text transcripts are machine-readable, unlike audio or video files, adding text transcripts makes it possible for search engines to index audio and video in ways that were previously impossible. Pop Up Archive, an audio search company in Oakland, California, works with speech-to-text software specially trained for news media and spoken word content to make radio, podcasts, and archival audio searchable. A challenge for audio search is that even though a transcription with a few mistakes works just fine for search engines, people often expect transcriptions to be perfect.
When the speech is in a language that is not understood, it needs to be translated as well. Perhaps you have watched a movie on an international flight and were able to choose from subtitles in many different languages. Creating subtitles for a foreign film is an asynchronous task that is substantially easier task than doing a real-time translation, and the demand for skilled translators for speeches and other synchronous situations (and interpreters, who translate speech to sign language for people with hearing disabilities) remains high.
Different levels of interactions or access can apply to different resources in a collection or to different categories of users. For example, library collections can range from completely open and public, to allowing limited access, to wholly private and restricted.
The library stacks might be open to anyone, but rare documents in a special collection are only accessible to authorized researchers. The same is true of museums, which typically have only a fraction of their collections on public display.
Because of their commercial and competitive purposes, organizing systems in business domains are more likely to enforce a granular level of access control that distinguishes people according to their roles and the nature of their interactions with resources. For example, administrative assistants in a company’s Human Resources department are not allowed to see salaries; HR employees in a benefits administration role can see salaries but not change them; management-level employees in HR can change the salaries. Some firms limit access to specific times from authorized computers or IP addresses.
A noteworthy situation arises when the person accessing the organizing system is the one who designed and implemented it. In this case, the person will have qualitatively better knowledge of the resources and the supported interactions. This situation most often arises in the organizing systems in kitchens, home closets, and other highly personal domains but can also occur in knowledge-intensive business and professional domains like consulting, customer relationship management, and scientific research.
Many of the organizing systems used by individuals are embedded in physical contexts where access controls are applied in a coarse manner. We need a key to get into the house, but we do not need additional permissions or passwords to enter our closets or to take a book from a bookshelf. In our online lives, however, we readily accept and impose more granular access controls. For example, we might allow or block individual “friend” requests on Facebook or mark photos on Flickr as public, private, or viewable only by named groups or individuals.
We can further contrast access policies based on their origins or motivations.
Designed resource access policies are established by the designer or operator of an organizing system to satisfy internally generated requirements. Examples of designed access policies are:
giving more access to “inside” users (e.g., residents of a community, students or faculty members at a university, or employees of a company) than to anonymous or “outside” users;
giving more access to paying users than to users who do not pay;
giving more access to users with capabilities or competencies that can add value to the organizing system (e.g., material culture researchers like archaeologists or anthropologists, who often work with resources in museum collections that are not on display).
Imposed Policies are mandated by an external entity and the organizing system must comply with them. For example, an organizing system might have to follow information privacy, security, or other regulations that restrict access to resources or the interactions that can be made with them.
University libraries typically complement or replace parts of their print collections with networked access to digital content licensed from publishers. Typical licensing terms then require them to restrict access to users that are associated with the university, either by being on campus or by using virtual private network (VPN) software that controls remote access to the library network. Copyright law limits the uses of a substantial majority of the books in the collections of major libraries, prohibiting them from being made fully available in digital formats. Museums often prohibit photography because they do not own the rights to modern works they display.
Whether an access policy is designed or imposed is not always clear. Policies that were originally designed for a particular organizing system may over time become best practices or industry standards, which regulators or industry groups not satisfied with “self-regulation” later impose. Museums might aggressively enforce a ban on photography not just to comply with copyright law, but also to enhance the revenue they get from selling posters and reproductions.
Except when the resources on display are replicas of the originals, which is more common than you might suspect. Many nineteenth-century museums in the United States largely contained copies of pieces from European museums. Today, museums sometimes display replicas when the originals are too fragile or valuable to risk damage (Wallach 1998). Whether the “resource-based interaction” is identical for the replica and original is subjective and depends on how well the replica is implemented.
The “.xpi” file type is used for Mozilla/Firefox browser extensions, small computer programs that can be installed in the browser to provide some additional user interface functionality or interaction. The “.mobi” file type was originally developed to enable better document display and interactions on devices with small screens. Today its primary use is as the base ebook format for the Amazon Kindle, except that the Kindle version is more highly compressed and locked down with digital rights management.
(Apte and Mason 1995) introduced this framework to analyze services rather than interactions per se.
Furthermore, many of the resources might not be available in the user’s own library and could only be obtained through inter-library loan, which could take days or weeks.
In contrast, far fewer interactions in museum collections are searches for known items, and serendipitous interactions with previously unknown resources are often the goal of museum visitors. As a result, few museum visitors would prefer an online visit to experiencing an original painting, sculpture, or other physical artifact. However, it is precisely because of the unique character of museum resources that museums allow access to them but do not allow visitors to borrow them, in clear contrast to libraries.
See also the Popular Science article How It Works: Underground Robot Library available at
Providing access to knowledge is a core mission of libraries, and it is worth pointing out that library users obtain knowledge both from the primary resources in the library collection and from the organizing system that manages the collection.
It also erodes the authority and privilege that apply to resources because they are inside the library when a web search engine can search the “holdings” of the web faster and more comprehensively than you can search a library’s collection through its online catalog.
See (Simon 2010). An exemplary project to enhance museum access is Delphi (Schmitz and Black 2008), the collections browser for the Phoebe A. Hearst Museum of Anthropology at University of California, Berkeley. Delphi very cleverly uses natural language processing techniques to build an easy-to-use faceted browsing user interface that lets users view over 600,000 items stored in museum warehouses. Delphi is being integrated into Collection Space (
http://www.collectionspace.org/), an open source web collections management system for museum collections, collaboratively being developed by University of California, Berkeley, Cambridge University, Ontario Academy of Art and Design(OCAD), and numerous museums.
Even sophisticated text representation formats such as XML have inherent limitations: one important problem that arises in complex management scenarios, humanities scholarship, and bioinformatics is that XML markup cannot easily represent overlapping substructures in the same resource (Schmidt 2009).
Digital books change the economics and first sale is not as well-established for digital works, which are licensed rather than sold (Aufderheide and Jaszi 2011). To protect their business models, many publishers are limiting the number of times ebooks can be lent before they “self-destruct.” Some librarians have called for boycotts of publishers in response (
In contrast to these new access restrictions imposed by publishers on digital works, many governments as well as some progressive information providers and scientific researchers have begun to encourage the reuse and reorganization of their content by making geospatial, demographic, environmental, economic, and other datasets available in open formats, as web services, or as data feeds rather than as “fixed” publications (Bizer 2009a), (Robinson et al. 2008). And we have made this book available as an open content repository so that it can be collaboratively maintained and customized.
We cannot explain this any better than the UN does: “The Convention follows decades of work by the United Nations to change attitudes and approaches to persons with disabilities. It takes to a new height the movement from viewing persons with disabilities as ‘objects’ of charity, medical treatment and social protection towards viewing persons with disabilities as ‘subjects’ with rights, who are capable of claiming those rights and making decisions for their lives based on their free and informed consent as well as being active members of society.” See
The Web Accessibility Initiative works to make the Web accessible to people with visual, auditory, speech, cognitive, neurological, and physical disabilities.
Smithsonian Guidelines for Accessible Exhibition Design
Because not every performance of a Broadway is exactly the same, the D-Scriptive audio snippets are tied to particular bits of dialog. The theater's stage manager triggers the D-Scriptive system to broadcast the corresponding visual explanations to audience members listening on earpieces. For example, in the Lion King a snippet might explain that “on the left are two giraffes and a cheetah.” (Giridharadas 2014)
In 2015 Netflix began a similar audio description service to accompany some of its original series. See
For a recent historical and highly technical review of speech recognition written by some of the most prominent researchers in the field, see (Huang, Baker, and Reddy 2014) An easier to read story about Apple's Siri voice recognition program is (Geller 2012). Popup archive is
https://www.popuparchive.org/and its audio search service is
These access controls to the organizing system or its host computer are enforced using passwords and more sophisticated software and hardware techniques. Some access control policies are mandated by regulations to ensure privacy of personal data, and policies differ from industry to industry and from country to country. Access controls can improve the credibility of information by identifying who created or changed it, especially important when traceability is required (e.g., financial accounting).
An important difference between interactions with physical resources and those with digital resources is how they use resource descriptions for access control. Resources sometimes have associated security classifications like “Top Secret” that restrict who can learn about their existence or obtain them. Nonetheless, if you get your hands on a top secret printed document, nothing can prevent you from reading it. Similarly, printed resources often have “All rights reserved” copyright notices that say that you cannot copy them, but nothing can prevent you from making copies with a copy machine. On the other hand, learning of the existence of a digital resource might be of little value if copyright or licensing restrictions prevent you from obtaining it. Moreover, obtaining a digital resource might be of no value if its content is only available using a password, decryption key, or other resource description that enforces access control directly rather than indirectly like the security classifications.
In response to this trend, however, many libraries are supporting “open access” initiatives that strive to make scholarly publications available without restriction (Bailey 2007). Libraries and ebook vendors are engaged in a tussle about the extent to which the “first sale” rule that allows libraries to lend physical books without restrictions also applies to ebooks (Howard 2011).