20 Four Distinctions about Resources
The nature of the resource is critical for the creation and maintenance of quality organizing systems. There are four distinctions we make in discussing resources: domain, format, agency, and focus. Figure: Resource Domain, Format, Focus and Agency. depicts these four distinctions, perspectives or points of view on resources; because they are not independent, we cannot portray these distinctions as categories of resources.
Resource Domain
Resource domain is an intuitive notion that groups resources according to the set of natural or intuitive characteristics that distinguishes them from other resources. It contrasts with the idea of ad hoc or arbitrary groupings of resources that happen to be in the same place at some time.
For physical resources, domains can be coarsely distinguished according to the type of matter they are made of using easily perceived properties. The top-level classification of all things into the animal, vegetable, and mineral kingdoms by Carl Linnaeus in 1735 is deeply embedded in most languages and cultures to create a hierarchical system of domain categories. [1] Many aspects of this system of domain categories are determined by natural constraints on category membership that exist as patterns of shared and correlated properties; a resource identified as a member of one category must also be a member of another with which it shares some but not all properties. For example, a marble statue in a museum must also be a kind of material, and a fish in an aquarium must also be a kind of animal.
For information resources, easily perceived properties like a book’s color or size are less reliably correlated with resource domain, so we more often distinguish domains based on semantic properties; the definitions of the “encyclopedia,” “novel,” and “invoice” resource types distinguish them according to their typical subject matter, or the type of content, rather than according to the great variety of physical forms in which we might encounter them. Arranging books by color or size might be sensible for very small collections, or in a photo studio, but organizing according to physical properties would make it extremely impractical to find books in a large library.
We can arrange types of information resources in a hierarchy. However, because the category boundaries are not sharp it is more useful to view domains of information resources on a continuum from weakly-structured narrative content to highly structured transactional content. This framework, called the Document Type Spectrum by Glushko and McGrath, captures the idea that the boundaries between resource domains, like those between colors in the rainbow, are easy to see for colors far apart in the spectrum but hard to see for adjacent ones.[2] (See the sidebar, The Document Type Spectrum, and its corresponding depiction as Figure: Document Type Spectrum.)
Resource Format
Information resources can exist in numerous formats with the most basic distinction between physical and digital ones. This distinction is most important in the implementation of a resource storage or preservation system because that is where physical properties are usually considerations, and very possibly constraints. This distinction is less important at the logical level when we design interactions with resources because digital surrogates for the physical resources can overcome the constraints posed by physical properties. When we search for cars or appliances in an online store it does not matter where the actual cars or appliances are located or how they are organized. (See the sidebar, The Three Tiers of Organizing Systems).
Many digital representations can be associated with either physical or digital resources, but it is important to know which one is the original or primary resource, especially for unique or valuable ones.
Today many resources in organizing systems are born digital, created in word processors, digital cameras, audio and video recorders. Other digital resources are by sensors in “smart things” and by the systems that create digital resources when they interact with barcodes, QR codes, RFID tags, or other mechanisms for tracking identity and location.[3]
Other digital resources are created by digitization, the process for transforming an artifact whose original format is physical so it can be stored and manipulated by a computer. We digitize the printed word, photographs, blueprints, and record albums. Printed text, for example, can be digitized by scanning pages and using character recognition software or simply re-typing it.[4]
There are a vast number of digital formats that differ in many ways, but we can coarsely compare them on two dimensions: the degree to which they distinguish information content from presentation or rendering, and the explicitness with which content distinctions are represented. Taken together, these two dimensions allow us to compare formats on their overall “Information IQ” —with the overarching principle being that “smarter” formats contain more computer-processable information, as illustrated in Figure: Information IQ.
Simple digital formats for “plain text” documents contain only the characters that you see on your computer keyboard. ASCII is the most commonly used simple format, but ASCII is inadequate for most languages, which have larger character sets, and it also cannot handle mathematical characters.[5] The Unicode standard was designed to overcome these limitations.[6] (ASCII and Unicode are discussed in great detail in “Notations”.)
Most document formats also explicitly encode a hierarchy of structural components, such as chapters, sections or semantic components like descriptions or procedural steps, and sometimes the appearance of the rendered or printed form.[7] Another important distinction to note is whether the information is encoded as a sequence of text characters so that it is human readable as well as computer readable. Encoding character content with XML, for example, allows for layering of intentional coding or markup interwoven with the “plain text” content. Because XML processors are required to support Unicode, any character can appear in an XML document. The most complex digital formats are those for multimedia resources and multidimensional data, where the data format is highly optimized for specialized analysis or applications.[8]
Digitization of non-text resources such as film photography, drawings, and analog audio and visual recordings raises a complicated set of choices about pixel density, color depth, sampling rate, frequency filtering, compression, and numerous other technical issues that determine the digital representation.[9]
There may be multiple intended uses and devices for a digitized resource that could require different digitization approaches and formats. Downstream users of digitized resources need to know the format in which a digital artifact has been created, so they can reuse it as is, or process it in other ways.
Some digital formats support interactions that are qualitatively different and more powerful than those possible with physical resources. Museums are using virtual world technology to create interactive exhibits in which visitors can fly through the solar system, scan their own bodies, and change gravity so they can bounce off walls. Sophisticated digital document formats can enable interactions with annotated digital images or video, 3-D graphics or embedded datasets. The Google Art Project contains extremely high resolution photographs of famous paintings that make it possible to see details that are undetectable under the normal viewing conditions in museums.[10]
Nevertheless, digital representations of physical resources can also lose important information and capabilities. The distinctive sounds of hip hop music produced by “scratching” vinyl records on turntables cannot be produced from digital MP3 music files.[11]
Copyright often presents a barrier to digitization, both as a matter of law and because digitization itself enables copyright enforcement to a degree not possible prior to the advent of digitization, by eliminating common forms of access and interactions that are inherently possible with physical printed books like the ability to give or sell them to someone else.[12]
Resource Agency
Agency is the extent to which a resource can initiate actions on its own. We can define a continuum between completely passive resources that cannot initiate any actions and active resources that can initiate actions based on information they sense from their environments or obtain through interactions with other resources. A book being read at the beach will grow warm from absorbing the sun’s energy but it has no way of measuring its temperature and is a completely passive resource. An ordinary mercury thermometer senses and displays the temperature but is not capable of communicating its own reading, whereas a digital wireless thermometer or weather station can.
Passive resources serve as nouns or operands that are acted upon, while active resources serve as verbs or operants that cause and carry out actions. We need a concept of agency to bring resources that are active information sources, or computational in character, into the organizing system framework. This concept also lets us include living resources, or more specifically, humans, into discussions about organizing systems in a more general way that emphasizes their agency.[13]
Passive or Operand Resources
Organizing systems that contain passive or operand resources are ubiquitous for the simple reason that we live in a world of physical resources that we identify and name in order to interact with them. Passive resources are usually tangible and static and thus they become valuable only as a result of some action or interaction with them.
Most organizing systems with physical resources or those that contain resources that are digitized equivalents treat those resources as passive. A printed book on a library shelf, a digital book in an ebook reader, a statue in a museum gallery, or a case of beer in a supermarket refrigerator only create value when they are checked out, viewed, or consumed. None of these resources exhibits any agency and cannot initiate any actions to create value on their own.
Active or Operant Resources
Active resources create effects or value on their own, sometimes when they initiate interactions with passive resources. Active resources can be people, other living resources, computational agents, active information sources, web-based services, self-driving cars, robots, appliances, machines or otherwise ordinary objects like light bulbs, umbrellas, and shoes that have been made “smarter.” We can exploit computing capability, storage capacity, and communication bandwidth to create active resources that can do things and support interactions that are impossible for ordinary physical passive resources.
We can analyze active resources according to five capabilities that progressively increase their agency. These capabilities build on each other to give resources and the organizing systems in which they participate more ways to create value through interactions and information exchanges.
- Sensing or awareness
-
The minimal capability for a resource to have some agency is for it to be able to sense or be aware of some aspect of its environment or its interactions with other resources. A thermometer measures temperature, a photodetector measures light, a gauge measures the fuel left in a car’s gas tank, a GPS device computes its location after detecting and analyzing signals from satellites, a wearable fitness sensor tracks your heartbeat and how far you walk. But sensing something in itself does not create any value in an organizing system. Something needs to be done.
- Actuation
-
A resource has the capability to actuate when it can create effects or value by initiating some action as a result of the information it senses; “actuator” is often used to describe a resource that can move or control a physical mechanism or system, while “effector” is used when the resource is a biological one. Resources can actuate by turning on lights, speakers, cameras, motors, switches, by sending a message about the state or value of a sensor, or by moving themselves around (as with robots).
A potential or latent actuation is created when a resource can display or broadcast some aspect of its state, but value is only created if another resource (possibly human) happens to see the display or hear the broadcast and then acts upon it.
For example, RFID chips, which are essentially bar codes with built-in radio transponders, can be attached to otherwise passive resources to make them active. RFID chips begin transmitting when they detect the presence of a RFID reading device. This enables automated location tracking and context sensing. RFID receivers are built into assembly lines, loading docks, parking lots, toll booths, or store shelves to detect when some RFID-tagged resource is at some meaningful location. RFID tags can be made more useful by having them record and transmit information from sensors that detect temperature, humidity, acceleration, and even biological contamination.[14]
- Connectivity
-
For an active resource to do useful work it must be connected in some way to the actuation mechanism that manipulates or controls some other resource. This connection might be a direct and permanent one between the resource and the thing it actuates, like that of a thermostat whose temperature sensing capability has a fixed connection to a heating or cooling system that it turns off or on depending on the temperature.
An important innovation in the design of active resources is “wrapping” physical resources with software so they can be given IP addresses and make connections with Internet protocols, which allows them to send information to an application with more capability to act on it. Such resources are said to be part of the “Internet of Things.”
Smart phones are active resources that can identify and share their own location, orientation, acceleration and a growing number of other contextual parameters to enable personalization of information services. Smart phones can also run the applications that receive messages from and send messages to other smart resources to monitor and optimize how they work.
- Computation or programmability
-
Simple active resources operate in a deterministic manner: given this sensor reading, do this. Other active resources have computational capabilities that enable them to analyze the current and historical information from their sensors, identify significant data values or patterns in these interaction resources, and then adapt their behavior accordingly.
Many thermostats are programmable, but most people do not bother to program them so they miss out on potential energy savings. Nest Labs makes a learning thermostat that programs itself. The Nest thermostat uses sensors for temperature, humidity, motion, and light to figure out whether people are at home, and a Wi-fi connection to get local weather data.
The Roomba vacuum cleaning robot navigates around furniture, power cords, stairs, and optimizes its cleaning paths to go over particularly dirty places. But vacuuming is all it does. More sophisticated robots are designed to be versatile and adaptable so they can repetitively perform whatever task is needed for some manufacturing process, and their capabilities can be continually upgraded by software updates, just like the apps on your smart phone. A new generation of robots typified by one called Baxter can be trained by example; a person moves Baxter’s arms and hands to show him what to do, and when Baxter has programmed himself to repeat it, he nods.
- Composability and cooperating
-
The “smartest” active resources can do more than analyze the information they collect and adapt what they do. In addition, they expose what they know and can do to other resources using standard or non-proprietary formats and protocols. This means that active resources that were independently designed and implemented can work together to create value.
Many organizing systems on the web consist of collections or configurations of active digital resources. Interactions among these active resources often implement information-intensive business models where value is created by exchanging, manipulating, transforming, or otherwise processing information, rather than by manipulating, transforming, or otherwise processing physical resources.
We are beginning to see the same principles of modularity and composability applied to physical resources, with open source software libraries for using sensors and micro-controllers and easy to use APIs. In essence, we are using software and physical resources in much the same way as functional building blocks, and standards will be critically important.
Service Oriented Architecture(SOA) is an emerging design discipline for organizing active software resources as functional business components that can be combined in different ways. SOA is generally implemented using web services that exchange XML documents in real-time information flows to interconnect the business service components.
A familiar design pattern for an organizing system composed from active digital resources is the “online store.” The store can be analyzed as a composition or choreography in which some web pages display catalog items, others serve as “shopping carts” to assemble the order, and then a “checkout” page collects the buyer’s payment and delivery information that gets passed on to other service providers who process payments and deliver the goods.
Design patterns for composing organizing systems from “smart” physical resources are emerging in work on the “smart home,” “smart office building,” or “smart city.” Many experiments are underway and new products emerging that are trying out different combinations of hardware and software to understand the design tradeoffs between them to best determine where the “smarts” should go. For example, we can compare a “smart home” built around a super-intelligent hub device that communicates and coordinates with many other “not so smart” devices from the same manufacturer to one in which all of the devices are equally smart and come from different makers.
At more complex scales, a truly smart building will not just have programmable thermostats to control heating and cooling systems. It will take in weather forecasts, travel calendars, information about the cost of electricity from different sources, and other relevant information as inputs to a model of how the building heats and cools to optimize energy use and cost while keeping the rooms at appropriate temperatures.
Standard application interfaces enable active resources to interact with people to get information that might otherwise come from sensors or that enhances the value of the sensor information. A programmable thermostat that can record time-based preferences of the people who use the space controlled by the thermostat is more capable than one with just a single temperature threshold. A standard Internet protocol for communicating with the thermostat would enable it to be controlled remotely.
Open and standard data formats and communication protocols enable the aggregation and analysis of information from many instances of the same type of active resource. For example, smart phones running the Google Maps application transmit information about their speed and location. Machine learning and sophisticated optimization techniques of this dataset can yield collective intelligence that can then be given to the resources from which it was derived. In this case, Google can identify traffic jams and generate alternative routes for the drivers stuck in traffic.
There is a great deal of hype about the Internet of Things, but there is also a great deal of innovation. If you search for the phrase “Internet of Things” along with almost any physical resource, chances are you will find something, Try “baby,” “dog,” “fork,” “lettuce,” “pajamas,” “streetlamp,” and then a few of your own.
But not everything can be done best by computers. The web has enabled the use of people as active resources to carry out tasks of short duration that can be precisely described but which cannot be done reliably by computers. These tasks often require aesthetic or subjective judgment. The people doing these web-based tasks are often called “Mechanical Turks” by analogy to a fake chess playing machine from the 18th century that had a human hidden inside who secretly moved the pieces.[15]
Resource Focus
A fourth contrast between types of resources distinguishes original or primary resources from resources that describe them. Any primary resource can have one or more description resources associated with it to facilitate finding, interacting with, or interpreting the primary one. Description resources are essential in organizing systems where the primary resources are not under its control and can only be accessed or interacted with through the description. Description resources are often called metadata.
The distinction between primary resources and description resources, or metadata, is deeply embedded in library science and traditional organizing systems whose collections are predominantly text resources like books, articles, or other documents. In these contexts description resources are commonly called bibliographic resources or catalogs, and each primary resource is typically associated with one or more description resources.
In business enterprises, the organizing systems for digital information resources, such as business documents, or data records created by transactions or automated processes, almost always employ resources that describe, or are associated with, large sets or classes of primary resources.[16]
The contrast between primary resources and description resources is very useful in many contexts, but when we look more broadly at organizing systems, it is often difficult to distinguish them, and determining which resources are primary and which are metadata is often just a decision about which resource is currently the focus of our attention.[17]
For example, many Twitter users treat the 140-character message body as the primary resource, while the associated metadata about the message and sender (is it a forward, reply, link, etc.) is less important. However, for firms that use Twitter metadata to measure sender and brand impact, or identify social networks and trends, the focus is the metadata, not the content.[18]
As another example, players on professional sports teams are human resources, but millions of people participate in fantasy sports leagues where teams consist of resources based on the statistics generated by the actual human players. Put another way, the associated resources in the actual sports are treated as the primary ones in the fantasy leagues.[19]
Resource Format x Focus
Applying the format contrast between physical and digital resources to the focus distinction between primary and descriptive resources yields a useful framework with four categories of resources (Figure: Resource Format x Focus.).
Physical Description of a Primary Physical Resource
The oldest relationship between descriptive resources and physical resources is when descriptions or other information about physical resources are themselves encoded in a physical form. Nearly ten thousand years ago in Mesopotamia small clay tokens kept in clay containers served as inventory information to count units of goods or livestock. It took 5000 years for the idea of stored tokens to evolve into Cuneiform writing in which marks in clay stood for the tokens and made both the tokens and containers unnecessary.[20]
Printed cards served as physical description resources for books in libraries for nearly two centuries.[21]
Digital Description of a Primary Physical Resource
Here, the digital resource describes a physical resource. The most familiar example of this relationship is the online library catalog used to find the shelf location of physical library resources, which beginning in the 1960s replaced the physical cards with database records. The online catalogs for museums usually contain a digital photograph of the painting, item of sculpture, or other museum object that each catalog entry describes.
Digital description resources for primary physical resources are essential in supply chain management, logistics retailing, transportation, and every business model that depends on having timely and accurate information about where things are or about their current states. This digital description resource is created as a result of an interaction with a primary physical resource like a temperature sensor or with some secondary physical resource that is already associated with the primary physical resource like an RFID tag or barcode.
Augmented reality systems combine a layer of real-time digital information about some physical object to a digital view or representation of it. The yellow “first down” lines superimposed in broadcasts of football games are a familiar example. Augmented reality techniques that superimpose identifying or descriptive metadata are used in displays to support the operation or maintenance of complex equipment, in smart phone navigation and tourist guides, in advertising, and in other domains where users might otherwise need to consult a separate information source. Advanced airplane cockpit technology includes heads-up displays that present critical data based on available instrumentation, including augmented reality runway lights when visibility is poor because of clouds or fog.
Augmented reality displays have recently been incorporated into wearable technology like Google Glass, which mounts on eyeglass frames to display information obtained from the Internet after being requested by voice commands. Some luxury car brands have incorporated similar technology to project dashboard data, traffic conditions, and directions on the driver’s windshield.
Digital Description of a Primary Digital Resource
A digital resource describes a digital resource. This is the relationship in a digital library or any web-based organizing system, making it possible to access a primary digital resource directly from the digital secondary resource.
Physical Description of a Primary Digital Resource
This is the relationship implemented when we encounter an embedded QR barcode in newspaper or magazine advertisements, on billboards, sidewalks, t-shirts, or on store shelves. Scanning the QR code with a mobile phone camera can launch a website that contains information about a product or service, place an order for one unit of the pointed-to- item in a web catalog, dial a phone number, or initiate another application or service identified by the QR code.[22]
-
(Linnaeus 1735). Linnaeus is sometimes called the father of modern taxonomy (which is unfair to Aristotle) but he certainly deserves enormous credit for the systematic approach to biological classification that he proposed in Systema Naturae, published in 1735. This seminal work contains the familiar kingdom, class, order, family, genus, species hierarchy.
↵ -
Project Gutenberg, begun in 1971, was the first large-scale effort to digitize books; its thousands of volunteers have created about 40,000 digital versions of classic printed works. Systematic research in digital libraries began in the 1990s when the US National Science Foundation(NSF), the Advanced Research Projects Agency(ARPA), and NASA launched a Digital Library Initiative that emphasized the enabling technologies and infrastructure. At about the same time numerous pragmatic efforts to digitize library collections began, characterized by some as a race against time as old books in libraries were literally disintegrating and turning into dust. The Internet Archive, started in 1996, now has a collection of over 3 million texts and has estimated the cost of digitizing to be about $30 for the average book. Multiply this by the scores of millions of books held in the world’s research libraries and it is easy to why many libraries endorsed Google’s offer to digitize their collections.
-
The ASCII scheme was standardized in the 1960s when computer memory was expensive and most computing was in English-speaking countries, so it is minimal and distinguishes only 128 characters. (Cerf1969) American Standard Code for Information Interchange(ASCII) is an ANSI specification. (See
http://en.wikipedia.org/wiki/ASCII
.) -
Unicode 6.0 (
http://www.unicode.org/
) has room to encode 109,449 characters for all the writing systems in the world, so a single standard can represent the characters of every existing language, even “dead” ones like Sumerian and Hittite. Unicode encodes the scripts used in languages, rather than languages per se, so there only needs to one representation of the Latin, Cyrillic, Arabic, etc scripts that are used for writing multiple language. Unicode also distinguishes characters from glyphs, the different forms for the same character—enabling different fonts to be identified as the same character. -
Encoding of structure in documents is valuable because titles, sections, links and other structural elements can be leveraged to enhance the user interface and navigational interactions with the digital document and enable more precise information retrieval. Some uses of documents require formats that preserve their printed appearance. “Presentational fidelity” is essential if we imagine a banker or customs inspector carefully comparing a printed document with a computer-generated one to ensure they are identical.
-
Text encoding specs are well-documented; see (
http://www.wotsit.org/list.asp?fc=10
). -
The ambitious use of virtual world technology to create novel forms of interaction described by (Rothfarb and Doherty 2007) reflects the highly interactive character of its host museum, the Exploratorium in San Francisco (
http://www.exploratorium.edu/
). Similarly, the Google Art Project (http://googleartproject.com
) is notable for its goal of complementing and extending, rather than merely imitating, the museum visitor’s encounter with artwork (Proctor 2011). A feature that let people create a “personal art collection” is very popular, enabling a fan of Vincent Van Gogh to bring together paintings that hang in different museums. -
However, scratching can be simulated using a smart phone or tablet app called djay. See
http://www.algoriddim.com/djay
. -
As a result, digital books are somewhat controversial and problematic for libraries, whose access models were created based on the economics of print publication and the social contract of the copyright first sale doctrine that allowed libraries to lend printed books. Digital books change the economics and first sale is not as well-established for digital works, which are licensed rather than sold (Aufderheide and Jaszi 2011). To protect their business models, many publishers are limiting the number of times ebooks can be lent before they “self-destruct.” Some librarians have called for boycotts of publishers in response (
http://boycottharpercollins.com
). -
The opposing categories of operands and operants have their roots in debates in political economics about the nature of work and the creation of value (Vargo and Lusch 2004) and have more recently played a central role in the development of modern thinking about service design (Constantin and Lusch 1994), (Maglio et al. 2009).
-
See (Allmendinger and Lombreglia 2005), (Want 2006). (Crawford and Johnson, 2012)
-
Luis Von Ahn (von Ahn 2004) was the first to use the web to get people to perform “microwork” or “human computation” tasks when he released what he called “the ESP game” that randomly paired people trying to agree on labeling an image. Not long afterward Amazon created the MTurk platform (
http://www.mturk.com
) that lets people propose microwork and others sign up to do it, and today there are both hundreds of thousands of tasks offered and hundreds of thousands of people offering to be paid to do them. -
For semi-structured or more narrative documents these descriptions might be authoring templates used in word processors or other office applications, document schemas in XML applications, style sheets, or other kinds of transformations that change one resource representation into another one. Primary resources that are highly and regularly structured are invariably organized in databases or enterprise information management systems in which a data schema specifies the arrangement and type of data contained in each field or component of the resource.
-
Describing information as “metadata” suggests that it is of secondary importance, not as essential or informative as the resource being described. This is surely the reason why the US National Security Agency and those of other governments, whose unauthorized surveillance of global communications were revealed in 2013 by Edward Snowden, often stressed that they were only collecting message metadata, not its content. Of course, information about who you communicate with and when you do so defines your social network, information that is potentially very valuable, and the NSA knows this just as Facebook and Twitter do.
-
There are a large number of third-party Twitter apps. See
http://twitter.pbworks.com/w/page/1779726/Apps
. For a scholarly analysis see (Efron 2011). -
The basic idea behind fantasy sports is quite simple. You select a team of existing players in any sport, and then compare their statistical performance against other teams similarly selected by other people. Fantasy sports appeal mostly to die-hard fans who study player statistics carefully before “drafting” their players. The global fantasy sports business for companies who organize and operate fantasy leagues is estimated as between 1 and 2 billion US dollars annually (Montague 2010).
-
The oldest known lists of books were created about 4000 years ago in Sumeria. The first use of cards in library catalogs was literal; when the revolutionary government of France seized private book collections, an inventory was created stating in 1791 using the blank backs of playing cards. 110 years later the US Library of Congress began selling pre-printed catalog cards to libraries, but in the mid-1960s the creation of the Machine-Readable Cataloging(MARC) format marked the beginning of the end of printed cards. See (Strout 1956). The MARC standards are at
http://www.loc.gov/marc/
. -
We treat resource format and resource focus as distinct dimensions, so there are four categories here. This contrasts with David Weinberger’s three “orders of order” that he proposes in the first chapter of a book called Everything is Miscellaneous (Weinberger 2007). Weinberger starts with the assumption that physical resources are inherently the primary ones, so the first “order of order” emerges when physical resources are arranged. The second “order of order” emerges when physical description resources are arranged, and the third “order of order” emerges when digital description resources for physical resources are arranged. Later in the book Weinberger mentions the use of bar codes associated with websites, a physical description of a digital resource, but because he started with the assumption that physical resources define the “first order” this example does not fit into his orders of order.