In the previous sections as we surveyed the five perspectives on analyzing relationships, we mentioned numerous examples where relationships had important roles in organizing systems. In this final section we examine three contexts for organizing systems where relationships are especially fundamental; the Semantic Web and Linked Data, bibliographic organizing systems, and situations involving system integration and interoperability.
The Semantic Web and Linked Data
In a classic 2001 paper, Tim Berners-Lee laid out a vision of a Semantic Web in which all information could be shared and processed by automated tools as well as by people. The essential technologies for making the web more semantic and relationships among web resources more explicit are applications of XML, including RDF (“Resource Description Framework (RDF)”), and OWL (“Ontologies”). Many tools have been developed to support more semantic encoding, but most still require substantial expertise in semantic technologies and web standards.
More likely to succeed are applications that aim lower, not trying to encode all the latent semantics in a document or web page. For example, some wiki and blogging tools contain templates for semantic annotation, and Wikipedia has thousands of templates and “infoboxes” to encourage the creation of information in content-encoded formats.
The “Linked Data” movement is an extension of the Semantic Web idea to reframe the basic principles of the web’s architecture in more semantic terms. Instead of the limited role of links as simple untyped relationships between HTML documents, links between resources described by RDF can serve as the bridges between islands of semantic data, creating a Linked Data network or cloud.
Bibliographic Organizing Systems
Much of our thinking about relationships in organizing systems for information comes from the domain of bibliographic cataloging of library resources and the related areas of classification systems and descriptive thesauri. Bibliographic relationships provide an important means to build structure into library catalogs.
Bibliographic relationships are common among library resources. Smiraglia and Leazer found that approximately 30% of the works in the Online Computer Library Center(OCLC) WorldCat union catalog have associated derivative works. Relationships among items within these bibliographic families differ, but the average family size for those works with derivative works was found to be 3.54 items. Moreover, “canonical” works that have strong cultural meaning and influence, such as “the plays of William Shakespeare” and The Bible, have very large and complex bibliographic families.
Barbara Tillett, in a study of 19th and 20th-century catalog rules, found that many different catalog rules have existed over time to describe bibliographic relationships. She developed a taxonomy of bibliographic relationships that includes equivalence, derivative, descriptive, whole-part, accompanying, sequential or chronological, and shared characteristic. These relationship types span the relationship perspectives defined in this chapter; equivalence, derivative, and description are semantic types; whole-part and accompanying are part semantic and part structural types; sequential or chronological are part lexical and part structural types; and shared characteristics are part semantic and part lexical types.
Smiraglia expanded on Tillett’s derivative relationship to create seven subtypes: simultaneous derivations, successive derivations, translations, amplifications, extractions, adaptations, and performances.
In “Identity and Bibliographic Resources”, “Identity and Bibliographic Resources,” we briefly mentioned the four-level abstraction hierarchy for resources introduced in the Functional Requirements for Bibliographic Records report. FRBR was highly influenced by Tillett’s studies of bibliographic relationships, and prescribes how the relationships among resources at different levels are to be expressed (work-work, expression-expression, work-expression, expression-manifestation, and so on).
Resource Description and Access (RDA)
Many cataloging researchers have recognized that online catalogs do not do a very good job of encoding bibliographic relationships among items, both due to catalog display design and to the limitations of how information is organized within catalog records. Author name authority databases, for example, provide information for variant author names, which can be very important in finding all of the works by a single author, but this information is not held within a catalog record. Similarly, MARC records can be formatted and displayed in web library catalogs, but the data within the records are not available for re-use, re-purposing, or re-arranging by researchers, patrons, or librarians.
The Resource Description and Access(RDA) next-generation cataloging rules are attempting to bring together disconnected resource descriptions to provide more complete and interconnected data about works, authors, publications, publishers, and subjects.
RDA uses RDF to assert relationships among bibliographic materials.
RDA and the Semantic Web
The move in RDA to encode bibliographic data in RDF stems from the desire to make library catalog data more web-accessible. As web-based data mash-ups, application programming interfaces (APIs), and web searching are becoming ubiquitous and expected, library data are becoming increasingly isolated. The developers of RDA see RDF as the means for making library data more widely available online.
In addition to simply making library data more web accessible, RDA seeks to leverage the distributed nature of the Semantic Web. Once rules for describing resources, and the relationships between them, are declared in RDF syntax and made publicly available, the rules themselves can be mixed and mashed up. Creators of information systems that use RDF can choose elements from any RDF schema. For example, we can use the Dublin Core metadata schema (which has been aligned with the RDF model) and the Friend of a Friend(FOAF) schema (a schema to describe people and the relationships between them) to create a set of metadata elements about a journal article that goes beyond the standard bibliographic information. RDA’s process of moving to RDF is well underway.
Integration and Interoperability
Integration is the controlled sharing of information between two (or more) business systems, applications, or services within or between firms. Integration means that one party can extract or obtain information from another one, it does not imply that the recipient can make use of the information.
Interoperability goes beyond integration to mean that systems, applications, or services that exchange information can make sense of what they receive. Interoperability can involve identifying corresponding components and relationships in each system, transforming them syntactically to the same format, structurally to the same granularity, and semantically to the same meaning.
For example, an Internet shopping site might present customers with a product catalog whose items come from a variety of manufacturers who describe the same products in different ways. Likewise, the end-to-end process from customer ordering to delivery requires that customer, product and payment information pass through the information systems of different firms. Creating the necessary information mappings and transformations is tedious or even impossible if the components and relationships among them are not formally specified for each system.
In contrast, when these models exist as data or document schemas or as classes in programming languages, identifying and exploiting the relationships between the information in different systems to achieve interoperability or to merge different classification systems can often be completely automated. Because of the substantial economic benefits to governments, businesses, and their customers of more efficient information integration and exchange, efforts to standardize these information models are important in numerous industries. Interactions with Resources will dive deeper into interoperability issues, especially those that arise in business contexts.
Ironically, the web was not semantic originally because Berners-Lee implemented web documents using a presentation-oriented HTML markup language. Designing HTML to be conceptually simple and easy to implement led to its rapid adoption. HTML documents can make assertions and describe relationships using
REVattributes, but browsers still do not provide useful interactions for link relations.
For example, Protégé a free, open-source platform with a suite of tools to construct domain models and knowledge-based applications with ontologies. (See
Barbara Tillett has written extensively about the theory of bibliographic relationships; (Tillett 2001) is an especially useful resource because it is a chapter in a comprehensive discussion ambitiously titled Relationships in the Organization of Knowledge (Bean and Green 2001).
See Section 188.8.131.52.
See (Coyle 2010a).
The FRBR entities, RDA data elements, and RDA value vocabularies have been defined in alignment with RDF using the Simple Knowledge Organization System (SKOS). SKOS is an “RDF-compliant language specifically designed for term lists and thesauri” (Coyle 2010b). The SKOS website provides lists of registered RDF metadata schemas and vocabularies. From these, information system designers can create application profiles for their resources, selecting elements from multiple schemas, including FRBR and RDA vocabularies.