79 Single-Source Textbook Publishing
Overview. The fourth case is also an actual case—a self-referential one. It is a case study about the organizing system involved in the creation, production, and distribution of The Discipline of Organizing. See (Glushko 2015).
We have known since the beginning of this project that this book should not just be a conventional text. A printed book is an intellectual snapshot that is already dated in many respects the day it is published. In addition, the pedagogical goal of The Discipline of Organizing as a textbook for information schools and similar programs is made more difficult by the relentless growth of computing capability and the resulting technology innovation in our information-intensive economy and culture. We think that the emergence of ebook publishing opens up innovative possibilities as long as we can use a single set of source files to produce and update the print and digital versions of this book.
What is being organized? The content of this book began in early 2010 as more than 1000 slides and associated instructor notes for a graduate course “Information Organizing and Retrieval” that Robert J. Glushko, the primary author and editor of The Discipline of Organizing, was teaching at the University of California, Berkeley. These slides and notes were created in XML and transformed to HTML for presentation in a web browser.
The first decision to be made about resource organization led to the iterative sorting of the slides from 26 lectures into the 10 chapters in the initial outline for the book. The second decision concerned the granularity of the new content resources being created for the book. The team of authors was organized by chapters, which made chapters the natural granularity for file management and version control. Because authors were widely dispersed we relied on the Dropbox cloud storage service to synchronize work. Nevertheless, the broad and deep topical coverage of the book meant that chapters had substantial internal structure (four levels of headings in some places), and many of these subsections became separately identified resources that moved from chapter to chapter until they found their natural home.
In addition to the text content and illustrations that make up the printed text, we needed to organize short videos, interactive examples, and other applications to incorporate in digital versions of the book.
Finally, it has been essential to view the software that transforms, assembles, formats, and assigns styles when turning source files into deliverable artifacts as resources that must be managed. For the first and second editions of the book, we were fortunate to get much of the software required to build both print and ebooks from O’Reilly and Associates, an innovative technology publisher that has been developing a single-source publishing system called Atlas. Because we have recently been experimenting with including richer interactivity and navigation capability, reader-controlled personalization, and other features that go beyond what Atlas enables, we now use our own custom-built single-source publishing system.
Why is it being organized? Publishing print and ebook versions of a text from the same source files is the only way to produce both in a cost-effective and maintainable fashion. Approaches that require any “hand-crafting” would make it impossible to revise the book on a timely schedule. Furthermore, a survey of Berkeley students in the summer of 2012 revealed a great diversity of preferred platforms for reading digital books that included laptop computers, Apple and Android tablets, and seven different dedicated ebook readers. Only an automated single-source publishing strategy could produce all these outputs.
The highly granular structure for the content resources that comprise this book makes cross-referencing vastly more precise, making it easier to use the book as a textbook and job aid. It will also make it easier to maintain and adapt the text for use in online courses. (The emerging best practice for online courses is to break up lectures and study content into smaller units than used in traditional classroom lectures.)
How much is it being organized? The nature and extent of resource organization for this book reflects its purpose of bringing together multiple disciplines that recognize organizing as a fundamental issue but from different perspectives. The book contains many specialized topics and domain-specific examples that might overwhelm the shared concepts. Our solution was to write a lean core text and to move much of the disciplinary and domain-specific content into tagged endnotes. These categories of endnotes are somewhat arbitrary, but the authoring task of identifying content to go into endnotes is a non-trivial one.
The extent of resource organization is also affected by the choice of XML vocabulary, and we carefully considered whether to choose DITA or DocBook. DITA has the benefit of having more native support for modular authoring and transparent customization and updating, but DocBook is much older and hence has better toolkits. We eventually chose DocBook.
When is it being organized? Despite the fact that the lecture notes with which the book began were in XML, we decided to author the book using Microsoft Word. Many of the authors had little experience with XML editors, and the highly developed commenting and revision management facilities in Word proved very useful. This tradeoff imposed the burden of converting files to XML during the production process, but only two of the authors were still working on the book at that stage, and both have decades of experience with hypertext markup languages.
How or by whom is it being organized? The chapter authors used Word style sheets in a careful manner, tagging text with styles rather than using formatting overrides. This enabled a conversion vendor to convert most of the book from Word to XML semi-automatically. Some cleanup of the markup is inevitable because of the ambiguity created when the source markup with Word styles is less granular than the target markup in XML. We do not know whether the amount of work left for us was atypical.
Nevertheless, waiting until the book was substantially finished to convert to XML meant that we were also deferring the effort to mark up the text with cross references, citations, glossary terms, and index entries, because these types of content were not included in the Word authoring templates and style sheets. As a result, a substantial amount of effort has been required of our copy and markup editor that could have been done by chapter editors if they had authored natively in XML. However, having a single markup editor has given this book a more consistent and complete bibliography, glossary, and index than would be have possible with multiple authors.
Other considerations. Because every bit of content in the book is tagged as either “core” or discipline-specific, our source files collectively represent a “family of books” with 2048 different members, any one of which we can build by filtering the content to include any combination from zero to eleven disciplines. It is impractical to publish this many editions, but we hope to use this flexibility to enable instructors to tailor the text for a wide range of courses in many different academic disciplines and customize the text for both graduate and undergraduate students. Better still would be an approach that defers the generation of a particular version of an ebook from “publishing time” to “reading time.” The same algorithms apply, but now the reader decides when and how to apply them, enabling the dynamic configuration of the book’s content. This radical capability is experimental as of August 2015, but we expect it to generally available before too long.
This design for a book challenges conventional definitions of book editions and forces us to imagine new ways to acknowledge collaborative authorship. But asking “What is The Discipline of Organizing?,” given these new authoring and publishing models, is a similar question to the one asked in Resources in Organizing Systems, “What is Macbeth?”
(Wilde and Catin 2007). Looking back it seems ironic to start with a single-source XML publishing system, abandon it to author the book in Word, and then convert the files Word back to XML to enable single-source publishing.
(Kimber 2012) seems destined to become the definitive resource for DITA-based publishing. The definitive source for DocBook has long been (Walsh 2010).