Requirements define what must be done but NOT how to do it; that’s the role of the design and implementation phases. Being explicit about requirements and the intended scope and scale of an organizing system before moving onto these phases in an organizing system’s lifecycle avoids two problems. The first is taking a narrow and short-term focus on the initial resources in a collection, which might not be representative of the collection when it reaches its planned scope and scale. This can result in overly customized and inflexible resource descriptions or arrangements that cannot easily accommodate the future growth of the collection. A second problem, often a corollary of the first, is not separating design principles from their implementation in some specific environment or technology.
Choosing Scope- and Scale-Appropriate Technology
A simple organizing system to satisfy personal record keeping or some short-lived information management requirements can be implemented using folders and files on a personal computer or by using “off the shelf” generic software such as web forms, spreadsheets, databases, and wikis. Other simple organizing systems run as applications on smart phones. Some small amount of configuration, scripting, structuring or programming might be involved, but in many cases this work can be done in an ad hoc manner. The low initial cost to get started with these kinds of applications must be weighed against the possible cost of having to redo a lot of the work later because the resources and the resource descriptions might not be easily exported to new ones.
More capable organizing systems that enable the persistent storage and efficient retrieval of large amounts of structured information resources generally require additional design and implementation efforts. Flat word processing files and spreadsheets are not adequate. Instead, XML document models and database schemas often must be developed to ensure more control of and validation of the information content and its descriptions. Software for version and configuration management, security and access control, query and transformation, and for other functions and services must also be developed to implement the organizing system.
Technology for organizing systems will always evolve to enable new capabilities. For example, cloud computing and storage are radically changing the scale of organizing systems and the accessibility of the information they contain. It might be possible to implement these capabilities and services to an organizing system in an incremental fashion with informal design and implementation methods. If information models, processing logic, business rules and other constraints are encoded in the software without explicit traceability to requirements and design decisions the organizing system will be difficult to maintain if the context, scope or requirements change. This is why we have repeatedly emphasized the importance of architectural thinking about organizing systems, beginning in “The Concept of “Organizing Principle”” where we proposed that organizing principles should ideally be expressed in a way that did not assume how they would be implemented. (See also ““Information Architecture” and Organizing Systems”, “Classification vs. Physical Arrangement”, and “Introduction”)
Much of the advice about designing and implementing an organizing system can be summarized as “architectural thinking,” introduced in “The Concept of “Organizing Principle””. The overall purpose of architectural thinking is to separate design issues from implementation ones to make a system more robust and flexible. Architectural thinking leads to more modularity and abstraction in design, making it easier to change an implementation to satisfy new requirements or to take advantage of new technologies or procedures. It is also important to think architecturally about the design of the vocabularies and schemas for resource description and of classification systems to leave room for expansion to accommodate new resource types (“Implementing Categories” and “Principles for Maintaining the Classification over Time”). Doing so is easier if the descriptions are logically and physically distinct from the resources they describe. A checklist the brings together useful principles and processes for architectural thinking from all parts of this book is in the nearby sidebar.
Nevertheless, architectural thinking requires more careful analysis of resources and implementation alternatives, and most people do not think this way, especially for personal and informal organizing systems. You can imagine that someone might arrange a collection of paperback books in a small bookcase whose shelf height and width were perfectly suited for the paperbacks they currently own. However, this organizing system would not work at all for large format books, and a paperback could not be added to the collection unless one was purged from the collection. It would be more sensible to start with a bigger bookcase with adjustable shelves so that the organizing system would have a longer lifetime.
You might think that large institutional organizing systems would avoid these problems caused by tying a collection too tightly to the physical environment in which it is initially organized, but sometimes they do not. A famous example involves the art collection of the Barnes Foundation, which had to keep its paintings in the exact same crowded arrangements when the museum made a controversial move from a small building to a larger one because the donor had mandated that the paintings never be moved from their original settings. (See the sidebar, The Barnes Collection).
For digital resources, inexpensive storage and high bandwidth have largely eliminated capacity as a constraint for organizing systems, with an exception for big data, which is defined as a collection of data that is too big to be managed by typical database software and hardware architectures. Even so, big data collections are often large but homogeneous, so their scale is not their most important challenge from an organizing system perspective (“Scope and Scale of the Collection”).
Distinguishing Access from Control
Because large resource collections are often used for multiple purposes by many different people or projects, they illustrate another important architectural issue for collections of digital resources. A requirement for access to resources does not imply a need to directly own or control them, and information-intensive and web-based businesses have increasingly adopted organizing system designs that involve storage of digital resources in the cloud, licensing of globally distributed resources, and outsourcing of information services. Designs that use these architectural concepts can realize functional and quality improvements because the location and identity of the service provider is hidden by an abstraction layer (“Value Creation with Physical Resources”, “Distinguish Identifying and Resolving”). However, separating access from ownership has been a cultural challenge for some libraries and museums whose institutional identities emphasize the resources they directly control and the physical buildings in which they control them.
Standardization and Legacy Considerations
As we noted with the Barnes Collection, a building becomes old and outdated over time. The technology used in digital organizing systems becomes obsolete faster than physical buildings do. The best way to slow the inevitable transformation of today’s cutting edge technology to tomorrow’s legacy technology is to design with standard data formats, description vocabularies and schemas, and classification systems unless you have specific requirements that preclude these choices.
Even a requirement to interoperate with an organizing system that uses proprietary or non-standard specifications can usually be satisfied by transforming from a standard format (“Institutional Semantics”, “Implementing Interactions”). Similarly, it is better to design the APIs and data feeds of an organizing system in a generic or standard way that abstracts from their hidden implementation. This design principle makes it easier for external users to understand the supported interactions, and also prevents disclosure of any aspects of resource description or organization that provide competitive advantage. For example, the way in which a business classifies products, suppliers, customers, or employees can be competitively important.
Two important design questions that arise with data transformation or conversion, whether it is required by a technology upgrade or an interoperability requirement, are when to do it and where to do it. The job of converting all the resources in a collection can typically be outsourced to a firm that specializes in format conversion or resource description, and a batch or pre-emptive conversion of an entire collection enables an upgraded or new organizing system to operate more efficiently when it is not distracted by ongoing conversion activity. On the other hand, if resources vary greatly in their frequency of use, a “do-it-yourself on-demand” method is probably more cost effective as long as the conversion does not impact the interactions that need to be supported.
Note that this definition does not include any specific size threshold, such as some number of terabytes (thousands of gigabytes). This allows the threshold size that makes a collection a big data one to increase as storage technology advances. It also recognizes that different industries or domains have different thresholds (Manyika et al 2011).