50 Key Points in Chapter Seven
-
What are categories?
Categories are equivalence classes: sets or groups of things or abstract entities that we treat the same.
-
What determines the size of the equivalence class?
The size of the equivalence class is determined by the properties or characteristics we consider.
-
Why do we contrast cultural, individual, and institutional categorization?
Cultural, individual, and institutional categorization share some core ideas but they emphasize different processes and purposes for creating categories.
-
What distinguishes individual categories?
Individual categories are created by intentional activity that usually takes place in response to a specific situation.
(See “Individual Categories”)
-
What distinguishes institutional categories?
Institutional categories are most often created in abstract and information-intensive domains where unambiguous and precise categories are needed.
-
What is the relation between categories and classification?
The rigorous definition of institutional categories enables classification, the systematic assignment of resources to categories in an organizing system.
-
When is it necessary to create categories by computational methods rather than by people?
Computational categories are created by computer programs when the number of resources, or when the number of descriptions or observations associated with each resource, are so large that people cannot think about them effectively.
-
What is the difference between supervised and unsupervised learning?
In supervised learning, a machine learning program is trained by giving it sample items or documents that are labeled by category. In unsupervised learning, the program gets the samples but has to come up with the categories on its own.
-
Why does it matter if every resource in a collection has a sortable identifier?
Any collection of resources with sortable identifiers (alphabetic or numeric) as an associated property can benefit from using sorting order as an organizing principle.
(See “Single Properties”)
-
What is the concern when only a single property is used to assign category membership?
If only a single property is used to distinguish among some set of resources and to create the categories in an organizing system, the choice of property is critical because different properties often lead to different categories.
(See “Single Properties”)
-
What is a hierarchical category system?
A sequence of organizing decisions based on a fixed ordering of resource properties creates a hierarchy, a multi-level category system.
-
What can one say about any member of a classical category in terms of how it represents the category?
An important implication of necessary and sufficient category definition is that every member of the category is an equally good member or example of the category.
-
What is aboutness?
For most purposes, the most useful property of information resources for categorizing them is their aboutness, which is not directly perceivable and which is hard to characterize.
-
When it is necessary to adopt a probabilistic or statistical view of properties in defining categories?
In domains where properties lack one or more of the characteristics of separability, perceptibility, and necessity, a probabilistic or statistical view of properties is needed to define categories.
-
What is family resemblance?
Sharing some but not all properties is akin to family resemblances among the category members.
-
What is similarity?
Similarity is a measure of the resemblance between two things that share some characteristics but are not identical.
(See “Similarity”)
-
What are the four psychologically-motivated approaches that propose different functions for computing similarity?
Feature- or property-based, geometry-based, transformational, and alignment- or analogy-based approaches are psychologically-motivated approaches that propose different functions for computing similarity.
(See “Similarity”)
-
What are so-called “classical categories”?
Classical categories can be defined precisely with just a few necessary and sufficient properties.
-
How does the breadth of a category affect the recall/precision tradeoff?
Broader or coarse-grained categories increase recall, but lower precision.
-
What is a decision tree?
A simple decision tree is an algorithm for determining a decision by making a sequence of logical or property tests.
-
What is the practical benefit of defining categories according to necessary and sufficient features?
The most conceptually simple and straightforward implementation of categories in technologies for organizing systems adopts the classical view of categories based on necessary and sufficient features.
-
How do artificial languages like mathematical notation and programming languages enable precise specification of categories?
An artificial language expresses ideas concisely by introducing new terms or symbols that represent complex ideas along with syntactic mechanisms for combining and operating on them.
-
How do Naïve Bayes classifiers learn?
Naïve Bayes classifiers learn by revising the conditional probability of each property for making the correct classification after seeing the base rates of the class and property in the training data and how likely it is that a member of the class has the property.
-
How do clustering techniques create categories?
Because clustering techniques are unsupervised, they create categories based on calculations of similarity between resources, maximizing the similarity of resources within a category and maximizing the differences between them.