Page

MusiCog: An Integrated Cognitive Architecture for Music Learning and Generation

Music composition is a complex, multi-modal human activity, engaging faculties of percep- tion, memory, motor control, and cognition, and drawing on skills in abstract reasoning, problem solving, creativity, and aesthetic evaluation. For centuries musicians, theorists, mathematicians—and more recently computer scientists—have attempted to systematize composition, proposing various formal methods for combining sounds (or symbols representing sounds) into structures that might be considered musical. Many of these systems are grounded in the statistical modelling of existing music, or in the mathematical formalization of the underlying rules of music theory. This thesis presents a different approach, looking at music as a holistic phenomenon, arising from the integration of perceptual and cognitive capacities. The central contribution of this research is an intergrated cogntive architecture (ICA) for music learning and generation called MusiCog. Informed by previous ICAs, MusiCog features a modular design, implementing functions for perception, working memory, long-term memory, and production/composition. MusiCog’s perception and memory modules draw on established experimental research in the field of music psychology, integrating both existing and novel approaches to modelling perceptual phenomena like auditory stream segregation (polyphonic voice-separation) and melodic segmentation, as well as higher-level cognitive phenomena like “chunking” and heirarchical sequence learning. Through the integrated approach, MusiCog constructs a representation of music informed specifically by its perceptual and cognitive limitations. Thus, in a manner similar to human listeners, its knowledge of different musical works or styles is not equal or uniform, but is rather informed by the specific musical structure of the works themselves.

MusiCog’s production/composition module does not attempt to model explilcit knowledge of music theory or composition. Rather, it proposes a “musically naive” approach to composition, bound by the perceptual phenomena that inform its representation of musical structure, and the cognitive constraints that inform its capacity to articulate its knowledge through novel compositional output.

This dissertation outlines the background research and ideas that inform MusiCog’s design, presents the model in technical detail, and demonstrates through quantitative testing and practical music theoretical analysis the model’s capacity for melodic style imitation. Strengths and limitations—both of the conceptual approach and the specific implementation— are discussed in the context of autonomous music generation and computer-assisted composition (CAC), and avenues for future research are presented. The integrated approach is shown to offer a viable path forward for the design and implementation of intelligent musical agents and interactive CAC systems.