Grouping variable media inputs to reflect a user session

Embodiments of the invention present a system and method for identifying relationships between different types of data captured by a pen-based computing system, such as a smart pen. The pen-based computing system generates one or more sessions including different types of data that are associated with each other. In one embodiment, the pen-based computing system generates an index file including captured audio data and written data, where the written data is associated with a temporal location of the audio data corresponding to the time the written data was captured. For example, the pen-based computing system applies one or more heuristic processes to the received data to identify relationships between various types of the received data, used to associated different types of data with each other.