It is hypothesized that the cognitive process in program comprehension involves “search” and “knowledge building” activity. In the former, gaze is characterized by saccades in which the programmer scans code for key features, structure and patterns. In the latter, gaze type is primarily that of fixations in which critical areas of code are examined in order to understand code functionality. Such cognitive activities are hidden and likely iterative in nature. Hence, the formulation and analysis of gaze events as a sequence pattern, for example as a hidden Markov model, can enable the investigation of statistically significant differences between programmer types and program types.
The PhD builds upon existing areas of work within the School of Computing – areas such as eye movements in programming, sequence mining of IP TV viewing data and fundamentals of process analytics. Within the School there are two existing data sets of eye tracking data relating to program comprehension by programmers with dyslexia. These will enable initial experiments in which the researcher can explore the possibilities for modelling this type of data using sequence mining techniques such as Needleman–Wunsch, Markov chain techniques and process mining algorithms. There is also scope for further data collection using the School’s eye tracker devices. This could involve augmenting the existing data sets or the design of new experiments. Of particular interest would be the collection of gaze data from professional programmers.
The results of the work will likely be of benefit in a number of respects. For example, it could lead to recommendations on code layout and spacing in order to facilitate overall comprehension, with a particular reference to programmers with dyslexia or indeed other neurodiverse characteristics. And it could allow for novel, visual representations of programmer gaze, building on the existing vocabulary of the Eye Movements in Programming coding scheme.