Mental workload in multi-device personal information management
Mental Workload in Multi-Device Personal Information Management Manas Tungare
sub-tasks and redesign or optimize the user experience
Dept. of Computer Science, Virginia Tech.
selectively. In addition, we believe that mental
workload shows promise as a cross-tool, cross-task
method of evaluating PIM tools, services and strategies, thus fulfilling a need expressed by several researchers
Manuel A. Pérez-Quiñones
in the area of personal information management. In
Dept. of Computer Science, Virginia Tech.
this paper, we describe our ongoing experiment of
measuring mental workload (via physiological as well as
subjective measures) and its implications for users, designers and researchers in PIM. Keywords Abstract
Personal Information Management, Mental Workload,
Knowledge workers increasingly use multiple devices
such as desktop computers, laptops, cell phones, and PDAs for personal information management (PIM)
ACM Classification Keywords
tasks. The use of several of these devices together
H.5.2 Information Interfaces and Presentation: User
creates higher task difficulty for users than when used
individually (as reported in a recent survey we conducted). Prompted by this, we are conducting an
Introduction & Motivation
experiment to study mental workload in multi-device
As we amass vast quantities of personal information,
scenarios. While mental workload has been shown to
managing it has become an increasingly complex
decrease at sub-task boundaries, it has not been
endeavor. The emergence of multiple information
studied if this still holds for sub-tasks performed on
devices and services such as desktops, laptops, cell
different devices. We hypothesize that the level of
phones, PDAs and cloud computing adds a level of
support provided by the system for task migration
complexity beyond simply the use of a single computer.
affects mental workload. Mental workload
In traditional single terminal computer systems, the
measurements can enable designers to isolate critical
majority of a user’s attentional and cognitive resources are focused on the terminal while performing a specific task. However, in an environment where multiple devices require intermittent attention and present
Copyright is held by the author/owner(s).
useful information at unexpected times, the user is
CHI 2009, April 4 – 9, 2009, Boston, MA, USA
subjected to different mental workload.
In an earlier study we conducted [15], users
another [13]. A related goal of our research is to
consistently reported difficulties in performing
examine if the increase in mental workload at the point
information tasks with multiple devices, especially when
of transition is correlated with the level of system
transitioning between/among devices. From the
support available for the sub-task of transitioning. I.e.,
responses we received, we observed (from a content
if the system incorporates full support for task
analysis of free-form responses) that users’ adoption of
migration, we hypothesize that mental workload will be
various technological alternatives is guided by an innate
less than in case of another system where such support
sense of certain specific factors. We noted that several
of these factors constitute mental workload, e.g. frustration level, temporal demand, and mental effort.
In addition, there has been no standard way to
In systems where users lacked the freedom of choice,
compare the effectiveness of tools, services, and
they turned to solving problems by adopting
techniques developed independently at different
workarounds motivated by one or more of these
research labs. Kelly [9] notes the methodological
difficulties in studying PIM because of its highly personal nature, leading to challenges in developing a
It has been shown that an operator’s task performance
set of reference tasks or cross-tool cross-task metrics.
is inversely correlated with high levels of mental
In several other task domains, workload assessments
workload [12]. Thus, we set out to explore if mental
such as NASA TLX [6] have been administered instead
workload estimates could be used to compare task
of direct measurement of task performance metrics for
difficulty in PIM tasks. Prior work in mental workload
several reasons: chief among them is that subjective
measurement has established that physiological
workload assessments require less effort and
measures such as changes in pupillary diameter
instrumentation of the task, and are easier to
(known as Task-Evoked Pupillary Response [3]) can be
administer. If mental workload in PIM tasks can be
used to estimate mental workload. Such continuous
shown to be inversely correlated with task performance
measures of mental workload can help locate sub-tasks
(as has already been shown in several other domains
of high task difficulty. Iqbal et al. [8] demonstrated that
[12, 2, 5]), such a measure can be used to compare
within a single task, mental workload decreases at sub-
the effectiveness of these tools across varying tasks.
task boundaries. A fundamental goal of our research is
Thus, a tertiary goal of our research is to examine
to examine if their finding still applies when the latter
whether mental workload estimates captured using the
sub-task is performed on a different device than the
NASA TLX scale can serve as a predictor of task
former. Our contrary hypothesis is that mental workload
performance for personal information management
rises just before the moment of transition, and returns
to its normal level a short duration after the transition is complete. Related Prior Work Mental workload is an important, practically relevant,
Systems differ in the level of support they provide for
and measurable entity [6]. The NASA Task Load Index
pausing a task on one device, and resuming it on
(NASA TLX) [6] is a multi-dimensional subjective
workload assessment technique that has been applied
Results from Preliminary Studies
in studies of airline cockpits [2], navigation [14], and in
Experimental tasks for the current study were chosen
the medical field [5]. It combines information about
from among the most common representative tasks
specific sources of workload weighted by their
identified in an exploratory survey study [15] and
relevance, thus reducing the influence of those are
another ethnographic investigation [16] (reported
experimentally irrelevant, and emphasizing the
contributions of others that are experimentally relevant. This reduces between-subject variability for
File management across multiple machines stood out as
the measure as compared to other subjective scales.
the most reported problematic task. 12 out of 79 survey users said that they encountered difficulties
Physiological measures such as changes in pupillary
while syncing data between multiple machines, 11
diameter (known as Task-Evoked Pupillary Response)
reported unexpected deletion of their data while
have been shown to be responsive to changes in
copying across machines, and 6 reported having
mental workload [3] and used as a physiological
trouble with managing conflicting versions of files that
measure of mental workload in several studies [7, 1].
were copied manually. Based on these findings, our first
Within a single task, mental workload decreases at sub-
experimental task involves managing files across a
task boundaries [8]. Such continuous measures of
desktop and a laptop, with and without support for
mental workload can help locate sub-tasks of high task
From the ethnographic investigation of calendar use
As the problem of information overload has worsened
[16], we found that paper calendars were actively used
over the years, human attentional resources have
by a majority of interviewees despite the widespread
stayed constant [11]. The issue of information
prevalence of electronic calendars (corroborating the
fragmentation across multiple devices (the condition of
findings reported in previous studies). 35% of
having a user’s data in different formats, distributed
participants reported printing their electronic calendar
across multiple locations, manipulated by different
for offline use. Based on this, our second experimental
applications, and residing in a generally disconnected
task is calendar management, and involves managing
manner [4]) threatens the effectiveness of users as
schedules using an online calendar and paper
An understanding of mental workload in PIM tasks is
From the survey, we also found that several devices are
not only expected to lead to a better understanding of
often used in groups, e.g. laptops and cell phones
why a particular tool causes high frustration or mental
(reported by 52 participants), and integrated multi-
demand in users, but also can be used to isolate critical
function portable devices such as Palm Treos,
sub-tasks and for comparing different tools against one
Blackberries and Apple iPhones have begun to replace
single-function devices for communication (e.g. email
and IM). Given this, we picked contact management as
Physiological Measure: Task-Evoked Pupillary Response
Subtle yet measurable changes in pupil diameter have been associated with cognitive workload and referred to
Methodology and Experimental Setup
as the Task-Evoked Pupillary Response (TEPR) [3].
This mixed-method study consists of an experiment,
Participants wear a head-mounted eye-tracker
preceded by a questionnaire, and followed by an
throughout the duration of the experiment that permits
interview. Participants are invited to perform three
free head movement while still tracking eye gaze and
tasks in two sessions each to cover three different
pupil diameter with reasonable accuracy. Pupil diameter
information collections: (1) files, (2) calendars and (3)
(adjusted and normalized for other factors) has been
contacts. Each task is performed in two different ways
shown to be a good predictor of cognitive workload [7,
in the two sessions; the difference in treatments is the
10]. This technique provides a continuous measure of
level of system support for task migration. E.g. for the
files task, users perform the task using either USB drives (low level of task migration support) or network
Subjective Measure: NASA Task Load Index (TLX)
After every task, participants are requested to record their subjective assessment of mental workload via the
Each task consists of a set of instructions (between 15
NASA TLX questionnaire. This offers a task-level
and 20 each) to locate, read, modify, and save
estimate of mental workload that is useful as a cross-
information. In each task, a few instructions include
questions directly related to the information at hand. The experimenter collects the answers and uses them
as a metric of task performance (details later).
Direct task-related metrics such as time taken, errors
Interspersed within these are instructions to switch
encountered, information overwritten or not correctly
devices, e.g. one of the instructions for the file
propagated across devices, and incorrect information
management task reads: “Complete all your work on
used are being measured and used to determine if high
the desktop, and prepare to travel to a different office
mental workload correlates negatively with task
performance. These are measured after the participant session has concluded, by (1) analyzing eye-gaze
The second session is conducted (at least) two weeks
video, (2) automatic instruction-level time-tracking in
after the first session, in order to minimize the learning
the system that displays task instructions, (3)
effects caused by the first session. In this within-
analyzing the end products of interaction, e.g. saved
subjects design, ordering effects are minimized by
files, modified calendars and (4) answers to questions
randomizing the order of treatments between sessions,
posed at the end of individual instructions.
as well as the ordering of tasks within each session.
As of January 2009, pilot studies have been conducted
Mental Workload is measured via two different ways:
with 8 participants and a few initial participants have been recruited and scheduled for the first session. Expected Results & Design Implications
mental workload is found to be unexpectedly high
Designers of PIM products and services strive to create
during certain task sequences in a higher-level task.
solutions that make it easier for users to get their tasks done. However, an evaluation of the effectiveness of
these tools poses tricky challenges. Kelly [9] notes that
In this paper, we describe a study in progress that
“research and theory concerning PIM behavior and tools
seeks to understand the changes in mental workload
have been stymied, since it is difficult to accumulate,
during personal information management tasks
compare, and integrate results across studies” and
performed using multiple information devices. We
expresses an urgent need for “developing evaluation
extend prior work in mental workload measurement to
methods and metrics that produce valid, generalizable,
the domain of PIM, and seek to examine its correlation
sharable knowledge about how users go about the PIM
with task performance. Mental workload is measured
activities and interactions in their daily lives.”
via physiological as well as subjective measures, while task performance is measured using several task-
We believe that the results of our experiment will
specific metrics for three independent tasks (each of
contribute to exactly such an endeavor. Mental
which was selected based on the results of two prior
workload already accounts for subjective factors such
studies.) This study has important implications for PIM
as frustration and mental demand, factors that users
system designers who can then use mental workload
have reported as important in influencing their choice
measures as a cross-task, cross-tool method for
of device/tool/strategy. If, further, mental workload can
comparing the effectiveness of PIM tools and services
be shown to be correlated with task performance, then
developed independently of one another.
it has tremendous potential in being used for cross-tool evaluations and for comparing vastly different PIM
Acknowledgments
methodologies with one another. If, as we expect, we
We would like to thank Tonya L. Smith-Jackson for
are able to find significant correlation among
instigating some of the ideas behind this project. Steve
physiological and subjective measures of mental
Harrison, Edward A. Fox, Stephen Edwards and Pardha
workload and task performance, designers will be able
S. Pyla also provided important insights that led to the
to evaluate their tools using non-intrusive low-overhead
design of this study in its current form. We wish thank
subjective workload assessment tests such as NASA
our pilot participants as well as future participants for
Not only will we be able to determine if a particular
References
system causes higher or lower mental workload in a
[1] B. P. Bailey and S. T. Iqbal. Understanding changes
user, we will also be able to understand where within a
in mental workload during execution of goal-directed
task users face problems. Measures of mental workload
tasks and its application for interruption management.
can be used in both formative and summative
ACM Trans. Comput.-Hum. Interact., 14(4):1–28, 2008.
evaluations of PIM products in the testing phase, and
[2] J. Ballas, C. Heitmeyer, and M. Pérez-Quiñones.
changes and/or optimizations can be introduced in case
Evaluating two aspects of direct manipulation in
advanced cockpits. In CHI ’92: Proceedings of the
[9] D. Kelly. Evaluating personal information
SIGCHI conference on Human factors in computing
management behaviors and tools. Commun. ACM,
systems, pages 127–134, New York, NY, USA, 1992.
[10] J. Klingner, R. Kumar, and P. Hanrahan. Measuring
[3] J. Beatty. Task-evoked pupillary responses,
the task-evoked pupillary response with a remote eye
processing load, and the structure of processing
tracker. In Eye Tracking Research and Applications
resources. Psychological Bulletin, 91(2):276–92, 1982.
[4] O. Bergman, R. Beyth-Marom, and R. Nachmias.
[11] D. M. Levy. To grow in wisdom: Vannevar Bush,
The project fragmentation problem in personal
Information Overload, and the Life of Leisure. In JCDL
information management. In CHI ’06: Proceedings of
’05: Proceedings of the 5th ACM/IEEE-CS joint
the SIGCHI conference on Human Factors in computing
conference on Digital libraries, pages 281–286, New
systems, pages 271–274, New York, NY, USA, 2006.
[12] R. D. O’Donnell and F. T. Eggemeier. Workload
[5] D. A. Bertram, D. A. Opila, J. L. Brown, S. J.
assessment methodology, volume 2 of Handbook of
Gallagher, R. W. Schifeling, I. S. Snow, and C. O.
perception and human performance: Vol. 2. Cognitive
Hershey. Measuring physician mental workload:
processes and performance, chapter Workload
Reliability and validity assessment of a brief instrument.
assessment methodology, pages 42/1–42/49. Wiley,
[6] S. G. Hart and L. E. Staveland. Development of
[13] P. S. Pyla, M. Tungare, and M. Pérez-Quiñones.
NASA-TLX (Task Load Index): Results of Empirical and
Multiple user interfaces: Why consistency is not
Theoretical Research. Human Mental Workload, 1:139–
everything, and seamless task migration is key. In
Proceedings of the CHI 2006 Workshop on The Many Faces of Consistency in Cross-Platform Design., 2006.
[7] S. T. Iqbal, P. D. Adamczyk, X. S. Zheng, and B. P. Bailey. Towards an index of opportunity: understanding
[14] J. C. Schryver. Experimental validation of
changes in mental workload during task execution. In
navigation workload metrics. Human Factors and
CHI ’05: Proceedings of the SIGCHI conference on
Ergonomics Society Annual Meeting Proceedings,
Human factors in computing systems, pages 311–320,
[15] M. Tungare and M. Pérez-Quiñones. It’s not what
[8] S. T. Iqbal and B. P. Bailey. Investigating the
you have, but how you use it: Compromises in mobile
effectiveness of mental workload as a predictor of
device use. Technical report, Computing Research
opportune moments for interruption. In CHI ’05: CHI
’05 extended abstracts on Human factors in computing systems, pages 1489–1492, New York, NY, USA, 2005.
[16] M. Tungare and M. Pérez-Quiñones. An
exploratory study of personal calendar use. Technical report, Computing Research Repository (CoRR), 2008.
FACULTY GUIDELINES FOR ACCOMMODATING STUDENT RELIGIOUS OBSERVANCES When planning courses, departmental programs, and other activities for the academic year, it is useful to remember the rich mixture of religious and ethnic groups that comprise our student population. The following list includes some religious holy days, civic holidays and festivals that occur during the academic year, vario
Efficiency, effectiveness and integrity questions relating to Service Contracts Procurement for EC External Actions Stanhope Hotel, Rue du Commerce 9, 1000 Brussels Opening remarks Panos Panagopoulos, EFCA President Koos Richelle, EuropeAid Director General Session 1 – Service procurement for EC external actions: policy and implementation Agneta Lindqvist, Euro