“What makes a good sleep tracker?”
“I didn’t sleep too well last night, and I felt so groggy this morning”
“Stressful day at work?”
“No, my stupid SleepApp didn’t work properly and I spent the night trying to get it to play relaxing whale noises. When I did drift off by myself, it didn’t wake me up in between sleep cycles. It better not fail tonight, I’ve got an important meeting tomorrow and I can’t sleep properly without it…”
Stilted conversation aside, we spend a large amount of our daily lives glued to screens of some description. These keep us active and engaged well beyond the time when we should be switching off for the night. For example, light from phones and laptops can alter our normal patterns of sleep and make it harder for us to drift off in the evening. This plays havoc with the body’s multitudinous clocks which naturally set in motion a number of chemical changes that allow us to unwind steadily before we drift off to sleep. However, that’s a point for another day.
For now, I want to talk about how we use technology to seemingly help us get to sleep and to, in want of a better way of putting it, tell us that we were in fact asleep the previous night (with or without a pretty graph to go with it).
Commercial ways to assess sleep:
There were two main ways of objectively measuring sleep in the research community: actigraphy and polysomnography (See what makes a good sleep tracker?) However, the business world has managed to split this up into a large selection of methods to tell us whether we’re sleeping enough. It may be that these different commercially available devices and apps are actually pretty good but the available evidence tells a different story.
Types of Sleep Tracker
I won’t go into too much detail about all of the different types of sleep tracking devices, of which there are many, but the purpose of the below list is to show the range of ways you can measure your sleep outside of the lab.
Sleep apps developed for use on phones (e.g. Android or Apple stores) vary in cost and features which they promise. A popular sleep app for android users, ‘Sleep as Android’, has over 10 million downloads and boasts the ability to differentiate between wakefulness, light sleep and deep sleep. It also claims that it can wake you up during your lightest sleep stages in the morning to promote a “natural” awakening which supposedly avoids the grogginess and tiredness of waking up in the deeper stages of sleep. This app, amongst others, relies on movement-based information.
Movement-based trackers can also take the form of watches which track our sleep (alongside activity, exercise, and heart rate). Both of these aim to give us lots of detail about our sleep and provide us with statistics which try to educate us about our own sleep – mainly with the intention of helping you sleep better. These works by applying an algorithm to movement data which is logged by something called an accelerometer. This is already present in your mobile phone and the app simply makes use of this data to predict your sleep.
There are a number of devices which you can buy which will offer a rudimentary attempt to track brain activity. Typically, if you want a good picture of your sleep you will track a wide range of different bodily functions in something known as polysomnography (see what makes a good sleep tracker?) However, these trackers try to track sleep through the use of a couple of electrodes placed on the scalp during sleep.
Temperature / Heart Rate / Muscle Response Based
Typically, these will be used alongside movement data to attempt to give a more accurate estimate of sleep.
Again, these are technically based on movement and temperature data but are placed on the bed in some respect and not worn. These can take the form of a bed covering (link), or even a trendy looking ball such as Sense.
What makes a good sleep tracker?
So what does an ideal measurement of sleep look like?
The gold-standard measure of sleep is known as called polysomnography. This measures a number of different things including:
- Electroencephalogram (EEG; records electrical activity produced by cells in the brain)
- Electrooculography (EOG; records eye movements)
- Electromyography (EMG; records electrical activity generated by muscle activity)
- Cardiac rhythms (ECG)
- Respiratory activity
This setup, understandably, requires you to come into a sleep laboratory and sleep in a rather unfamiliar bed with electrodes planted on your head and body. This enables sleep researchers to record brain activity, eye movements, muscle activity and heart rhythms. Together, these readings allow us to identify how long you sleep, how often you wake up, how long it takes for you to get to sleep, when you wake up, and to identify the individual sleep stages and cycles which make up a normal night’s sleep. There are problems with this approach but it is the best option we have for understanding more about an individual’s sleep. As you’ll likely agree, this is a lot of information to obtain and wade through. Sleep apps and watches cannot possible recreate this and so are limited, but this is not necessarily a problem if they correlate well with other less rigorous but well accepted measures used in sleep research.
For example, most sleep apps and watches work by determining how often you move on a given night and use the times you are moving less as sleep and the times you are moving more as awake. These devices measure movement by using a little device called an accelerometer which is found within your smartphone and watches such as the FitBit.
Within sleep research, we also use a device which is based on this same technology. The device is known as an actigraphy watch and it is generally seen an acceptable alternative to sleep lab measurements which are often costly, time-consuming and cumbersome. An actigraphy watch is usually worn on the non-dominant wrist, can be used outside a sleep lab, and thus allows researchers to assess sleep objectively as people go about their normal day-to-day lives. Actigraphy also works by detecting movement through accelerometers, and algorithms are applied to this data to explore certain facets of sleep (there are limits to this) in a more portable format.
Do these sleep apps actually work?
The main question you’ve probably come here to hear answered. The short answer: yes, but there are limitations to them all.
Quite simply put, there is a lack of research conducted here and the sample sizes are miniscule at best for the ones that do currently exist. Yet, these research studies do hint that perhaps sleep apps and wearable devices are actually assessing your levels of sleep in terms of duration, sleep efficiency and how often and for how long you are awake during a night of recording.
When sleep researchers try to determine whether wearables are actually tracking sleep they look for the following things: sensitivity (e.g. the ability of the app / device to measure when you’re actually asleep), specificity (e.g. the ability of the app / device to measure when you’re actually awake) and accuracy (e.g. is it measuring when you’re truly awake and truly asleep).
What does the current science say about commercial sleep trackers (e.g. FitBit)?
Wearables, or commercial sleep trackers, tend to show the same pattern when it comes to estimating our sleep patterns as that which is seen in actigraphy. Wearables using movement data typically overestimate total sleep time and sleep latency (how long it takes you to fall asleep) but are generally pretty accurate at telling when you’re asleep and for how long. By contrast, wearables such as the FitBit are poor at identifying periods of wakefulness during the night and will significantly underestimate disruptions in the night – known as wake after sleep onset (WASO). So, for the average person it is fair to say that wearables such as the numerous reincarnations of the FitBit may give you some insight into what your sleep looks like. However, if you are prone to fitful sleep or suffering from insomnia then these apps will be less accurate (despite claims on FitBit’s website that they may have a fix for people with disrupted sleep) and should not be depended upon. This is doubly the case if you consider using these in lieu of a going to a doctor about your sleep issues.
How about the claims made by, worryingly, many sleep apps and wearables that they can track different stages of sleep? The bold claim that a sleep app can measure REM sleep, for a start, is simply not known and extremely doubtful, even if some apps do add in more than just movement measures (e.g. heart rate). Other wearables such as the Jawbone UP make more reasonable claims that they can detect “light” versus “sound” sleep over a given night. Although it’s not entirely clear what “sound” sleep actually means, one research team took this to mean deep sleep and examined whether there was any truth behind this claim. In a large sample of adolescents, they found that the light and sound sleep measures were rather poor at measuring what they claimed to. Rather worryingly the ‘light’ sleep was found to be associated with time spent in the deepest stage of sleep which highlights that healthy skepticism should be applied to apps claiming they can pick out individual, or broad, sleep stages (see further reading).
A study by de Zambotti and colleagues examined the sensitivity and specificity of a different wearable, the Jawbone UP, compared to polysomnography and actigraphy in a group of middle aged women (average age 50 years old). They found, like others, that the Jawbone UP was generally accurate at determining when and for how long participants slept (but overestimated this value), but was pretty poor at determining when participants woke up during the night (underestimated this value, so had a good sensitivity but poor specificity). The ability to detect periods of waking and total sleep during the night were notably bad during particularly disrupted sleep as detected by PSG. This suggests that the Jawbone UP and FitBit Ultra are not particularly accurate at detecting the amount of sleep and waking in individuals with disrupted / fragmented sleep.
When the same research group (de Zambotti et al., 2015b) also examined the Jawbone UP in a sample of adolescents and young adults (age range 12-22 years old) and found similar results. That is, that the Jawbone UP overestimated total sleep time, sleep efficiency and sleep onset latency but underestimated total wake time during the night (it was much worse at detecting time spent awake than any of the sleep measures). Overall, it was found that the Jawbone UP was good at detecting when participants were asleep, but rather poor at identifying when they woke up during the night. This is important as fragmentation of your sleep will impact on sleep quality.
The same study also tried to examine whether the Jawbone UP’s dichotomisation of ‘sound sleep’ versus ‘light sleep’ were appropriately linked to deep versus light stages of sleep as assessed in the sleep laboratory. The ‘light sleep’ count was linked to movement and awakenings, and also to stage 3 of sleep (known as the deepest stage of sleep). In fact, none of the lightest stages of sleep were shown to be associated with the ‘light sleep’ count produced by Jawbone UP. The opposite was found for the ‘sound sleep’ count whereby typically deep stages of sleep were not found to be associated with this measure, but rather overall measurement of movement (specifically reduced movements) was. This suggests that wearables such as Jawbone and FitBit may be reasonable at detecting sleep during the night, but not night-time awakenings or the different stages of sleep. Furthermore, in populations where sleep is fitful or fragmented the accuracy of these apps reduces to a greater extent.
A previous study by one of the study’s authors, Hawley Montgomery, in 2012 found that a wrist-worn FitBit was comparable to actigraphy, but poorer than polysomnography in detecting when people were asleep. However, the FitBit was particularly poor in identifying when people woke up compared to actigraphy. This suggests that the healthy adult populations could still gain useful information from a FitBit but, as the authors highlight, it is a far way off being appropriate for measuring sleep in people with diagnosed or suspected sleep disorders.
As you can hopefully see, there are only a handful of studies which have attempted to understand how good market-leading wearables are in detecting sleep and wake. The emphasis on sleep and wake is intentional. These apps have not been assessed for their ability to assess anything beyond total sleep time, wake after sleep onset, sleep onset latency and sleep efficiency.
Other sleep apps also include a measure to wake individuals up during the lighter stages of sleep to enable them to feel more wakeful in the morning. The alarm sounds after tracking an individual’s pattern of sleep to create an ‘optimal’ window to wake up during which the alarm will try to target. However, there is currently limited evidence to back up these claims (Kelly et al., 2012), and there needs to be considerably more research here before such claims can be validated and backed up (SLEEPIO Article). It is somewhat surprising that such research has not been carried out considering how easy it would be to create an experiment where participants are randomized to either an optimal or sub-optimal wake-up alarm condition over 1-2 weeks (See Kelly et al., 2012 for expansion on this very point).
Sleep is something which can be fickle at many times throughout our lives, and it is not surprising that we would want to learn more about it. However, for those of us who suffer with our sleep on a regular basis, there is an obvious appeal to be able to track our sleep in the comfort of our own homes and on a regular (perhaps even nightly) basis. It may provide a skewed notion of how an individual is sleeping (for better or for worse) and this can provide unfounded alarm or comfort. The current wearables have a reasonable ability to tell when and for how long you’re asleep but be sceptical on their ability to tell you anything about the quality of your sleep. If you are genuinely concerned about your sleep then please consult your doctor.
Sleep apps, wearables and other sleep trackers are a fantastic idea and if they can prove to be comparable to other methods such as actigraphy then I see few reasons to discourage their use. However, there is a lack of available data to really understand whether these different sleep trackers are accurate. If they are simply measuring time spent asleep and awake then they seem to be okay, and comparable to measures used in sleep experiments. Yet, bolder claims about smart alarms, tracking individual sleep stages, and their use in sleep disorders are not conclusively studied at this moment in time. That is not to say they will not, but there is a sensible reason to be cautious until that evidence is available to us. So, by all means use these trackers and add them to part of your daily routine if you so wish. However, understand their limitations and be aware that paying minute detail to your sleep may also create its own problems. More on that point in my next post.
de Zambotti, M., Baker, F. C., & Colrain, I. M. (2014). Validation of Sleep-Tracking Technology Compared with Polysomnography in Adolescents. Sleep, 38(9), 1461-1468.
de Zambotti, M., Claudatos, S., Inkelis, S., Colrain, I. M., & Baker, F. C. (2015). Evaluation of a consumer fitness-tracking device to assess sleep in adults. Chronobiology international, 32(7), 1024-1028.
** Kelly, J. M., Strecker, R. E., & Bianchi, M. T. (2012). Recent developments in home sleep-monitoring devices. ISRN neurology, 2012. (Good expansion on the different sleep trackers available)
Montgomery-Downs, H. E., Insana, S. P., & Bond, J. A. (2012). Movement toward a novel activity monitoring device. Sleep and Breathing, 16(3), 913-917.
Montgomery-Downs, H. E., Insana, S. P., & Bond, J. A. (2012). Movement toward a novel activity monitoring device. Sleep and Breathing, 16(3), 913-917.