Is it readable? What's the Evidence?

November 2016

Earnsy Liu, TechCommNZ member and GDID student, looks for evidence (not just opinions) to help you manage the daily conundrums we face in our profession. If you have a question for Earnsy to tackle, please email comms@techcomm.nz. This month, Earnsy looks at the evidence for readability.

Once upon a time, before I’d heard of technical communication, I had to review a particular business case for content (not writing). It didn’t look very appealing, but I didn’t re-format it, partly out of fear that the action would be criticised as silly and unnecessary – did it really matter if, as long as the content was right? But boy, did I get told off for that call.

Just how important is the ‘look’ of a document? If it’s important, should I have used a serif or sans serif typeface? What size? How far apart and how long should the lines have been? Let’s look at the evidence from the last 10 years or so, and at guidelines (not evidence) for accessibility, especially for those with visual impairments and dyslexia.

Do looks matter?

Lonsdale, Dyson, and Reynolds (2006) concluded that looks matter. Thirty participants were given three passages in one of three layouts. Designed for different degrees of readability (the authors call it ‘legibility’), the layouts varied different elements:

  • typeface for headings
  • font size
  • line length
  • line spacing
  • alignment
  • number of columns
  • margins
  • paragraph denotation.

Participants carrying out a matching task based on the passages performed faster, more accurately, and more efficiently using the high readability (legibility) layout. Performance decreased for the medium readability layout, and was poorest with the low readability layout.

Performance aside, there’s visual appeal. Varela, Mäki, Skorin-Kapov, & Hoßfeld (2013) deliberately created four websites. They then violated design principles, varying the number and ‘goodness’ (suitability) of colours and typefaces. Their crowdsourced experiment on close to 700 participants from 45 countries found the goodness of colours and typefaces affected visual appeal a lot, as did participants’ countries. In contrast, the number of colours and typefaces didn’t matter, nor did the participants’ ages.

There’s also the argument that legible, uncluttered documents are easier to read and therefore more accessible for everyone, especially those with vision loss (Round Table on Information Access for People with Print Disabilities, 2011).

So it looks like I should have re-formatted the business case. But how?

Serif or sans serif?

The results of the three studies below suggest it doesn’t matter whether serif or sans serif typefaces are used. Despite that, the participants of one study (Ling & van Schaik, 2005) definitely preferred sans serif. This echoed the findings of a previous study, in which ‘even though participants performed tasks more quickly with serif fonts, they still preferred sans serif fonts’ (Bernard et al, 2003, cited in Ling & van Schaik, 2005, p. 396).

Researchers Ling & van Schaik (2005)
Typefaces Arial and Times New Roman
Tasks
  • Visual search: Finding target hyperlinks on a screen.
  • Information retrieval: Answering questions by browsing web pages.
Results The typeface had no effect on speed, accuracy, or efficiency, but participants preferred Arial, especially after the search task.
Researchers Beymer, Russell, & Orton (2008)
Typefaces Helvetica and Georgia
Tasks Reading articles on a screen and answering multiple-choice questions to measure retention 1.
Results No statistically significant 2 differences in reading speed or retention for both native and non-native English speakers.
Researchers Akhmadeeva, Tukhvatullin, & Veytsman (2012)
Typefaces PT Serif and PT Sans
Tasks Reading text in print and answering multiple-choice questions immediately after.
Result No difference in reading speed or comprehension, both of which varied greatly.
Comment Conducted in Russian. Tested short-term memory only.

For more on the serif-sans serif discussion, check out Alex Poole’s Overview of legibility research: serif vs sans serif (note that the research described is not current).

Serifs don’t matter either way for general accessibility. What’s more important is letters that are easy to distinguish from one another, such as ‘e’ from ‘o’ (Round Table, 2011). However, sans serif typefaces are recommended for those with dyslexia (British Dyslexia Association, n.d.). ‘Serifs tend to obscure the shapes of letters, making the letters run together. But a sans-serif [typeface] … increases the spacing between letters and makes them more distinguishable’ (Anthony, 2011).

Font size

A 10 pt font is too small. Beymer et al found the ‘reaction to the small 10 pt font was fairly negative’ (2008, p. 18) and thought they would probably recommend 12 pt, based on speed and preference. On the other hand, Rello, Pielot, & Marcos’s (2016) advice is to ‘make it big!’, 18 pt big. While larger fonts don’t improve reading speed, they do improve comprehension. Fixations 3 are shorter, which suggests reading is easier.

Just as importantly, readers like it big: up to 26 pt in one study (Rello & Marcos, 2012). Another study found perceived readability peaked at 18 pt, and perceived comprehension was better with larger sizes (Rello et al, 2016).

If you’re wondering whether to upsize only for mature readers, the answer’s no. In one study, eight out of ten participants were in their 30s and 40s (Beymer et al, 2008), while in the other two, the average ages were 26 and 30 years (Rello & Marcos, 2012; Rello et al, 2016).

Guidelines suggest at least 12 pt for everyone (Round Table, 2011), and 12-14 pt for those with dyslexia (British Dyslexia Association, n.d.)

Researchers Beymer et al (2008)
Fonts Helvetica and Georgia, 10, 12, and 14 pt
Tasks Reading articles on a screen and answering multiple-choice questions.
Results For both native and non-native speakers of English:
  • reading speeds and retention were not statistically different
  • fixations were longer at 10 than at 14 pt
  • return sweeps 4 took significantly longer at 14 than at 10 pt.
Researchers Rello & Marcos (2012)
Fonts Arial, 14, 18, 22, and 26 pt
Tasks Reading fragments of text with different elements that were varied (colours, font size, spacing, etc.). After reading all the text, picking a preferred layout for each fragment from the options presented.
Results The shortest fixations were for 22 and 26 pt, which 79 per cent of the users preferred. Also, ‘… the greatest correlation among performance and preference was found in font size’ (p. 4).
Comments Conducted in Spanish.
Researchers Rello et al (2016)
Fonts Arial, 10, 12, 14, 18, 22, and 26 pt
Tasks Reading text and answering literal and inferential questions.
Results
  • Fixations kept getting shorter as font size increased. This suggests text was more readable at larger sizes.
  • Comprehension was better at 18-22 pt than 10-12 pt.
  • Subjective perceptions of readability improved up to 18 pt.
  • Subjective perceptions of comprehension were similar for 10-12 pt and 14-26 pt.

Line spacing

‘If the space between lines is too narrow, the print can be difficult to read. Lines of text may appear to merge with the text on the lines above and below, making it difficult to recognise word shapes. For larger font sizes, more spacing is required between lines.’ (Round Table, 2011, p. 12).

It’s not clear to me what the evidence is for line spacing: three studies, three different findings. Ling & van Schaik (2007) suggest double spacing may enhance performance, especially with justified text.

On the other hand, Rello et al (2016) reported that participants felt single spacing was better for comprehension than 1.8 line spacing; the authors suggest not deviating ‘too much’ (p. 3646) from single spacing, although 1.5 would still be fine. Exact spacing isn’t critical: ‘The effects of line spacing were less pronounced [than the effects of font size]. Our study revealed significant effects on comprehension, but not on readability’ (p. 3645).

Perhaps the main thing is to be generous with spacing, and to use double-spacing ‘when text is presented to be read or scanned quickly’ (Ling & van Schaik, 2007, p. 65).

Line spacing isn’t important just for older readers either, as larger fonts weren’t. Over three-quarters of Ling & van Schaik’s (2007) participants were 25, and the participants in the other two studies averaged 26 and 30 years (Rello & Marcos, 2012; Rello et al, 2016).

The British Dyslexia Association recommends 1.5-line spacing.

Researchers Ling & van Schaik (2007)
Fonts Arial, 10 pt
Spacing 1, 1.5, and 2 lines
Tasks Visual search task: finding target hyperlinks on a screen.
Results
  • Performance was best at 2 lines, then 1.5, then 1.
  • Participants preferred 2 lines to 1.5, and 1.5 to 1.
Comment Double spacing seemed to improve performance when text was justified. See next section on line length.
Researchers Rello & Marcos (2012)
Fonts Arial, 14, 18, 22, and 26 pt
Spacing 0.8, 1, 1.2, and 1.4 lines
Results Fixations were shortest at 1.4 lines, the spacing preferred by 43 per cent of participants.
Researchers Rello et al (2016)
Fonts Arial, 10, 12, 14, 18, 22, and 26 pt
Spacing 0.8, 1, 1.4, and 1.8 lines
Results
  • Comprehension was significantly better at larger spacing than at 0.8 lines.
  • Fixation durations were unaffected.
  • Subjective perceptions of comprehension were significantly higher at 1 than at 1.8 lines. The authors concluded ‘participants felt that their comprehension was impaired by the largest line spacing’ (p. 3644).

Line length

Line length is usually measured in characters per line, or cpl. Although longer lines are read faster, shorter lines may be read more accurately, and readers prefer them (Ling & van Schaik, 2005). So if it’s a choice between long and short lines, ‘opt for the latter, because ... the majority of textual web content is presented to be read rather than skimmed’ (Ling & van Schaik, 2005, p. 403).

But tread carefully when discussing line length. Schaik’s (2005) results suggest it can be subjective, and hence controversial.

The British Dyslexia Association recommends 60-70 cpl.

Researchers Shaikh (2005)
Fonts Arial, 10 pt
Line lengths 35, 55, 75, and 95 cpl
Tasks Reading short news articles and answering comprehension questions.
Results
  • Reading speed and efficiency were best when lines were longest (95 cpl).
  • Comprehension and satisfaction were unaffected by length.
  • Preference was mixed: 30 per cent liked 35 cpl best because short lines could be read faster and required less eye movement. Another 30 per cent liked 95 cpl best because there was more information on a page. And all 20 participants picked either 35 or 95 cpl as their least preferred length.
Researchers Ling & van Schaik (2005)
Fonts Arial, 10 pt, and Times New Roman, 12 pt
Line lengths 55, 70, 85, and 100 cpl
Tasks
  • Visual search: Finding target hyperlinks on a screen.
  • Information retrieval: Answering questions by browsing web pages.
Results
  • For visual search:
    • speed improved when identifying hyperlinks at 85-100 cpl.
    • accuracy improved when noting the absence of hyperlinks at 70 cpl.
  • Longer lines ‘… allow users to scan quickly across the page, while reducing the number of separate lines that need to be scanned for a given amount of information … In addition … [scrolling] would obviously be reduced for longer line lengths, hence increasing speed (p. 402).
  • For information retrieval, performance was not affected by line length.
  • Participants preferred shorter lines, especially after the information retrieval task.
Researchers Rello & Marcos (2012)
Fonts Arial, sizes 14, 18, 22, and 26 pt
Line lengths 22, 44, 66, and 88 cpl
Results
  • The shortest fixations (0.174 sec) were for the longest lines (88 cpl), but only 8 per cent of the participants liked this length.
  • The second shortest fixations (0.175 sec) were at 44 cpl, preferred by 69 per cent.

It is possible that short lines are particularly important for non-native readers. In a study of Asians who had lived in the U.S. less than five years, ‘many users mentioned that [the traditional] format is too crowded and hard to read, especially on a lengthy page. In addition, [it] felt very uncomfortable to their eyes and brain.’ (Yu & Miller, 2010, p. 6). To help such readers, some software breaks up text at appropriate junctions into a cascading or a Jenga-like format.

It hasn’t occurred to me before, but line length ‘can also be considered in terms of alignment, the way in which text is spread across the page’ (Ling & van Schaik, 2007, p. 61). Left-justified text tends to speed up searching but participants prefer justified text. This is not entirely surprising; some participants in Lonsdale et al’s (2006) study preferred the medium readability layout partly as its justified text made it look organised.

Researchers Ling & van Schaik (2007)
Fonts Arial, 10 pt
Line lengths Left-aligned or justified. Characters per line not specified.
Tasks Visual search task: finding target hyperlinks on a screen.
Results
  • Performance with left-aligned text was better than or the same as with justified text. Search results suggested that ‘justified text was more difficult to search accurately’ (p. 65), but double spacing compensated for this.
  • Participants preferred justified text, regardless of which alignment they had been given.

In line with the findings on performance, use left-alignment for clear, readable documents (Round Table, 2011). It is also recommended for those with dyslexia: ‘Justified text … creates large uneven spaces between letters and words. When these spaces line up above one another, a distracting river of white space can appear. This can cause dyslexic readers to repeatedly lose their place when reading’ (Anthony, 2011).

Conclusion

It’s disappointing that the findings have almost exclusively been for screens and would not have applied to my business case. Apparently ‘extensive research’ has been conducted for print (Ling & van Schaik, 2007, p. 60), but the paucity in recent years, and changes in reading habits, make it difficult to know what applies today.

The print-screen distinction matters. For example, many typefaces ‘…are designed to be printed and so may be less easy to read when presented on screen . . . . Investigations of the effect of line length in both printed and on-screen formats have generated differing results’ (Ling & van Schaik, 2005, p. 396).

Let’s make the best of what we have and look at what applies to screens, print, and both formats.

Screens

Based on recent findings for screens, the following would seem advisable:

  • Use either serif or sans serif typefaces (no difference in performance), but participants prefer sans serif and this is better for accessibility.
  • Use large font sizes, for example, 18 pt (better performance and preferred).
  • Space lines generously (better performance and preferred), especially for scanning.
  • Make lines 44-70 cpl (better for reading and preferred).
  • Left align lines (better performance) or justify them (preferred).

Print

With print, my interpretation of the only study is:

  • Don’t sweat over using serif or sans serif typefaces (no difference in performance).

The following points merely outline the format in Lonsdale et al’s (2006) high readability layout. Some of the studies and opinions they were based on are decades old, and some are just opinions. Also, although the high readability format worked, it is hard to know which elements are recommended, so please take the following points with a large pinch of salt:

  • Font size was Times New Roman, 10.5 pt.
  • Line spacing was 14 pt.
  • Lines were 70 cpl and left aligned.
  • By the way, do you know how long lines in MS Word are? I didn’t, so I checked and found they were too long:
  • Times New Roman, 10.5 pt gave approximately 103 cpl
  • Calibri, 11 pt, gave approximately 97 cpl.

Both

Font size, line spacing, and line length need to be balanced (Lonsdale et al, 2006). As font size is increased, increase line spacing accordingly.

If evidence on performance and preference contradict, consider the purpose of the design. ‘For commercial sites, aesthetic decisions should perhaps be prioritised over more objective ones. However for other sites where site traffic is of less relevance designers may be better served by the performance data’ (Ling & van Schaik, 2007, p. 65).

Notes

  1. Retention: Memory.
  2. Statistically significant: In this article, 'significant' means statistically significant: a result that does not occur by chance. It does not mean important or great.
  3. Fixations: Brief pauses for processing. Longer fixations suggest readers are having difficulty.
  4. Return sweeps: Moving from the end of a line to the beginning of the next.

Acknowledgements

Thank you to Emma Harding for being my sounding board and to Kevin Prince for the references on accessibility.

References

Akhmadeeva, L., Tukhvatullin, I., & Veytsman, B. (2012). Do serifs help in comprehension of printed text? An experiment with Cyrillic readers. Vision Research, 65, 21-24. Retrieved from Academic OneFile.

Anthony (2011). 6 surprising bad practices that hurt dyslexic users. Retrieved Oct 16, 2016, from UX Movement website: http://uxmovement.com/content/6-surprising-bad-practices-that-hurt-dyslexic-users/

Beymer, D., Russell, D., & Orton, P. (2008). An eye tracking study of how font size and type influence online reading. Conference proceedings of 22nd British HCI Group Annual Conference on People and Computers: Culture, Creativity, Interaction (15-18). Liverpool. Retrieved from https://www.semanticscholar.org/paper/An-eye-tracking-study-of-how-font-size-and-type-Beymer-Russell/501066ccf251484eabff32fb1d5dd93faac5716d

British Dyslexia Association (n.d.). Dyslexia style guide. Retrieved Oct 16, 2016, from www.bdadyslexia.org.uk/common/ckeditor/filemanager/userfiles/About_Us/policies/Dyslexia_Style_Guide.pdf

Ling, J., & van Schaik, P. (2005). The influence of font type and line length on visual search and information retrieval in web pages. International Journal of Human-Computer Studies, 64, 395–404. Retrieved from Academic OneFile.

Ling, J., & van Schaik, P. (2007). The influence of line space and text alignment on visual search of web pages. Displays, 28, 60-67. Retrieved Sep 21, 2016, from http://www.academia.edu/download/40627517/The_influence_of_line_spacing_and_text_a20151204-15111-1outxwg

Lonsdale, M. d. S., Dyson, M. C., & Reynolds, L. (2006). Reading in examination-type situations: The effects of text layout on performance. Journal of Research in Reading, 29(4), 433-453. Retrieved from Ebesco database.

Rello, L., & Marcos, M.-C. (2012). An eye tracking study on text customization for user performance and preference. Web Congress (LA-WEB), 2012 Eighth Latin American. Retrieved from http://www.academia.edu/download/30709425/LAWEB_2012.pdf

Rello, L., Pielot, M., & Marcos, M.-C. (2016). Make it big! The effect of font size and line spacing on online readability. Conference proceedings of 2016 CHI Conference on Human Factors in Computing Systems (3637-3648). Retrieved from https://www.researchgate.net/publication/301935601_Make_It_Big_The_Effect_of_Font_Size_and_Line_Spacing_on_Online_Readability

Round Table on Information Access for People with Print Disabilities Inc. (2011). Guidelines for producing clear print. Retrieved Oct 16, 2016, from http://printdisability.org/wp-content/uploads/2013/09/round_table_-clear_print_guidelines-PDF.pdf

Shaikh, A. D. (2005). The effects of line length on reading online news. Retrieved Sep 11, 2016, from Usability news website: http://usabilitynews.org/the-effects-of-line-length-on-reading-online-news/

Varela, M., Mäki, T., Skorin-Kapov, L., & Hoßfeld, T. (2013). Towards an understanding of visual appeal in website design. Conference proceedings of QoMex 2013. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.716.707&rep=rep1&type=pdf

Yu, C.-H., & Miller, R. C. (2010). Enhancing web page readability for non-native readers. Conference proceedings of the 28th International Conference on Human Factors in Computing Systems (CHI'10) (2523-2532). Retrieved Sep 11, 2016, from https://groups.csail.mit.edu/uid/projects/froggy/chi10-froggy.pdf