Why are there no spaces needed in the Chinese language?

44

When we read English without spaces between the words, a sentence appears like a string of jumbled characters, such as: @#¥%……&*(). However, adding spaces between words in a Chinese sentence can feel redundant. Interestingly, even for native English speakers, spaces are crucial for reading comprehension. 

So, why does English require spaces between words, while Chinese doesn’t? What are the underlying reasons for this difference? Scientists at the Institute of Psychology, Chinese Academy of Sciences find that the answer lies in an ‘economic’ principle.

English is an alphabetic writing system where each letter represents a phoneme, and words are typically composed of multiple letters. Spaces in English text serve to clearly define word boundaries, marking the start and end of each word. 

Chinese, in contrast, is an ideographic writing system where each character represents a syllable or morpheme. Chinese text flows with consecutive characters, without spaces separating words. Most Chinese words are short, typically made up of one or two characters, with an average word length of 1.40 characters. This consistency makes it easier for Chinese readers to predict word lengths, allowing them to quickly recognize word boundaries with less ambiguity.

On the other hand, English words are longer and more variable in length, averaging 3.78 letters, which makes it harder for readers to predict where one word ends and another begins. This creates greater uncertainty regarding word boundaries.

Researchers used information theory to analyze how different writing systems mark word boundaries in a large-scale corpus of 27 languages. They found that whether a writing system uses spaces to define word boundaries is tied to the amount of information these spaces provide. In languages like English, where spaces are used, they provide significant information (2.90 bits). In languages like Chinese, where spaces are not typically used, the added information from spaces is much lower (1.10 bits).

This difference in the value of spaces stems from the varying levels of uncertainty regarding word boundaries in different writing systems. Since Chinese has less uncertainty about word boundaries, inserting spaces provides limited additional information. In contrast, English word boundaries are more ambiguous, making spaces critical in helping readers identify words more easily.

The amount of information provided by spaces for marking word boundaries reflects the cognitive effort readers invest in segmenting words when reading text without spaces. Without spaces, readers must break a continuous string into distinct words— a process known as ‘word segmentation’ or ‘sentence breaking.’ 

During this process, readers rely on contextual clues and linguistic knowledge to identify word boundaries. In some cases, errors in segmentation can occur, requiring readers to detect and correct them. Both the segmentation process and error correction demand cognitive effort, which influences reading speed. In English, spaces provide a large amount of information, and removing them forces readers to exert more cognitive effort, increasing the likelihood of segmentation errors. On the other hand, in Chinese, where the information provided by spaces is minimal, readers can segment text without spaces with less cognitive strain. Therefore, English relies on spaces to reduce the cognitive burden of word segmentation, while Chinese, with its more predictable word boundaries, forgoes them.

Consistent with these findings, previous research has shown that altering how word boundaries are marked impacts reading efficiency differently across languages. Studies have demonstrated that removing spaces from writing systems with high space-based information (e.g., English) can lead to a significant reduction in reading speed—by as much as 50%. Conversely, in systems with low space-based information (e.g., Chinese), inserting spaces does not significantly enhance reading speed.

The decision to use spaces in English and not in Chinese may be an efficiency-driven choice aimed at achieving economy in reading. In reading, the range of visual perception at a single point of gaze is limited. Inserting spaces reduces the number of characters visible at once, which in turn decreases the efficiency of visual processing. For Chinese, the information provided by spaces is minimal, and readers can easily segment text without them. As a result, the benefit of using spaces in Chinese is insufficient to outweigh the cost they impose on visual perception, making it more efficient to omit spaces.

In contrast, spaces in alphabetic systems like English provide significant information, and removing them would require readers to invest more cognitive effort in word segmentation. Therefore, in English, the benefits of spaces for aiding word segmentation far outweigh their visual perception costs.

Ultimately, whether a writing system uses spaces or not reflects a choice to balance the cognitive effort of word segmentation with the efficiency of visual processing. Both systems adopt the more economical approach to marking word boundaries, optimizing for efficient reading.

The evolutionary history of alphabetic writing systems shows that they have been gradually reformed to achieve the most economical method of marking word boundaries. Historically, alphabetic systems did not always use spaces between words. In early texts, word boundaries were not clearly marked, partly because spoken language didn’t provide explicit boundary information and because writing materials were costly. To comprehend the text, readers had to read aloud, leading to less efficient reading. During this time, writing was largely restricted to a small group of scribes and missionaries.

It wasn’t until the Renaissance, with the rising demand for literacy among the general public, that spaces between words were introduced. This change improved reading efficiency and facilitated literacy by adapting the writing system to better meet the cognitive needs of readers, aligning with the principle of economy.

In contrast, Chinese texts have historically not used spaces to indicate word boundaries. The introduction of punctuation in Chinese helped clarify sentence structures, reduce the difficulty of reading, and improve reading efficiency by making sentence boundaries clearer. However, despite the addition of punctuation, Chinese has not adopted interword spaces like alphabetic languages. This suggests that punctuation alone is sufficient to reduce the cognitive load for Chinese readers, and the added benefit of spaces is not enough to outweigh the negative impact they would have on visual processing efficiency.

Thus, the evolutionary path of Chinese, while different from alphabetic writing systems, also follows the principle of economy. By making minimal changes while preserving its original structure, Chinese has effectively improved reading efficiency.