Abstrakt:
Like many other complex systems, natural language is studied from a variety of perspectives and attracts diverse academic disciplines, ranging from humanities to formal and natural sciences. One of the directions of research focuses on language's quantitative properties; it aims at identifying statistical laws characterizing various elements of language and their mutual relations – a famous example of such laws is Zipf's law, describing the distribution of word frequencies in texts. An interesting and yet unexplored issue is the question about the statistical properties of punctuation, which is responsible for introducing a specific organization into written language – punctuation marks divide texts into logically and grammatically coherent parts, clarify the meaning of potentially ambiguous phrases, and indicate when to take a breath when reading aloud. It turns out that certain features of punctuation seem to be largely universal across languages – for example, its distribution can be characterized by just two parameters which can be quite easily interpreted. On the other hand, the values of these parameters for texts in different languages might differ significantly and indicate features specific to particular languages.