H. C. Andersen's Style as Features of Variation, Complexity and Contextual Stylistic Regularity.A Preliminary Approach


The main purpose of this investigation is - of course - to contribute to the stylistic description of H. C. Andersen's texts, but since style and stylistics are quite broad terms, I find it necessary to clarify the use of the concepts in this context.

In general the theoretical notion of style covers features of artefacts (or human acts) which display the "way" things are done or the "mode" of doing them in case there actually are different ways or modes. The denotation of the term "style" may include a huge amount of not always intuitively related phenomena, and in this context I shall confine myself to what is normally conceived of as either literary or linguistic style.

Linguistic Style

I have previously argued that

... linguistic style is the form of a specific linguistic sign, expressing specific information about a specific universe of discourse or context (a referential universe), if there are in fact alternative form which may be interpreted as referring to basically the 'same' referential universe. (Götzsche 1994a p. 159)

I still hold this view to some extent, but I shall redefine and qualify the point of view below.

But first I have to clarify the distinction between my approach to the investigation of linguistic style and other stylistic approaches. The first kind of approach can be labelled "literary style", because its primary aim is to characterise the metaphorical and poetical features of literary works. As is well known, the central part of the standard account of Danish stylistics, Albeck (1939), is concerned with these matters, i.e. the tropes (metaphors and metonyms and derivations thereof), and the stylistic figures (i.e. the musical features of rhythm and rhyme on the one hand, and the patterns of different kinds of word order on the other). Albeck also deals with the choice of words and syntactic constructions, but there can be no doubt that the purpose of characterising these features is to contribute to the description of the literary work. It is evident that what is investigated is linguistic form and content, but it is equally evident that the form is investigated only as a medium of the content, and that the result of the analysis is a description of the conceptual and emotional functions of the form in the fictional universe of the literary work.

The other kind of approach can be labelled the "semiotic approach", because it takes up any kind of (linguistic) communication as its material of analysis and investigates all the "communicative (linguistic) levels" of (linguistic or other) media, often written texts. This approach implies the assumption that there are a number of "deeper levels" under the surface of the communicative (linguistic) form, and that these patterns of meaning have certain functions in the (linguistic) communication, the most important function being the "effects" on the person(s) who receive the (linguistic) message. In the case of texts it is evident that such an approach involves the investigation of both the "semantics", the "pragmatics" and the "situation context" of the text, and furthermore it involves interpretations of general phenomena concerning the psychological and social characteristics of people. A proponent of this view is Cassirer (cf. Cassirer 1970 and 1986), and in a recent paper (Cassirer 1994b) he elaborates his line of reasoning.1 His point of view implies that the most important feature of style is the effect of the linguistic product on persons, and consequently he proposes that stylistics as a notion within philological and linguistic research should be replaced by the notion of rhetoric (cf. Cassirer 1994a p. 14). This constitutes a coherent theory of (some of) the functions of the linguistic sign, and therefore it will be a logical step to classify it as a "semiotic approach". But it follows from the central concepts of the theory that certain linguistic forms are supposed to have been chosen by the author (primarily?) because of their assumed or predicted effects, and that there may have been alternative linguistic forms, the use of which have been rejected because they might have had other effects. This furthermore implies that the most important reasons for choosing specific linguistic forms seem to be the assumed or predicted effects on the communicating persons, and not what they actually are talking or writing about. Therefore this approach may yield the most convincing analyses when the materials of the investigation are commercial or political language, or literary works.

I do not reject the approaches mentioned above, but I shall point to the fact that there seem to be some features of linguistic signs - including literary works - which in general are not captured by these approaches. Such features - which may be almost impossible to translate (cf. Götzsche 1994b) - have to do with the overall impression of the mode of expressing oneself which constitutes the style of an author or the style of a genre, and they may be captured by a focused investigation of the linguistic form of a linguistic product. Within the scope of literary fiction such investigations are only justified insofar as they contribute to the descriptions of literary works, but they may also be applied to other linguistic products as a part of the research in the functions of oral and written communication in societies.

If an investigation like this is to be made applicable, a number of theoretical, conceptual classifications have to be made. First of all the theoretical concept of style is by no means clear, and I shall propose a definition saying that: style is the form of a linguistic sign. This definition has the advantage (i.e. it makes the notion precise and clearly specified) that it is connected with an overall theory of semiotics, but that it deals only with linguistic expressions. Other kinds of style may be labelled with other terms. Furthermore it has the advantage that it does not deal with the "deeper levels" of a linguistic product, such as its specific "semantics" and "pragmatics", but is delimited to linguistic form (cf. below). Accordingly, one is not tempted to make empathic interpretations of the forms to explain the choice of expressions; one only has to describe the characteristics of the linguistic product - expressed in a particular language - to say something about its style. Finally the definition has the advantage that it is not especially controversial what the subject of investigation is, namely the linguistic expressions and not the special "contextual" meanings expressed by them, or the pattern of the universe of discourse. To be able to apply a terminology in such descriptions it is necessary to make some theoretical distinctions.

Text and Work

First of all one has to separate the literary work from the text. Since H. C. Andersen's fairy tales and other stories are the materials of this investigation, these literary "texts" each have both a form and a content, and the content includes both the meaning of words and sentences, and the narrative or poetic (fictional) universe of the "texts". But the fictional universe based on the special meanings of words and sentences employed by each single "text" - and the semantics of words and sentences - is relevant to the stylistic investigation only to the extent that their general meanings are drawn upon in the characterisation of the linguistic form. Accordingly the special meanings of words and sentences, and their establishing a certain fictional universe, constitute the elements of the literary work, and "The text is just a sequence of words, and the only kind of interpreting that one can do of a text is to expound the meanings of its constituent words and sentences as they are given by the conventions of the language" (Currie 1991 p. 338), i.e. a text can be investigated by grammatical (including sentence and word semantic) methods only. Consequently the stylistic investigation carried out in the following only considers the texts of H. C. Andersen as its subject, not the literary works expressed by those texts.

Then what are grammatical methods, if only such methods can be applied to texts, and to what extent is it plausible to draw upon the (general) semantics and pragmatics of the words and sentences without being theoretically inconsistent? If one follows the line of reasoning behind the definition of the theoretical concept of style proposed above, then there emerges a difference between on the one hand the form of the linguistic sign as a whole, and on the other hand the form of the minimal linguistic sign. In other theoretical contexts (cf. Götzsche 1994a p. 54 sqq) I have argued that the definition of a sentence should be just that, namely the form of the minimal linguistic sign. For a sequence of linguistic expressions to be interpreted as a sentence, it must be constructed in accordance with a certain linguistic structure, namely the syntactic structure of the particular language in which the sentence in constructed; and this is the scope of syntax. Style, then, is the form of the whole of a linguistic sign, while syntax is the specific structure of the parts of the linguistic sign that makes these parts sentences. Both syntax and style can be conceived of within the scope of grammar in a broad sense. If one asks what the entities subjected to scrutiny in each case actually are, then the normal answer is that they are words in sentences and words investigated as cross-sentence entities. Theoretically, this is not without problems (cf. Götzsche 1994a p. 175 sqq), but as for this project it will do, and accordingly the two subjects of this paper are some salient stylistic features: 1) the nature of the syntactic constructions, and 2) the lexical characteristics of the texts of H. C. Andersen called fairy tales and stories ("Eventyr og Historier").

In connection with these considerations concerning the theoretical notion of style it is crucial to observe that the notion proposed above does not presuppose a notion of stylistic variation, and in this respect it differs substantially from traditional stylistics. In accordance with the point of view held by me now - a view that differs from my previous accounts (cf. the quotation above) - variation and variability are a possible property of style, not a prerequisite, and this should solve the problem of how to describe the style of expressions for which there are no alternative forms, e.g. numerals in arithmetic.

The Methods of Investigation

The overall purpose of the following investigation is to ascertain the general features of the syntactic constructions in some of H. C. Andersen's texts, and furthermore the general lexical features of the same. Accordingly, the textual properties examined are the ones mentioned in the title of this paper: complexity and regularity. This includes both the syntax and the lexicon of the texts. Thus syntactic constructions can be said to be either complex or simple, and either type may occur more or less often in the texts, and in parallel the words of a text may display different occurrences. This implies that the features are no parameters in a strict sense, because the corresponding values have no either/or (but not both) instantiations. Complexity and simplicity are relative notions, and one has to decide about more or less arbitrary limits between them. But once these limits have been established they can function as parameters, and the frequency of the values can be estimated. Frequency is also a relative concept in this investigation, because the observed frequencies of the units of the different texts are fairly easy to find, but what is interesting is the relative frequency of these units in each text, or as cross-text relative frequency (for convenience relative frequency will be expressed as percentage in the following). And to make the results of the statistics relevant to a stylistic description one has to decide about a more or less arbitrary borderline between frequent and non-frequent occurrences.

It should be noticed that the occurrence of words is no straight forward concept. In this context I shall only look upon words as word forms (i.e. specific combinations of letters - so-called word types of a lexicon - found as word tokens between blanks in texts) and as lexemes (abstract words of which the word forms are inflections). The term word will be used in no other sense than word form, and accordingly lexical occurrence is the (observed or relative) frequency of the word tokens (henceforth: text words) of word forms and lexemes.

To carry out a limited part of the task mentioned above two kinds of preliminary investigation have been made: a lexical statistical survey of the first 31 of the texts of the corpus "Eventyr og Historier", and a syntactic analysis of the sentence(s) of randomly chosen periods in the first 31 texts of the corpus. It would be difficult to carry out manual parsing of all the 31 texts, so I have selected the period (defined as the text between full stops) after the 23rd full stop and carried out a syntactic analysis of the sentence(s) incorporated in the period. If the text is too short - i.e. there are not 23 full stops (or there are less than 5 full stops between the 23rd and the end of the text) - I have selected the period after the 5th full stop. Should there be any peculiarities about the period after the 23rd full stop (e.g. it is not a full sentence but a sentence fragment) I have chosen the 47th period as the pivot.

The Style of H. C. Andersen's Texts

In the introduction above I have mentioned the syntactic analysis first, but in fact I shall start with the statistics of the lexical units. In the traditional account of H. C. Andersen's (HCA's) style and language by Jensen (1929) it is claimed that HCA is especially fond of the adjectives nydelig and dejlig:

...nydelig er et antageligt ord, men bedre er - hvilket? [...] Hvorhen vi end vender os i eventyrenes verden, møder vi dette adjektiv [i.e. dejlig]. ... pretty is an acceptable word, but what is better is - which one? [...] Wherever we look in the world of the fairy tales we meet this adjective. (Jensen 1929 p. 11)

This is not a precise evaluation of the occurrence of the words, and by means of modern technology it is possible to test the claim. The questions can be posed like this: is it true that the words mentioned are especially frequent in HCA's texts; are there any words which - apart from these - are especially frequent; and are these words also frequent in Modern Standard Danish?

To try to answer the questions I have counted the words of each of the first 11 texts in the electronic version of "Eventyr og Historier" and also the words of a combined corpus of the first 31 texts. The texts are named by the "Eventyrkode" (Fairy Tale code) number according to Møller (1967 pp. 9-11), and they will be referred to by these figures. All the words of the texts and the corpus have been counted by means of some simple computational routines, and out of these sequences of text words has been automatically produced a list of lexicon words (word types) in which each word is followed by a figure stating the number of occurrences (text words or word tokens). A typical example of such a list is displayed in appendix 1 which is an extract of the list made on the basis of text 010, selecting a part of the list with the initial letter h - which is a letter with a medium frequency as an initial letter (ca. 4500 words of the corpus). Examining appendix 1, what is striking at a first glance is the fact that relatively few of the words occur more than one or two times, although the text is the longest of the 11 texts (9 509 text words).

After that I have, from each text, selected the lexemes with a total of word forms of 10 or more for further investigation, and these words form a special list. By experience it can be assumed that the length of the texts does not significantly affect the probability of occurring 10 or more times, because the longer the text the larger the lexicon (cf. appendix 1) and the possibility of variation, and the shorter the text the higher the relative frequency of frequent words, so presumably not many words will be eliminated because of the limit at 10 words. In addition I have, in the special list, included the lexemes deilig and nydelig because they are of special interest, although they do not have 10 or more word form occurrences. In the special list I have marked the words which either belong to the core set of words in Modern Standard Danish or which are thematic words, i.e. words belonging naturally to the theme of the story. The core words are the ones isolated by means of advanced statistical methods by Ruus (1995), and they are marked with a K in the list. Words which are thematic are marked with a T in the list, and words not belonging to the core set of Modern Standard Danish (MSDan) but belong to the set of the 5 000 most frequent MSDan words (cf. DFO) are marked with an F.

The interesting feature of the list is that there is only one word in the residual list: nydelig. Most of the words have been sorted out because they are K-words, and words like Hex, Hund, Penge, Prinsesse, Soldat, Øine are considered thematic words because they are connected with the theme (including the persons) of the text. The only word which is not a K-word but an F-word is Øine, but as mentioned it is also thematic. The residual word nydelig is one of the words mentioned by Jensen above, but (as mentioned above) it does not appear in the list by nature, because it does not occur 10 or more times; in fact it occurs only two times in 001. The lexeme does in fact occur in various word forms in most of the other 10 texts (in total 47 times), but most of the occurrences are concentrated in a limited number of texts: 004 (13), 005 (18) and 006 (11); and none of the isolated word form occurrences (text words) per text amounts to 10 or more times; only the word form totals do so. One of the word forms (nydelig) also occurs with a relatively high frequency in the whole (31-text) corpus of 94 206 text words (e.g. nydelig: 0.032 (observed frequency: 30)), as compared with the very low MSDan frequency: 0.00075 (observed frequency: 30), but one might ask whether this difference is especially relevant. An observed frequency of 30 nydelig text words in a corpus of 94 206 words is, of course, relatively higher than an observed frequency of 30 nydelig text words in a corpus of 4 000 000 text words, but it is questionable whether one can justifiably claim that in the first case the word nydelig is especially frequent, when one only meets this text word 30 times when reading 94 206 words. If nydelig is included in a set (a semantic paradigm) with other word forms of the same lexeme, the number of occurrences increases to 83, and although this figure is almost three times as high, it does not undermine the argument: 83 occurrences in a corpus containing 94 206 text words still does not deserve a predicate like "Wherever we look we meet".

The same kind of reasoning may be applied to the lexeme deilig and its inflectional word forms. It is, in fact, one of the core words of MSDan, but only in two cases in the 11-text corpus does a word form occur in a single text as a text word more than 9 times (D/deilige 16 times in 007 and 21 times in 008). As opposed to the lexeme nydelig, the relative frequency of deilig/e is actually significant (technically in a non-statistical way). In fact, the word form deilig occurs 26 times as often in text 001 as in MSDan, and deilige occurs 51 times as often as in MSDan in text 004, but these ratios are based on a rather small number of occurrences: 4 in text 001 (2 531 text words) and 7 in text 004 (2 729 text words). So, even though the relative frequencies of related word forms (inflections and derivations) of deilig in the texts are fairly high compared with MSDan, the difference in observed frequencies between MSDan and the 31-text corpus (315 occurrences in the corpus of 94 206 text words; relative frequency: 0.334) does not justify the claim that "Wherever we look we meet" in HCA. To put it tersely: one might question whether 630 occurrences (or 945 or 1 260) would affect the overall impression of a word being used very often if the occurrences were equally distributed in the text corpus of 94 206 text words. On the other hand the general impression of HCA's tendency to use these particular words in certain texts may seem justified (on the background of the limited data which has been examined), but with the proviso that the occurrences are concentrated in specific texts and display fairly low frequencies in other texts.

In connection with salient and semantically significant words like deilig and nydelig one may ask whether there are other less salient words the frequency of which by itself makes them interesting as an example of the lexical favourites of HCA. In text 002 one finds the word ganske which has a relative frequency in the text of 0.228, which means that it occurs 9.0 times as often as in MSDan, and in text 004 the ratio is 20 : 1 in favour of HCA. The overall occurrence in the corpus is 7.7 times as high as in MSDan, so one might assume that HCA - one way or the other - "prefers" this "insignificant" word. The statistics evidently has to be related to the semantics of a word. For instance the word dandse occurs 12 times in text 004 (a relative frequency of 0.440), and this means that the frequency is 146 times as high as in MSDan, but this obviously has to do with the theme of the story. The word dandse cannot be substituted by another word when the author wants to tell about somebody who dances, while the word ganske - both as an adjective an (especially) as an adverb - can be substituted by a number of other words.

This pattern is repeated throughout the statistics of the 11 texts: words with very high HCA-frequencies (e.g. raabte, 120 times as frequent as in MSDan, text 002) are semantically specific words, and words with medium high frequencies are either possible HCA-preferences like ganske or stor (3 times the MSDan frequency, text 001) or context bound and semantically more or less "empty" words like the deictic hun (in the 31, text corpus 2.3 times the MSDan frequency).

By means of this combination of semantic identification and statistical evaluation it appears (on the basis of the limited set of data) that there are very few, if any, favourite words of HCA. And if there are actually some words, then they are concentrated in the context. In connection with the syntactic analysis below I shall call this concentration stylistic contextual regularity.2 While frequency says something about global textual occurrence (or corpus occurrence) of words but nothing about how they are concentrated, regularity can be said to be an effect of the local distribution of words in certain contextual parts of a text. How a reader experiences the importance of a word - what kind of psychological impression it makes on him - seems to be more a matter of contextual regularity (combined with the thematic meaning of the word) than it is a matter of its observed or relative frequency. Consequently the claim made by Jensen (1929) concerning the whole of HCA's work seem to be a result of a psychological mechanism rather than empirical facts, and the problem of "Wherever we look we meet" seems to belong to the realm of literary works and not to the stylistic examination of texts.

It is a salient feature of HCA's texts and syntax that the periods (text between full stops) of his texts are often short, and that - if the periods are long - the sentences are short. I have not investigated the statistics of short vs. long periods. Such an investigation would require theoretical reasoning about the operational concepts of shortness vs. length, and furthermore software which is capable of testing the data. The software is fairly easy to develop, but there is no reason to do so before meaningful notions of the theoretical concepts have been established.

Instead I have made a random selection of periods in the first 31 texts, estimated the length as short or long, and analysed the syntactic constructions of some of the sentences. The random selection has not been made by a random generator but according to the method mentioned above. Of the selection of 31 periods it appears that about half of them (14) are rather short periods (cf. appendix 3). As for the rest of the periods some of them have a "normal" length:

004 [.23] Saa gik hun med dem hen til alt sit andet Legetøi, der stod paa et pænt lille Bord, og hele Skuffen var fuld af Stads.

while others are fairly long:
019 [.23] De seilede saa længe, at der ingen Land var at øine mere, og de saae en Flok Storke, de kom ogsaa hjemme fra og vilde til de varme Lande; den ene Stork fløi bag ved den anden og de havde allerede fløiet saa langt, saa langt; een af dem var saa træt, at hans Vinger næsten ikke kunde bære ham længer, han var den allersidste i Rækken og snart kom han et stort Stykke bag efter, tilsidst sank han med udbredte Vinger lavere og lavere, han gjorde endnu et Par Slag med Vingerne, men det hjalp ikke; nu berørte han med sine Fødder Tougværket paa Skibet, nu gled han ned af Seilet og bums! der stod han paa Dækket.

The function of the length of periods is mostly a matter of the reading process, but in general there is a relation between the periods and the sentences of the periods, i.e. one normally does not expect a long sequence of sentences (i.e. equivalent syntactic constructions on the same level in a syntactic hierarchy) with no full stops separating the sentences. Rather one would (in MSDan) expect a number of subordinate clauses (and subordinate clauses in subordinate clauses) as the reason for the distance between the full stops. But this does not apply to many of HCA's periods, exemplified in example 019 above. So, on the one hand one may characterise example 019 as a complex sequence of text words in a period - which may be difficult to read, and on the other hand the syntactic structure of the sentences of the period is rather simple:3

019 [.23]
saa længe,
ingen Land var
    ns --------v
øine mere, og
-------ra ß c

en Flok Storke,
hjemme fra og
s v
til de

varme Lande;
den ene Stork
ved den anden og
pa ------------ßc

saa langt,
saa langt;
 een af dem var
saa træt,

hans Vinger
pa    <ß

allersidste i Rækken og
et stort Stykke bag

med udbredte Vinger
pa -----------------
pa     ß

endnu et Par Slag med Vingerne,
na   ß

med sine Fødder
Tougværket paa Skibet,

paa Dækket.
pa ---------

There are a limited number of syntactic constituents in each sentence (including the subordinate clauses), most of the constituents are rather short, and the phrase structure of each constituent is quite simple. So, apart from the extreme length of the period the most significant feature of HCA's syntax - based on this example and the short periods in appendix 3 - is that it is fairly simple or primitive. Before passing over to some more complicated examples I shall mention a salient feature in example 019 which is related to the account of word frequency and regularity above: the word og as a conjunction between sentences occurs 4 times, and it is even more frequent in the local context of:

005 [.23] En nydelig lille hvid Sommerfugl blev ved at flyve rundt omkring hende, og satte sig tilsidst ned paa Bladet, for den kunde saa godt lide Tommelise, og hun var saa fornøiet, for nu kunde Skruptudsen ikke naae hende og der var saa deiligt, hvor hun seilede; Solen skinnede paa Vandet, det var ligesom det deiligste Guld. (My emphasis, HG)

Here we find also the word deiligt, and this concentration of alleged HCA-features is an example of what I would label with the above mentioned notion of stylistic contextual regularity.

Actually HCA is able to employ a fairly complicated syntax:

031 [.05]
ikke noget,
at den
o>sc s

desto mere
da Hofhunden
pa>sc  s
til den,
va o <

at Springgaasen
 o>sc  s
af god Familie;
 den gamle

Raadmand, der
tre Ordener for at tie stille,

at Springgaasen
o>sc    s
begavet med
o                   <<ß

paa Ryggen af den om
pa---------------- o>sc 
en mild
en stræng

--- <
ikke engang see
paa Ryggen af ham der


Sentence no. 2 follows a primitive and short initial sentence, and it contains a subordinate clause as an object (o> <ß). This is followed by a 3rd sentence beginning with an adverbial subordinate clause (pa> <) and ending with another subordinate clause as an object (o> <ß). In the 4th sentence the subject is furnished with an attributive subordinate clause (>cs <), and immediately after the verb another subordinate clause follows functioning as an object, and this subordinate clause contains in itself a subordinate clause, also as an object (o> o> <<ß). The last sentence of the period displays similar complexities. This is not heavy syntax, but I have in other texts (e.g. "Historien om en Moder") found even more complex syntactic constructions, so HCA does not avoid syntactic complexity if it serves his purposes. Example 019 can be said to be an instance of syntactic (stylistic) contextual regularity; as opposed to what may be called lexical (stylistic) contextual regularity (cf. example 005 above). Of course syntactic contextual regularity is also a matter of frequent occurrence, but it should be noticed that what is being repeated is not identical units, as with words. What is being repeated in a certain context is the phenomenon of syntactic simplicity; not identical syntactic structures.

As an overall impression of the style of HCA the most salient feature seems to be the fact that in longer texts he does not repeat himself very often (cf. appendix 1). It is true - to some extent - that he has certain iterated ways of expressing himself: a few special words, short periods and sentences; but looking through the lists of observed frequency of lexicon words, the most striking feature is his capacity of variability. In my view the special (peculiar) features of his style are not very special. Maybe the results of traditional descriptions of his style have not all been based on empirical excerptions. Instead the phenomenon of what I call stylistic contextual regularity may yield a subjective impression of frequency, and this impression may be psychologically confirmed by the scattered distribution of the features in other contexts without stylistic regularity. Accordingly - in my view - the most important feature of HCA's style is the variation of the linguistic expressions.


This is - as mentioned in the title - a preliminary approach. One purpose of this paper is to contribute to the clarification of the theoretical notion of linguistic style, another purpose is to contribute to the description of HCA's style and language use. I have not made any suggestions about the relation between my description and the well known literary qualities of HCA's work, but I suppose that such linguistic investigations are relevant, both to the literary interpretations of HCA, to the characterisation of 19th century Danish in historical linguistics, and to modernisation and translations of his texts.

If the methods employed and the results achieved are to be improved, it involves some of the measures mentioned above, and furthermore it would be an advance if some kind of automatic interpretation of the syntactic constructions of texts - also parsing of the resulting sentences - could be made possible. Then the whole HCA-corpus containing 174 texts would be more easily handled.


1. Somehow I have a feeling that his arguments are - in part - addressed to some of my previously published reflections on the theoretical notion of style, but they are not being referred to. back

2. In the original version of this paper I used the technical term stylistic contextual density about this phenomenon. Later I found out that "stylistic density" had been used by Göran Kjellmer (cf. Kjellmer 1993 p. 29) about exactly the opposite phenomenon, and I substituted the term density by regularity. back

3. The letter symbols represent the normal constituents of sentences: s: subject, v: verb, a: adverb, etc. (cf. Götzsche 1994a), and their specific meaning is not essential in understanding the syntactic complexity. As for the more peculiar symbols, the notation means:

sentence limit ß
subordinate clause > ... <
sentence substitution ∑ _


Albeck, U. 1939: Dansk Stilistik. København: Gyldendal.

Cassirer, P. 1970: Deskriptiv stilistik. Acta Universitatis Gothoburgensis: Nordistica Gothoburgensia 4. Göteborg: Almquist & Wiksell.

Cassirer, P. 1986: Stilistik & stilanalys. Stockholm: Biblioteksförlaget.

Cassirer, P. 1994a: "Några reflexioner kring stilistikens territorium, eller: Hur många ben har elefanten", in Stilsymposiet i Göteborg, 21-23 maj 1992. Institutionen för svenska språket, Göteborgs universitet, pp. 7-30.

Cassirer, P. 1994b: "Stilistikens plats i nordistiken". Working paper. Institutionen för svenska språket, Göteborgs universitet.

Currie, G. 1995: "Work and Text", in Mind, Vol. 100. 399, July 1991, pp. 325-40.

Dansk Frekvensordbog. 1992. København: G. E. C. Gads Forlag. (DFO)

Götzsche, H. 1994a: Deviational Syntactic Structures. A Contrastive Linguistic Study in the Syntax of Danish and Swedish. Doctoral Thesis. Göteborgs universitet, Institutionen för svenska språket.

Götzsche, H. 1994b: "Översättning av stil, eller: Varför kan man inte översätta H. C. Andersen till svenska?", in Stilsymposiet i Göteborg, 21-23 maj 1992. Institutionen för svenska språket, Göteborgs universitet, pp. 119-35.

Jensen, A. 1929: Studier over H. C. Andersens Sprog. Haderslev: Carl Nielsens Forlag.

Kjellmer, G. 1993: "Lexical differentiators of style", in M. Gellerstam (ed.): Studies in Lexicology 1. Reports from the Lexicology Research Programme. University of Göteborg, Faculty of Arts, pp. 24-34.

Møller, Sv. J. 1967: Bidrag til H. C. Andersens Bibliografi, I. København: Det kongelige Bibliotek.

Ruus, H. 1995: Danske Kerneord, I & II. København: Museum Tusculanums Forlag.

Appendix 1, Text 010

Haab:1 heldig:1 Himmelen:1
Haand:4 hele:15 hinanden:6
Haanden:2 Hele:4 Hindringer:1
Haar:1 heller:2 hist:1
Hallandsaas:1 hellig:1 Historie:1
Halsen:1 Helligdom:1 Historier:1
halv:2 hellige:1 Hittegodset:1
halvaaben:1 helst:1 hjalp:2
halvaabne:1 hen:11 Hjem:1
Halvdeel:2 hende:2 hjem:2
Halvdelen:1 hendes:2 hjemme:1
Halve:1 henover:2 hjemligt:1
halvnøgne:1 Henseende:1 Hjemmet:1
halvsnees:1 Hensyn:1 Hjerte:6
ham:41 hentet:1 Hjerter:1
Han:11 her:36 Hjerterne:3
han:191 Her:7 Hjertes:1
handlede:1 Herbergeersteder:1 Hjertestød:1
handler:1 Herefter:1 Hjertet:11
Hanekamme:1 herinde:4 Hjertets:1
hang:2 herlige:1 hjælpe:1
Hannibal:1 Herlighed:1 Hjørne:1
Hans's:3 herligt:2 Hm:1
hans:28 Hermed:1 hm:1
Har:2 hernede:1 Ho:1
har:45 herneden:1 ho:2
harsk:1 herover:1 Hoffet:1
Hat:1 herpaa:1 holde:2
Hatte:1 Herr:1 holdt:2
Hatten:1 Herre:12 hollandske:1
Havbugter:1 Herreblade:1 Holmen:2
havde:57 Herren:1 holsteenske:1
Have:2 Herrer:3 Hop:1
have:22 Herskab:1 hoppede:1
Haven:2 hertil:1
Havet:1 Hest:1
havt:1 Heste:1
hede:1 hialp:1
Heden:1 hiem:1
Hedendømmet:1 Hiemmet:1
hedt:1 Hierte:1
heed:1 hiin:1
heel:2 Hilsen:1
Heiberg:2 himmelblaa:1

Appendix 2, Text 001

001 TXTWRDs:2531 T Hunden: 19 K store: 14
K af: 20 T Hundene: 3 K stort: 6
K alle: 12 T Hundens: 1 K tag: 4
K at: 28 K hvor: 16 K tage: 2
K da: 19 K Hvor: 2 K taget: 1
K de: 28 K i: 49 K tog: 16
K deilig: 4 K ikke: 28 K var: 39
K deilige: 2 K ind: 10 TF Øine: 11
K deiligste: 1 K jeg: 16
K deiligt: 1 K Jeg: 3
K den: 25 K kan: 12
K der: 32 K Kan: 2
K Det: 11 K kom: 13
K det: 54 K komme: 1
K Dig: 4 K kommer: 3
K dig: 7 K kunde: 18
K din: 2 K mange: 12
K dit: 1 K med: 30
K Du: 45 K men: 21
K du:5 K Men: 5
K en: 43 K mig: 11
K er: 19 K min: 4
K et: 27 K mit: 7
K fik: 5 K nu: 17
K faae: 14 K Nu: 9
K faaet: 1 K nydelig: 2
K for: 16 K og: 124
K ham: 20 K om: 13
K Han: 4 K Om: 4
K han: 61 K op: 17
K Har: 1 K paa: 47
K har: 6 T Penge: 11
K havde: 1 T Prindsesse: 5
K havde: 26 T Prindsessen: 11
K have: 11 T Prindsessens: 1
K hende: 11 K saae: 9
K hendes: 2 K see: 15
T Hex: 4 K seer: 3
T Hexen: 10 K seet: 2
T Hexens: 2 T Soldat: 7
K hun: 15 T Soldaten: 35
K Hun: 4 T Soldatens: 1
T Hund: 6 T Soldaterne: 4
T Hunde: 1 K stor: 4

Appendix 3

001 [.23] Saa huggede Soldaten Hovedet af hende.
002 [.23] "Ak ja!" sukkede lille Claus oppe paa Skuret, da han saae al Maden blive borte.
003 [.05] Det var en Prindsesse, som stod udenfor.
008 [.23] Nu var da den ældste Prindsesse 15 Aar og turde stige op over Havfladen.
009 [.23] Alle Mennesker i Byen talte om det prægtige Tøi.
010 [.23] "Hele Fortouget er væk og alle Lygterne slukkede!"
012 [.23] "Sei Du!" sagde den ene, "der ligger en Tinsoldat! han skal ud at seile!"
016 [.47] Nu kunde de da forstaae, at det var Tyrkeguden selv, som skulde have Prindsessen.
017 [.23] Nu kom Drengene nede paa Gaden og sang deres Vise: "Storke, Storke Steie!"
020 [.23] "Gud bevar' os!" sagde Hofdamen.
024 [.05] "Gid jeg aldrig faae Pidsk om jeg lyver!" svarede Toppen.
026 [.23] "Iaften," sagde de Allesammen, "iaften skal det straale!"
028 [.23] Gud, hvor det dog var et Veir, og hvor Gaden saae ud!
029 [.23] Han tog sin Kone paa Kridt, som man siger!

Bibliographic information about the text:

Götzsche, Hans "H. C. Andersen's Style as Features of Variation, Complexity and Contextual Stylistic Regularity. A Preliminary Approach" , In: Johan de Mylius, Aage Jørgensen and Viggo Hjørnager Pedersen (ed.): Hans Christian Andersen. A Poet in Time. Papers from the Second International Hans Christian Andersen Conference 29 July to 2 August 1996. The Hans Christian Andersen Center, Odense University, Odense University Press. 576 pages, Odense, Denmark 1999.