Over the last few years, a brand new method of linguistic research has began to emerge. This strategy, which has emerge as identified below numerous labels reminiscent of 'data-oriented parsing', 'corpus-based interpretation' and 'treebank grammar', assumes that human language comprehension and construction works with representations of concrete earlier language reviews instead of with summary grammatical ideas. It operates through decomposing the given representations into fragments and recomposing these items to research (infinitely many) new utterances. This ebook indicates how this common technique can observe to numerous varieties of linguistic representations. Experiments with this technique recommend that the effective devices of usual language can't be outlined by way of a minimum algorithm or rules, yet have to be outlined through a wide, redundant set of formerly skilled buildings. Bod argues that this consequence has very important results for linguistic thought, resulting in a completely new view of the character of linguistic competence.

6. 6. 7. 7. Notice that some subtrees occur twice (a subtree may be extracted from different trees and even several times from a single tree if the same node configuration appears at different positions). By means of the composition operation, new sentence-analyses can be constructed out of this subtree collection. 8. Analyzing Mary likes Susan by combining subtrees The probability of this particular derivation is the joint probability of 3 stochastic events: (1) selecting the subtree s[NP vptvHikes] NP]] among the subtrees with root label S, (2) selecting the subtree NPtMary] among the subtrees with root label NP, (3) selecting the subtree NpfSusan] among the subtrees with root label NP.

In the context of stochastic language theory, however, we are not so much interested in tree languages as in stochastic tree languages. Thus, it is more interesting to compare stochastic tree languages of strongly equivalent grammars. Proposition 6 There exists an STSG for which there is a strongly equivalent SCFG but no strongly stochastically equivalent SCFG. 3 The string language generated by G is {ab*}. Thus the only (proper) SCFG G' which is strongly equivalent with G consists of the following productions: S->Sb (1) (2) 3 This STSG is also interesting because it can be projected from a DOP1 model whose corpus of sentence-analyses consists only of tree t \.

Elementary tree corresponding to the construction flights from ... to... 5). This is a serious shortcoming, since for the correct disambiguation of a new sentence which contains an NP-construction like flights from ... , it may be important to describe this general construction as one statistical unit. STSG, on the other hand, can easily describe this NP as a statistical unit by simply attaching a probability to this construction. The same limitations of SHBG occur with dependencies between words like nearest and to in the ATIS NP-construction the nearest airport to Dallas.

