You are on page 1of 3

Cognitive Models of Language and Beyond

Assignments Week 2: DOP Trees


Hielke Prins, 6359973

Answers

1. With no constraints on tree depth the treebank will include a tree that leaves just the word
“relatives” out of the first sentence in the corpus. We can then construct both sentences taking
this (rather large) subtree as a starting point, by adding parts of the tree from the second
sentence:

(a) “Mary hates visiting bees”: (b) “Mary hates visiting buzzing bees”

2. When we restrict ourselves to using only the smallest subtrees possible (Illustration 12), we will
first have to construct the subtree used in the previous question out of these atomic parts. This
involves the steps labeled 1-7 in the derivation for Question 1a.

These steps can be taken in any order compatible with leftmost node substitution and are thus
not unique. For example, step 2 (substituting the leftmost NP with as subtree corresponding
with NP → Mary) can be taken at any point after steps 1-6, since the root node of this subtree
requires a NP. Step 7 has to be taken after the 2nd one however, since it has the same root node
category.

3. The sentences “Mary hates visiting bees” and “Mary hates visititing buzzing bees” are
variations on the original trees in the corpus that recycle larger parts of their structure:

Mary hates visiting relatives John likes visiting buzzing bees


Mary hates visiting bees Mary hates visiting buzzing bees

Imposing a maximal depth on the subtrees extracted from the corpus increases the amount of
possible derivations for a single sentence, as seen when comparing the answers on Question 1
and 2.

1/3
A minimal depth larger the one on the other hand will prevent us from isolating the NP → bees
subtree (step 7) as a single unit, independent from its original parent and sibling nodes. That in
turn would make shortest derivation of these sentences longer or maybe even impossible, as
opposed to the derivations in Question 1.

In fact, the derivations given for Question 1 are both shortest and unique but require subtrees of
various depths: a minimal depth of 1 and a maximum of at least 3. To be able to exchange the
subjects John and Mary between the sentences or to replace them with arbitrary nouns acquired
from other trees added to the corpus, this has to increase to at least 4. Such constraints
correspond with the depth of both of the trees in the corpus, which makes them obsolete.

4. A sentence for which there is no single (unique) shortest derivation with slots of different
categories at the leave nodes of different branches, so that can they can be filled in arbitrary
order.

The shortest derivation in such a case includes only the two node substitutions necessary to
attach the nodes with a corresponding root category at the leaves of the largest common tree.
Because their categories differ this can be done in any order, making no single shortest
derivation possible.

Such a sentence is for example “Mary likes visiting bees” where V → likes and NP → bees
have to be attached at the structure corresponding with “Mary _ visiting _ ”, which can be done
in any order.

5. In Question 3 we have seen that any constraint


on the depth of subtrees will impair derivation
of new sentences that efficiently vary on the
structure of previously acquired ones.

Subtrees with a larger depth impose more


constraints that make them context-dependent
over longer distances, while the minimal
subtree depth of one corresponds with a
context-free rewrite rule.

Releasing constraints on subtree depth allows Illustration 1: Minimal subtrees with a depth of one
to efficiently store both, short distance and corresponds to context-free rewrite rules.
long distance structural relationships within
sentences in the treebank. It decreases the shortest derivation length while it remains its
flexibility.

Question 4 showed that using the shortest derivation for ranking different tree structures itself is
not without problems. Is does not always provide an unique solution, making ranking arbitrary.
The given example sentence “Mary likes visiting bees” shows that this does not have to be at
all uncommon although the implied meaning makes that look unlikely. It is useful to extract
such patterns from sentences to be able to express the fact that “Mary likes visiting me” or even
“the photo museum” once more trees are added to the corpus.

2/3
The assumptions behind the derivation length as a criterion for ranking however, make it
efficient to derive such new sentences recycling large chunks of highly context-specific trees
stored. Some of these context dependent patterns are only cognitively plausible when Mary is
your girlfriend. Otherwise it is better be stored as a subject independent pattern in order to be
able to say the same thing about others.

This shows that the optimization task behind finding the shortest derivation does not necessarily
reflect cognitive processes. Efficiency constraints implied by the shortest derivation seems to
rely on highly context-specific stored patterns without a penalty for barely used or implausible
ones. Actual usage might be a better predictor of cognitively plausible patterns then their utility
in an arbitrary optimization process.

6. (… this is a bonus extension of question 5 ….)


Depending on the source of the corpus, the treebank will of course already reflect some of the
relevant structural usage statistics that might play an underestimated role in generating new
sentences. Cross-modal corpora of consumed and written material on social networking sites
might soon be able to reestimate this role. There nevertheless seems to be room for
improvement on the cognitive plausibility of the shortest derivation model.

An obvious improvement is to account for the shortcomings mentioned above by implementing


ways to account for utility and plausibility of the stored representations. The treebank could be
adapted to reflect the probabilities of certain subtrees and derivations by assigning weight to the
trees (memory) and to the connections between their nodes (associativity). These weights could
then be initialized or trained using usage data combining the two approaches of Data Oriented
Parsing (DOP) and tree ranking. Decay functions and learning algorithms can be applied on
these weights and the network starts acquire more of the functionality of neural possessing and
Bayesian reasoning models.

Development of such models could profit from research of Bayesian cortical models,
illustrating the potential of progress in that direction when it comes to increasing cognitive
plausibility. Ultimately any real cognitive process is implemented in the brain. Any plausible
model has to account for that and should take the constraints and existent machinery that are
revealed in these models into account. A claim constructionists would approve on.

The shortest derivation criterion should be replaced by algorithms that models execution of
specific tasks and opens the door to a more integrated framework of language processing. Tasks
vary from learning to retrieval but they all involve employing and maintaining data in the
treebank.

Combining techniques from DOP and and tree ranking seems to a promising first step towards
more cognitive plausibility. An incremental reward-learning algorithm that models cortico-
striatal processes and does supervised learning as well as working memory retrieval could be
the last. The _ derivation criterion will almost certainly not necessarily be the shortest one
anymore.

3/3

You might also like