Information Structure and Syntax Series: Prologue

There’s much about theoretical syntax that is either misunderstood or flat-out misrepresented. Often, this comes from the fact that syntactic theories are complex and hard to present in a concise and easy to understand manner. There’s simply too much of the stuff floating around to be able to effectively read every line of every work (a general problem in academia). It’s only been around ~70 years since theoretical syntax has been around, but there’s been a huge explosion in theories and models in that time that can make approaching the subject daunting. Personally, despite having completed a Master’s degree in linguistics, I still feel out of touch with a lot of the syntax going on. However, from a certain point of view (Kenobi, 4 ABY) this complexity and multiplicity of theories makes it an exciting and vibrant field1.

One of these misunderstandings centers around the idea that syntacticians aren’t interested in how language is actually used. This idea is far from the truth, as those who are interested in syntax have to be interested in how it is used, given that that is our primary source for understanding human syntax. Sure, there are many that may not agree with this, but for me and a lot of the people that I went to school with and worked with it seemed to be the case. One supposedly irreconcilable “interface” is that between syntax (as it is in the generative tradition) and pragmatics. Syntax is thought of—or at least was thought of—as being “autonomous” from other linguistic systems, such as semantics and pragmatics. As Lakoff (2008) states, in his review of Jackendoff’s 2007 book:

Because the rules could not look outside the system, language had to be ‘autonomous’—independent of the rest of the mind. Meaning and communication could play no role in the structure of language. The brain was irrelevant. This approach was called generative linguistics, and it continues to have adherents in many linguistics departments in the United States. (para. 4)

However, it need not be this way. In fact, it is not quite the case that the syntactic system is wholly divorced from anything else to do with the mind/brain. Chomsky (as cited in Newmeyer, 1998) alludes to this in Reflections on Language, wherein he states that he agrees with Searle’s assertion that “it is reasonable to suppose that the means of communication influenced [language] structure” (p. 155). Further, Chomsky (1995) also notes that a derivation must satisfy “external conditions” imposed on the syntax by the sensorimotor (SM) and conceptual-intentional (CI) systems2. Newmeyer (1998), while some may not agree, does try to argue that formalists and functionalists are often looking at the same problem from two different angles.

One of these areas of interests is the dislocation of constituents due to information structure, which is the packaging of discursive information in any given sentence. In particular, this area of linguistic examination looks at the topic, focus, and contrast of particular constituents in a phrase (see LingSpace for a brief overview). Some languages have this to a limited degree (such as English) whereas other languages, such as Japanese in the following example, portray a rich surface structure relating to information structure.

Kinoo Taroo-ga Ginza-de  susi-o tabe:ta

Yesterday Taroo-NOM Ginza-in sushi-ACC eat:PAST

‘Taro ate sushi in Ginza yesterday.’

Taroo-ga  Ginza-de Kinoo  susi-o  tabeta

Kinoo susi-o Taroo-ga Ginza-de tabeta

Susi-o  kinoo Taroo-ga  Ginza-de tabeta

Ginza-de Taroo-ga kinoo susi-o  tabeta

Kinoo Ginza-de susi-o Taroo-ga tabeta

            (example from Tsujimura, 2006)

The prevalence of information, and how information relates to other pieces of information in a particular discourse is explicitly marked in Japanese. It’s done primarily through the dislocation of constituents, but also manifests in the use of a topic marker in the form of –wa. Languages like Daagare (see Sakurai, 2014) or Gungbe (Aboh, 2007) also extend this to a set of information structural particles, as seen in the following data from Gungbe, which includes a topic marker and a focus marker :

1. Ún sè    ᶑɔ̀,    dán    lɔ́   yà,     Kòfi       hù        ì

   1sg hear that, snake D Top     Kofi    kill-2sg   it

   “I heard that, as for the snake, Kofi killed it.”

2. Ún sè    ᶑɔ̀,    xwé   lɔ́   yà     Kòfi  wɛ̀,   Àsibá     gbá-ɛ̀      ná

    1sg hear that, house D Top     Kofi Foc  Asiba      build-3sg for

    “I heard that, as for the house, Asiba built it for KOFI.”

There are also particularly interesting stress patterns associated with information structure insofar as prominent stress and contrastive stress are concerned (as in English: John went to the store, not Sally).

Why is this of interest? First, a quick primer in Minimalism, particularly derivational minimalism3. Following the current Minimalist understanding of the structure of the faculty of language, a set of lexical items (called the “numeration”) will be processed by the operation Merge (with additional computational constraints, which I won’t get into here) that will eventually be handed over as a hierarchically structured set to the interfaces of SM and CI: sound and meaning. From the standpoint of computational efficiency, this means that the derivation must be complete before it reaches SM and CI in order for the derivation to be properly interpretable. This derivation will contain the information necessary for linguistically external operations to be performed (such as physical externalization sign/speech or subsets of thought). Thus, we have the following process:

{α, β, γ, δ} –> Computational processes [Merge, labelling, etc.] –> {α, {β, {γ, δ}}} –> given to CI and SM

A caveat: since the semantics (scope, etc.) nor the final structure of any given derivation is “known” to the syntax before it is completed (what would be the point?), then there is a need to somehow account for how this set of lexical items reaches the endpoint in a way that satisfies both SM and CI conditions. If the derivation is improperly formed or contains “illegible” objects when it reaches either of the interfaces, then we get a failure of interpretability or “crash”. So, syntax can’t know where it is going, but at the same time it needs to get there correctly4. There’s a lot more to this, which I may or may not talk about in the next few posts depending on the topics that come up. In any case, it is an intriguing line of thought. Of course, this is still an open area of contention even within generative linguistics, let alone linguistics as a field. But running with this idea, we can see that from this point-of-view there is a bit of a paradox: for there to be a crash-proof derivation containing an information structure, it must be privy to that information ahead of time (given that it is discursively known). Further, given that there are both externalization (phonological) and semantic consequences involved with information structure, then all the information must be contained within the completed derivation at the point where it is handed over to SM and CI. Thus, there has to be a syntactic mechanism for this structure of some sort, with no extra-syntactic, pre-interface modules. At least, this is the line of thought I will be pursuing.

For information structure, we have a very interesting constellation of theories centered around two approaches to this problem: a.) there are specific lexical items that function only to help “mark” constituents as topic, focus, etc., or b.) there are limited or no “functional” items, and instead information structure is handled via extra-syntactic processes. On either hand, there is an increase in computational workload, but where it is placed—either inside or outside the syntax—has potentially important theoretical consequences regarding our understanding of how language and the mind works. How this situation is resolved is an interesting test case, in my opinion, for theoretical models of syntax.

Now, this may seem very high-end and full of technical jargon, and in some instances that is very much true. However, I am planning to incrementally move through this apparent paradox in the next few posts in this series. I will update the posts below so that there will be a comprehensive list. The next post will simply give some test cases involving particularly interesting languages, as well as a brief overview of some of the main proposals in figuring out syntax and information structure.


Aboh, E. (2007). Information structuring begins in the numeration (Pre-print manuscript). Retrieved from

Chomsky, N. (1995). The Minimalist program. Cambridge, MA: The MIT Press.

Chomsky, N. (2000). Minimalist inquiries: The framework. In J. Uriagereka (Ed.), Step by step: Essays on minimalist syntax in honor of Howard Lasnik, (89- 156). Cambridge, MA: MIT Press.

Lakoff, G. (2008). The functionalists dilemma. American Scientist, 96(1). Retrieved from

Newmeyer, F. (1998). Language form and language function. Cambridge, MA: MIT Press.

Sakurai, K. (2014). The syntax and semantics of focus: Evidence from Dagaare (Doctoral dissertation).

Tsujimura, N. (2007). An introduction to Japanese linguistics. Malden, MA: Blackwell Pub.


1. There are a number of great books and papers out there on the differing theories in linguistics. Here are a few free ones that are out there: Andrew Carnie’s great Syntax book, Grammatical Theory by Stefan Müller, and (for busier folk) Hagstrom’s wonderful class outline. I also recommend, for minimalist ideas, Cedric Boexck’s book Understanding Minimalist Syntax and Lasnik’s A Course in Minimalist Syntax. This will help approach more foundational books like Chomsky’s Minimalist Program.

2. SM and CI are roughly thought of as the sound and meaning, though Chomsky points out:

To say that phonetic features are ‘instructions’ to sensorimotor systems at the interface is not to say that they have the form ‘Move the tongue in such-and-such a way’ or ‘Perform such-and-such analysis of signals.’ Rather, it expresses the hypothesis that the features provide information in the form required for the sensorimotor systems to function in language-independent ways. Similar observations hold on the (far more obscure) meaning side. The framework imposes a distinction between (a) linguistic expressions Exp = <PF, LF> that are internal to the mind/brain, and (b) observable events, utterances, and actionsexternalization of (mentally constructed) speech acts. No questions arise about the ontological status of the set of expressions {Exp} generated by L; its status is somewhat like that of potential visual images or plans for limb motions. (Chomsky, 2000, p. 91)

With this clarification comes the point that,

However these matters are resolved, we have two “imperfections” to consider: uninterpretable features and the dislocation property. These properties (in fact, morphology altogether) are never built into special-purpose symbolic systems. We might suspect, then, that they have to do with externally imposed legibility conditions. With regard to dislocation, that has been suggested from the earliest days of modern generative grammar, with speculations about facilitation of processing (on the sound side) and the dissociation of “deep” and “surface” interpretive principles (on the meaning side). The boundaries are not clear, nor are the mechanisms to express them. One approach to the array of problems was to distinguish the role of deep and surface structure (D- and S-Structure) in semantic interpretation: the former enters into determining quasi-logical properties such as entailment and q-structure; the latter such properties such as topic-comment, presupposition, focus, specificity, new/old information, agentive force, and others that are often considered more discourse-oriented and appear to involve the “edge” of constructions. Theories of LF and other approaches sought to capture the distinctions in other ways. The “deep” (LF) properties are of the general kind found in language-like systems; the “surface” properties appear to be specific to human language. If the distinction is real, we would expect to find that language design marks it in some systematic way perhaps by the dislocation property, at least in part. To the extent that such ideas can be given substance, it would follow that the dislocation property is required; it falls within the design specifications given to the super-engineer seeking an optimal solution to conditions imposed by the external systems.” (Chomsky, 2000, pp. 120-121)

3. See Epstein, Groat, Kawashima, and Kitahara (1998) as well as Epstein and Seely (2002; 2006).

4. This is gets more complicated with the concept of phases in minimalist syntax. More importantly, the syntax should be hypothetically “blind” from the intended outcome since that would mean that the final structure would be “known” before it ever reached that state, since it would be self-defeating (why have a structure-building device if you already know the structure) and it would add unnecessary computational load (with having to “look ahead” at every possible outcome).

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s