This wiki page is a record of a discussion about terminology. The Annotation Standards working group considered the different terms being used for different paths (the logical term if we use the annotation graph representation), and for describing various types of relationship among different paths. For now this page is simply the body of e-mail messages in this thread, which could be edited later to make a coherent description of the terminological ambiguities, relating them to discussion of the ontology in the paragraph where the "white paper" cites the Annotation Graph framework of Bird and Liberman (2001), and linked into the main page there if/when these pages migrate to their eventual home.
n.b. Calling the paths "streams" invokes Hertz's Delta system, as described in, e.g.:
Hertz, Susan R. (1990). The Delta programming language: an integrated approach to non-linear phonology, phonetics, and speech synthesis. In John Kingston & Mary E. Beckman (eds.), Papers in Laboratory Phonology I: Between the Grammar and Physics of Speech, pp. 215-25. Cambridge University Press.
_____________________________________________________________________
From: Charles Fillmore
Subject: "layered annotation"
Date: Sun, 6 Sep 2009 16:58:45 -0700
I think I might have introduced a confusion of terminology in connection with "layered annotation."
Since we're in general talking about stand-off annotations, we can have one layer that shows part of speech, one that shows lexical meanings, one that shows speaker switch, etc., etc., and that's all fine. But I find that I've also used "layering" to refer to the problem of using one annotator's product to form the basis of another level of analysis (and have claimed that this is both necessary and problematic). Can anybody think of a way of phrasing the latter that would avoid confusion?
From: Sarah Churng
Date: Sun, 6 Sep 2009 17:20:23 -0700
For 'layers' aimed at the simultaneous displays of different time-aligned annotations such as pos tags, lexical transcriptions, intonational phrases, etc. (in contrast to the levels of analysis 'layers'), I think we referred to these in our meetings as tiers for linguistic types?
In ELAN, stackable tiers are available both as independent tiers which are time-alignable (such as direct transcriptions) and as referring tiers which are not time-alignable and must inherit from parent transcription tiers (such as with translations).
Is this getting close to what you mean?
From: Stuart Robinson
Date: Mon, 7 Sep 2009 00:36:54 +0000 (UTC)
At the risk of muddying the waters, a tier in Sarah's sense could have multiple layers then, no? So Chuck's example would be one where a tier for morphosyntax or whatever has multiple layers (i.e., annotation on annotation). On top of that you might also have versioning.
From: Sarah Churng
Date: Sun, 6 Sep 2009 18:41:04 -0700
Right, and, to a large extent, I think the documentation for ELAN actually faces the same conundrum Chuck brings up---that is, they seem to use "tier" in both senses Chuck is trying to distinguish. First, there are the tiers I mention below for aligning "layers" of POS tagging, lexical meaning, etc, and these straightforwardly can be handled across a time axis.
Second, there are tier-to-tier relationships, so to speak, between different tier types. These are of the kind that Chuck was originally asking to give a different label than "layer" to in his e-mail. ELAN gets around this by distinguishing "parent" vs. "child" tiers. So, the product of a parent tier is available for the annotation of a child tier, but not vice versa.
Is this something we can adopt? It nicely admits that there is an overlap of "layers" for both senses of the word, but makes it clear that not all layers are created equal in the sense of what each annotation can inherit. And the issue of multiple embedded layers that Stuart brings up is independent and able to coinhabit with the different 'parent' vs. 'child' layers.
The documentation on tiers in ELAN: http://www.lat-mpi.eu/tools/elan/manual/ch05s01.html/view
From: Mary Beckman <mbeckman@ling.osu.edu>
Date: Mon, 7 Sep 2009 09:41:19 -0400
Muddying the waters further (or perhaps clarifying?), I think the distinction that we're making maybe is not between types of annotation streams and their relationships, but between a static and a dynamic view, no?
That is, when I hear/read "tier", I tend to think of the simultaneous display and/or the associated simultaneous development of different streams of parallel annotations that are anchored to the same primary data. This anchoring can be either via a time stamp (as in Figure 1) or just via reference to shared nodes in the annotation graph (as in the phrase-internal sharing of nodes for the word-by-by transcription and gloss in Figure 2). But this is a static "result-oriented" view of the parallel analysis streams and their relationships.
By contrast, when I hear or read "layer", I tend to think of the dynamics of how the different annotation streams were originally developed. Although there are cases where the analyses have a necessary order -- e.g., a word-by-word gloss probably has to come after a transcription and tokenization of the primary data -- there are also many cases where the "layering" is arbitrary or idiosyncratic. For example, in (ame_)ToBI labelling, some people, such as Stef Shattuck-Hufnagel, find that they have to mark the Break Indices first and then go back and mark the Tones. Others, such as Nanette Veilleux (and me), can't do Break Indices before we do Tones. If the annotation isn't left in a partial state, though, there is no way to recover the difference between Stef and Nanette. So this is a dynamic "process-oriented" view of the relationship.
There are no threads for this page.
Be the first to start a new thread.