1.
Annotation standards: Existing and needed 2.
Standards for Storage, Retrieval, and Search of Data 3.
Tools: Existing and future “killer apps” 4.
Protecting data reliability and provenance 5.
Following in their footsteps: Models from other fields 6.
Funding models 7.
Collaboration structure The first three working groups (annotation standards, other standards, and tools) will be charged with identifying and documenting on the wiki what already exists in their target area, and then projecting forward to the future about further standards and tools that need to be developed. These, too, should be documented on the wiki with the idea that interested parties could begin work on them. The tools group in particular will be charged with considering which types of tools would leave to wide-spread adoption of cyberinfrastructure and data sharing among linguists.
The fourth working group will focus on the issue of protecting the reliability of data as it moves through the cyberinfrastructure as well as its provenance: this is critical for both data providers (who need credit for the work they’ve done and the academic contribution of collecting, curating and annotating data) and the data users (who need to know where the data has come from so they can form an opinion of how much credence to give it). Furthermore, as one person’s analysis is encoded in annotation it becomes the next person’s data, so the provenance and reliability mechanisms need to scale to multiple layers of annotation over one original data set.
The fifth and sixth working groups will consider how other fields have organized themselves to create and fund cyberinfrastucture projects. Exploration of funding models is important, as usable software will be key to wide-spread adoption, and usable software requires money for user interface design, portability, and other software engineering issues that are not well addressed as part of linguistics or computational linguistics research projects.
Finally, the collaboration structure group will be charged with designing a means for ensuring ongoing collaboration and coordination that is efficient, relying on on-line communication and emphasizing a light-weight committee structure, if one is needed at all.