cyberling.orgThis is a featured page

Design Doc for Linguistics-Oriented Data Sharing/Publishing Platform


Use cases

who would be interested in the platform?

  1. the kwa lexicographer
    a traditional linguist that wants to do the right thing.
  2. tracy: linguist that has done work, which should not be lost.
  3. alicia: (big) data without time to annotate/analyse.
  4. virach: competition on thai word segmentation.
1. the kwa lexicographer

linguist uses toolbox to write a lexicon on a kwa language dialect,
using custom markers (custom fields of lexical entries), thus toolbox doesn't work anymore.
what the linguist wants to do is prepare a printed book.
would be interested to get interlinear glossed examples for the lexicon

concerns:
  • data conversion
  • merging of data
  • cross-platform publishing (including the non-web world)
extension (once on the platform):
  • find new data (on kwa), transcribe it -> online collaboration for annotation

subfields:
  • language documentation
  • field methodologies
  • phonology

2. traci: linguist that has done work, which noone should have to do again.
(radio shows, dictionary entries)
the squib
-> backup
-> interest in "publication clearance process" (the "lawyer button")


subfields:
  • syntax

concerns:
  • backup
  • documentation
  • reproducibility
extension:



3. alicia: there's (big) data without time to annotate/analyse.
(pacific nw phonetics) ("light annotation problem")
-> the case for video
-> the researcher that wants to reuse/reannotate data of others

subfields:
  • sociolinguistics
  • phonology
concerns:
  • backup
  • sharing with access control (graded access)
extension:


4. virach: competition on thai word segmentation
the tool builder
-> tools in search of data
-> results in search of judges

subfields:
  • computational linguistics
concerns:
  • find users for tools
  • software as a service (potential for mashups)
  • distribution mechanism
  • attribution

extension:

with a platform that solves the concerns mentioned above, what else do we get?

foster the education interaction: connect students with community and
with data. (attribution will help here)

now it turns out that some concerns match up with extensions.
-> once the people are on the platform, the collaboration and network
effects set in.
- potential for citation, publication


Requirements

  • data backup
  • data conversion
  • remote access
  • publishing (legal/copyright/etc)
  • graded access
  • metadata: The platform should already provide for the basic metadata to allow for attribution (who created which data when). Tagging with language codes is necessary to enable network effects.

Potential

  • platform to deliver software as a service



No user avatar
robert_forkel
Latest page update: made by robert_forkel , Jul 19 2009, 2:33 AM EDT (about this update About This Update robert_forkel Edited by robert_forkel

8 words added

view changes

- complete history)
Keyword tags: spec
More Info: links to this page
There are no threads for this page.  Be the first to start a new thread.