Version User Scope of changes
Jul 19 2009, 1:00 PM EDT (current) MarkYLiberman 286 words deleted
Jul 19 2009, 12:48 PM EDT MarkYLiberman 9 words added, 1 word deleted

Changes

Key:  Additions   Deletions
By funding models, we mean ways of securing funding to build and maintain the various pieces of the linguistic cyberinfrastructure: standards, corpora, archives, tools. This includes the tools that linguists use to access the data, and also the tools that make it possible (or even easy) for ordinary linguists to produce and publish standards-compatible data.

The Funding Models group will consider how to fund the creation and maintenance of this ongoing cyberinfrastructure for linguistic data. Most pieces of this cyberinfrastructure are created as part of larger projects or as spin-off from other activities; and often, a particular corpus or tool will have different funding models for creation, maintenance/evolution, and distribution.

[Some preliminary thoughts by group co-chair Mark Liberman are here.]

[The outline of our preliminary report on Sunday morning is here.]

We note that there is a prior question: how to organize the process. In particular, the balance between central planning and bottom-up initiative is an issue to think about.

Funding SourcesIndividual in-kind donations (of time, facilities, etc.)Individual financial donationsInstitutional in-kind donations (space, salary, facilities, etc.)Institutional financial support - from companies - from foundationsGovernment grants and contractsPrivate grants (like Ford, Hewlett, MacArthur, Sloan, etc.)Membership feesUsage fees???Fundable EntitiesScholarly or technical societies (LSA, ACL)Universities - individual researchers - centersConsortiumsFree-standing not-for-profit (e.g. 501(3)c) entities(For-profit) companies???Advantages/Drawbacks of Abovehow does an open development model fit into the above?piggy-backing on other development communities?CyberInfrastructure Examples: This is NOT intended to be a complete inventory, but rather a diverse set of examples for discussion, exemplifying the diverse "sources and sinks" listed above. Note that most of these are mainly or at least partly oriented towards speech and language research; but a few more general links have been included as well, such Audacity, Oyez, and LibriVox.Software Tools: Audacity, AGTK, Champollion, CLAN, Emu, ESPS, Festival, HTK, Linguistica, LIWC, NLTK, OpenFST, Praat, SpeechStation2, SenSyn, Transcriber, Wavesurfer, Xaira, ...Datasets: Wordnet, WALS, Rosetta Language BaseBuilding Community: Open Translation Tools 2009 and the Open Translation Tools manual created in the subsequent book sprint (funded by Open Society Institute and the Ford Foundation); OLAC; NLTK courses; Praat community;Archives: BAS, Childes, ELDA, LDC, LibriVox, OTA, Oyez, Rosetta, TalkbankPortals: ANAE, BNC, Corpus.byu.edu, Ethnologue, FrameNet, LDC Online, WALS, MultiTree ...Standards: GOLD (grant-based), OLAC (open community), ISO 639-3 (SIL funding model - also for development work in general), Unicode (consortium membership model)

Other resource pages: Linguist List Software Page, Natural Language Software Registry, SIL's Linguistics Computing Resources on the Internet, Open Translation Tools