|
Events
|
Sep 2 2009, 4:37 AM EDT |
|
edit |
78 words added
78 words deleted
|
Change:
The workshop is part of a series of explorative workshops financed by NOS-HS under the heading "Research Infrastructure for Linguistic Variation Studies" (RILiVS). RILiVS is
View changes from previous version.
(Word count: 275)
View all updates.
|
|
Events
|
Sep 2 2009, 4:37 AM EDT |
|
edit |
78 words added
|
Change:
The workshop is part of a series of explorative workshops financed by NOS-HS under the heading "Research Infrastructure for Linguistic Variation Studies" (RILiVS). RILiVS is
View changes from previous version.
(Word count: 275)
View all updates.
|
|
Group 4 White Paper
|
Aug 31 2009, 11:06 AM EDT |
|
edit |
11 words added
11 words deleted
|
Change:
reliableReliable provenanceProvenance throughThrough publicationPublicationA major question is how to achieve reliable data provenance in the linguistic community and promoting the sharing of data. Creating
View changes from previous version.
(Word count: 3595)
View all updates.
|
|
Group 4 White Paper
|
Aug 31 2009, 11:05 AM EDT |
|
edit |
120 words added
328 words deleted
|
Change:
Comprehensibility is key to data reliability. All data must be tagged with the appropriate metadata and linked to its documentation. This allows researchers to understand
View changes from previous version.
(Word count: 3595)
View all updates.
|
|
Group 4 White Paper
|
Aug 31 2009, 10:51 AM EDT |
|
edit |
136 words added
120 words deleted
|
Change:
Given that linguists change institutions and that URLs shift over time, it is important that future researchers be able to access the same data that is being used today and to be certain that this is the same data as was used by other researchers. On the Internet, rightsthe
View changes from previous version.
(Word count: 3804)
View all updates.
|
|
Group 4 White Paper
|
Aug 31 2009, 10:42 AM EDT |
|
edit |
3 words added
|
Change:
signed dissertation is submitted.Finally, the establishment of electronic data publishing journals in conjunction with a cyberinfrastructure should be considered, so as to provide a formal channel for establishing authorship of data sets and creating a scholarly reference in addition to a framework for peer review.
View changes from previous version.
(Word count: 3786)
View all updates.
|
|
Group 4 White Paper
|
Aug 31 2009, 10:42 AM EDT |
|
edit |
60 words added
42 words deleted
|
Change:
until the signed dissertation is submitted. Finally, the establishment of electronic data publishing journals in conjunction with a cyberinfrastructure should be considered, so as to provide a formal channel for establishing authorship and creating a scholarly reference in addition to a framework for peer review.
View changes from previous version.
(Word count: 3783)
View all updates.
|
|
Group 4 White Paper
|
Aug 31 2009, 10:35 AM EDT |
|
edit |
45 words added
12 words deleted
|
Change:
It will therefore be useful for the linguistic community to to engage in extensive dissemination and training efforts and to establish links with ongoing generic projects on metadata standards and preservation (e.g. PREMIS).There are several ways to encourage linguists to
View changes from previous version.
(Word count: 3764)
View all updates.
|
|
Group 4 White Paper
|
Aug 31 2009, 10:25 AM EDT |
|
edit |
174 words added
147 words deleted
|
Change:
This again is a challenge for provenance information. Ideally, a handle can be assigned to every step in the pipelining process -- note that at every step, intermediate data could be cached.Furthermore, Rosetta, Freebase, the Internet Archive etc. allow for mashups of data. Cyberinfrastructures
View changes from previous version.
(Word count: 3730)
View all updates.
|
|
Group 4 White Paper
|
Aug 31 2009, 10:08 AM EDT |
|
edit |
120 words added
115 words deleted
|
Change:
As part of provenance, in recording the who, what, and when of metadata, it is necessary to have trusted identification of individuals, organizations, and services.
View changes from previous version.
(Word count: 3706)
View all updates.
|
|
Group 4 White Paper
|
Aug 31 2009, 10:01 AM EDT |
|
edit |
1415 words added
1 word deleted
|
Change:
Researchers are often unwilling to turn over their data for storage and distribution in repositories. One reason is that some people feel their data is
View changes from previous version.
(Word count: 3701)
View all updates.
|
|
Group 4 White Paper
|
Aug 31 2009, 9:29 AM EDT |
|
edit |
296 words added
141 words deleted
|
Change:
Every entity involved in data set creation can be identified by a unique handle. These include entities such as people, organizations, and their roles, the
View changes from previous version.
(Word count: 2273)
View all updates.
|
|
Group 4 White Paper
|
Aug 30 2009, 11:12 AM EDT |
|
edit |
131 words added
100 words deleted
|
Change:
Individuals contributing to and creating these data sets need to get institutional credit for data publication. For example, these should count for tenure reviews and
View changes from previous version.
(Word count: 2116)
View all updates.
|
|
Group 4 White Paper
|
Aug 30 2009, 11:01 AM EDT |
|
edit |
273 words added
264 words deleted
|
Change:
is important to know whether the trees were manually constructed, created automatically, or bootstrapped by manually correcting automatically constructedwas trees.created.How to Achieve ProvenanceA major question is how to achieve reliable data provenance in the
View changes from previous version.
(Word count: 2085)
View all updates.
|
|
Group 4 White Paper
|
Aug 30 2009, 10:37 AM EDT |
|
edit |
40 words added
56 words deleted
|
Change:
it. It is important to know who contributed to a data set by collecting the data and by providing theits data.publication. The data might come from native-speaker informants, from published works of literature, from the web, etc. ThisAdequate allowsinformation theabout qualityprovenance, ofi.e.
View changes from previous version.
(Word count: 2077)
View all updates.
|
|
Group 4 White Paper
|
Aug 24 2009, 4:12 PM EDT |
|
edit |
16 words added
1 word deleted
|
Change:
community in a context where we are moving from simple data sets to more complex cyberinfrastructures. We then suggest some first steps to promote data sharing and publication in the linguistics community.Data ProvenanceProvenance is the who, what, and when of metadata. When a data set
View changes from previous version.
(Word count: 2092)
View all updates.
|
|
Group 4 Notes
|
Aug 15 2009, 3:48 PM EDT |
|
edit |
69 words added
88 words deleted
|
Change:
that match initial metadata querybut would like to beof abledata toand dothen thissearching evenand iffiltering don'tmight space,not ifbe cannotthe downloadoptimal sincesolution. isFurthermore, restricteddownloading todata notshould be downloaded, if don't have correctavoided
View changes from previous version.
(Word count: 4481)
View all updates.
|
|
Group 4 Notes
|
Aug 15 2009, 3:42 PM EDT |
|
edit |
201 words added
107 words deleted
|
Change:
are located. An infrastructure would not necessarily replace or copy existing repositories and services, but could aim at connecting them. Provenance information is very important as an underlying feature, since a user might usecombine a parser X with a tagger Y,Y etc.connectfrom existingdifferent archivesplaces
View changes from previous version.
(Word count: 4502)
View all updates.
|
|
Group 4 Notes
|
Aug 15 2009, 3:27 PM EDT |
|
edit |
162 words added
80 words deleted
|
Change:
therefore crucial that repositories offer versioning and updating of the stored materials. Some motivation,researchers might justprefer useto control distribution themselves from their webpage,own homepage. theA possible solution could be that links or webpages cancould be generated from the repository automatically;automatically. veryThis could
View changes from previous version.
(Word count: 4410)
View all updates.
|
|
Group 4 Notes
|
Aug 15 2009, 3:15 PM EDT |
|
edit |
102 words added
112 words deleted
|
Change:
It is challenging to reference mashup data since the mashup process is dynamic and every combination of specific versions of data produces a new mashup
View changes from previous version.
(Word count: 4328)
View all updates.
|