Skip to content

LapDevelopment_Status

StephanOepen edited this page Mar 4, 2016 · 27 revisions

House Cleaning

  • standardize option names (e.g. ‘--sentence’, ‘--token’, and such; following annotation types); provide sensible defaults in all tools
  • review and harmonize ‘==process’ (and tool) naming in GT and OBT stacks;
  • review annotation structures and ‘finalize’ (for now)
  • investigate name mismatches observed by milen

Authentication

  • simplesaml metadata (for Feide)

  • edugain: CoCo

  • CLARIN SPF

milen & nikolay on the technical side; oe (with input from francesca) driving the legal side

for the time being, standardize on mail attribute, since Galaxy requires user ids to be valid email addresses.

once we have the production service working, look into more sophisticated IdP discovery, e.g. discojuice or discopower.

Certification

  • enroll in Type A Service trial

Visualization

  • in-browser rendering of tagged and parsed text, using brat

DELPH-IN Integration

Tool Integration

  • constituent structure parsing
  • language identification
  • lemmatization
  • classification

Interoperability with Other CLARINO Centers

Data Collections & Parallelization

  • import archive (of document collection)
  • chunking: tool to iterate through the datasets in a collection and parallelize

LAP Library in Java

Explicit Modeling of Tool Ontology

  • data types and metadata
Clone this wiki locally