VISL Format Consistency Checker

Select file to check consistency of:


This consistency checker will comment formal inconsistencies in VISL format treebanks (e.g. indentation/bracketing problems), plus a few content inconsistencies (e.g. impossible form/function combinations). The program was written in Perl by Eckhard Bick for use in the PaNoLa project, and can be downloaded here.

Easy errors will be corrected automatically and marked #¤¤¤¤¤¤¤¤¤¤F#, while a #¤¤¤¤¤¤¤¤¤¤!# mark is used as a warning in cases, where correction needs human intervention. At the bottom of the output file, statistics will be provided for all form and function tags used. The examples below are from Icelandic PaNoLa.

function & form statistics

Functions used:
A289
CJT68
CO43
COM5
Co13
Cs35
D464
EXC1
H454
Od92
Oi9
P254
QUE35
S234
STA187
SUB3
Sf14
Ss2
cl1 ### cave
Forms used:
03 ### cave
adj168
adv114
cl255
conj43
g453
infm38
koma1 ### cave
n354
num18
par33
pron168
prop71
prp173
v311

Tree format input

IS89) Gamli maðurinn las kvæðið hátt og skýrt.
A1
STA:cl
S:g
D:adj Gamli
H:n maðurinn
P:v las
Od:n kvæðið
=A:par
CJT:adv hátt
CO:conj og
CJT:adv skýrt
.

IS97) Það eru innlendu fréttirnar sem byrja klukkan 19.
A1
STA:cl
Sf:pron Það
P:v eru
S:g
=D:adj innlendu
=H:n fréttirnar
=P:cl
==conj sem
==P:v byrja
==A :g
===D:n klukkan
===H:num 19 .

Output

IS89) Gamli maðurinn las kvæðið hátt og skýrt.
A1
STA:cl
S:g
=D:adj Gamli #¤¤¤¤¤¤¤¤¤¤F!# missing indentation, check next
=H:n maðurinn #¤¤¤¤¤¤¤¤¤¤F# missing indentation
P:v las
Od:n kvæðið
A:par #¤¤¤¤¤¤¤¤¤¤F# indentation reduced once
=CJT:adv hátt #¤¤¤¤¤¤¤¤¤¤F!# missing indentation, check next
=CO:conj og #¤¤¤¤¤¤¤¤¤¤F# missing indentation
=CJT:adv skýrt #¤¤¤¤¤¤¤¤¤¤F# missing indentation
.

IS97) Það eru innlendu fréttirnar sem byrja klukkan 19.
A1
STA:cl
Sf:pron Það
P:v eru
S:g
=D:adj innlendu
=H:n fréttirnar
P:cl #¤¤¤¤¤¤¤¤¤¤F# indentation reduced once #¤¤¤¤¤¤¤¤¤¤!# impossible tag combination
=conj sem #¤¤¤¤¤¤¤¤¤¤F# indentation reduced once #¤¤¤¤¤¤¤¤¤¤!# missing function
=P:v byrja #¤¤¤¤¤¤¤¤¤¤F# indentation reduced once
=A:g #¤¤¤¤¤¤¤¤¤¤F# spurious space or tab in node #¤¤¤¤¤¤¤¤¤¤F# indentation reduced once
==D:n klukkan #¤¤¤¤¤¤¤¤¤¤F# indentation reduced once
==H:num 19 #¤¤¤¤¤¤¤¤¤¤F# indentation reduced once
.