Module morph
[hide private]
[frames] | no frames]

Module morph

source code

An application of Data-Oriented Parsing to Esperanto. Combines a syntax and a morphology corpus.

Functions [hide private]
 
chapelitoj(word) source code
 
malchapelitoj(word) source code
 
cnf(tree)
make sure all terminals have POS tags; invent one if necessary ("parent_word")
source code
 
stripfunc(tree)
strip all function labels from a tree with labels of the form "function:form", eg.
source code
 
dos(words)
`Data-Oriented Segmentation 1': given a sequence of segmented words (ie., a sequence of morphemes), produce a dictionary with extrapolated segmentations (mapping words to sequences of morphemes).
source code
 
dos1(words)
`Data-Oriented Segmentation 2': given a sequence of segmented words (ie., a sequence of morphemes), produce a dictionary with extrapolated segmentations (mapping words to sequences of morphemes).
source code
 
dos2(words) source code
 
dos3(words) source code
 
segmentor(segmentd)
wrap a segmentation dictionary in a naive unknown word segmentation function with some heuristics (phonological rules could probably improve this further)
source code
 
morphmerge(tree, md, segmented)
merge morphology into phrase structure tree
source code
 
morphology(train)
an interactive interface to the toy corpus
source code
 
toy() source code
 
interface() source code
 
monato()
produce the goodman reduction of the full monato corpus
source code
Variables [hide private]
  __package__ = None
Function Details [hide private]

stripfunc(tree)

source code 

strip all function labels from a tree with labels of the form "function:form", eg. S:np for subject, np.

dos(words)

source code 

`Data-Oriented Segmentation 1': given a sequence of segmented words (ie., a sequence of morphemes), produce a dictionary with extrapolated segmentations (mapping words to sequences of morphemes). Assumes non-ambiguity. Method: cartesian product of all possible morphemes at position 0..n, where n is maximum word length.

dos1(words)

source code 

`Data-Oriented Segmentation 2': given a sequence of segmented words (ie., a sequence of morphemes), produce a dictionary with extrapolated segmentations (mapping words to sequences of morphemes). Discards ambiguous results. Method: cartesian product of all words with the same number of morphemes.