bitpar.BitParChartParser

init(self, weightedrules=None, lexicon=None, rootsymbol=None, unknownwords=None, openclassdfsa=None, cleanup=True, n=10, name=`''`)
(Constructor)

Interface to bitpar chart parser. Expects a list of weighted productions with frequencies (not probabilities).

Parameters:

weightedrules - sequence of tuples with strings (lhs and rhs separated by tabs, eg. "S NP VP") and frequencies. The reason we use this format is that it is close to bitpar's file format; converting a weighted grammar with probabilities to frequencies would be a detour, and bitpar wants frequencies so it can do smoothing.
lexicon - set of strings belonging to the lexicon (ie., the set of terminals)
rootsymbol - starting symbol for the grammar
unknownwords - a file with a list of open class POS tags with frequencies
openclassdfsa - a deterministic finite state automaton, refer to the bitpar manpage.
cleanup - boolean, when set to true the grammar files will be removed when the BitParChartParser object is deleted.
name - filename of grammar files in case you want to export it, if not given will default to a unique identifier

n - the n best parse trees will be requested >>> wrules = ( ("S\tNP\tVP", 1), ("NP\tmary", 1), ("VP\twalks", 1) ) >>> p = BitParChartParser(wrules, set(("mary","walks"))) >>> tree = p.parse("mary walks".split()) >>> print tree (S (NP mary) (VP walks)) (p=1.0)

>>> from dopg import GoodmanDOP
>>> d = GoodmanDOP([tree], parser=InsideChartParser)
>>> d.parser.parse("mary walks".split())
ProbabilisticTree('S', [ProbabilisticTree('NP@1', ['mary'])
(p=1.0), ProbabilisticTree('VP@2', ['walks']) (p=1.0)])
(p=0.444444444444)
>>> d.parser.nbest_parse("mary walks".split(), 10)
[ProbabilisticTree('S', [ProbabilisticTree('NP@1', ['mary']) (p=1.0),
        ProbabilisticTree('VP@2', ['walks']) (p=1.0)]) (p=0.444444444444),
ProbabilisticTree('S', [ProbabilisticTree('NP', ['mary']) (p=1.0),
        ProbabilisticTree('VP@2', ['walks']) (p=1.0)]) (p=0.222222222222),
ProbabilisticTree('S', [ProbabilisticTree('NP@1', ['mary']) (p=1.0),
        ProbabilisticTree('VP', ['walks']) (p=1.0)]) (p=0.222222222222),
ProbabilisticTree('S', [ProbabilisticTree('NP', ['mary']) (p=1.0),
        ProbabilisticTree('VP', ['walks']) (p=1.0)]) (p=0.111111111111)]

>>> d = GoodmanDOP([tree], parser=BitParChartParser)
    writing grammar
>>> d.parser.parse("mary walks".split())
ProbabilisticTree('S', [Tree('NP@1', ['mary']), Tree('VP@2', ['walks'])]) (p=0.444444)
>>> list(d.parser.nbest_parse("mary walks".split()))
[ProbabilisticTree('S', [Tree('NP@1', ['mary']), Tree('VP@2', ['walks'])]) 
(p=0.444444),
ProbabilisticTree('S', [Tree('NP', ['mary']), Tree('VP@2', ['walks'])])
(p=0.222222),
ProbabilisticTree('S', [Tree('NP@1', ['mary']), Tree('VP', ['walks'])])
(p=0.222222), 
ProbabilisticTree('S', [Tree('NP', ['mary']), Tree('VP', ['walks'])])
(p=0.111111)]

TODO: parse bitpar's chart output / parse forest

Class BitParChartParser

init(self, weightedrules=None, lexicon=None, rootsymbol=None, unknownwords=None, openclassdfsa=None, cleanup=True, n=10, name=`''`)
(Constructor)

writegrammar(self, f, l)

Class BitParChartParser

__init__(self, weightedrules=None, lexicon=None, rootsymbol=None, unknownwords=None, openclassdfsa=None, cleanup=True, n=10, name='') (Constructor)

writegrammar(self, f, l)

init(self, weightedrules=None, lexicon=None, rootsymbol=None, unknownwords=None, openclassdfsa=None, cleanup=True, n=10, name=`''`)
(Constructor)