The module is a probability based, corpus-trained tagger that assigns
POS tags to English text based on a lookup dictionary and a set of
probability values. The tagger assigns appropriate tags based on
conditional probabilities - it examines the preceding tag to determine
the appropriate tag for the current word. Unknown words are classified
according to word morphology or can be set to be treated as nouns
or other parts of speech. The tagger also extracts as many nouns
and noun phrases as it can, using a set of regular expressions.