Stanford
The Stanford output is included mostly as a convenience and a debugging tool, but if you need a more reliable corenlp.run implementation here you go.
This is the result of the Stanford CoreNLP analysis of the text using the following annotators:
- tokenize: recognizes individual words.
- ssplit: recognizes sentence boundaries.
- pos: identifies the token's part of speech.
- lemma: identifies the root form of the token.
- parse: creates a syntax tree of the sentence. Main clause, independent clause, noun phrase, verb phrase, etc.
- depparse: creates a dependency graph of the sentence. Subject, verb, object, modifier, auxilliary, etc.
- ner: recognizes named entities. Is this a person? Is this a place?
- coref: links pronouns to antecedents.
- quote: recognizes quotations and links quotes to speakers.
- sentiment: classifies the emotional content of a sentence on a scale from most negative to most positive.
The structure of the STANFORD json is as follows:
- document
- various document level annotations
- corefs
- quotes
- sentences
- various sentence level annotations
- tokens
- various token level annotations
- word
- lemma
- pos
- ner