Skip to content

Commit

Permalink
Updated README for new Sentence object.
Browse files Browse the repository at this point in the history
  • Loading branch information
Hugo-ter-Doest committed Jan 30, 2018
1 parent a5dbada commit 0aa8918
Show file tree
Hide file tree
Showing 2 changed files with 25 additions and 31 deletions.
47 changes: 24 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -1184,9 +1184,19 @@ var rules = new natural.RuleSet(rulesFilename);
var tagger = new natural.BrillPOSTagger(lexicon, rules);

var sentence = ["I", "see", "the", "man", "with", "the", "telescope"];
console.log(JSON.stringify(tagger.tag(sentence)));
// [["I","NN"],["see","VB"],["the","DT"],["man","NN"],["with","IN"],["the","DT"],["telescope","NN"]]

console.log(tagger.tag(sentence));
```
This outputs the following:
```
Sentence {
taggedWords:
[ { token: 'I', tag: 'NN' },
{ token: 'see', tag: 'VB' },
{ token: 'the', tag: 'DT' },
{ token: 'man', tag: 'NN' },
{ token: 'with', tag: 'IN' },
{ token: 'the', tag: 'DT' },
{ token: 'telescope', tag: 'NN' } ] }
```

### Lexicon
Expand Down Expand Up @@ -1227,21 +1237,16 @@ VBD NN PREV-TAG DT
Here the category of the previous word must be <code>DT</code> for the rule to be applied.

### Algorithm
The tagger applies transformation rules that may change the category of words. The input sentence must be split into words which are assigned with categories. The tagged sentence is then processed from left to right. At each step all rules are applied once; rules are applied in the order in which they are specified. Algorithm:
The tagger applies transformation rules that may change the category of words. The input sentence is a Sentence object with tagged words. The tagged sentence is processed from left to right. At each step all rules are applied once; rules are applied in the order in which they are specified. Algorithm:
```javascript
function(sentence) {
var tagged_sentence = new Array(sentence.length);

// snip

// Apply transformation rules
for (var i = 0, size = sentence.length; i < size; i++) {
this.transformation_rules.forEach(function(rule) {
rule.apply(tagged_sentence, i);
Brill_POS_Tagger.prototype.applyRules = function(sentence) {
for (var i = 0, size = sentence.taggedWords.length; i < size; i++) {
this.ruleSet.getRules().forEach(function(rule) {
rule.apply(sentence, i);
});
}
return(tagged_sentence);
}
return sentence;
};
```

### Adding a predicate
Expand Down Expand Up @@ -1270,13 +1275,13 @@ A typical entry for a rule templates looks like this:
"parameter1Values": nextTagParameterValues
}
```
A predicate function accepts a tagged sentence, the current position in the
A predicate function accepts a Sentence object, the current position in the
sentence that should be tagged, and the outcome(s) of the predicate.
An example of a predicate that checks the category of the current word:
```javascript
function next_tag_is(tagged_sentence, i, parameter) {
if (i < tagged_sentence.length - 1) {
return(tagged_sentence[i+1][1] === parameter);
function next_tag_is(sentence, i, parameter) {
if (i < sentence.taggedWords.length - 1) {
return(sentence.taggedWords[i + 1][1] === parameter);
}
else {
return(false);
Expand All @@ -1296,10 +1301,6 @@ function nextTagParameterValues(sentence, i) {
}
}
```
Please note that these functions work with a different data type. Here, a
sentence is an array of tokens and tokens are maps that have at least a
token (word) and a tag.


### Training
The trainer allows to learn a new set of transformation rules from a corpus.
Expand Down
9 changes: 1 addition & 8 deletions lib/natural/brill_pos_tagger/lib/Brill_POS_Tagger.js
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ function Brill_POS_Tagger(lexicon, ruleSet) {
// the word itself followed by its lexical category
Brill_POS_Tagger.prototype.tag = function(sentence) {
var taggedSentence = this.tagWithLexicon(sentence);
console.log(taggedSentence);
//console.log(taggedSentence);
return this.applyRules(taggedSentence);
};

Expand All @@ -51,13 +51,6 @@ Brill_POS_Tagger.prototype.tagWithLexicon = function(sentence) {
// A tagged word is an array consisting of the word itself followed by its lexical category.
// Returns an array of tagged words as well
Brill_POS_Tagger.prototype.applyRules = function(sentence) {
// Apply transformation rules
/*
var sentence = new Sentence();
taggedSentence.forEach(function(tokenPlusTag) {
sentence.addTaggedWord(tokenPlusTag[0], tokenPlusTag[1]);
});
*/
for (var i = 0, size = sentence.taggedWords.length; i < size; i++) {
this.ruleSet.getRules().forEach(function(rule) {
rule.apply(sentence, i);
Expand Down

0 comments on commit 0aa8918

Please sign in to comment.