Is the english dictionary US or GB #119
-
Subject basically sums it up but:
Thanks |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
The English dictionary is built off of this data set. I do not believe it distinguishes between the different flavors, but I could be wrong. As for how to separate them (assuming they are the same) is unknown. Not sure if there is an easy method unless the sources were separated out. |
Beta Was this translation helpful? Give feedback.
-
To improve dictionary quality, we could consider using recent huge dataset like GigaWord (~4 M documents) or the insanely huge The Pile (825 GO) dataset. The latter is more diverse (medical document, emails, etc.). I have a script that can run both dataset and output a JSON dict. |
Beta Was this translation helpful? Give feedback.
The English dictionary is built off of this data set. I do not believe it distinguishes between the different flavors, but I could be wrong.
As for how to separate them (assuming they are the same) is unknown. Not sure if there is an easy method unless the sources were separated out.