About
In Natural Language processing, Tokens can be things like:
- word,
- numbers,
- acronyms,
- word-roots
- or fixed-length character strings.
A token is the result of parsing (tokenization) the document down to the atomic elements generally of a language.
The token are then searchable.
See:
Articles Related
Management
Create
See Natural Language Processing - (Tokenization|Parser|Text Segmentation|Word Break rules|Text Analysis)
Tokenization
See Natural Language Processing - (Tokenization|Parser|Text Segmentation|Word Break rules|Text Analysis)
Stemming
Visualize
- https://github.com/wooorm/common-words - Rare word visualization