Index created by elasticlunr-rs doesn't work with elasticlunr.js for characters that can't be represented by a single UTF-16 Code Unit #53

Sunshine40 · 2024-06-05T08:25:43Z

Lines 40 to 42 in 29d97e4

    
           fn add_token(&mut self, doc_ref: &str, token: &str, term_freq: f64) { 
        
               let mut iter = token.chars(); 
        
               if let Some(character) = iter.next() {

During index building, elasticlunr-rs iterates over the token &str's content in Unicode Scalar Values.

While the JS library does it in this way:

elasticlunr.InvertedIndex.prototype.addToken = function (token, tokenInfo, root) {
  var root = root || this.root,
      idx = 0;

  while (idx <= token.length - 1) {
    var key = token[idx];

The JS string is actually iterated in UTF-16 Code Units, which are entire characters for English, most alphabetic text, common Chinese characters; but not Emojis and rare Chinese characters.

Related issue with mdBook.

The text was updated successfully, but these errors were encountered:

ImUrX · 2025-03-19T05:19:27Z

3.0.3 should probably be yanked for now as it breaks mdbook

mattico added bug help wanted labels Jun 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Index created by elasticlunr-rs doesn't work with elasticlunr.js for characters that can't be represented by a single UTF-16 Code Unit #53

Index created by elasticlunr-rs doesn't work with elasticlunr.js for characters that can't be represented by a single UTF-16 Code Unit #53

Sunshine40 commented Jun 5, 2024 •

edited

Loading

ImUrX commented Mar 19, 2025

Index created by elasticlunr-rs doesn't work with elasticlunr.js for characters that can't be represented by a single UTF-16 Code Unit #53

Index created by elasticlunr-rs doesn't work with elasticlunr.js for characters that can't be represented by a single UTF-16 Code Unit #53

Comments

Sunshine40 commented Jun 5, 2024 • edited Loading

ImUrX commented Mar 19, 2025

Sunshine40 commented Jun 5, 2024 •

edited

Loading