Skip to content

jannisbecker/ve-rs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ve for Rust

This is a port of the ruby gem https://github.com/Kimtaro/ve to Rust. 🦀

The Rust version is meant to be used with https://github.com/daac-tools/vibrato/, a great and blazingly fast mecab-compatible tokenizer, and an IPADIC dictionary which can be found in the same repo (under Releases).

(Support for other tokenizer crates like Lindera planned as well, should be easy)

I'm trying to mostly stay close to the original codebase, and use Rust ways and idioms where applicable.

In the future I'm planning to add tests comparing outputs from the ruby and rust version, making sure there's no unexpected differences in logic.

Getting started

You can play around with the library as-is simply by cloning the repo and cargo running it. This will tokenize an example string using vibrato and then postprocess the tokens to return a more meaningful array of words.

The example code also shows in a simple way how to use this crate in your own application, provided that you're working with vibrato for tokenization.

About

A Rust port of the popular linguistic framework ve https://github.com/Kimtaro/ve

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages