Skip to content

Commit

Permalink
Add the FastSS and Levenshtein modules to docs (#3279)
Browse files Browse the repository at this point in the history
* fix TFIDF docs

* add the FastSS and Levenshtein modules to docs

* add doc source for FastSS
  • Loading branch information
piskvorky authored Dec 24, 2021
1 parent 7d7bb84 commit 8b8203d
Show file tree
Hide file tree
Showing 6 changed files with 45 additions and 18 deletions.
2 changes: 2 additions & 0 deletions docs/src/apiref.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,8 @@ Modules:
similarities/termsim
similarities/annoy
similarities/nmslib
similarities/levenshtein
similarities/fastss
test/utils
topic_coherence/aggregation
topic_coherence/direct_confirmation_measure
Expand Down
34 changes: 17 additions & 17 deletions docs/src/auto_examples/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ Understanding this functionality is vital for using gensim effectively.

.. raw:: html

<div class="sphx-glr-thumbcontainer" tooltip="Introduces transformations and demonstrates their use on a toy corpus.">
<div class="sphx-glr-thumbcontainer" tooltip="Introduces transformations and demonstrates their use on a toy corpus. ">

.. only:: html

Expand All @@ -92,7 +92,7 @@ Understanding this functionality is vital for using gensim effectively.

.. raw:: html

<div class="sphx-glr-thumbcontainer" tooltip="Demonstrates querying a corpus for similar documents.">
<div class="sphx-glr-thumbcontainer" tooltip="Demonstrates querying a corpus for similar documents. ">

.. only:: html

Expand Down Expand Up @@ -169,14 +169,14 @@ Learning-oriented lessons that introduce a particular gensim feature, e.g. a mod

.. raw:: html

<div class="sphx-glr-thumbcontainer" tooltip="Introduces Gensim&#x27;s fastText model and demonstrates its use on the Lee Corpus.">
<div class="sphx-glr-thumbcontainer" tooltip="Introduces Gensim&#x27;s EnsembleLda model">

.. only:: html

.. figure:: /auto_examples/tutorials/images/thumb/sphx_glr_run_fasttext_thumb.png
:alt: FastText Model
.. figure:: /auto_examples/tutorials/images/thumb/sphx_glr_run_ensemblelda_thumb.png
:alt: Ensemble LDA

:ref:`sphx_glr_auto_examples_tutorials_run_fasttext.py`
:ref:`sphx_glr_auto_examples_tutorials_run_ensemblelda.py`

.. raw:: html

Expand All @@ -186,18 +186,18 @@ Learning-oriented lessons that introduce a particular gensim feature, e.g. a mod
.. toctree::
:hidden:

/auto_examples/tutorials/run_fasttext
/auto_examples/tutorials/run_ensemblelda

.. raw:: html

<div class="sphx-glr-thumbcontainer" tooltip="Introduces Gensim&#x27;s EnsembleLda model">
<div class="sphx-glr-thumbcontainer" tooltip="Introduces Gensim&#x27;s fastText model and demonstrates its use on the Lee Corpus. ">

.. only:: html

.. figure:: /auto_examples/tutorials/images/thumb/sphx_glr_run_ensemblelda_thumb.png
:alt: Ensemble LDA
.. figure:: /auto_examples/tutorials/images/thumb/sphx_glr_run_fasttext_thumb.png
:alt: FastText Model

:ref:`sphx_glr_auto_examples_tutorials_run_ensemblelda.py`
:ref:`sphx_glr_auto_examples_tutorials_run_fasttext.py`

.. raw:: html

Expand All @@ -207,11 +207,11 @@ Learning-oriented lessons that introduce a particular gensim feature, e.g. a mod
.. toctree::
:hidden:

/auto_examples/tutorials/run_ensemblelda
/auto_examples/tutorials/run_fasttext

.. raw:: html

<div class="sphx-glr-thumbcontainer" tooltip="Introduces the Annoy library for similarity queries on top of vectors learned by Word2Vec.">
<div class="sphx-glr-thumbcontainer" tooltip="Introduces the Annoy library for similarity queries on top of vectors learned by Word2Vec. ">

.. only:: html

Expand Down Expand Up @@ -309,7 +309,7 @@ These **goal-oriented guides** demonstrate how to **solve a specific problem** u

.. raw:: html

<div class="sphx-glr-thumbcontainer" tooltip="Demonstrates simple and quick access to common corpora and pretrained models.">
<div class="sphx-glr-thumbcontainer" tooltip="Demonstrates simple and quick access to common corpora and pretrained models. ">

.. only:: html

Expand All @@ -330,7 +330,7 @@ These **goal-oriented guides** demonstrate how to **solve a specific problem** u

.. raw:: html

<div class="sphx-glr-thumbcontainer" tooltip="How to author documentation for Gensim.">
<div class="sphx-glr-thumbcontainer" tooltip="How to author documentation for Gensim. ">

.. only:: html

Expand Down Expand Up @@ -447,13 +447,13 @@ Blog posts, tutorial videos, hackathons and other useful Gensim resources, from
.. container:: sphx-glr-download sphx-glr-download-python
:download:`Download all examples in Python source code: auto_examples_python.zip </auto_examples/auto_examples_python.zip>`
:download:`Download all examples in Python source code: auto_examples_python.zip <//Volumes/work/workspace/gensim/trunk/docs/src/auto_examples/auto_examples_python.zip>`
.. container:: sphx-glr-download sphx-glr-download-jupyter
:download:`Download all examples in Jupyter notebooks: auto_examples_jupyter.zip </auto_examples/auto_examples_jupyter.zip>`
:download:`Download all examples in Jupyter notebooks: auto_examples_jupyter.zip <//Volumes/work/workspace/gensim/trunk/docs/src/auto_examples/auto_examples_jupyter.zip>`
.. only:: html
Expand Down
8 changes: 8 additions & 0 deletions docs/src/similarities/fastss.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
:mod:`similarities.fastss` -- Fast Levenshtein edit distance
==================================================================

.. automodule:: gensim.similarities.fastss
:synopsis: Fast fuzzy search between strings, using the Levenshtein edit distance
:members:
:inherited-members:

8 changes: 8 additions & 0 deletions docs/src/similarities/levenshtein.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
:mod:`similarities.levenshtein` -- Fast soft-cosine semantic similarity search
==============================================================================

.. automodule:: gensim.similarities.levenshtein
:synopsis: Fast fuzzy search between strings, using the Soft-Cosine Semantic Similarity
:members:
:inherited-members:

9 changes: 9 additions & 0 deletions gensim/similarities/fastss.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,15 @@ def bytes2set(b):


class FastSS:
"""
Fast implementation of FastSS (Fast Similarity Search): https://fastss.csg.uzh.ch/
FastSS enables fuzzy search of a dynamic query (a word, string) against a static
dictionary (a set of words, strings). The "fuziness" is configurable by means
of a maximum edit distance (Levenshtein) between the query string and any of the
dictionary words.
"""

def __init__(self, words=None, max_dist=2):
"""
Expand Down
2 changes: 1 addition & 1 deletion gensim/similarities/levenshtein.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ class LevenshteinSimilarityIndex(TermSimilarityIndex):
"Levenshtein similarity" is a modification of the Levenshtein (edit) distance,
defined in [charletetal17]_.
This implementation uses the FastSS neighbourhood algorithm
This implementation uses the :class:`~gensim.similarities.fastss.FastSS` algorithm
for fast kNN nearest-neighbor retrieval.
Parameters
Expand Down

0 comments on commit 8b8203d

Please sign in to comment.