ropensci-review-tools · mpadge · Oct 25, 2024 · Oct 25, 2024 · Oct 25, 2024 · Oct 25, 2024
diff --git a/.Rbuildignore b/.Rbuildignore
@@ -4,6 +4,7 @@
 ^.*\.Rproj$
 ^CODE_OF_CONDUCT\.md$
 ^CONTRIBUTING.md$
+^Dockerfile$
 ^LICENSE\.md$
 ^README\.Rmd$
 ^README\.md$

diff --git a/DESCRIPTION b/DESCRIPTION
@@ -1,6 +1,6 @@
 Package: pkgmatch
 Title:  Find R Packages Matching Either Descriptions or Other R Packages
-Version: 0.4.0.065
+Version: 0.4.0.068
 Authors@R: c(
     person("Mark", "Padgham", , "[email protected]", role = c("aut", "cre"),
            comment = c(ORCID = "0000-0003-2172-5265")),

diff --git a/README.Rmd b/README.Rmd
@@ -26,9 +26,15 @@ packages currently on [CRAN](https://cran.r-project.org).
 
 ## Installation
 
-The easiest way to install this package is via the [associated
-`r-universe`](https://ropensci-review-tools.r-universe.dev/ui#builds). As
-shown there, simply enable the universe with
+This package relies on a locally-running instance of
+[ollama](https://ollama.com). Procedures for setting that up are described in a
+separate vignette. ollama needs to be installed before this package can be
+used.
+
+Once ollama is running, the easiest way to install this package is via the
+[associated
+`r-universe`](https://ropensci-review-tools.r-universe.dev/ui#builds). As shown
+there, simply enable the universe with
 
 ```{r options, eval = FALSE}
 options (repos = c (
@@ -63,29 +69,7 @@ The package takes input either from a text description or local path to an R
 package, and finds similar packages based on both Large Language Model (LLM)
 embeddings, and more traditional text and code matching algorithms. The LLM
 embeddings require a locally-running instance of [ollama](https://ollama.com),
-as described in the following sub-section.
-
-
-## Setting up the LLM embeddings
-
-This package does not access LLM embeddings through external APIs, for reasons
-explained in
-[`vignette("why-local-lllms")`](https://docs.ropensci.org/pkgmatch/articles/why-local-llms.html).
-The LLM embeddings are extracted from a locally-running instance of
-[ollama](https://ollama.com). That means you need to download and install
-ollama on your own computer in order to use this package. Once downloaded,
-ollama can be started by calling `ollama serve`. The particular models used to
-extract the embeddings will be automatically downloaded by this package if
-needed, or you can do this manually by running the following two commands (in a
-system console; not in R):
-
-``` bash
-ollama pull jina/jina-embeddings-v2-base-en
-ollama pull ordis/jina-embeddings-v2-base-code
-```
-
-You'll likely need to wait up to half an hour or more for the models to
-download before proceeding.
+as described in a separate vignette.
 
 ## Using the `pkgmatch` package
 

diff --git a/README.md b/README.md
@@ -16,7 +16,13 @@ from all packages currently on [CRAN](https://cran.r-project.org).
 
 ## Installation
 
-The easiest way to install this package is via the [associated
+This package relies on a locally-running instance of
+[ollama](https://ollama.com). Procedures for setting that up are
+described in a separate vignette. ollama needs to be installed before
+this package can be used.
+
+Once ollama is running, the easiest way to install this package is via
+the [associated
 `r-universe`](https://ropensci-review-tools.r-universe.dev/ui#builds).
 As shown there, simply enable the universe with
 
@@ -53,28 +59,7 @@ The package takes input either from a text description or local path to
 an R package, and finds similar packages based on both Large Language
 Model (LLM) embeddings, and more traditional text and code matching
 algorithms. The LLM embeddings require a locally-running instance of
-[ollama](https://ollama.com), as described in the following sub-section.
-
-## Setting up the LLM embeddings
-
-This package does not access LLM embeddings through external APIs, for
-reasons explained in
-[`vignette("why-local-lllms")`](https://docs.ropensci.org/pkgmatch/articles/why-local-llms.html).
-The LLM embeddings are extracted from a locally-running instance of
-[ollama](https://ollama.com). That means you need to download and
-install ollama on your own computer in order to use this package. Once
-downloaded, ollama can be started by calling `ollama serve`. The
-particular models used to extract the embeddings will be automatically
-downloaded by this package if needed, or you can do this manually by
-running the following two commands (in a system console; not in R):
-
-``` bash
-ollama pull jina/jina-embeddings-v2-base-en
-ollama pull ordis/jina-embeddings-v2-base-code
-```
-
-You’ll likely need to wait up to half an hour or more for the models to
-download before proceeding.
+[ollama](https://ollama.com), as described in a separate vignette.
 
 ## Using the `pkgmatch` package
 
@@ -117,11 +102,11 @@ pkgmatch_similar_pkgs (".")
 ```
 
     ## $text
-    ## [1] "rdataretriever" "pkgcheck"       "c14bazAAR"      "elastic"       
-    ## [5] "textreuse"     
+    ## [1] "pkgcheck"       "rdataretriever" "elastic"        "codemetar"     
+    ## [5] "robotstxt"     
     ## 
     ## $code
-    ## [1] "autotest"    "pkgcheck"    "roreviewapi" "rtweet"      "srr"
+    ## [1] "autotest"    "pkgcheck"    "roreviewapi" "dynamite"    "cffr"
 
 And the most similar packages in terms of text descriptions include
 several general search and retrieval packages, and only [the `pkgcheck`
@@ -136,10 +121,10 @@ pkgmatch_similar_pkgs (".", corpus = "cran")
 ```
 
     ## $text
-    ## [1] "searcher"   "typetracer" "ore"        "ehelp"      "RWsearch"  
+    ## [1] "librarian" "ore"       "ehelp"     "searcher"  "RWsearch" 
     ## 
     ## $code
-    ## [1] "remotes"   "RInno"     "workflowr" "cffr"      "miniCRAN"
+    ## [1] "workflowr" "RInno"     "remotes"   "pkgload"   "miniCRAN"
 
 The `input` parameter can also be a local path to compressed `.tar.gz`
 binary object directly downloaded from CRAN.

diff --git a/codemeta.json b/codemeta.json
@@ -8,7 +8,7 @@
   "codeRepository": "https://github.com/ropensci-review-tools/pkgmatch",
   "issueTracker": "https://github.com/ropensci-review-tools/pkgmatch/issues",
   "license": "https://spdx.org/licenses/MIT",
-  "version": "0.4.0.065",
+  "version": "0.4.0.068",
   "programmingLanguage": {
     "@type": "ComputerLanguage",
     "name": "R",

diff --git a/vignettes/ollama.Rmd b/vignettes/ollama.Rmd
@@ -0,0 +1,63 @@
+---
+title: "ollama"
+output: rmarkdown::html_vignette
+vignette: >
+  %\VignetteIndexEntry{ollama}
+  %\VignetteEngine{knitr::rmarkdown}
+  %\VignetteEncoding{UTF-8}
+---
+
+```{r, include = FALSE}
+knitr::opts_chunk$set (
+    collapse = TRUE,
+    comment = "#>"
+)
+```
+
+The "pkgmatch" package does not access LLM embeddings through external APIs,
+for reasons explained in
+[`vignette("why-local-lllms")`](https://docs.ropensci.org/pkgmatch/articles/why-local-llms.html).
+The LLM embeddings are extracted from a locally-running instance of
+[ollama](https://ollama.com). That means you need to download and install
+ollama on your own computer in order to use this package. The following
+sub-sections describe two distinct ways to do this. You will generally need to
+follow one and only one of these sections.
+
+## Local installation
+
+This sub-section describes how to install and run ollama on your local
+computer. This may not be possible for everybody, in which case the following
+sub-section describes how to run ollama within a docker container.
+
+General download instructions are given at https://ollama.com/download. Once
+downloaded, ollama can be started by calling `ollama serve &`, where the final
+`&` starts the process in the background.
+
+The particular models used to extract the embeddings will be automatically
+downloaded by this package if needed, or you can do this manually by running
+the following two commands (in a system console; not in R):
+
+``` bash
+ollama pull jina/jina-embeddings-v2-base-en
+ollama pull ordis/jina-embeddings-v2-base-code
+```
+
+You'll likely need to wait a few tens of minutes for the models to
+download before proceeding. Once downloaded, both models should appear in the
+output of `ollama list`.
+
+## Docker
+
+This package comes with a "Dockerfile" containing all code needed to build and
+run the necessary ollama models within a docker container. To do this, download
+the [Dockerfile from this
+link](https://github.com/ropensci-review-tools/pkgmatch/blob/main/Dockerfile).
+Then from the same directory as that file, run these lines:
+
+``` bash
+docker build . -t ollama-models
+docker run --rm -p 11434:11434 ollama-models &
+```
+
+The running container can be stopped by calling `docker stop` followed the the
+"CONTAINER ID" listed on the output of `docker ps`.