Skip to content

Commit

Permalink
Merge pull request #89 from ropensci-review-tools/review
Browse files Browse the repository at this point in the history
updates in response to pre-review for #88
  • Loading branch information
mpadge authored Dec 5, 2024
2 parents 52eaf4e + 7340667 commit f05026a
Show file tree
Hide file tree
Showing 9 changed files with 101 additions and 40 deletions.
5 changes: 3 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: pkgmatch
Title: Find R Packages Matching Either Descriptions or Other R Packages
Version: 0.4.2.015
Version: 0.4.2.021
Authors@R: c(
person("Mark", "Padgham", , "[email protected]", role = c("aut", "cre"),
comment = c(ORCID = "0000-0003-2172-5265")),
Expand All @@ -11,11 +11,12 @@ License: MIT + file LICENSE
URL: https://docs.ropensci.org/pkgmatch/,
https://github.com/ropensci-review-tools/pkgmatch
BugReports: https://github.com/ropensci-review-tools/pkgmatch/issues
Requires: R (>= 4.1.0)
Imports:
brio,
checkmate,
cli,
curl,
curl (>= 6.0.0),
dplyr,
fs,
httr2,
Expand Down
12 changes: 11 additions & 1 deletion R/bm25.R
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,11 @@
#' rOpenSci corpus will be automatically downloaded and used.
#' @param corpus If `txt` is not specified, data for nominated corpus will be
#' downloaded to local cache directory, and BM25 values calculated against
#' those. Must be one of "ropensci", "ropensci-fns".
#' those. Must be one of "ropensci", "ropensci-fns", or "cran". Note that the
#' "ropensci-fns" corpus contains entries for every single function of every
#' rOpenSci package, and the resulting BM25 values can be used to determine the
#' best-matching function. The other two corpora are package-based, and the
#' results can be used to find the best-matching package.
#'
#' @return A `data.frame` of package names and 'BM25' measures against text
#' from whole packages both with and without function descriptions.
Expand Down Expand Up @@ -84,6 +88,12 @@ m_pkgmatch_bm25 <- memoise::memoise (pkgmatch_bm25_internal)
#' Calculate a "BM25" index from function-call frequencies between a local R
#' package and all packages in specified corpus.
#'
#' Note that the results of this function are entirely different from
#' \link{pkgmatch_bm25} with `corpus = "ropensci-fns"`. The latter returns BM25
#' values from text descriptions of all functions in all rOpenSci packages,
#' whereas this function returns BM25 values based on frequencies of function
#' calls within packages.
#'
#' @param path Local path to source code of an R package.
#' @param corpus One of "ropensci" or "cran"
#' @return A `data.frame` of two columns:
Expand Down
2 changes: 1 addition & 1 deletion R/similar-fns.R
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#' Identify R functions best matching a given input string
#'
#' @description Function matching is only available for Only applies to
#' @description Function matching is only available for
#' functions from the corpus of rOpenSci packages.
#'
#' @inheritParams pkgmatch_similar_pkgs
Expand Down
28 changes: 17 additions & 11 deletions codemeta.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"codeRepository": "https://github.com/ropensci-review-tools/pkgmatch",
"issueTracker": "https://github.com/ropensci-review-tools/pkgmatch/issues",
"license": "https://spdx.org/licenses/MIT",
"version": "0.4.2.015",
"version": "0.4.2.021",
"programmingLanguage": {
"@type": "ComputerLanguage",
"name": "R",
Expand Down Expand Up @@ -229,6 +229,7 @@
"@type": "SoftwareApplication",
"identifier": "curl",
"name": "curl",
"version": ">= 6.0.0",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
Expand Down Expand Up @@ -286,6 +287,11 @@
"sameAs": "https://CRAN.R-project.org/package=memoise"
},
"9": {
"@type": "SoftwareApplication",
"identifier": "methods",
"name": "methods"
},
"10": {
"@type": "SoftwareApplication",
"identifier": "pbapply",
"name": "pbapply",
Expand All @@ -297,7 +303,7 @@
},
"sameAs": "https://CRAN.R-project.org/package=pbapply"
},
"10": {
"11": {
"@type": "SoftwareApplication",
"identifier": "Rcpp",
"name": "Rcpp",
Expand All @@ -309,7 +315,7 @@
},
"sameAs": "https://CRAN.R-project.org/package=Rcpp"
},
"11": {
"12": {
"@type": "SoftwareApplication",
"identifier": "rvest",
"name": "rvest",
Expand All @@ -321,7 +327,7 @@
},
"sameAs": "https://CRAN.R-project.org/package=rvest"
},
"12": {
"13": {
"@type": "SoftwareApplication",
"identifier": "tibble",
"name": "tibble",
Expand All @@ -333,7 +339,7 @@
},
"sameAs": "https://CRAN.R-project.org/package=tibble"
},
"13": {
"14": {
"@type": "SoftwareApplication",
"identifier": "tidyr",
"name": "tidyr",
Expand All @@ -345,7 +351,7 @@
},
"sameAs": "https://CRAN.R-project.org/package=tidyr"
},
"14": {
"15": {
"@type": "SoftwareApplication",
"identifier": "tokenizers",
"name": "tokenizers",
Expand All @@ -357,7 +363,7 @@
},
"sameAs": "https://CRAN.R-project.org/package=tokenizers"
},
"15": {
"16": {
"@type": "SoftwareApplication",
"identifier": "treesitter",
"name": "treesitter",
Expand All @@ -369,7 +375,7 @@
},
"sameAs": "https://CRAN.R-project.org/package=treesitter"
},
"16": {
"17": {
"@type": "SoftwareApplication",
"identifier": "treesitter.r",
"name": "treesitter.r",
Expand All @@ -381,7 +387,7 @@
},
"sameAs": "https://CRAN.R-project.org/package=treesitter.r"
},
"17": {
"18": {
"@type": "SoftwareApplication",
"identifier": "vctrs",
"name": "vctrs",
Expand All @@ -393,15 +399,15 @@
},
"sameAs": "https://CRAN.R-project.org/package=vctrs"
},
"18": {
"19": {
"@type": "SoftwareApplication",
"identifier": "R",
"name": "R",
"version": ">= 3.5.0"
},
"SystemRequirements": {}
},
"fileSize": "3709.612KB",
"fileSize": "492.3KB",
"readme": "https://github.com/ropensci-review-tools/pkgmatch/blob/main/README.md",
"contIntegration": [
"https://github.com/ropensci-review-tools/pkgmatch/actions?query=workflow%3AR-CMD-check",
Expand Down
6 changes: 5 additions & 1 deletion man/pkgmatch_bm25.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 5 additions & 2 deletions man/pkgmatch_bm25_fn_calls.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/pkgmatch_similar_fns.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

26 changes: 26 additions & 0 deletions tests/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Testing this package

Note that all tests here are mocked, and the entire test suite can be run
without even installing `ollama`, which is otherwise required to actually use
the package itself.

Tests can be run using any common testing functions, such as:

- `devtools::testl()`
- `testthat::test_local()`
- `covr::package_coverage()`

Note that test coverage reports currently [exclude the following
functions](https://github.com/ropensci-review-tools/pkgmatch/blob/52eaf4e841f627c315619e73300cb8cd175af929/.github/workflows/test-coverage.yaml#L41-L440o):

- [`R/ollama.R`](https://github.com/ropensci-review-tools/pkgmatch/blob/main/R/ollama.R),
which contains functions to ascertain status of locally-running instance of
[`ollama`](https://ollama.com).
- [`R/browse.R`](https://github.com/ropensci-review-tools/pkgmatch/blob/main/R/browse.R)
which contains a single function used to open URLs of results from this package
in a local browser.

Running `covr::package_coverage()` locally without excluding these two files
will thus yield lower package coverage than the values reported from
[`codecov`](https://app.codecov.io/gh/ropensci-review-tools/pkgmatch), yet
still over 75%.
53 changes: 32 additions & 21 deletions vignettes/ollama.Rmd
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
---
title: "ollama"
title: "Before you begin: ollama installation"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{ollama}
%\VignetteIndexEntry{Before you begin: ollama installation}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
Expand All @@ -14,20 +14,20 @@ knitr::opts_chunk$set (
)
```

The "pkgmatch" package does not access language model (LM) embeddings through external APIs,
for reasons explained in
[`vignette("why-local-lms")`](https://docs.ropensci.org/pkgmatch/articles/why-local-lms.html).
The LM embeddings are extracted from a locally-running instance of
[ollama](https://ollama.com). That means you need to download and install
ollama on your own computer in order to use this package. The following
sub-sections describe two distinct ways to do this. You will generally need to
The "pkgmatch" package works through a locally installed and running instance
of [ollama](https://ollama.com) (for reasons described in the
[`vignette("why-local-lms")`](https://docs.ropensci.org/pkgmatch/articles/why-local-lms.html)).
In order to use "pkgmatch", [ollama](https://ollama.com) needs to be installed,
and be running. This vignette describes how to do that. There are two distinct
ways, each described in a separate sub-section. You will generally need to
follow one and only one of these sections.

## Local ollama API endpoint

The "pkgmatch" package presumes by default that the local ollama instance has
an API endpoint at "127.0.0.1:11434". If this is not the case, alternative
endpoints can be set using [the `ollama_set_url()`
Regardless of which method you use to install and run
[ollama](https://ollama.com), the "pkgmatch" package presumes by default that
the local ollama instance has an API endpoint at "127.0.0.1:11434". If this is
not the case, alternative endpoints can be set using [the `ollama_set_url()`
function](https://docs.ropensci.org/pkgmatch/reference/ollama_set_url.html) or
by setting the environment variable `OLLAMA_HOST` before pkgmatch is loaded.

Expand All @@ -39,26 +39,37 @@ sub-section describes how to run ollama within a docker container.

General download instructions are given at https://ollama.com/download. Once
downloaded, ollama can be started by calling `ollama serve &`, where the final
`&` starts the process in the background.

The particular models used to extract the embeddings will be automatically
downloaded by this package if needed, or you can do this manually by running
`&` starts the process in the background. Alternatively, omit the `&` to run as
a foreground process. You can then interact with that process elsewhere (for
example in a different terminal shell), and you'll see full logs in the
original shell.

Once ollama is running with `ollama serve`, the particular models used here
will be automatically downloaded by this package when needed. Alternatively,
you can do this manually before using the package for the first time by running
the following two commands (in a system console; not in R):

``` bash
ollama pull jina/jina-embeddings-v2-base-en
ollama pull ordis/jina-embeddings-v2-base-code
```

You'll likely need to wait a few tens of minutes for the models to
download before proceeding. Once downloaded, both models should appear in the
output of `ollama list`.
You'll likely need to wait a few tens of minutes for the models to download
before proceeding. Once downloaded, you should be able to run `ollama list`,
and see both models appear in the output.

## Docker

If you do not or can not install [`ollama`](https://ollama.com) on your local
machine, you can build a Docker container to download and run
[`ollama`](https://ollama.com). For this, you will need to have [Docker
installed](https://docs.docker.com/get-started/get-docker/). (Docker is not
required for the local installation procedure described above; only for this
alternative procedure.)

This package comes with a "Dockerfile" containing all code needed to build and
run the necessary ollama models within a docker container. This can be either
built locally, or downloaded from the GitHub container registry.
run the necessary ollama models within a Docker container. This can be either
built locally, or downloaded from the GitHub container registry.

### Building the Docker container locally

Expand Down

0 comments on commit f05026a

Please sign in to comment.