Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update readme & add ollama vignette #52

Merged
merged 3 commits into from
Oct 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
^.*\.Rproj$
^CODE_OF_CONDUCT\.md$
^CONTRIBUTING.md$
^Dockerfile$
^LICENSE\.md$
^README\.Rmd$
^README\.md$
Expand Down
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: pkgmatch
Title: Find R Packages Matching Either Descriptions or Other R Packages
Version: 0.4.0.065
Version: 0.4.0.068
Authors@R: c(
person("Mark", "Padgham", , "[email protected]", role = c("aut", "cre"),
comment = c(ORCID = "0000-0003-2172-5265")),
Expand Down
36 changes: 10 additions & 26 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,15 @@ packages currently on [CRAN](https://cran.r-project.org).

## Installation

The easiest way to install this package is via the [associated
`r-universe`](https://ropensci-review-tools.r-universe.dev/ui#builds). As
shown there, simply enable the universe with
This package relies on a locally-running instance of
[ollama](https://ollama.com). Procedures for setting that up are described in a
separate vignette. ollama needs to be installed before this package can be
used.

Once ollama is running, the easiest way to install this package is via the
[associated
`r-universe`](https://ropensci-review-tools.r-universe.dev/ui#builds). As shown
there, simply enable the universe with

```{r options, eval = FALSE}
options (repos = c (
Expand Down Expand Up @@ -63,29 +69,7 @@ The package takes input either from a text description or local path to an R
package, and finds similar packages based on both Large Language Model (LLM)
embeddings, and more traditional text and code matching algorithms. The LLM
embeddings require a locally-running instance of [ollama](https://ollama.com),
as described in the following sub-section.


## Setting up the LLM embeddings

This package does not access LLM embeddings through external APIs, for reasons
explained in
[`vignette("why-local-lllms")`](https://docs.ropensci.org/pkgmatch/articles/why-local-llms.html).
The LLM embeddings are extracted from a locally-running instance of
[ollama](https://ollama.com). That means you need to download and install
ollama on your own computer in order to use this package. Once downloaded,
ollama can be started by calling `ollama serve`. The particular models used to
extract the embeddings will be automatically downloaded by this package if
needed, or you can do this manually by running the following two commands (in a
system console; not in R):

``` bash
ollama pull jina/jina-embeddings-v2-base-en
ollama pull ordis/jina-embeddings-v2-base-code
```

You'll likely need to wait up to half an hour or more for the models to
download before proceeding.
as described in a separate vignette.

## Using the `pkgmatch` package

Expand Down
41 changes: 13 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,13 @@ from all packages currently on [CRAN](https://cran.r-project.org).

## Installation

The easiest way to install this package is via the [associated
This package relies on a locally-running instance of
[ollama](https://ollama.com). Procedures for setting that up are
described in a separate vignette. ollama needs to be installed before
this package can be used.

Once ollama is running, the easiest way to install this package is via
the [associated
`r-universe`](https://ropensci-review-tools.r-universe.dev/ui#builds).
As shown there, simply enable the universe with

Expand Down Expand Up @@ -53,28 +59,7 @@ The package takes input either from a text description or local path to
an R package, and finds similar packages based on both Large Language
Model (LLM) embeddings, and more traditional text and code matching
algorithms. The LLM embeddings require a locally-running instance of
[ollama](https://ollama.com), as described in the following sub-section.

## Setting up the LLM embeddings

This package does not access LLM embeddings through external APIs, for
reasons explained in
[`vignette("why-local-lllms")`](https://docs.ropensci.org/pkgmatch/articles/why-local-llms.html).
The LLM embeddings are extracted from a locally-running instance of
[ollama](https://ollama.com). That means you need to download and
install ollama on your own computer in order to use this package. Once
downloaded, ollama can be started by calling `ollama serve`. The
particular models used to extract the embeddings will be automatically
downloaded by this package if needed, or you can do this manually by
running the following two commands (in a system console; not in R):

``` bash
ollama pull jina/jina-embeddings-v2-base-en
ollama pull ordis/jina-embeddings-v2-base-code
```

You’ll likely need to wait up to half an hour or more for the models to
download before proceeding.
[ollama](https://ollama.com), as described in a separate vignette.

## Using the `pkgmatch` package

Expand Down Expand Up @@ -117,11 +102,11 @@ pkgmatch_similar_pkgs (".")
```

## $text
## [1] "rdataretriever" "pkgcheck" "c14bazAAR" "elastic"
## [5] "textreuse"
## [1] "pkgcheck" "rdataretriever" "elastic" "codemetar"
## [5] "robotstxt"
##
## $code
## [1] "autotest" "pkgcheck" "roreviewapi" "rtweet" "srr"
## [1] "autotest" "pkgcheck" "roreviewapi" "dynamite" "cffr"

And the most similar packages in terms of text descriptions include
several general search and retrieval packages, and only [the `pkgcheck`
Expand All @@ -136,10 +121,10 @@ pkgmatch_similar_pkgs (".", corpus = "cran")
```

## $text
## [1] "searcher" "typetracer" "ore" "ehelp" "RWsearch"
## [1] "librarian" "ore" "ehelp" "searcher" "RWsearch"
##
## $code
## [1] "remotes" "RInno" "workflowr" "cffr" "miniCRAN"
## [1] "workflowr" "RInno" "remotes" "pkgload" "miniCRAN"

The `input` parameter can also be a local path to compressed `.tar.gz`
binary object directly downloaded from CRAN.
Expand Down
2 changes: 1 addition & 1 deletion codemeta.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"codeRepository": "https://github.com/ropensci-review-tools/pkgmatch",
"issueTracker": "https://github.com/ropensci-review-tools/pkgmatch/issues",
"license": "https://spdx.org/licenses/MIT",
"version": "0.4.0.065",
"version": "0.4.0.068",
"programmingLanguage": {
"@type": "ComputerLanguage",
"name": "R",
Expand Down
63 changes: 63 additions & 0 deletions vignettes/ollama.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
---
title: "ollama"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{ollama}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set (
collapse = TRUE,
comment = "#>"
)
```

The "pkgmatch" package does not access LLM embeddings through external APIs,
for reasons explained in
[`vignette("why-local-lllms")`](https://docs.ropensci.org/pkgmatch/articles/why-local-llms.html).
The LLM embeddings are extracted from a locally-running instance of
[ollama](https://ollama.com). That means you need to download and install
ollama on your own computer in order to use this package. The following
sub-sections describe two distinct ways to do this. You will generally need to
follow one and only one of these sections.

## Local installation

This sub-section describes how to install and run ollama on your local
computer. This may not be possible for everybody, in which case the following
sub-section describes how to run ollama within a docker container.

General download instructions are given at https://ollama.com/download. Once
downloaded, ollama can be started by calling `ollama serve &`, where the final
`&` starts the process in the background.

The particular models used to extract the embeddings will be automatically
downloaded by this package if needed, or you can do this manually by running
the following two commands (in a system console; not in R):

``` bash
ollama pull jina/jina-embeddings-v2-base-en
ollama pull ordis/jina-embeddings-v2-base-code
```

You'll likely need to wait a few tens of minutes for the models to
download before proceeding. Once downloaded, both models should appear in the
output of `ollama list`.

## Docker

This package comes with a "Dockerfile" containing all code needed to build and
run the necessary ollama models within a docker container. To do this, download
the [Dockerfile from this
link](https://github.com/ropensci-review-tools/pkgmatch/blob/main/Dockerfile).
Then from the same directory as that file, run these lines:

``` bash
docker build . -t ollama-models
docker run --rm -p 11434:11434 ollama-models &
```

The running container can be stopped by calling `docker stop` followed the the
"CONTAINER ID" listed on the output of `docker ps`.