This repository has been archived by the owner on Sep 12, 2024. It is now read-only.
Replies: 1 comment
-
using code from doc site https://llama-node.vercel.app/docs/backends/llama.cpp/inference and latest // inference.js
import { LLM } from 'llama-node';
import { LLamaCpp } from 'llama-node/dist/llm/llama-cpp.js';
const model = '/Users/linonetwo/Downloads/openbuddy-openllama-13b-v7-q4_K.bin';
const llama = new LLM(LLamaCpp);
const config = {
modelPath: model,
enableLogging: true,
nCtx: 1024,
seed: 0,
f16Kv: false,
logitsAll: false,
vocabOnly: false,
useMlock: false,
embedding: false,
useMmap: true,
nGpuLayers: 0,
};
const template = `简短地介绍一下Tiddlywiki吧`;
const prompt = `A chat between a user and an assistant.
USER: ${template}
ASSISTANT:`;
const parameters = {
nThreads: 4,
nTokPredict: 2048,
topK: 40,
topP: 0.1,
temp: 0.2,
repeatPenalty: 1,
prompt,
};
const run = async () => {
await llama.load(config);
await llama.createCompletion(parameters, (response) => {
process.stdout.write(response.token);
});
};
run(); |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
https://huggingface.co/OpenBuddy/openbuddy-ggml/tree/main 's
openllama-7b-v5-q5_K.bin
throw error when
await runnerInstance.load(loadConfig);
while
ggml-vic7b-q5_1.bin
can load and run properly with same code.@44670 @rujews do you know what this message means?
Beta Was this translation helpful? Give feedback.
All reactions