Skip to content

Commit

Permalink
Some fixes to all of this hope its stable now for 1.4 release
Browse files Browse the repository at this point in the history
  • Loading branch information
syonfox committed Jan 8, 2023
1 parent fc36ade commit d37ff39
Show file tree
Hide file tree
Showing 18 changed files with 2,423 additions and 183 deletions.
3 changes: 2 additions & 1 deletion .github/workflows/node.js.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:

strategy:
matrix:
node-version: [12.x, 14.x]
node-version: [12.x, 14.x, 16.x, 18.x]

steps:
- uses: actions/checkout@v2
Expand All @@ -26,5 +26,6 @@ jobs:
node-version: ${{ matrix.node-version }}
- run: npm ci
- run: npm run build --if-present
- run: npm run docs --if-present
- run: npm test

1 change: 1 addition & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
MIT License

Copyright (c) 2020 AIDungeon
Copyright (c) 2023 syonfox

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
25 changes: 15 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,8 @@ BPE. The returned object includes the following properties:
- `total`: the total number of tokens in the text.
- `unique`: the number of unique tokens in the text.
- `frequencies`: an object containing the frequency of each token in the text.

- `postions`: an object mapping tokens to positions in the encoded string
- `tokens`: same as the output to tokens
Compatibility

This library is compatible with both Node.js and browser environments, we have used webpack to build /dist/bundle.js 1.5 MB including the data. A compiled version for both environments is included in the package.
Expand All @@ -62,6 +63,9 @@ This library was created as a fork of the original GPT-3-Encoder library by lati

## Example

See browser.html and demo.js
Note you may need to include it from the appropriate place in node modules / npm package name

```js

import {encode, decode, countTokens, tokenStats} from "gpt-3-encoder"
Expand Down Expand Up @@ -90,27 +94,28 @@ console.log('We can decode it back into:\n', decoded)
## Developers

I have added som other examples to the examples folder.
Please take a look at pakege.json for how to do stuff
Please take a look at package.json for how to do stuff

```sh
git clone https://github.com/syonfox/GPT-3-Encoder.git

cd GPT-3-Encoder

npm install
npm install # install dev deps (docs tests build)

npm run test
npm run docs
npm run test # run tests
npm run docs # build docs

npm run browser
npm run demo
npm run build # builds it for the browser
npm run browser # launches demo inf firefox
npm run demo # runs node.js demo


less Encoder.js
less Encoder.js # the main code is here

firefox ./docs/index.html
firefox ./docs/index.html # view docs locally

npm publish --access public
npm publish --access public # dev publish to npm



Expand Down
3 changes: 1 addition & 2 deletions bpe_ranks.js

Large diffs are not rendered by default.

9 changes: 9 additions & 0 deletions browser.html
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,15 @@
</head>
<body>
<h1>gpt-3-encoder Demo</h1>
<p>To install with npm:</p>
<pre class="prettyprint source"><code><span class="pln">npm install </span><span class="lit">@syonfox</span><span class="pun">/</span><span class="pln">gpt</span><span class="pun">-</span><span class="lit">3</span><span class="pun">-</span><span class="pln">encoder</span></code></pre>
<h2>Usage</h2>
<a href="https://www.npmjs.com/package/@syonfox/gpt-3-encoder">
<img src="https://img.shields.io/npm/v/@syonfox/gpt-3-encoder.svg" alt="npm version">
</a>
<p><a href="https://syonfox.github.io/GPT-3-Encoder/"><img src="https://img.shields.io/badge/JS%20Docs-Read%20them%20maybe-brightgreen" alt="JSDocs"></a></p>
<p><img src="https://img.shields.io/github/last-commit/syonfox/GPT-3-Encoder" alt="GitHub last commit"></p>
<p>Compatible with Node &gt;= 12</p>
<p>Enter some text in the text field below to see how it is encoded and decoded by the gpt-3-encoder library:</p>
<textarea id="input"></textarea>
<button id="encode-button">Encode</button>
Expand Down
3 changes: 0 additions & 3 deletions build_encoder.js
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
const fs = require('fs');
const path = require('path');

const bpe_file = fs.readFileSync(path.join(__dirname, './vocab.bpe'), 'utf-8');

const lines = bpe_file.split('\n');

const encoder = JSON.parse(fs.readFileSync(path.join(__dirname, './encoder.json')));

Expand Down
2 changes: 1 addition & 1 deletion demo.js
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
// import {encode, decode, countTokens, tokenStats} from "gpt-3-encoder"
//or

const {encode, decode, countTokens, tokenStats} = require('./index')
const {encode, decode, countTokens, tokenStats} = require('../index')

const str = 'This is an example sentence to try encoding out on!'
const encoded = encode(str)
Expand Down
1 change: 0 additions & 1 deletion dist/bundle.js

This file was deleted.

2 changes: 1 addition & 1 deletion docs/Encoder.js.html
Original file line number Diff line number Diff line change
Expand Up @@ -370,7 +370,7 @@ <h2><a href="index.html">Home</a></h2><h3>Global</h3><ul><li><a href="global.htm
<br class="clear">

<footer>
Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.0</a> on Sat Jan 07 2023 23:11:38 GMT-0500 (Eastern Standard Time)
Documentation generated by <a href="https://github.com/jsdoc/jsdoc">JSDoc 4.0.0</a> on Sun Jan 08 2023 03:07:22 GMT-0500 (Eastern Standard Time)
</footer>

<script> prettyPrint(); </script>
Expand Down
11 changes: 10 additions & 1 deletion docs/browser.html
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,15 @@
</head>
<body>
<h1>gpt-3-encoder Demo</h1>
<p>To install with npm:</p>
<pre class="prettyprint source"><code><span class="pln">npm install </span><span class="lit">@syonfox</span><span class="pun">/</span><span class="pln">gpt</span><span class="pun">-</span><span class="lit">3</span><span class="pun">-</span><span class="pln">encoder</span></code></pre>
<h2>Usage</h2>
<a href="https://www.npmjs.com/package/@syonfox/gpt-3-encoder">
<img src="https://img.shields.io/npm/v/@syonfox/gpt-3-encoder.svg" alt="npm version">
</a>
<p><a href="https://syonfox.github.io/GPT-3-Encoder/"><img src="https://img.shields.io/badge/JS%20Docs-Read%20them%20maybe-brightgreen" alt="JSDocs"></a></p>
<p><img src="https://img.shields.io/github/last-commit/syonfox/GPT-3-Encoder" alt="GitHub last commit"></p>
<p>Compatible with Node &gt;= 12</p>
<p>Enter some text in the text field below to see how it is encoded and decoded by the gpt-3-encoder library:</p>
<textarea id="input"></textarea>
<button id="encode-button">Encode</button>
Expand All @@ -16,7 +25,7 @@ <h1>gpt-3-encoder Demo</h1>
<p>Token Count: <span id="count"></span></p>
<p>Token Stats: <span id="stats"></span></p>

<script type="application/javascript" src="../browser.js"></script>
<script type="application/javascript" src="./browser.js"></script>

<script>
const tokens = gpt3encoder.encode('Hello, world!');
Expand Down
Loading

0 comments on commit d37ff39

Please sign in to comment.