mekaneeky · steffencruz · Dec 27, 2023 · Dec 27, 2023 · Dec 27, 2023 · Dec 27, 2023
diff --git a/.gitignore b/.gitignore
@@ -1,4 +1,13 @@
+# VS Code
+.vscode/
+
+test-models/
+
+# Exclude the Miner's directory for saving the models.
+local-models/
 neurons/pretraining_model/
+
+
 # Byte-compiled / optimized / DLL files
 __pycache__/
 *.py[cod]
@@ -9,7 +18,6 @@ neurons/wandb/
 
 # C extensions
 *.so
-**.ipynb
 
 # Distribution / packaging
 .Python
@@ -82,6 +90,7 @@ target/
 
 # Jupyter Notebook
 .ipynb_checkpoints
+*.ipynb
 
 # IPython
 profile_default/

diff --git a/README.md b/README.md
@@ -1,237 +1,57 @@
 <div align="center">
 
-# **Bittensor Training Subnet** <!-- omit in toc -->
+# **Bittensor Pretrain Subnet** <!-- omit in toc -->
 [![Discord Chat](https://img.shields.io/discord/308323056592486420.svg)](https://discord.gg/bittensor)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) 
 
 ---
 
-## Bittensor Incentivized Pretraining<!-- omit in toc -->
-
-[Discord](https://discord.gg/bittensor) • [Network](https://taostats.io/) • [Research](https://bittensor.com/whitepaper)
+[Leaderboard](https://huggingface.co/spaces/RaoFoundation/pretraining-leaderboard) • [Discord](https://discord.gg/bittensor) • [Network](https://taostats.io/subnets/netuid-9/) • [Research](https://bittensor.com/whitepaper) 
 </div>
 
 ---
 
 # Introduction
 
-Bittensor subnet 9 rewards miners for producing pretrained models of structure GPT2 on the Falcon Refined Web dataset. It acts like a continuous benchmark where miners are paid out for attaining the best losses on randomly sampled pages of that dataset. The reward mechanism works as follows:
-
-    1. Miner train and periodically host their model weights on a wandb account which is linked to their miner through the neurons/miner.py code. 
-    2. Validators periodically check and pull the latest hosted models.
-    3. Validators run a continuous eval on pulled models and perform the validation system outlined in neurons/validator.py 
-
-## Validation
-
-Miners are evaluated based on the number of times their loss on a random batch duing a 360 block epoch are lower than all other miners. 
-To perform well, miners must attain the lowest loss on the largest number of random batches sampled from the 900M page, 3T token dataset Falcon Refined Wed.
-
-All models are open and accessible via a wandb [project](https://wandb.ai/opentensor-dev/openpretraining) and this repo contains tools for downloading them and then
-serving them on your own miner. The drive to find the best miner at the earliest rate is ensured by having validators record the best global miner per epoch and assigning
-a miner ```epsilon``` reduction on the loss of this miner when calculating wins per batch.
-
-A Psuedo code for the algorithm can be read bellow:
-```python
-    epsilon = 0.03 # best miner epsilon reduction.
-    while True:
-        wins = {} # Count of wins per batch per miner
-
-        # Run continous scoring until the epoch is over.
-        while epoch_not_over( block )
+The following documentation assumes you are familiar with basic Bittensor concepts: Miners, Validators, and incentives. If you need a primer, please check out https://docs.bittensor.com/learn/bittensor-building-blocks.
 
-            # Fetch random sample of batches to evaluate models on
-            batches = get_random_sample_of_batches_from_falcon()
-
-            # Fetch and or update models during this step.
-            models = get_and_update_models_from_miners()
+Bittensor subnet 9 rewards miners for producing pretrained Foundation-Models on the Falcon Refined Web dataset. It acts like a continuous benchmark whereby miners are rewarded for attaining the best losses on randomly sampled pages of Falcon given a consistent model architecture. The reward mechanism works as follows:
 
-            # Compute losses for each batch on subset and count wins per miner
-            for batch in batches:
+    1. Miners train and periodically publish models to hugging face and commit the metadata for that model to the Bittensor chain.
+    2. Validators download the models from hugging face for each miner based on the Bittensor chain metadata and continuously evaluate them, setting weights based on the performance of each model against the Falcon dataset. They also log results to [wandb](https://wandb.ai/opentensor-dev/pretraining-subnet).
+    3. The Bittensor chain aggregates weights from all active validators using Yuma Consensus to determine the proportion of TAO emission rewarded to miners and validators. 
 
-                # Find miner with lowest loss on the batch.
-                for miner_uid, model in enumerate( models ):
-                    loss = get_loss_for_model_on_batch( model, batch )
-                    if miner_uid == epoch_global_min_uid: loss *= epsilon
-                    if loss < best_loss:
-                        best_uid = miner_uid
-                        best_loss = loss
-
-                # Increment the number of wins for the miner with the lowest loss on this subnet.
-                wins[ best_uid ] += 1
-
-        # Assign epoch_global_min_uid to miner uid with lowest loss across all epoch batches.
-        # This miner now attains a single epoch advantage for attaining the lower lost first.
-        epoch_global_min_uid = get_miner_with_lowest_loss_on_all_epoch_batches()
-
-        # End epoch.
-        # Weights are computed based on the ratio of wins a model attains during the epoch.
-        weights = zeros()
-        for miner_uid in wins.keys()
-            # Adds a communistic +1 score for all active miners.
-            weights = (wins[miner_uid] + 1)/ sum(wins.values())
-
-        # Set weights on the chain.
-        set_weights( weight )
-```
+See the [Miner](docs/miner.md) and [Validator](docs/validator.md) docs for more information about how they work, as well as setup instructions.
 
 ---
 
-## Installing
+## Incentive Mechanism
 
-Before continuing, make you have at least python3.8. If you have issues installing python on your machine I recommend using conda as explained [here](https://bittensor.com/documentation/getting-started/installation). Once python is install, install *this* repository as below:
-```bash
-# Installs this local repository using python.
-git clone https://github.com/unconst/pretrain-subnet.git
-cd pretrain-subnet
-python -m pip install -e . 
-```
-
----
+Bittensor hosts multiple incentive mechanism through which miners are evaluated by validators for performing actions well. Validators perform the process of evaluation and 'set weights', which are transactions into Bittensor's blockchain. Each incentive mechanism in Bittensor is called a 'subnet' and has an identifier (This particular mechanism has subnet uid 9). Weights and the amount of TAO held by the validators become inputs to Bittensor's consensus mechanism called Yuma Consensus. YC drives validators towards a consensus, agreement about the value of the work done by miners. The miners with the highest agreed upon scores are minted TAO, the network digital currency.
 
-## Subtensor
-
-Your node will run better if you are connecting to a local Bittensor chain entrypoint rather than using Opentensor's. 
-We recommend running a local node as follows and passing the ```--subtensor.network local``` flag all following commands i.e. for miners + validators.
-To install and run a local subtensor node follow the commands below with [Docker and Docker-Compose](https://docs.docker.com/engine/install/) previously installed.
-```bash
-# Installs your local subtensor chain endpoint and runs it on your machine. 
-git clone https://github.com/opentensor/subtensor.git
-cd subtensor
-docker compose up --detach
-```
+Miners within this subnet are evaluated based on the number of times the model they have hosted has a lower loss than another model on the network when randomly sampling from the near infinite Falcon Refined Web pretraining dataset. To perform well, miners must attain the lowest loss on the largest number of random batches. Finding the best model and delta at the earliest block ensures the most incentive.
 
 ---
 
-## Registration
+## Getting Started
 
-Miners + validator require a Bittensor coldkey and hotkey pair registered to netuid 9 before they can participate in the network.
-To create a wallet for either your validator or miner run the following command in your terminal. Make sure to save the mnemonic for
-both keys and store them in a safe place.
-```bash
-# Creates your miner/validator cold + hotkey keys.
-btcli w create --wallet.name ... --wallet.hotkey ... 
-btcli w list # to view your created keys.
-```
+TL;DR:
+1. [Chat](https://discord.gg/bittensor)
+2. [Leaderboard](https://huggingface.co/spaces/RaoFoundation/pretraining-leaderboard)
 
-Registering a miner or a validator on subnet 9 requires the participant `recycle` TAO to pay for entrance. To register your key run the 
-following command. Before continuing make sure you have enough TAO to register.
-```bash
-# Registers your cold and associated hotkey to netuid 9. 
-btcli s register --wallet.name ... --wallet.hotkey ... --netuid 0 
-```
----
+This repo's main conversation is carried out in the Bittensor [Discord](https://discord.gg/bittensor). Visit the 'pretraining' channel to ask questions and get real time feedback. You can view the ongoing running of the incentive mechanism, the best miners (see 'incentive'), the most in consensus validators (see 'vtrust') using this [taostats link](https://taostats.io/subnets/netuid-9/). The table shows all 256 participant UIDs with corresponding YC stats and earnings. 
 
-## Wandb
+See [Miner Setup](docs/miner.md#getting-started) to learn how to set up a Miner.
 
-Miner and validators make **heavy use** of weights and biases (wandb) in order to share model state and validation information. Both miners and validators must attain
-a wandb account from [wandb](https://wandb.ai/home) along with their wandb api key which can be found by following the instructions [here](https://docs.wandb.ai/quickstart).
-
-Models hosted by miners and corresponding validator information for runs can be found in this open wandb [project](https://wandb.ai/opentensor-dev/openpretraining). You can get access to all valid, signed and recent miners runs from other participants on the network as follows:
-
-```python
->>> import pretrain
->>> import bittensor as bt
->>> meta = bt.subtensor(network = 'local' ).metagraph(9)
-# Get all valid runs.
->>> miner_runs = pretrain.get_miner_runs( meta )
-    {
-        238: {
-            'uid': 238, 
-            'hotkey': '5CchHAvd95HtTaxfviiC36wt1HFXU73Xq9Aom7NDZJnAiG8v', 
-            'emission': 0.02, 
-            'run': <Run opentensor-dev/openpretraining/63j2ps12 (finished)>, 
-            'model_artifact': <File model.pth () 312.5MiB>, 
-            'timestamp': 1699448922
-        }, 
-        239: {
-            'uid': 239, 
-            'hotkey': '5CSczy1dp4EpvLARaVbgvq8DST6oJgqmSTTQJZ8iXhJpKwdZ', 
-            'emission': 0.01, 
-            'run': <Run opentensor-dev/openpretraining/qp0w790l (finished)>, 
-            'model_artifact': <File model.pth () 312.5MiB>, 'timestamp': 1699448504
-        } 
-        ...
-# Download model from run 1
->> model = pretrain.model.get_model() 
->> miner_runs['5CchHAvd95HtTaxfviiC36wt1HFXU73Xq9Aom7NDZJnAiG8v']['model_artifact'].download( replace=True, root=<path to model>)
->> model_weights = torch.load( <path to model> )
->> model.load_state_dict( model_weights )
-```
-
-You can download all validation data from wandb which can be used to evaluate how miners are performing on each individual page of the Falcon Refined Web dataset.
-```python
->>> import pretrain
->>> import bittensor as bt
->>> meta = bt.subtensor(network = 'local' ).metagraph(9)
-# Get all valid runs.
->>> vali_runs = pretrain.get_validator_runs( meta )
- {
-        240: {
-            'uid': 238, 
-            'hotkey': '5CchHAvd95HtTaxfviiC36wt1HFXU73Xq9Aom7NDZJnAiG8v', 
-            'stake': 123121, 
-            'run': <Run opentensor-dev/openpretraining/63j2ps12 (finished)>, 
-        }, 
-        ...
- }
- dataframe = vali_runs[240]['run'].history()
- ...
-```
-
----
-
-## Mining
-
-The mining script can be run in multiple ways. In all cases, it uploads a model to wandb which will be evaluated by validators. 
-
-By default, the miner trains from scratch and posts the model periodically as its loss decreases on Falcon.
-```bash
-python neurons/miner.py --wallet.name ... --wallet.hotkey ... --num_epochs 10 --pages_per_epoch 5
-```
-
-Alternatively, you can scrape a model from an already performing miner on the network by passing its run id. This starts the training process from the checkpoint of another 
-miner. See this [page](https://wandb.ai/opentensor-dev/openpretraining) for the run_ids of other miners or use the above tools.
-```bash
-python neurons/miner.py --wallet.name ... --wallet.hotkey ... --num_epochs 10 --pages_per_epoch 5 --load_run_id ...
-```
-
-The miner can automatically search the for runs which perform well directly from wanbd. Using the best scored model as its initial checkpoint. The pretraining
-subnet is *PRO* model sharing. We recommend miners scrape other participants models and often.
-```bash
-python neurons/miner.py --wallet.name ... --wallet.hotkey ... --num_epochs 10 --pages_per_epoch 5 --load_best
-```
-
-Passing the ```--device``` option allows you to select which GPU to run on. You can also use ```--continue_id``` to continue from a training run you have already started.
-The model you train will be hosted on wandb. You can always view this model and others by visiting https://wandb.ai/opentensor-dev/openpretraining/runs/ where all participant 
-model are shared. 
-
-We strongly recommend you read, understand and adapt the miner code to your needs by reading ```neurons/miner.py```. For all serious attempts to get emission on this subnet you will
-likely NEED to do this.
+See [Validator Setup](docs/validator.md#getting-started) to learn how to set up a Validator.
 
 ---
 
-## Validating
+## Feedback
 
-Validators can be run as follows. Pass you wallet hotkey and coldkey to the script. Note validation required you have a working GPU.
-In version release/2.0.1 you need a GPU with atleast 20GB of RAM. 
+We welcome feedback!
 
-```bash
-python neurons/validator.py 
-    --wallet.name YOUR_WALLET_NAME
-    --wallet.hotkey YOUR_WALLET_HOTKEY 
-    --device YOUR_CUDA DEVICE
-    --wandb.on 
-```
----
-
-# Auto-update PM2 + CRON
-
-```bash
-echo '* * * * * git -C <path to pretrain-subnet repo> pull' >> /etc/crontab
-pm2 start neurons/validator.py --name sn9_validator --interpreter python3 --watch -- --wallet.name my_wallet ...
-pm2 start neurons/miner.py --name sn9_miner_1 --interpreter python3 --watch -- --wallet.name my_wallet ...
-pm2 start neurons/miner.py --name sn9_miner_2 --interpreter python3 --watch -- --wallet.name my_wallet ...
-```
+If you have a suggestion, please reach out on the Discord channel, or file an Issue.
 
 ---