Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2.17.6 #621

Merged
merged 82 commits into from
Mar 12, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
82 commits
Select commit Hold shift + click to select a range
1a77872
Add fallback
dbobrenko Feb 20, 2025
4299e3f
Add fallback json
dbobrenko Feb 20, 2025
4ed705e
Enhance fallback
dbobrenko Feb 20, 2025
cd7a215
Clean up
dbobrenko Feb 20, 2025
d4f6985
Fix loop runner issues
dbobrenko Feb 20, 2025
670f85c
Fix offset datetimes in loop runner
dbobrenko Feb 20, 2025
4317ddc
Check is website is dict
bkb2135 Feb 20, 2025
a96a85b
Merge pull request #618 from macrocosm-os/fix-attr-errors-in-web-retr…
dbobrenko Feb 20, 2025
e2d4be4
Increasing Miner Availability frequency
richwardle Feb 20, 2025
f3c1448
Merge branches 'staging' and 'staging' of github.com:macrocosm-os/pro…
richwardle Feb 20, 2025
a8949ad
Update .gitignore
bkb2135 Feb 20, 2025
63e59d9
Delete past_websites.csv
bkb2135 Feb 20, 2025
089bd8a
Fix deps issue; add uids fallback
dbobrenko Feb 20, 2025
3aee09b
Add proper fallback
dbobrenko Feb 20, 2025
099f4dc
Poetry lock
dbobrenko Feb 20, 2025
9eb7413
Run pre-commit
dbobrenko Feb 20, 2025
bd58940
Merge pull request #620 from macrocosm-os/api/fallback
dbobrenko Feb 20, 2025
feabaa2
Flattening lists in staging
richwardle Feb 20, 2025
2fcf2ea
Merge branch 'staging' of github.com:macrocosm-os/prompting into staging
richwardle Feb 20, 2025
56af3bd
Merge pull request #619 from macrocosm-os/Remove-Past-Websites
bkb2135 Feb 20, 2025
cec4b57
api/prod
dbobrenko Feb 20, 2025
8ae8925
Default to random miners
bkb2135 Feb 20, 2025
92d478d
Improving CoT prompt engineering
richwardle Feb 20, 2025
5d52d2e
Merge branch 'staging' into hotfix/default-to-random-miners
bkb2135 Feb 20, 2025
10290db
Remove Notebooks
bkb2135 Feb 20, 2025
93f89bf
Precommit FFixes
bkb2135 Feb 20, 2025
a583f3e
Add one-shot example
bkb2135 Feb 20, 2025
0e56e0f
Merge pull request #623 from macrocosm-os/remove-notebooks
bkb2135 Feb 20, 2025
5d9e453
Teardown wandb at exit
bkb2135 Feb 24, 2025
69565a5
Linting
bkb2135 Feb 24, 2025
0b19292
Merge branch 'staging' into hotfix/default-to-random-miners
bkb2135 Feb 24, 2025
6da14ce
Merge pull request #622 from macrocosm-os/hotfix/default-to-random-mi…
bkb2135 Feb 24, 2025
57e6205
Handle Exit with excepts
bkb2135 Feb 24, 2025
34cba84
Linting
bkb2135 Feb 24, 2025
4b161a3
Don't share unecessary miner failure
richwardle Feb 24, 2025
930eeb2
Clean Up Random error raise
richwardle Feb 24, 2025
9e4693c
Silence Some Web Retrieval Logging
richwardle Feb 24, 2025
19b51a4
finish wandb in mp loop
richwardle Feb 25, 2025
afb286c
Adding cache (#625)
Hollyqui Feb 25, 2025
2533323
use wandb sdk to terminate runs
bkb2135 Feb 25, 2025
2cef7a2
Make WebRetrievalRewardModel hashable (#629)
Hollyqui Feb 25, 2025
834cd24
Add API systemd start/stop scripts (#626)
dbobrenko Feb 25, 2025
4558912
Precommit Fix
bkb2135 Feb 25, 2025
4055c6d
Making Reward Function Async & Various Optimisations (#628)
Hollyqui Feb 27, 2025
034ed18
SN1-406-improve-api-docs (#627)
richwardle Feb 27, 2025
dad7cb9
Add Past Websites Files (#630)
richwardle Feb 27, 2025
665bfc1
Seperate Prompting, Remove TTI Endpoint, Add Json Flag
bkb2135 Feb 28, 2025
97096e7
Add Await To Inference Reward Model
richwardle Feb 28, 2025
6ae46ce
Precommit Changes
richwardle Feb 28, 2025
15c951d
Precommit Fix
bkb2135 Mar 3, 2025
cb29ad8
Improving TTI Final Prompt, Add Unittests for Prompts
richwardle Mar 3, 2025
2464424
Merge branch 'SN1-423-restructure-prompting' of github.com:macrocosm-…
richwardle Mar 3, 2025
4039cf2
Remove Comments
bkb2135 Mar 3, 2025
2f96a50
Await Reward Models
richwardle Mar 3, 2025
a6132c0
Add Next Action For Final Prompt
richwardle Mar 3, 2025
fc470bf
Add Detailed Log For Scoring Response Failed
richwardle Mar 3, 2025
5948dbb
Precommit Fixes
bkb2135 Mar 3, 2025
f801392
Readd exiting with system exit code 1
bkb2135 Mar 3, 2025
8b9dea8
Merge pull request #632 from macrocosm-os/fix/reward-coroutine-attrib…
bkb2135 Mar 3, 2025
baa3f98
Fix pydantic types; move autoawq to poetry
dbobrenko Mar 4, 2025
baa1dc0
Run pre-commit
dbobrenko Mar 4, 2025
93e70d8
Check if past websites is not empty
dbobrenko Mar 4, 2025
9a5ebb3
Simplify Prompt Structure
richwardle Mar 4, 2025
0245a2a
Fix Unittest and Precommit
bkb2135 Mar 4, 2025
3f7b463
Merge pull request #635 from macrocosm-os/fix/pydantic-poetry
dbobrenko Mar 4, 2025
3e39b90
Merge pull request #633 from macrocosm-os/SN1-423-restructure-prompting
bkb2135 Mar 4, 2025
b641085
Add Async Timeout
richwardle Mar 5, 2025
a4e3132
Precommit Fix
richard-wardle Mar 5, 2025
467dab1
Merge pull request #636 from macrocosm-os/fix/web_retrieval_timeout
bkb2135 Mar 5, 2025
ebfe369
fix timeout
richwardle Mar 6, 2025
9f3bfcb
Merge pull request #637 from macrocosm-os/fix/pass_timeout_through
bkb2135 Mar 6, 2025
48d0092
Merge pull request #624 from macrocosm-os/features/teardown-wandb-at-…
bkb2135 Mar 10, 2025
b588037
Hotfix HF model initialization
dbobrenko Mar 11, 2025
18770f7
Add params to miner model
dbobrenko Mar 11, 2025
5bf219d
Run pre-commit
dbobrenko Mar 11, 2025
2ac95a3
Fix settings device
dbobrenko Mar 11, 2025
5ed22db
Merge pull request #638 from macrocosm-os/hotfix/SN1-431-model-init
dbobrenko Mar 11, 2025
747a563
Allow langflow arguments to be properly parsed by completion endpoint
bkb2135 Mar 12, 2025
f399305
Default to Inference in Serializer
bkb2135 Mar 12, 2025
ea38621
Allow multiple api schemas
richwardle Mar 12, 2025
67b6cb0
Linting
richwardle Mar 12, 2025
07eba87
Merge pull request #640 from macrocosm-os/SN1-433-update-serializer-t…
bkb2135 Mar 12, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -183,3 +183,4 @@ wandb
.vscode
**/api_keys.json
weights.csv
past_websites.csv
6 changes: 5 additions & 1 deletion neurons/miners/epistula_miner/miner.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,11 @@ def __init__(self):
},
)
if SHOULD_SERVE_LLM:
self.llm = ReproducibleHF(model_id=LOCAL_MODEL_ID)
self.llm = ReproducibleHF(
model_id=LOCAL_MODEL_ID,
device=shared_settings.NEURON_DEVICE,
sampling_params=shared_settings.SAMPLING_PARAMS,
)
else:
self.llm = None

Expand Down
18 changes: 15 additions & 3 deletions neurons/validator.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,17 @@ async def spawn_loops(task_queue, scoring_queue, reward_events):
logger.debug(f"Number of tasks in Scoring Queue: {len(scoring_queue)}")
logger.debug(f"Number of tasks in Reward Events: {len(reward_events)}")

asyncio.run(spawn_loops(task_queue, scoring_queue, reward_events))
try:
asyncio.run(spawn_loops(task_queue, scoring_queue, reward_events))
except Exception as e:
logger.info(f"Terminating loop process: {e}")
finally:
logger.info("Cleaning up resources...")

# Ensure wandb is closed properly
if settings.shared_settings.WANDB_ON:
wandb.finish()
logger.info("WandB run finished.")


def start_api(scoring_queue, reward_events):
Expand Down Expand Up @@ -150,19 +160,21 @@ async def main():
f"Metagraph hasn't been updated for {current_block - last_update_block} blocks. "
f"Staled block: {current_block}, Last update: {last_update_block}"
)
sys.exit(1)
break # Exit the loop
step += 1

except KeyboardInterrupt:
logger.info("KeyboardInterrupt detected. Shutting down gracefully...")
except Exception as e:
logger.error(f"Main loop error: {e}")
raise
finally:
wandb.teardown()
# Clean up processes
for process in processes:
if process.is_alive():
process.terminate()
process.join()
sys.exit(1)


# The main function parses the configuration and runs the validator.
Expand Down
85 changes: 32 additions & 53 deletions poetry.lock

Large diffs are not rendered by default.

File renamed without changes.
54 changes: 21 additions & 33 deletions prompting/llms/hf_llm.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,38 +4,38 @@
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, PreTrainedModel, pipeline

from shared.settings import shared_settings


class ReproducibleHF:
def __init__(self, model_id="hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4", **kwargs):
"""
Initialize Hugging Face model with reproducible settings and optimizations
"""
# Create a random seed for reproducibility
# self.seed = random.randint(0, 1_000_000)
# self.set_random_seeds(self.seed)
def __init__(
self,
model_id: str = "hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4",
device: str = "cuda:0",
sampling_params: dict[str, str | float | int | bool] | None = None,
):
"""Deterministic HuggingFace model."""
self._device = device
self.sampling_params = {} if sampling_params is None else sampling_params
self.model: PreTrainedModel = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16,
low_cpu_mem_usage=True,
device_map="cuda:0",
device_map=self._device,
)

self.tokenizer = AutoTokenizer.from_pretrained(model_id)
self.valid_generation_params = set(
AutoModelForCausalLM.from_pretrained(model_id).generation_config.to_dict().keys()
)

self.llm = pipeline("text-generation", model=self.model, tokenizer=self.tokenizer)

self.sampling_params = shared_settings.SAMPLING_PARAMS

@torch.inference_mode()
def generate(self, messages: list[str] | list[dict], sampling_params=None, seed=None):
"""
Generate text with optimized performance
"""
def generate(
self,
messages: list[str] | list[dict[str, str]],
sampling_params: dict[str, str | float | int | bool] | None = None,
seed: int | None = None,
) -> str:
"""Generate text with optimized performance."""
self.set_random_seeds(seed)

inputs = self.tokenizer.apply_chat_template(
Expand All @@ -44,14 +44,13 @@ def generate(self, messages: list[str] | list[dict], sampling_params=None, seed=
add_generation_prompt=True,
return_tensors="pt",
return_dict=True,
).to(shared_settings.NEURON_DEVICE)
).to(self._device)

params = sampling_params if sampling_params else self.sampling_params
filtered_params = {k: v for k, v in params.items() if k in self.valid_generation_params}

# Generate with optimized settings
outputs = self.model.generate(
**inputs.to(shared_settings.NEURON_DEVICE),
**inputs,
**filtered_params,
eos_token_id=self.tokenizer.eos_token_id,
)
Expand All @@ -61,21 +60,10 @@ def generate(self, messages: list[str] | list[dict], sampling_params=None, seed=
skip_special_tokens=True,
)[0]

# logger.debug(
# f"""{self.__class__.__name__} queried:
# prompt: {messages}\n
# responses: {results}\n
# sampling params: {params}\n
# seed: {seed}
# """
# )

return results if len(results) > 1 else results[0]

def set_random_seeds(self, seed=42):
"""
Set random seeds for reproducibility across all relevant libraries
"""
def set_random_seeds(self, seed: int | None = 42):
"""Set random seeds for reproducibility across all relevant libraries."""
if seed is not None:
random.seed(seed)
np.random.seed(seed)
Expand Down
6 changes: 3 additions & 3 deletions prompting/llms/model_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,9 +65,9 @@ def load_model(self, model_config: ModelConfig, force: bool = True):
GPUInfo.log_gpu_info()

model = ReproducibleHF(
model=model_config.llm_model_id,
gpu_memory_utilization=model_config.min_ram / GPUInfo.free_memory,
max_model_len=settings.shared_settings.LLM_MAX_MODEL_LEN,
model_id=model_config.llm_model_id,
device=settings.shared_settings.NEURON_DEVICE,
sampling_params=settings.shared_settings.SAMPLING_PARAMS,
)

self.active_models[model_config] = model
Expand Down
2 changes: 1 addition & 1 deletion prompting/rewards/date.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ def date_score(self, reference: str, completion: str) -> float:
score = 0
return score

def reward(self, reference: str, response_event: DendriteResponseEvent, **kwargs) -> BatchRewardOutput:
async def reward(self, reference: str, response_event: DendriteResponseEvent, **kwargs) -> BatchRewardOutput:
"""Compute difference scores given a completion and reference pair.

Args:
Expand Down
2 changes: 1 addition & 1 deletion prompting/rewards/exact_match.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ def normalize_timing(timing: float, timings: float) -> float:


class ExactMatchRewardModel(BaseRewardModel):
def reward(self, reference: str, response_event: DendriteResponseEvent, **kwargs) -> BatchRewardOutput:
async def reward(self, reference: str, response_event: DendriteResponseEvent, **kwargs) -> BatchRewardOutput:
"""
Calculates rewards based on an exact match of the response with the reference string.

Expand Down
2 changes: 1 addition & 1 deletion prompting/rewards/float_diff.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ def math_score(reference: str, completion: str) -> float:
except Exception:
return 0.0

def reward(self, reference: str, response_event: DendriteResponseEvent, **kwargs) -> BatchRewardOutput:
async def reward(self, reference: str, response_event: DendriteResponseEvent, **kwargs) -> BatchRewardOutput:
"""Compute difference scores given a completion and reference pair."""
rewards = []
timings = []
Expand Down
6 changes: 3 additions & 3 deletions prompting/rewards/inference_reward_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@


class InferenceRewardModel(BaseRewardModel):
def reward(
async def reward(
self,
reference: str,
response_event: DendriteResponseEvent,
Expand All @@ -14,5 +14,5 @@ def reward(
) -> BatchRewardOutput:
"""Gives an exact reward of 1 if the response matches the reference, 0 otherwise"""
if model_id:
return ExactMatchRewardModel().reward(reference, response_event)
return RelevanceRewardModel().reward(reference, response_event)
return await ExactMatchRewardModel().reward(reference, response_event)
return await RelevanceRewardModel().reward(reference, response_event)
8 changes: 5 additions & 3 deletions prompting/rewards/multi_choice.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,8 @@ def safe_load_json(json_string: str) -> dict[str, float]:
cleaned_json_string = re.sub(r'"\s*\n\s*"', r'""', cleaned_json_string)
try:
return {k.upper(): v for k, v in json.loads(cleaned_json_string).items()}
except json.JSONDecodeError as e:
raise ValueError(f"Invalid JSON data: {e}")
except Exception:
return None

def process_predictions(self, predictions: dict[str, float]) -> dict[str, float]:
if not all(isinstance(value, (int, float)) for value in predictions.values()):
Expand All @@ -56,12 +56,14 @@ def letter_reward(self, reference: str, completion: str) -> float:
def logit_reward(self, reference: str, completion: str) -> float:
try:
loaded_json = self.safe_load_json(completion)
if not loaded_json:
return None
valid_choices = self.process_predictions(loaded_json)
return valid_choices.get(reference.upper(), 0.0)
except ValueError:
return None

def reward(self, reference: str, response_event: DendriteResponseEvent, **kwargs) -> BatchRewardOutput:
async def reward(self, reference: str, response_event: DendriteResponseEvent, **kwargs) -> BatchRewardOutput:
rewards = []
timings = []

Expand Down
2 changes: 1 addition & 1 deletion prompting/rewards/penalty.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ class PenaltyModel(BaseRewardModel):
def name(self) -> str:
return "penalty"

def reward(self, reference: str, response_event: DendriteResponseEvent, **kwargs) -> BatchRewardOutput:
async def reward(self, reference: str, response_event: DendriteResponseEvent, **kwargs) -> BatchRewardOutput:
"""Penalises miner if they do not respond."""
rewards = []
timings = []
Expand Down
2 changes: 1 addition & 1 deletion prompting/rewards/relevance.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ def init_model(self) -> "RelevanceRewardModel":
self.embedding_model = MODEL
return self

def reward(self, reference: str, response_event: DendriteResponseEvent, **kwargs) -> BatchRewardOutput:
async def reward(self, reference: str, response_event: DendriteResponseEvent, **kwargs) -> BatchRewardOutput:
"""Calculate the cosine similarity between sentence embeddings of the reference and completions.

We subtract a baseline score which is what an empty string would get (a failed completion).
Expand Down
10 changes: 5 additions & 5 deletions prompting/rewards/reward.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,10 +69,10 @@ class BaseRewardModel(ABC, BaseModel):
weight: float = 1.0

@abstractmethod
def reward(self, reference: str, response_event: DendriteResponseEvent, **kwargs) -> BatchRewardOutput:
async def reward(self, reference: str, response_event: DendriteResponseEvent, **kwargs) -> BatchRewardOutput:
raise NotImplementedError("You must implement the reward method")

def apply(
async def apply(
self,
response_event: DendriteResponseEvent,
reference: str | None = None,
Expand All @@ -83,7 +83,7 @@ def apply(
) -> WeightedRewardEvent:
t0 = time.time()
comparator = reference if reward_type == "reward" else challenge
batch_rewards_output: BatchRewardOutput = self.reward(comparator, response_event, task=task, **kwargs)
batch_rewards_output: BatchRewardOutput = await self.reward(comparator, response_event, task=task, **kwargs)
batch_rewards_time = time.time() - t0

return WeightedRewardEvent(
Expand Down Expand Up @@ -136,7 +136,7 @@ def final_rewards(cls, reward_events: list[WeightedRewardEvent]) -> list[float]:
return cls.sum_rewards(reward_events)

@classmethod
def apply(
async def apply(
cls,
response_event: DendriteResponseEvent,
reference: str,
Expand All @@ -147,7 +147,7 @@ def apply(
reward_events = []
for weighted_reward in cls.reward_definitions:
reward_events.append(
weighted_reward.apply(
await weighted_reward.apply(
reference=reference,
response_event=response_event,
challenge=challenge,
Expand Down
2 changes: 1 addition & 1 deletion prompting/rewards/rouge.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ def rouge_score(self, reference, completion):
return 0.0
return self.rouge.get_scores(reference, completion, avg=self.avg)[0][self.ngram][self.metric]

def reward(self, reference: str, response_event: DendriteResponseEvent, **kwargs) -> BatchRewardOutput:
async def reward(self, reference: str, response_event: DendriteResponseEvent, **kwargs) -> BatchRewardOutput:
"""Compute ROUGE scores given a completion and reference pair."""
rewards = []
timings = []
Expand Down
4 changes: 2 additions & 2 deletions prompting/rewards/scoring.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ class TaskScorer(AsyncLoopRunner):

is_running: bool = False
thread: threading.Thread = None
interval: int = 10
interval: int = 0
scoring_queue: list | None = None
reward_events: list | None = None

Expand Down Expand Up @@ -76,7 +76,7 @@ async def run_step(self) -> RewardLoggingEvent:

# and there we then calculate the reward
reward_pipeline = TaskRegistry.get_task_reward(scoring_config.task)
reward_events = reward_pipeline.apply(
reward_events = await reward_pipeline.apply(
response_event=scoring_config.response,
challenge=scoring_config.task.query,
reference=scoring_config.task.reference,
Expand Down
3 changes: 1 addition & 2 deletions prompting/rewards/streaming.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,7 @@ def __init__(self, max_tokens_per_chunk: int, **kwargs):
super().__init__()
self.max_tokens_per_chunk = max_tokens_per_chunk

def reward(self, _: str, response_event: DendriteResponseEvent) -> BatchRewardOutput:
"""Compute difference scores given a completion and reference pair."""
async def reward(self, reference: str, response_event: DendriteResponseEvent) -> BatchRewardOutput:
"""Compute difference scores given a completion and reference pair."""
rewards = []
timings = []
Expand Down
Loading