Skip to content

Releases: macrocosm-os/data-universe

Release 1.8.1

17 Mar 16:35
443c293
Compare
Choose a tag to compare

Enhanced On-Demand API Release Announcement
New X/Twitter Data Enrichment Feature
We're excited to announce a significant upgrade to the Data Universe On-Demand API! Starting tomorrow, miners will be able to test this enhanced functionality on testnet using their existing hotkeys.
What's New
The enhanced API now delivers substantially richer X/Twitter content, including:

Comprehensive User Metadata

User display names, verification status, follower counts
Profile details and engagement metrics

Complete Tweet Context

Full engagement metrics (likes, retweets, replies, quotes, views)
Tweet classification (reply, quote, retweet)
Conversation threading information

Rich Media Support

Media URLs and content types
Support for photos and videos

Enhanced Value

More valuable data for validators and users
Better content analysis possibilities

How to Get Started
Miners can test this enhanced functionality on testnet starting tomorrow using their existing hotkeys. Installation is simple and backward compatible - the enhanced scraper will be available for all X/Twitter requests while Reddit functionality remains unchanged.
Implementation Benefits

Higher Quality Data: Deliver richer, more valuable content to validators
Competitive Edge: Enhanced content can lead to better scores in validation
Future-Ready: Positioning for upcoming data quality measurements

Miner Policy

To launch Gravity as a commercial product, we need to adhere to legal guidelines. The miner policy provides SN13 a basis for legality and outlines measures that miners should adhere to when scraping.
The miner policy is now displayed in the SN13 docs. It includes prohibiting the scraping of harmful or illegal content, and outlines legal responsibilities of data collection. We ask that you read this over and should you need to, make appropriate changes immediately.
Datasets uploaded to Hugging Face now display the Macrocosmos Miner Policy in dataset cards.
API Improvements

[CONTINUE HERE FOR ON-DEMAND API]

New Endpoint: list_hf_repo_names
Returns the list of distinct miner Hugging Face repos currently stored by the Validator.

Release 1.8.0

03 Mar 22:08
e26dd8c
Compare
Choose a tag to compare

Release 1.8.0

Key Enhancements

API Database Stability Fix
Fixed critical issues with the API key database system
Implemented more robust database initialization
Added improved error handling for database operations
Enhanced On-Demand Data Verification

Added validator verification when miners return empty results
Penalizes miners who fail to return data that actually exists
Provides users with data even when miners fail to deliver

Release 1.7.9

27 Feb 17:33
23d5cd6
Compare
Choose a tag to compare

In this update, we added improvements to on-demand data requests to deliver better results to API users and support future collaborations with other subnets, who will make use of this feature.

On-Demand Request Changes:

  • Now queries 5 miners instead of just 1 for each request ( random coldkeys).
  • Added consistency checks between miner responses
  • Implemented occasional validation of returned data (5% of requests)
  • Added small credibility penalties for miners who return bad data
  • Improved handling of empty results and non-existent data queries
  • Better selection logic to return the most reliable data to users

The process is as follows:

  • Select up to 5 diverse miners from top 60% performers ( by coldkey)
  • Query all selected miners with the same request
  • Check consistency among responses (within 30% of median)
  • Validate data in 5% of cases or when consistency is poor
  • Apply small credibility for bad data (0.01-0.05)
  • Choose best data to return from the following:
    a. Validated miners with highest score
    b. Consistent miners with most data
    c. Median response when inconsistent
  • Return unique results to API user

Release 1.7.85

25 Feb 19:38
33f5225
Compare
Choose a tag to compare

In this update:

  • Removed labels with greater than 140 characters from Dynamic Desirability uploads and retrieval.
  • Fixed datetime fromisoformat error when if commit date is greater than 19 hours old.
    No action needed from miners.

Release 1.7.84

20 Feb 15:40
11b25be
Compare
Choose a tag to compare
  • A label weight can have a max value of 5 when incentivized by dynamic desirability
  • change label limit from 32 to 140 chars
  • filter out Unexpected header key encountered logs

Release 1.7.83

19 Feb 17:06
dfb2fb1
Compare
Choose a tag to compare

Temporarily remove parquet check.
Change base miner code to upload data every 17h.
Increase max total dynamic desirability value from 100 to 250

Hotfix of < vs >

17 Feb 22:34
89fe222
Compare
Choose a tag to compare

Hotfix of < vs >

HF validation, clean the logs, add additional HF checks.

17 Feb 22:23
76aff7a
Compare
Choose a tag to compare

Fixed small dTAO bugs with OnDemand request
Ignore deprecated datetime.utcnow() warningto avoid spam of validator logs
HF validation
if miners uploads same HF parquets validation will fails.
if miner didn't add any data into HF repo validation fails.
Happy Mining!

dTAO update.

13 Feb 21:41
e72e4ed
Compare
Choose a tag to compare

In addition to the dTao update, we have added the rao_vali_permit flag, which will allow you to set a minimum threshold at which validators are recognized and at which miners will respond to requests.

API improvements, Direct data querying. Update on new task On-Demand Data Streaming.

12 Feb 21:12
2b25e68
Compare
Choose a tag to compare

Release 1.7.8: API improvements, Direct data querying. Update on new task On-Demand Data Streaming.

Hello @dataverse・13 ! We are excited to announce the latest update that adds on-demand data streaming capabilities to SN13! This feature allows validators to request small chunks of data (<1k rows) from miners through the API. As well as direct possibility to query data from the network to the API users.

Key Features

On-Demand Data Streaming

  • Validators can request data directly from top miners
  • Supports both X (Twitter) and Reddit data sources
  • Built-in validation and reward system

API Bucket quering

  • Validators can request any timebucket from the network via API.

*New Scoring Info for Miners*

  • Pre-implemented on-demand handling in miner template
  • Freedom to customize scraping implementations
  • Automatic reward system:
  • Top 50% miners by trust participate in validation
  • 50% chance of validation per request
  • Successful validation: +0.1% credibility boost
  • Failed validation: 1% credibility decrease
    here we will include a link to scoring.md that in dev

Other Updates:

HF Updates:

  • Added miner hotkey and UID in Hugging Face scoring logs.

Dynamic Desirability Updates:

  • Added endpoint to validator API to get the latest unscaled JSON desirability submission for a given hotkey.
    • If no hotkey is specified, returns the current total dynamic list used by the validator.
  • Used subtensor.commit to update retrieval and upload functions to be compatible with dTao update.
  • Small config change to MinerEvaluator to be compatible with dTao update.
  • Increased total label scale factor pool for the dynamic list to 100.
    • (All dynamically incentivized labels across all validators will have their scale factors add up to 100).
  • Added JSON formatting when displaying current dynamic desirability list for readability improvements in wandb logs.

Happy Scraping!