
By participating as a miner on Subnet 13, you are agreeing to adhere to our Miner Policy below.
Macrocosmos Miner Data Compliance Policy
Version 1.0, March 2025
As a miner on Macrocosmos Subnets, you play a crucial role in handling various types of data. This summary outlines your potential obligations under the UK General Data Protection Regulation (UK GDPR) should you inadvertently collect personal data, and the expectations regarding prohibited content and acceptable use of Macrocosmos subnet code. We expect any participant (however they may define themselves or their involvement) in our subnet ecosystems to adhere to the guidelines set out in this policy. Deliberate and consistent violation of these guidelines may result in Macrocosmos seeking to limit your ability to participate, support for your participation and/or your incentive rewards. Miners and all other participants are responsible for their own legal and regulatory compliance procedures and are encouraged to seek advice if in any doubt as to how to proceed. Macrocosmos is available to provide informal guidance if required (see contact information below).
While Macrocosmos does not directly collect or process data, and seeks to avoid incentivising any collection or interaction with personal data, as a miner, you may be subject to GDPR obligations if your activities result in the inadvertent or accidental collection of personal data. We recommend that you put in place appropriate policies and procedures to accommodate this eventuality and set out below a summary of key responsibilities:
- Lawful Basis for Processing
- You must ensure there is a lawful basis for collecting and processing any personal data (e.g., consent, legitimate interests, or legal obligation).
- Transparency
- Inform individuals about how their data is being collected, processed, and stored.
- Provide clear privacy information, including the purpose and lawful basis for processing.
- Data Minimization
- Only collect and process data that is strictly necessary for your stated purpose.
- Avoid collecting sensitive personal data unless absolutely required and lawful.
- Upholding Individuals’ Rights
- Be prepared to handle requests or objections from individuals regarding their right to access, rectify, limit processing of or delete their personal data in compliance with GDPR.
- Data Security
- Implement robust security measures to protect any data you collect from unauthorized access or breaches.
- Ensure encryption and secure storage practices are in place.
- Breach Reporting
- Notify the appropriate data protection authority (e.g., ICO) within 72 hours of becoming aware of a personal data breach.
As a miner, you are strictly prohibited from collecting, processing, or transmitting the following types of content:
-
Illegal Content
- Child abuse material or exploitation.
- Hate speech, extremist propaganda, or content inciting violence.
- Content related to human trafficking or modern slavery.
-
Copyrighted Material
- Content protected by copyright unless you have explicit permission or a valid license to use it.
-
Explicit or Harmful Content
- Pornographic or explicit imagery.
- Content promoting self-harm, suicide, or drug abuse.
-
Acceptable Use Expectations Macrocosmos expects all miners to: Comply with platform-specific terms of service and relevant laws, including GDPR. Use Macrocosmos subnets for ethical and lawful purposes only. Regularly review and update your data collection and processing practices to ensure compliance with legal and ethical standards. Immediately cease processing any flagged or prohibited material and report concerns to the appropriate authorities where required.
-
Commitment to Support Macrocosmos is committed to supporting miners in understanding and meeting their GDPR obligations. To help you navigate these requirements and ensure compliance, we provide the following guidance and resources:
- Overview of GDPR Requirements
- The UK Information Commissioner’s Office (ICO) provides a comprehensive guide to GDPR obligations, including lawful bases for processing, data minimization, and security requirements:
- ICO Guide to GDPR
- Lawful Basis for Processing Data
- Understand the six lawful bases for processing personal data as defined under GDPR:
- ICO Lawful Basis Guide
- Transparency and Privacy Notices
- Guidance on providing clear and accessible privacy notices to individuals:
- ICO Privacy Notice Checklist
- Handling Data Subject Rights
- Information on responding to requests from individuals to access, rectify, or delete their personal data:
- ICO Individual Rights Guide
- Data Security and Minimization
- Best practices for securing personal data and ensuring data minimization:
- ICO Security Guidance
- Reporting Data Breaches
- Guidance on recognizing and reporting data breaches to the ICO within the required 72-hour window:
- ICO Guide to Data Breach Reporting
- UK GDPR Key Definitions
- A quick reference guide to key GDPR definitions and principles:
- ICO Key Definitions
- Data Protection Impact Assessments (DPIAs)
- Information on when and how to conduct a DPIA for high-risk data processing activities:
- ICO DPIA Guidance
- FAQs on GDPR Compliance
- Practical answers to common GDPR compliance questions:
- GDPR FAQs from the European Data Protection Board
- Regular Updates and Communication
- Macrocosmos will provide updates on GDPR-related requirements and best practices through periodic communications.
- Consultation Support
- If miners have specific questions or require clarification on GDPR obligations, Macrocosmos offers a support channel to address these concerns: [email protected]
- Training Materials
- Macrocosmos may share training materials and resources from time to time to help miners enhance their understanding of GDPR compliance.
Data is a critical pillar of AI and Data Universe is that pillar for Bittensor.
Data Universe is a Bittensor subnet for collecting and storing large amounts of data from across a wide-range of sources, for use by other Subnets. It was built from the ground-up with a focus on decentralization and scalability. There is no centralized entity that controls the data; the data is stored across all Miner's on the network and is queryable via the Validators. At launch, Data Universe is able to support up to 50 Petabytes of data across 200 miners, while only requiring ~10GB of storage on the Validator.
The Data Universe documentation assumes you are familiar with basic Bittensor concepts: Miners, Validators, and incentives. If you need a primer, please check out https://docs.bittensor.com/learn/bittensor-building-blocks.
In the Data Universe, Miners scrape data from a defined set of sources, called DataSources. Each piece of data (e.g. a webpage, BTC prices), called a DataEntity, is stored in the miner's database. Each DataEntity belongs to exactly one DataEntityBucket, which is uniquely identified by its DataEntityBucketId, a tuple of: where the data came from (DataSource), when it was created (TimeBucket), and a classification of the data (DataLabel, e.g. a stock ticker symbol). The full set of DataEntityBuckets on a Miner is referred to as its MinerIndex.
Validators periodically query each Miner to fetch their latest MinerIndexes and store them in a local database. This gives the Validator a complete understanding of all data that's stored on the network, as well as which Miners to query for specific types of data. Validators also periodically verify the correctness of the data stored on Miners and reward Miners based on the amount of valuable data the Miner has. Validators log to wandb anonymously by default.
Optionally, Miners upload their local stores to HuggingFace for public dataset access. This data is anonymized for privacy purposes to comply with the Terms of Service per each data source. See the HuggingFace docs for more information on HuggingFace uploads. In the future, publicly uploading data to HuggingFace will be required.
See the Miner and Validator docs for more information about how they work, as well as setup instructions.
As described above, each Miner reports its MinerIndex to the Validator. The MinerIndex details how much and what type of data the Miner has. The Miner is then scored based on 2 dimensions:
- How much data the Miner has and how valuable that data is.
- How credible the Miner is.
Not all data is equally valuable! There are several factors used to determine data value:
Fresh data is more valuable than old data, and data older than a certain threshold is not scored.
As of Dec 11th, 2023 data older than 30 days is not scored. This may increase in future.
Data Universe defines a DataDesirabilityLookup that defines which types of data are desirable. Data deemed desirable is scored more highly. Unspecified labels get the default_scale_factor of 0.5 meaning they score half value in comparison.
The DataDesirabilityLookup will evolve over time, but each change will be announced ahead of time to give Miners adequate time to prepare for the update.
Data that's stored by many Miners is less valuable than data stored by only a few. The value of a piece of data is decreases proportional to the number of Miners storing it.
Validators remain suspicious of Miners and so they periodically check a sample of data from each Miner's MinerIndex, to verify the data correctness. The Validator uses these checks to track a Miner's credibility, which it then uses to scale a Miner's score. The scaling is done in such a way that it is always worse for a Miner to misrepresent what types and how much data it has.
As you can see from the above, Data Universe rewards diversity of data (storing 200 copies of the same data isn't exactly beneficial!)
To help understand the current data on the Subnet, the Data Universe team hosts a dashboard (https://shorturl.at/Ca5uu), showing the amount of each type of data (by DataEntityBucketId) on the Subnet. Miners are strongly encouraged to use this dashboard to customize their Miner Configuration, to maximize their rewards.
See Miner Setup to learn how to setup a Miner.
See Validator Setup to learn how to setup a Validator.
- A Validator API to allow other Subnets to query the data.
- More data sources
DataDesirabilityLookup: A defined list of rules that determine how desirable data is, based on its DataSource and DataLabel.
DataEntity: A single "item" of data collected by a Miner. Each DataEntity has a URI, that the Validators can use to retrieve the item from its DataSource.
DataEntityBucket: A logical grouping of DataEntities, based on its DataEntityBucketId.
DataEntityBucketId: The unique identifier for a DataEntityBucket. It contains the TimeBucket, DataSource, and DataLabel.
DataLabel: A label associated with a DataEntity. Precisely what the label represents is unique to the DataSource. For example, for a Yahoo finance DataSource, the label is the stock ticker of the finance data.
DataSource: A source from which Miners scrape data.
Miner Credibility: A per-miner rating, based on how often they pass data validation checks. Used to heavily penalize Miner's who misrepresent their MinerIndex.
Miner Index: A summary of how much and what types of data a Miner has. Specifically, it's a list of DataEntityBuckets.
We welcome feedback!
If you have a suggestion, please reach out to @arrmlet, @ewekazoo or any of the broader Macrocosmos Team on the Discord channel, or file an Issue.