1 of 39

SEPTEMBER 12, 2024 REPORT

2 of 39

Table of content

Highlights of the month
TL;DR All Subnets
Comms and PR - Aug & Sep, 2024
Apex - Subnet 1
Pre-training - Subnet 9
Data Universe - Subnet 13
Protein Folding - Subnet 25
Finetuning - Subnet 37

3 of 39

MULTIPLE COMPETITIONS HAVE BEEN LAUNCHED FOR SUBNET 9 including 14B model competition !

New DASHBOARD is available

Subnet 9 WHITEPAPER was published !!!

SN9 PRE-TRAINING

FINE-TUNING

SUBNET 25

Decentralized Protein Folding just got an upgrade

Released on Testnet new simulation technique based on OpenMM to improve the reproducibility and capabilities of the future TaoFold

SN25 PROTEIN FOLDING

RELEASE 2.8.0

FOR SUBNET 1

Coming 2024-09-17��Modification to the synapses and new types of tasks with separate sub-competitions were added into the subnet to improve the variety and quality of miners models�

Programming Task
Web Retrieval
Inference Task

SN1 APEX

RELEASE 2.0.0

FOR SUBNET 37��New sub-competition launched based on a synthetic MMLU-like dataset generated from the subnet 1

SN137 FINETUNING

4 of 39

TL;DR All Subnets

SUBNETS PLANS & ACHIEVEMENTS

Modify reward landscape to incentivise speciation and make inference task effective, make each competition closer to winner takes all
Inhance subnet stability by improving testing process

SUBNET

LLM tasks enhancements for better models quality, Programming Task, Web Retrieval, Inference Task
A new Synapse - Availability Synapse - and previous Synaps modifications added to provide the task type data

IN PROGRESS

COMPLETED

Enhancement of the new leaderboard front end for better user experience
Adding a Benchmark
New strategy development and experimenting with 100B+ models

Stabilised the current competitions - 700M, 3B, 7B, 14B
5 miners are already on 14B competition!
Removed 7B* competition
Decaying epsilon function activated - papers published

Scoring mechanizm change implementation to incentivise miners upload the Datasets into the HuggingFace
Research and design to the datasets quality - Data Duplication
New design of the Dashboard Mockup

Research on Dynamic Desirability
Implemented validation scripts for HF datasets from X and Reddit
Researching long-term solutions for Twitter validation to sort out issues with Apify availability

Increasing the proteins set to extend Protein Folding simulation capabilities in TaoFold
Launching OpenMM in main with community meeting
Wandb logging improvement for better user experience

Upgrade simulator to SOTA - implementing OpenMM in the Testnet
Successful experiments with reproducibility
R&Ds on the PDB conversions to increase proteins set

Further promotion of the subnet by integrating into more front-ends to increase visibility on the subnet performance
Producing POC of possible products built on the subnet service
Stabilising Version 2.0.0

Start of 2nd sub-competition based on a synthetic MMLU-like dataset generated from the Text Prompting subnet (SN1)
Implemented Dynamic Epsilon function

5 of 39

WHITE PAPER is published

2 months, 70,000 proteins folded: Our roadmap for scaling Subnet 25

Fine-tuning, finely tuned: How SN37 is delivering SOTA fine-tuning on Bittensor

COMPLETED

Comms and PR - August & September

PUBLISHED ARTICLES & MEDIA RESOURCE ACTIVITIES

How we’re expanding the Data Universe

biggest and freshest sources of open-source social media datasets on Hugging Face

More organic queries are coming to Subnet 1!

Mentions of WHITEPAPERS in media

LinkedIn
X
Substack

The Epsilon Experiments: How thresholds incentivise intelligence on SN9

Decentralized Protein Folding just got an upgrade

6 of 39

SUBNET 1 APEX UPDATE

7 of 39

BENCHMARK

BENCHMARKING THE NETWORK ON MMLU

The miners are performing very well and are not overfitted

8 of 39

Q3 ROADMAP

Release of Organic Scoring

Ensures that miners provide genuine responses, improving the quality of interactions and allowing us to officially bring Chattensor online

Release Chattensor Public Beta chat.macrocosmos.ai to allow users to directly interface with the network again
Make SN1 more accessible for miners, improve subnet scalability and prepare for the future changes:

Comprehensive Codebase Refactor
Pruning of Low-Value Tasks

Enhance the API to allow better task-specific access to the models hosted on SN1
Add higher-value tasks that allow us to incentivize the development of SOTA models
Modify the reward landscape to allow miners to specialise in individual task, leading to higher quality models

COMPLETE

9 of 39

Q3 ROADMAP

MAJOR MILESTONES IN SN1 APEX DEVELOPMENT

10 of 39

SEPTEMBER DELIVERABLES COMPLETE

DETAILED DELIVERY FOR SN1 APEX DEVELOPMENT

11 of 39

SEPTEMBER DELIVERABLES IN PROGRESS

DETAILED DELIVERY FOR SN1 APEX DEVELOPMENT

12 of 39

chat.macrocosmos.ai

GitHub - macrocosm-os/language-models

Prompting Subnet Dashboard

Macrocosmos | Substack

Bittensor Whitepapers

Bittensor Guru podcast

RESOURCES

LINKS TO THE RESOURCES RELEVANT TO SUBNET 1

13 of 39

SUBNET 9 PRE-TRAINING UPDATE

14 of 39

Q3 ROADMAP

7B, 7B* and 700M parameters competition added to the competition pool with the current dataset - 13 Aug 2024
Sample packing in computing the loss depreciation. Each validation query derived from a single sample. Released our first report comparing top miners' models to state-of-the-art models on standard benchmarks and introduced the 3B parameters competition (same dataset) - 13 Aug 2024
The 14B competition added to the competition pool with 2 weeks earlier then planned timeline !!! 27 Aug 2024 - 42% of the total reward pool
Changes to encourage miners to submit their absolute best models by 3 sep 2024 to incentivise collaboration on the models improvements
Validation set becomes more diverse by 17 Sept 2024. Instead of using a single dataset for validation, multiple datasets are utilized. This change aims to encourage miners to diversify their training sets
Validation set becomes larger (by making the evaluation pipeline faster) by 24 Sept 2024, reducing variance between validators.

COMPLETE

15 of 39

Q3 ROADMAP

MAJOR MILESTONES IN SN9 PRE-TRAINING DEVELOPMENT

16 of 39

SEPTEMBER DELIVERABLES COMPLETE

DETAILED DELIVERY FOR SN9 PRE-TRAINING DEVELOPMENT

17 of 39

SEPTEMBER DELIVERABLES IN PROGRESS

DETAILED DELIVERY FOR SN9 PRE-TRAINING DEVELOPMENT

18 of 39

PRE-TRAINING DASHBOARD AND ENHANCED MOCKUP ->

19 of 39

GitHub - macrocosm-os/pretraining

Leaderboard

Macrocosmos | Substack

WHITEPAPER for subnet 9

RESOURCES

LINKS TO THE RESOURCES RELEVANT TO SUBNET 9

20 of 39

SUBNET 13 DATA UNIVERSE UPDATE

21 of 39

Q3-Q4 ROADMAP AND HIGHLIGHTS

Extend Open Source and Accessibility�This comprises of revamping rewards and data desirability to be flexible and change with current interests

Hugging Face Datasets Available with rewards landscape changes - COMPLETED
Additional Data Sources - IN PROGRESS

A Dynamic Subnet for a Dynamic Landscape�This comprises of revamping rewards and data desirability to be flexible and change with current interests

We’re upgrading from a static data desirability lookup to dynamic voting! - IN PROGRESS
Dynamic Desirability will allow validators to direct scraping to desired topics, data sources, and users, with voting power granted based on their stake. - IN PROGRESS

Improve Subnet Quality of Life�This involves regular upkeep and upgrades to the repo according to miner/validator feedback

Creating a new version of Dashboard for User Experience improvement - IN PROGRESS
Gauging community interests and values through community calls and polls - IN PROGRESS

22 of 39

Q3 ROADMAP

MAJOR MILESTONES IN SN13 DATA UNIVERSE DEVELOPMENT

23 of 39

SEPTEMBER DELIVERABLES COMPLETE

DETAILED DELIVERY FOR SN9 PRE-TRAINING DEVELOPMENT

24 of 39

SEPTEMBER DELIVERABLES IN PROGRESS

DETAILED DELIVERY FOR SN9 PRE-TRAINING DEVELOPMENT

25 of 39

GitHub - macrocosm-os/data-universe

Miner-side Dashboard

HF Dataset Explorer Dashboard

Macrocosmos | Substack

RESOURCES

LINKS TO THE RESOURCES RELEVANT TO SUBNET 13

26 of 39

SUBNET 25 PROTEIN FOLDING

27 of 39

Q3 ROADMAP AND HIGHLIGHTS

Ensure the validity of simulations

MVP for Folding Product TaoFold

Mockups are in the POC with research teams
Leverage beta test feedback from experts, validators, and general users to refine our product

Upgrade simulator to SOTA

Implement integration with OpenMM - In the Testnet from Sep 06, 2024 !!!
Stress-test the system to prove that scalability is possible

Implement first iteration of the Dashboard as a product Front End

28 of 39

MAJOR MILESTONES IN SN25 PROTEIN FOLDING DEVELOPMENT

Q3 ROADMAP

29 of 39

SEPTEMBER DELIVERABLES COMPLETE

DETAILED DELIVERY FOR SN25 PROTEIN FOLDING DEVELOPMENT

30 of 39

SEPTEMBER DELIVERABLES IN PROGRESS

DETAILED DELIVERY FOR SN25 PROTEIN FOLDING DEVELOPMENT

31 of 39

TAOFOLD MOCKUP

MVP ITERATION OF EXTERNAL PRODUCT

32 of 39

GitHub - macrocosm-os/folding

Protein Folding Dashboard

Macrocosmos | Substack

RESOURCES

LINKS TO THE RESOURCES RELEVANT TO SUBNET 25

33 of 39

GitHub - macrocosm-os/folding

Protein Folding Dashboard

Macrocosmos | Substack

RESOURCES

LINKS TO THE RESOURCES RELEVANT TO SUBNET 25

34 of 39

SUBNET 37 FINETUNING

35 of 39

Q3 ROADMAP

Improve the subnet 37 framework and incentive structure and polish the frontend

Prohibit models that can’t be further trained - COMPLETED
Polish the front-end leaderboard to improve user experience - COMPLETED

Further promotion of the subnet by integrating into more front-ends to increase visibility on the subnet performance - IN PROGRESS
Expanding the framework with new data sources and improved evaluation mechanisms

Partnering with subnet 1 to evaluate using a multiple choice dataset derived from wikipedia - COMPLETED

Q4 -> Producing POC of possible products built on the subnet service

36 of 39

SEPTEMBER DELIVERABLES COMPLETE

DETAILED DELIVERY FOR SN37 FINETUNING DEVELOPMENT

New release supports the start of our 2nd competition starting on block 3790400! This competition is based on a synthetic MMLU-like dataset generated from the Text Prompting subnet (SN1). For more details
Release 2.0.0

Subnet Improvements

New Multi-choice competition introduced
Dynamic Epsilon implemented for all competitions starting on block 3790400

Validator Improvements

Reduced # of top models kept around for each competition. This will allow validators to process more new models each eval loop and have shorter eval loops during regular operations

37 of 39

SEPTEMBER DELIVERABLES IN PROGRESS

DETAILED DELIVERY FOR SN37 FINETUNING DEVELOPMENT

Further promotion of the subnet by integrating into more front-ends to increase visibility on the subnet performance
Producing POC of possible products built on the subnet service

38 of 39

GitHub - macrocosm-os/finetuning

Github - macrocosm-os/taoverse

Finetuning Leaderboard

Macrocosmos | Substack

RESOURCES

LINKS TO THE RESOURCES RELEVANT TO SUBNET 37

39 of 39

FOR MORE DETAILS OR ANY QUESTIONS ON CURRENT REPORT OR PRODUCT PLEASE REACH OUT TO

Elena Nesterova elena.nesterova@macrocosmos.ai

Alma Schalen alma.schalen@macrocosmos.ai