Training Process
Last updated
Last updated
The training process is essential for adapting models to new voice data and improving the quality of the Text-to-Speech (TTS) output. This process ensures the integration of user-specific voice characteristics while maintaining privacy and efficiency across decentralized nodes.
Step 1: Client Data Submission and Smart Contract Initialization
The client specifies the model characteristics and submits training data (e.g., voice samples). This data is encrypted before submission, ensuring privacy.
The smart contract manages the distribution of training tasks and holds client payment in escrow until successful validation.
Step 2: Training Cost Calculation
The system calculates the training cost by selecting among the training nodes and validator nodes. The node selection is made by balancing the quality score of each node with its individual cost.
Where:
Step 3: Client Authorization and Payment
If the Client accept the costs, he must transfer payment tokens to the smart contract.
The client provides temporary decryption keys to selected nodes, allowing them to use the data for this specific training session only.
Step 4: Training Process Execution
Training nodes use optimization algorithms to adjust the model weights according to the provided voice samples and details.
During training, nodes perform intermediate evaluations using test samples to ensure the model is converging towards the desired output, sharing only the encrypted evaluation metrics with validators.
Step 5:
Validators assess the accuracy and consistency of the updated model by applying it to a set of test inputs and comparing results against expected outputs
Validators reach a decision using the following voting mechanism:
Where:
Step 6:
After the training is validated, the rewards and penalties are distributed
Where:
Nodes that successfully complete the training receive an increase in their reputation score, impacting their future selection probability for tasks
Step 7: Weights Encryption and Distribution
Once validated, the updated model is distributed across training nodes in the network for future TTS tasks, ensuring consistency and availability.
The updated model is encrypted before being stored, requiring client permission for any future usage, ensuring continuous privacy control
Aclaration: Processor nodes can use different models to perform the TTS task, so each Training Node will need to specify which model the weights are compatible with. For efficiency, a single node could serve as both a Training Node and a Processor Node for its own model. However, we believe that a modular approach enables greater decentralization and scalability