Data availability in blockchain refers to the ability of nodes on the network to access and verify the transaction data included in each produced block.
Data on a blockchain is not stored in a single location. Instead, it is replicated across multiple nodes in the network. Data availability ensures that every participant can retrieve the transaction data needed to validate blocks and transactions. If the transaction data is not made available, or it can be manipulated in some ways, it creates a vulnerability that malicious actors can exploit.
There are two primary types of data availability:
Newer data availability solutions enable nodes to verify the presence of data without having to download the entire dataset. There are two innovation in this field, Data Availability Sampling and Data Availability Committees.
Data Availability Sampling (DAS): DAS works by having nodes conduct multiple rounds of random sampling for small portions of block data. As a node completes more rounds of sampling for block data, it increases its confidence that data is available. Once the node successfully reaches a predetermined confidence level (e.g. 99%) it will consider the block data as available.
Data Availability Committees (DACs): DACs are a permissioned group of nodes responsible for providing data availability to a blockchain. This group of parties commits to storing copies of input data and keeping the data available upon request.
Data availability protocols aim to provide a decentralized and scalable off-chain data availability layer.
Celestia: A decentralized data availability network that acts as a separate data availability layer for other blockchains. Celestia is not responsible for executing and settling transactions, but only ensuring that transaction data is available for anyone to access and verify.
Polygon Avail: A modular blockchain that separates data availability from execution and consensus. It leverages zero-knowledge proofs to enable scalable data availability.