Abstract

From a client and developer perspective, the process of using a blockchain can be broken into three high-level steps: (1) computation tasks are submitted to the chain in form of transactions, (2) the blockchain processing the transactions, (3) inspecting the resulting state.

For a highly-scalable platform such as Flow, which we envision to process tens of thousand to millions of transactions per second, the events emitted during the a single block’s computation and the resulting state changes will be beyond the capacity of any single server to process. The Flow network itself shoulders this massive computation and data load via its horizontally-scalable, pipelined architecture. However, the approach commonly adopted in the blockchain space, to just run a single computer replicating the entire space and computation becomes intractable at Flow’s scale.

In the following, we describe Flow’s long-term vision how clients and developers of the Flow network can reliably and trustlessly query and replicate the subsection of the global state that is relevant for them. We emphasize that the design is byzantine fault tolerant [BFT] and clients can receive correctness proof for all data they are interested in if they desire so.

Furthermore, Flow’s design enables smartphone-sized light clients that can operate entirely trustlessly without needing to download or store chain history.

Central concepts and terminology:

Long-term goals

Flow is architected to scale beyond 1 million transactions per second. You can read the details in this paper.

  1. A short- to mid-term goal is to support • the state exceeding 1 petabyte in size (snapshot at one specific height) and • the network producing gigabytes of event data every few seconds.

  2. Trustless operations, including trustless data egress, is a central value proposition of Flow. In all likelihood, some data egress functionality will be technically intractable to implement in a trustless manner, too large of an engineering lift to implement compared to its utility, or not economically viable for clients to use due to disproportionate cost of the proofs.

    Nevertheless, we strive to keep the data egress functions that are only available via trusted sources as small as possible. The important data egress functionality must be available in a trustless manner. Otherwise, we would substantially weaken the value of Flow as a decentralized, trustless platform, if core data egress functions are available only through trusted/centralized entities.

  3. We want to support trusted data egress as well, because omitting proofs substantially reduces hardware and energy consumption. Trust relationships based on legal or economic incentives are very common, and we should allow clients to utilize such to improve efficiency and decrease cost.

  4. We want to enable developers to run their own edge nodes at small costs, allowing them to locally access the subset of state and event data that is relevant to them. Edge nodes must eventually allow getting their input data in a trustless manner for all prevalent use cases.