Project Catalyst and Voltaire bring power to the people

Establishing a long-term future for Cardano growth has begun with a treasury and democratic voting in the Catalyst project

10 September 2020 Bingsheng Zhang 7 mins read

Project Catalyst and Voltaire bring power to the people

Designing a groundbreaking proof-of-stake blockchain means it is vital to ensure that the system is self-sustainable. This will allow it to drive growth and maturity in a truly decentralized and organic way. Voltaire is IOHK’s way of establishing this capability, allowing the community to maintain the Cardano blockchain while continuing to develop it by proposing and implementing system improvements. This puts the power to make decisions in the hands of ada holders.

Rigorous research lies at the heart of building a solid blockchain. July’s Shelley summit included a presentation on the importance of funding for the growth of Cardano. This was based on research between Lancaster University and IOHK into the notion of a treasury system and an effective, democratic approach to funding Cardano’s long-term development. IOHK has now applied treasury mechanism capabilities in Project Catalyst, which combines research, social experiments and community consent to establish an open, democratic culture within the Cardano community.

A democratic approach

With the rapid growth of blockchain technology, the world is witnessing the emergence of platforms across a variety of industries. Technological growth and maturity are essential for long-term blockchain sustainability and development. That is why someone has to support and fund growth and system enhancements. A democratic approach is an integral part of the blockchain ecosystem because it allows sustainability decisions to be made collaboratively, without relying on a central governing entity. Thus, the governing and decision-making process must be collective. This will allow users to understand how improvements are made, who makes decisions, and ultimately where the funding comes from to make these choices.

Long-term sustainability

There are several ways to raise capital for development purposes. Donations, venture capital funding, and initial coin offerings (ICOs) are the most common. However, although such models may work for raising initial capital, they rarely ensure a long-term funding source or predict the amount of capital needed for development and maintenance. In addition, these models suffer from centralized control, making it difficult to find a consensus meeting the needs and goals of everyone.

To establish a long-term funding source for blockchain development, some cryptocurrency projects apply taxation, taking a percentage from fees or rewards, and accumulating them in a separate pool – a treasury. Treasury funds can then be used for system development and maintenance purposes. In addition, treasury reserves undergo inflation as the value of cryptocurrencies grows. This grants another potential source of funds accretion.

However, funding systems are often at risk of centralization when making decisions on guiding development. In these systems, only certain individuals in the organization or company are empowered to make decisions on how to use available funds and for which purposes. Considering that the decentralized architecture of blockchain makes it inappropriate to have centralized control over funding, disagreement can arise among organization members and lead to complex disputes.

Treasury systems and Cardano

A number of treasury systems have arisen to address the problems. These systems may consist of iterative treasury periods during which project funding proposals are submitted, discussed, and voted on. However, common drawbacks include poor voter privacy or ballot submission security. In addition, the soundness of funding decisions can be compromised if master nodes are subject to coercion, or a lack of expert involvement might encourage irrational behavior.

As a third-generation cryptocurrency platform, Cardano was created to solve the difficulties encountered by previous platforms.

Cardano aims to bring democracy to the process, giving power to everyone and so ensuring that decisions are fair. For this, it is crucial to put in place transparent voting and funding processes. This is where Voltaire comes in.

The paper on treasury systems for cryptocurrencies introduces a community-controlled, decentralized, collaborative decision-making mechanism for sustainable funding of blockchain development and maintenance. This approach to collaborative intelligence relies on ‘liquid democracy’ – a hybrid of direct and representative democracy that provides the benefits of both systems.

This approach enables the treasury system to take advantage of expert knowledge in a voting process, as well as ensuring that all ada holders are granted an opportunity to vote. Thus, for each project, a voter can either vote directly or delegate their voting power to a member of the community who is an expert on the topic.

To ensure sustainability, the treasury system is controlled by the community and is refilled constantly from potential sources such as:

  • some newly-minted coins being held back as funding
  • a percentage of stake pool rewards and transaction fees
  • additional donations or charity

Because funds are being accumulated continually, it will be possible to fund projects and pay for improvement proposals.

So, the funding process can consist of ‘treasury periods’, with each being divided into the following phases:

  • pre-voting
  • voting
  • post-voting

During each period, project proposals may be submitted, discussed by experts and voters, and finally voted for to fund the most essential projects. Even though anyone can submit a proposal, only certain proposals can be supported depending on their importance and desirability for network development.

Voting and decision making

To understand which project should be funded first, let’s discuss the process of decision-making.

Ada holders who participate in treasury voting include scientists and developers, management team members, and investors and the general public. Each of these may have different imperatives for the growth of the system, and that is why there has to be a way to make these choices and desires work together.

For this, the voting power is proportional to the amount of ada someone owns; the more ada, the more influence in making decisions. As part of the liquid democracy approach, as well as direct yes/no voting, an individual may delegate their voting power to an expert they trust. In this case, the expert will be able to vote directly on the proposal they regard as the most important. After the voting, project proposals may be scored based on the number of yes/no votes and be shortlisted; the weakest project proposals will be discarded. Then, shortlisted proposals can be ranked according to their score, and the top-ranked proposals will be funded in turn until the treasury fund is exhausted. The strategy of dividing the decision-making process into stages allows reaching consensus on the priority of improvements.

To ensure voter privacy, the research team has invented an ‘honest verifier zero-knowledge proof for unit vector encryption with logarithmic size communication’. Zero-knowledge techniques are mathematical methods used to verify things without revealing any underlying data. In this case, the zero-knowledge proof means that someone can vote without revealing any information about themselves, other than that they are eligible to vote. This eliminates any possibility of voter coercion.

Treasury prototypes have been created at IOHK for benchmarking. Implementing the research as the basis of Voltaire will help to deliver reliable and secure means for treasury voting and decision-making. Project Catalyst is an experimental treasury system that combines proposal and voting procedures focusing on the establishment of a democratic culture within the Cardano community. Initially, Cardano’s treasury will be refilled from a percentage of stake pool rewards ensuring a sustainable treasury source. Other blockchains have treasury systems, but IOHK’s combines complete privacy through zero-knowledge proofs, liquid democracy from expert involvement and vote delegation, and participation for all, not just a governing entity. This should encourage participation, incentivization, and decentralization for making fair and transparent decisions.

It is also important to note that this treasury system mechanism can be implemented on a variety of blockchains, not just Cardano. It has already been proposed implementing it for Ethereum Classic. In the process, treasury systems can help everyone to understand how a network will develop.

After a successful closed user group trial earlier which started this summer, Project Catalyst will very soon be opened up to its first public beta program. Although it is still early days for Cardano on-chain governance, we look forward to a bright future, with the community lighting the way. So, please follow the blog for updates on Voltaire and how Project Catalyst is paving the way to Cardano’s sustainability.

The decline and fall of centralization

This week marks the first step in the road to the full decentralization of Cardano, as stake pools begin to take responsibility for block production. Here’s what the journey will look like.

14 August 2020 Kevin Hammond 12 mins read

The decline and fall of centralization

Full decentralization lies at the heart of Cardano’s mission. While it is not the only goal that we're focused on, in many ways, it is a goal that will enable and accelerate almost every other. It is integral to where we want to go as a project.

It is also where the philosophical and technical grounding of the entire Cardano project meets its community, in very real and tangible ways. This is why we have done a lot of thinking on how to achieve decentralization effectively, safely, and with the health of the ecosystem front of mind.

Defining decentralization

Let’s start by explaining what we mean by decentralization. This is a word that is fraught with challenge, with several competing meanings prevalent in the blockchain community.

For us, decentralization is both a destination and a journey. Shelley represents the first steps toward a fully decentralized state; from the static, federated approach of Byron to a fully democratic environment where the community not only runs the network, but is empowered and encouraged to take decisions through an on-chain framework of governance and voting.

True decentralization lies at the confluence of three essential components, working together in unison.

  • Networking - where geographically distributed agents are linked together to provide a secure and robust blockchain platform.
  • Block production - where the work of building and maintaining the blockchain is distributed across the network to a collection of cooperating stake pools.
  • Governance - where decisions about the blockchain protocol and the evolution of Cardano are taken collectively by the community of Cardano stakeholders.

Only when all these factors exist within a single environment can true decentralization be said to have been achieved successfully.

Key parameters that affect decentralization

Let's talk about d, maybe.

The d-parameter performs a pivotal role in controlling the decentralization of block production. Decentralization is a spectrum, of course, rather than an absolute. In simple terms, d controls ‘how’ decentralized the network is. For example, at one extreme, d=1 means that block production is fully centralized. In this state, IOG’s core nodes produce all the blocks. This was how Byron operated,

Conversely, once d=0, and decentralized governance is in place and on chain, ‘full’ decentralization will have been achieved. At this point, stake pool operators produce all the blocks (block production is 100% decentralized), the community makes all the decisions on future direction and development (governance is decentralized), and a healthy ecosystem of geographically distributed stake pools are connected into a coherent and effective network (the network is decentralized). We will have reached our decentralization goal.

The journey that d will take from 1 to 0 is a nuanced one that requires a careful balance between the action of the protocol and the reaction of the network and its community. Rather than declining instantly, d will go through a period of ‘constant decay’ where it is gradually decremented until it reaches 0. At this point Cardano will be fully decentralized. This gradual process will allow us to collect performance data and to monitor the state of the network as it progresses towards this all-important point. A parameter-driven approach will help provide the community with transparency and a level of predictability. Meanwhile, we’ll be monitoring the results carefully; there will always be socio-economic and market factors to consider once ‘in the wild’.

How will the d parameter change over time

The evolution from 1 to 0 is relatively simple:

When d=1, all blocks are produced by IOG core nodes, running in Ouroboros Byzantine Fault Tolerance (OBFT) mode. No blocks are produced by stake pool operators (running in Ouroboros Praos mode). All rewards go to treasury.

When d=0, the reverse becomes true: every block will be produced by stake pools (running in Praos mode), and none by the IOG core nodes. All rewards go to stake pools, once the fixed treasury rate is taken.

In between these extremes, a fraction of the blocks will be produced by the core nodes, and a fraction by the stake pools. The precise amounts are determined by d. So when d reaches 0.7, for example, 70% of the blocks will be produced by the core nodes and 30% will be produced by stake pools. When d subsequently reaches 0.2, 20% of the blocks will be produced by the core nodes, and 80% by the stake pools.

It is important to note that regardless of the percentage of blocks that are produced by the stake pools, however, once d < 1, all the rewards will go to stake pools in line with the stake that they hold (after the fixed treasury percentage is taken), and none to the core nodes. This means that IOG has absolutely no incentive to keep the d parameter high. In fact, when d reaches zero, IOG will be able to save the costs of running the core nodes, which are not insubstantial.

Like many other ada holders, IO Global is currently running a number of stake pools on the mainnet. As the creator of the Cardano platform, IO Global naturally has a significant stake in its success from fiscal, fiduciary, and security aspects, and this success will be built on a large number of effective and decentralized pools. As a commercial entity, IO needs to generate revenue from its stake, while recognizing the part it needs to play within an ecosystem of stake pools, helping to grow and maintain the health of the network as we move towards full decentralization. In the medium term, we will follow a private/public/community delegation approach, similar to that we adopted on the ITN, spreading our stake across both IOG and community pools. In the short term, however, we are running IOG pools on the mainnet, establishing a number of our own pools that can take some of the load from our core nodes. Using our stake and technical expertise to secure and stabilise the network is an important element at first, but one that will become less important as the d parameter decreases. The road to decentralization will offer many opportunities for pools of all sizes to establish themselves and thrive along the way.

The key milestones of the d journey

d<1.0 (Move away from centralization)

The first milestone happened on August 13 at the boundary of epoch 210 and 211 when the d parameter first dropped below 1.0. At this point, IOG's core nodes started to share the block production of blocks with community stake pools. This marks the beginning of the road to full decentralization.

d=0.8 (Stake pools produce 20% of blocks)

At 0.8, more pools (double the number compared to d=0.9) will get the opportunity to create blocks and establish themselves. At this level, pools won’t suffer in the rankings as long as they create one of the allocated blocks and get rewards. This way, we believe we can start growing the block-minting proportion of the network, at low network risk.

d<0.8 (Stake pool performance taken into account)

The next major milestone will happen when d drops below 0.8. Below that level, each pool's performance will be taken into account when determining the rewards that it receives. Above that level, however, the pool’s performance is ignored. The reason for this is to avoid unfairness to pools when they are only expected to produce a few blocks.

d<0.5 (Stake Pools Produce the Majority of Blocks)

When d drops below 0.5, stake pools will produce the majority of blocks. The network will have reached a tipping point, where decentralization is inevitable.

Before taking this dramatic step, we will ensure that two critical features are in place: peer-to-peer (P2P) pool discovery and protocol changes to enable community voting. These will enable us to make the final push to full and true decentralization The recently announced Project Catalyst program was the first step in this, concurrent journey to full on-chain governance.

d=0 (Achieve Full Decentralization)

As soon as the parameter reaches 0, the IOG core nodes will be permanently switched off.

IOG will continue to run its own stake pools that will produce blocks in line with the stake they attract, just like any other pools. But these will no longer have any special role in maintaining the Cardano network. It will also, of course, delegate a substantial amount of its stake to community pools. Simultaneously, the voting mechanism will be enabled, and it will no longer be possible to increase d and ‘re-centralize’ Cardano.

At this point in time, we will have irrevocably entered a fully decentralized Cardano network. Network + block production + on-chain governance = decentralization.

Rate of constant decay

The progressive decrement of d is known as constant decay. The gradual decrease will give us the chance to monitor the effects of each decrement on the network and to make adjustments where necessary. As the parameter decreases, more stake pools will also be able to make blocks, since the number of blocks that are made by the pools will increase, and less stake will then be required for each block that is made.

The key factors driving this decrease will be:

  • The resilience and reliability of the network as a whole.
  • The number of effective block-producing pools.
  • The amount of the total stake that has been delegated.

Here’s our current thinking on what implementation might look like:

We will then likely pause before dropping the parameter below 0.5 to ensure that the two key conditions described above are met:

  • The implementation of the new Peer-to-Peer pool discovery mechanism has been released and is successfully in use;
  • We have successfully transitioned the first hard fork in the Shelley era, which will introduce the basis for community voting on protocol parameters, and other important protocol changes

We will resume the countdown to d=0 at a similar rate, pausing again if necessary before finally transitioning to d=0 in March 2021.

Other factors that affect decentralization: Saturation threshold

A second parameter – k – is used to drive growth in the number of pools by encouraging delegators to spread their stake. By setting a cap on the amount of stake that earns rewards (the saturation threshold), new delegators are directed towards pools that have less stake. In ideal conditions, the network will stabilise towards the specific number of pools that have been targeted. In practice, we saw from the ITN that many more pools than this number were supported by the setting that we chose.

The k parameter was set to 150 at the Shelley hard fork. This setting was chosen to balance the need to support a significant number of stake pools from the start of the Shelley era against the possibility that only a small number of effective pools would be set up by the community. In due course, it will be increased to reflect the substantial number of pools that have emerged in the Cardano ecosystem since the hard fork. This will spread stake, and so block production, among more pools. The overall goal in choosing the setting of the parameter will be to maximise the number of sustainable pools that the network can support, so creating a balanced ecosystem. In order to achieve this, a careful balance is required between opening up the opportunity to run a block-creating pool to as many pools as want to run the system, against the raw economics of running a pool (from bare metal servers, to cloud services, to people’s time), taking into account the rewards that can be earned from the actively delegated stake. Changing this parameter will therefore be done with a degree of caution and balance so that we ensure the long term success of a fully decentralized Cardano network. We’re now looking carefully at early pool data and doing some further modelling before making the next move.

d and pool rewards

Two questions remain: What is the effect of d on the rewards that a pool can earn, and can this parameter ever be increased?

Regarding rewards, as long as a pool produces at least one block, the value of the parameter has absolutely no effect on the rewards that a pool will earn – only on the number of blocks that are distributed to the pools. So if a pool has exactly 1% of the stake, it will earn precisely 1% of the total rewards, provided that it maintains its expected performance.

Finally, while d could in theory be increased, there would need to be a truly compelling reason to do so (a major protocol issue, or fundamental network security, for example.) We would never envision actually doing this in practice. Why? Simply because we want to smoothly and gradually reduce the parameter to 0 in order to achieve our objective of true decentralization. We’ll be making this journey carefully but with determination step by step. If each step is taken thoughtfully and with confidence, you should not need to retrace them? As d becomes 0, the centralized IO servers will be finally switched off, and Cardano will become a model of decentralized blockchain that others aspire to be.

Conclusion

The decline of centralized entities coincides with Cardano's rise towards full and true decentralization. In the near future, the Cardano blockchain will be solely supported and operated by a strong community of stake pools whose best interest is the health and further development of the network.

This journey, which began with Shelley and the implementation of the d parameter, will take Cardano through a path of evolutionary stages in which the network will become progressively more and more decentralized, as d decays. The journey will only end when the blockchain enters a state of irrevocable decentralization, a moment in time that will see networking, block production, and governance operating in harmony within a single environment.

Blockchain governance - from philosophy and vision to real-world application

Robust and effective community governance lies at the very heart of Cardano's decentralized vision, and Project Catalyst will test the theory this summer.

5 August 2020 Olga Hryniuk 8 mins read

Blockchain governance - from philosophy and vision to real-world application

Taking a decentralized approach to governance and decision-making is proving to be more efficient in many spheres than the centralized, authority-based model that became the norm in so many areas in the last century. At IOHK, we believe that blockchain technology offers a way to encourage participation in collective action. And we are building just such a system, so Cardano can grow in a fair and decentralized way, with an open governance system that encourages global participation by granting ada holders the power to make decisions.

Decentralization is core to global governance

As we’ve been working on this, the pandemic crisis has exposed weaknesses in our globalized economy and made it clear that everyone needs to reconsider the ways we collaborate in the face of challenging international situations. Over recent decades, the world has become ever more connected through digital infrastructure and social platforms. Therefore, robust tools and new behavior patterns are now necessary to improve the way we collaborate.

In the past, large collective challenges had to be solved in a centralized manner by high-level actors governing from the ‘top’ down. In that governance model, power, authority, and control were decided and exercised at management level. This could be a chief executive, a president, or even a dictator determining the ‘best’ course of action. In this centralized system, once a decision is made, it becomes the law of the land, and new behaviors are enforced. However, the top-down model is inefficient for solving global-scale challenges. Dr Mihaela Ulieru, an adviser on digital ecosystems organization and transformation, spoke at the recent Cardano Virtual Summit about her vision of ‘bottom-up’ organic governance. There, she pointed out that hierarchical structures are rigid and less productive. Furthermore, they cannot deal efficiently with emerging complexity.

A centralized governance model depends on the limited knowledge, expertise, and understanding of a single individual or a body of actors. Decisions must then proliferate through the system to deal with emerging problems. This generates an inflexible response that lacks on-the-ground information from the people affected by a particular event. Therefore, the more complex and widespread a problem is, the less prepared a top-down organization can deal with it in a way that works for most people. So the question arises, how do we create a system that responds to emerging problems, aids decision-making, and remains all-inclusive?

Dr Ulieru reminds us that the bottom-up approach has been used in peer-to-peer networks to great effect allowing network participants to collaborate in a way that is reflective of the desires and needs of the community. SoScial.Network, launched by Jon Cole after Hurricane Harvey hit the southern US in 2017, is one example. This network enabled people and communities to gather and offer help to each other by providing aid to disaster victims. Another social network, ChefsForAmerica, delivers food to hospitals, the disadvantaged, and those in need. AgeWell created a peer-to-peer care delivery model that improves the well-being and health of elderly people, keeping them in their homes, while reducing their healthcare costs. Such activity, organized by individuals in a decentralized manner, can solve challenges faster and collaboratively.

Cardano decentralization

For effective collaboration on decision-making, Cardano offers a decentralized network built on a blockchain for a higher level of security and transparency. It takes the networking peer-to-peer concept further and builds it into a global infrastructure. For Cardano to establish a solid governance model, it is important to ensure that everyone can participate in a transparent, accountable, and responsive way. As decentralization empowers individuals and communities to take action and collaborate, anyone can suggest a change, or an activity to be initiated, whether this is for social good or for technological progress.

The question is, how to decide which change is crucial and what exactly will benefit everyone? Crucially, who will pay for its realization? To solve these issues, Cardano is establishing a global voting and treasury systems for funding the blockchain long-term development. Hence, decision-making and funding are two crucial components of governance. At IOHK, we worked to solve these issues and to provide the tools to empower decentralized governance and improve all our systems.

Voltaire and ada holders

For Cardano, a decentralized governance model should grant all ada holders the ability to decide what changes should be made for the ecosystem to grow and mature. When building an ecosystem for the next century rather than the next decade, the self-sustainability of the network becomes vital. Since individuals in the Cardano ecosystem are most affected by the decisions made about the protocol, it is important for them to understand how those decisions are made and how they are paid for, as well as how to participate in that process.

Voltaire is the era in Cardano development that deals with decentralized governance and decision-making. Voltaire focuses on the Cardano community’s ability to decide on software updates, technical improvements, and project funding. To provide the final components that turn Cardano into a self-sustainable blockchain, we need to introduce:

  • A treasury system: a continuous source of funding to develop the Cardano blockchain.
  • Decentralized software updates: the process enabling decentralized, open participation for fair voting on decisions about system advancements.

In line with that, IOHK’s engineers are implementing tools and social experiments to enable Cardano’s governance framework. Hence, we are working on:

  • Cardano improvement proposals (CIPs): a social communication system for describing formal, technically-oriented standards, codes, and processes that provide guidelines for the Cardano community in a transparent and open-source way.
  • Project Catalyst: an experimental treasury system combining proposal and voting procedures.

Project Catalyst focuses on making the treasury system a reality by establishing a democratic culture for the Cardano community. This goal will be achieved by a combination of research, social experiments, and community consent. Catalyst enables the community to make proposals, vote on them, and fund them according to their feasibility, auditability and impact.

It is important to note that everyone has an equal right to propose changes and system enhancements, and that everyone is encouraged to collaborate on innovative and powerful proposals.

To ensure that everyone gets a say, we are establishing a ballot submission system where people can propose improvements. This may be a proposal to sustain a system, to market it, or perhaps a suggestion to improve an existing protocol. In addition, IOHK wants to enable our community to create products or applications aimed at solving challenges within the market.

To suggest such contributions, people can submit ballots to the community. After the ballots are submitted, voting will decide which proposals can move forward.

Voting is the exercise of influence to decide which proposals get accepted or rejected. Only by encouraging everyone to participate can we ensure that preferences are determined in a democratic way. The value of Catalyst, though, lies not only in the technology improvement, but also in enabling the community to learn how to collaborate, make good decisions, and essentially, generate great proposals. So, decisions are taken through a robust set of voting protocols that we are developing. These will be evaluated by the community with the help of members who can put themselves forward as ‘experts’ to help explain proposals.

Although any ada holder can submit a proposal, the voting power of an individual is proportional to the amount of ada that is ‘locked’ into the system when people register. By voting, participants will influence decisions on which proposals to fund for network development and maintenance. This could also lead to the creation of educational materials, marketing strategies, technical improvements processes, and many other ideas.

Another feature of Catalyst is privacy. As in any political elections, this is key for preventing voter coercion and helping defend against corruption. So, Cardano is looking to implement a zero-knowledge proof for each vote. This cryptographic technique grants privacy by allowing the vote to be verified without it being publicly revealed.

The treasury funds proposals

A crucial aspect of a self-sustainable network is understanding who will fund decisions and proposals. To meet these needs, Cardano is introducing a self-sustainable treasury system to fund blockchain enhancements and maintenance. The main sources of capital will be refilled every epoch from a portion of stake pool rewards, and later from minted new coins, the percentage from fees, and additional donations or charity.

Project Catalyst will maintain this constant source of funding. A steady income stream will support initiatives and improvements proposed by ada holders, while at the same time rewarding and incentivizing people who dedicate their time and effort to making productive decisions. This will enable the establishment of a decentralized system that promotes participation, collaborative decisions, and incentivization. Besides the funding purposes, Catalyst ensures that funds are used well, facilitating the flow of information so the community is able to assess proposals while addressing the most important needs of the ecosystem.

IOHK is building a decentralized financial system that can address the world's needs. Voltaire is the first step in this direction. Using a collection of concepts, tools, and experiments, we are creating a fully decentralized ecosystem that will democratize our blockchain, and open it to the world.

Want to help shape the future of Cardano? Project Catalyst is looking for 25 community members to join a focus group on the direction of the program, pitch their ideas, and test the platform and voting. If you have an idea that might be suitable for funding, or would like to join a panel assessing community ideas, apply today! Please ensure you have a Telegram and an email account to apply by midnight UTC on Friday, August 7.

Improving Haskell’s big numbers support

Work by IOHK engineers is part of the latest release of the Glasgow compiler

28 July 2020 Sylvain Henry 9 mins read

Improving Haskell’s big numbers support

Haskell is vital to IOHK’s work in ensuring Cardano is a secure blockchain for the future of decentralized finance. As part of this, we use the language to develop the Plutus smart contract platform, and you may have read about the training courses run by our engineers, including a Haskell course in Mongolia this month. Smart contract applications are a powerful way for a distributed network to generate value, allowing individuals and businesses to agree to conditions and automatically execute exchanges of information and wealth, without relying on third parties. Plutus contracts contain a substantial amount of code that is run off the chain, on users’ computers. To make it easy to create portable executables for these, we want to compile the off-chain Haskell code into JavaScript or WebAssembly. To reach that goal, we take part in the development of the Glasgow Haskell Compiler (GHC), GHCJS (Haskell to JavaScript compiler) and Asterius (Haskell to WebAssembly compiler).

Recently we have been working on improving GHC’s support for big numbers, ie, numbers larger than 64-bit (both Integer and Natural types in Haskell). We have developed an implementation of big number operations in Haskell that is faster than the previous one (integer-simple). We have also improved the way GHC deals with the different implementations to make it more robust and evolutive. These contributions are part of the latest GHC release, version 9.0.

Background

Every Haskell compiler has to support arbitrary-precision integers (see Haskell 98 and Haskell 2010 reports, section 6.4). GHC is no exception and as such it provides the ubiquitous Integer and Natural types (‘big numbers’ for the rest of this post).

Until now, GHC could use one of two packages for this support:

  • integer-gmp: use GNU MP library (GMP) (via FFI), LGPL. Good performance.
  • integer-simple: Haskell implementation, BSD3. Bad performance.

Choosing one or the other depended on license, performance and cross-compilation considerations. In some cases, GMP’s LGPL license can be problematic, especially if you use static linking, which is required on some platforms, including Windows. When it comes to performance: integer-simple is sometimes several orders of magnitude slower than integer-gmp, as discussed below. And with cross-compilation, some target platforms may not support GMP, such as JavaScript.

The situation was already unfortunate but there were additional issues:

  1. Each implementation had its own way of representing big numbers (array of unboxed words or list of boxed words). GHC was aware of the selected implementation and produced different code for each. It could lead to bugs – even ‘insanity’! – when big numbers were exchanged between processes compiled with different implementations. Moreover, it made the compiler code more complicated because it had to deal with this discrepancy (eg, when inspecting heap objects at runtime in GHCi).
  2. Similarly, because of the different internal representations, there are at least 71 packages on Hackage (among them some widely used ones such as bytestring and text) that explicitly depend on integer-gmp package or that provide a flag to depend either on integer-gmp or integer-simple. It is a maintenance burden because each code path could have specific bugs and should be tested on CI.

All this meant that every new big number implementation was a daunting task. First, the interface to implement was very large (Integer and Natural types and all their operations). Then, we needed to ensure that GHC’s rewrite rules (constant folding) still worked for the implementation, the packages mentioned above needed to be fixed, and new Cabal flags added (by the way, Cabal flags are Boolean so they can’t be easily extended to support more than two options). Finally, GHC’s build system needed to be modified. No wonder it never happened.

Fortunately, most of these issues are now fixed in the latest release, GHC 9.0.

The ghc-bignum package

Starting from GHC 9.0, big numbers support is provided by a single package: ghc-bignum. This provides a Haskell implementation of big numbers (native-backend) that is faster than integer-simple’s (performance figures are given below), is also BSD3-licensed, and uses the same representation of big numbers as integer-gmp.

Now the different big numbers implementations are considered as internal backends of ghc-bignum library and there should be no observable difference between backends, except for performance. To enforce this, we even have a meta-backend used during tests that performs every operation with two backends (native-backend and another selected one) and checks that their results are the same.

A pure Haskell implementation can't really expect to beat the performance of heavily optimized libraries such as GMP, hence integer-gmp has been integrated into ghc-bignum as a backend (gmp-backend).

Adding a big numbers backend is now much easier. The interface to implement is minimal and is documented. A new backend doesn’t have to provide the whole implementation up front: operations provided by native-backend can be used to fill the holes while another backend is being developed. The test framework doesn’t have to be rewritten for each backend and results can be checked automatically against native-backend with the meta-backend mentioned above. We hope backends using other libraries, such as OpenSSL libcrypto integers or BSDNT, will be developed in the near future.

The ghc-bignum package also has a third ffi-backend that doesn’t provide an implementation per se but performs FFI calls for each operation. So ghc-bignum must be linked with an object providing the implementation or the compiler should replace these calls with platform-specific operations (eg, JavaScript/CLR/JVM big numbers operations) when this backend is selected. It is similar to what the Asterius compiler was doing – replacing GMP FFI calls with JavaScript BigInt calls – but in a cleaner way because GMP isn’t involved any more.

A major advantage of ghc-bignum is that all the backends use the same representation for big numbers: an array of words stored in little-endian order. This representation is also used by most big numbers libraries. Formerly, integer-simple was a notorious exception because it used a Haskell list to store the words, partly explaining its poor performance. Now, any package wanting to access the representation of the big numbers just has to depend on ghc-bignum. Cabal flags and CPP code are no longer required to deal with the different implementations. However, conditional code may be needed during the transition from integer-* packages to ghc-bignum.

To make the transition easier, the integer-gmp-1.1 package is still provided but it has been rewritten to depend on ghc-bignum and to provide some backward-compatible functions and pattern synonyms. Note, however, that some functions that were only available in integer-gmp-1.0.* (eg, prime number test, extended GCD) have been removed in integer-gmp-1.1. We expect these very specific functions to be exported by packages such as hgmp instead. Alternatively, someone could implement Haskell versions of these functions into native-backend.

GHC code has been simplified and made faster. Big numbers types and constructors are now known to the compiler (‘wired-in’), in the same way as other literals, so GHC doesn’t have to read interface files each time it wants to generate code using them. The unified representation avoids any need for two code paths, which makes the code more robust and easier to test.

Performance

We have compared the performance of native-backend against latest integer-simple and integer-gmp implementations (Figure 1). We measured the time necessary to compute basic operations:

Platform was Linux 5.5.4 on Intel Core i7-9700K CPU running at 3.60GHz. The three GHC bindists have been built with Hadrian using ‘perf’ flavor. integer-gmp and integer-simple bindists are built from commit 84dd96105. native-backend bindist is built from ghc-bignum branch rebased on commit 9d094111.

Computations have been performed with positive integers of the following sizes: small – 1 word (64-bit); medium – 8 words (512-bit); big – 76 words (4,848-bit).

Figure 1. Native-backend and integer-gmp are faster than integer-simple in almost all cases (note the logarithmic scale)

Figure 1 shows that native-backend and integer-gmp are always faster than integer-simple. The only exceptions are when we add or subtract a small number (1 word) from a big one as these operations are particularly well suited for integer-simple’s bignum representation (a list) because the tail of the list remains unchanged and is shared between the big numbers before and after the operation. With the other representation, the tail has to be duplicated in memory.

Division with integer-simple performs so badly that native-backend is 40 times faster in all the tested cases.

Sometimes, native-backend is faster than integer-gmp (eg, addition/subtraction with a small/medium number). GMP library is probably at least as good as native-backend but the latter avoids FFI calls, which may explain the better performance. Otherwise, native-backend is still slower than integer-gmp but these results are expected because it only implements basic algorithms and it doesn’t use vectorised or optimised processor instructions.

When it comes to GHC performance tests, we took our baseline as GHC HEAD compiled with integer-gmp. We compare the results with ghc-bignum's gmp-backend. There are no regressions. Noticeable improvements in memory use are seen in Table 1.

Next, we compared metrics obtained with native-backend to those obtained with GHC HEAD built with integer-simple. The new Haskell implementation results in noticeable changes (Table 2). Note that the first four tests were disabled with integer-simple because they took too long or failed to complete (heap overflow) but they all passed with native-backend. Also, T10678, the final test in the table, performs a lot of additions of a small number to a big number. As we saw above, this is the only case for which integer-simple representation is better: in most iterations only the head of the list is modified without duplicating the tail. It is reflected again in this result.

Finally, we compared native-backend with gmp-backend: it tells us how far our Haskell implementation is from our best backend. Noticeable changes are reported in Table 3.

Conclusion

We are very proud to have made this contribution to GHC. IOHK and the whole community now benefits from the improved performance and robustness of the compiler. Programs that use the new Haskell implementation of big numbers (native-backend) should get a performance boost when they switch from integer-simple.

We are also eager to see what the community will do with this work. In particular, it should now be easier to create backends based on other libraries. There is also a lot of room for optimization in native-backend. We also encourage everyone to test this new implementation and to report any issue on GHC’s bug tracker.

Cardano’s path to decentralization

Three mini-protocols are vital to the network’s operation

9 July 2020 Marcin Szamotulski 6 mins read

Cardano’s path to decentralization

The next releases of Cardano and the Ouroboros protocol contain changes that guide us towards decentralization and the Shelley era. This Deep Dive post explains how we are approaching this phase. With the release of the Praos algorithm for Shelley, which comes with the staking process, stake pools can be set up so ada owners can delegate their stake. The networking team is focused now on two features that will enable us to run a fully decentralized system. Let me first briefly describe how the networking is designed and engineered and give an overview of where we are at the moment. This post will start at the top of our abstractions and go down the stack. Hopefully, this will be an interesting journey through our design.

Typed protocols

At the very top of the stack is IOHK’s typed-protocols framework, which allows us to design application-level protocols. The top-level goal of protocol design for Ouroboros is to distribute chains of blocks and transactions among participants in the network and that is achieved by three mini-protocols:

  • chain-sync is used to efficiently sync a chain of headers;
  • block-fetch allows us to pull blocks;
  • tx-submission is used to submit transactions.

All three mini-protocols were carefully designed after considering the threats that can arise running a decentralized system. This is very important, because cyber attacks are very common, especially against targets that present strong incentives. There is a range of possible attacks at this level that we need to be able to defend against and one type that we were very careful about is resource-consumption attacks. To defend against such attacks, the protocols allow the consumer side to stay in control of how much data it will receive, and ultimately keep use of its resources (eg, memory, CPU, and open file descriptors) below a certain level.

If you are interested in more details about typed-protocols, we gave talks and ran workshops at Haskell events last year and these were very well received by the engineering community. In particular, see the talk by Duncan Coutts talk at Haskell eXchange and the workshop I ran at Monadic Party.

Role of the multiplexer

TCP/IP protocols form the most ubiquitous protocol suite deployed on the internet. They are also some of the most studied protocols and are available on almost every operating system and computer architecture, so are a good first choice for our purposes. TCP/IP gives us access to a two-way communication channel between servers on the internet. The only high-level requirement of typed-protocols is an ordered delivery of network packets, which is guaranteed by the TCP protocol.

Operating systems limit the number of connections at any one time. For example, Linux, by default, can open 1,024 connections per process, but on macOS the limit is just 256. To avoid excessive use of resources we use a multiplexer. This allows us to combine communication channels into a single one, so we can run all three of our mini-protocols on a single TCP connection. Another way to save resources is to use the bi-directionality of TCP: this means that one can send and receive messages at both ends simultaneously. We haven't used that feature in the Byron Reboot era, but we do want to take advantage of it in the decentralized Shelley era.

The peer-to-peer-governor

We want to use bi-directional connections, running all three mini-protocols in both directions, so we need to have a component that is aware which connections are currently running. When a node connects to a new peer, we can first check if it already has an open connection with that peer, which would be the case if the peer had connected to it already. But this is only one part of connection management that we will need.

Another requirement comes from the peer-to-peer governor. This part of the system is responsible for finding peers, and choosing some of them to connect to. Making a connection takes some time, depending on factors such as the quality of the network connection and the physical distance. Ouroboros is a real-time system, so it is good to hide some latency here. It wouldn't be good if the system was under pressure and yet still needed to connect to new peers; it's much better if the system maintains a handful of spare connections that are ready to take on any new task. A node should be able to make an educated decision about which existing connections to promote to get the best performance. For this reason we decided to have three type of peer:

  • cold peers know about their existence, but there is no established network connection.
  • warm peers have a connection, but it is only used for network measurements and none of the node-to-node mini-protocols is used;
  • hot peers have a connection, which is being used by all three node-to-node mini-protocols.

A node can potentially know about thousands of cold peers, maintain up to hundreds of warm peers, and have tens of hot peers (20 seems a reasonable figure at the moment). There are interesting and challenging questions around the design of policies that will drive decisions for the peer-to-peer governor. Choice of such policies will affect network topology and alter the performance characteristics of the network, including performance under load or malicious action. This will shape the timely distribution of block diffusion (parameterized by block sizes), or transactions. Since running such a system has many unknowns, we'd like to phase it into two parts. For the first phase, which will be released in a few weeks (probably shortly after Praos, also known as the Shelley release), we want to be ready with all the peer-to-peer components but still running in a federated mode. In addition, we will deliver the connection manager together with implementing a server accepting connections, and its integration with the peer-to-peer governor. In this phase, the peer-to-peer governor will be used as a subscription mechanism. Running various private and public testnets, together with our extensive testing should give us enough confidence before releasing this to mainnet.

In the second phase, we will extend the mini-protocols with a gossip protocol. This will allow exchange of information about peers, finalize network quality measures, and plug them into the block-fetch logic (which decides from whom to download a block) as well as the peer-to-peer governor. At this stage, we would like to design and run some experiments to discover how peer-to-peer policies shape the network, and check how they recover from any topologies that are suboptimal (or adversarial).

I hope this gives you a good sense of where we are with the design and implementation of decentralization for Cardano, and our roadmap towards the Shelley era. You can follow further progress in our weekly reports.

This is the third of the Developer Deep Dive technical posts from our software engineering teams.