What is our release strategy for Cardano?
A new development approach from IOHK engineers will bring benefits to users
1 December 2017 4 mins read
Since the release of mainnet, IOHK engineers have not only started work planning the new features of Shelley, the next big release, but have also been evolving the way they work. The biggest change is that the team developing Cardano will follow a new release cycle where updates are more regularly made to the live software. This is now possible because we have the foundation of Byron to build on. This approach will benefit the community because it means we can deliver features to users sooner, and also has other advantages, such as allowing us to design very detailed tests for the code. Considerable planning has been under way to help the team shift to this approach, and we are confident it will bring good results. This blog post explains what is happening behind the scenes. Cardano involves collaboration from several development teams working towards a common vision following the same roadmap. The development teams collaborating on the project include Core SL (Settlement Layer), Networking, Cardano Blockchain Explorer, Wallet Backend, and Daedalus. Development follows the agile software development approach, which allows us to continuously inspect, adapt and improve the way we build and ship Cardano. Agile is an iterative approach where software is built incrementally from the start of the project, instead of delivering it all at once near the end, as we did with Byron, the launch of mainnet.
The development teams working on Shelley now follow synchronized two-week sprints, which are a set period of time during which specific work has to be completed and made ready for review. There are daily meetings to coordinate the teams and resolve any issues. The product backlog – where the prioritized list of features are stored and is essentially all the work that needs to go into developing Cardano – is being managed in an issue tracker tool. Individual teams have their own sprints that take their work from this shared product backlog.
After a two-week sprint, we will deliver a testnet release. After a number of testnet releases there will be a mainnet release based on feedback from the community and internal quality assurance testing.
Our goal is to reduce the time between mainnet deployments. To release early and often, we’re adopting time-based releases as opposed to feature-based releases. Time-based release strategies are used in many commercial and open source projects, for example, Linux. Moving to time-based releases means no more waiting for features to be ready; instead we merge only the features that are ready at the time of the release. This doesn’t include patches, which are updates to improve the software or fix bugs, which will continue to be released when they are ready.
One of the problems with the "big bang" release approach we followed for Byron is that it creates a bottleneck due to problems coordinating all the features slated for release at the same time. Decoupling features from releases, where possible, means that in future we will not have to delay a release because we are waiting for one feature to be ready. It also means that we can focus on getting smaller features out of the door and into the hands of our community quicker. The shorter feedback loops allow us to gather valuable feedback from the community on the direction that we are taking Cardano. It allows users, contributors and stakeholders to prepare and plan around a predictable timeframe.
However, scaling this agile approach across several teams working on the same product brings merging and integration challenges. We are using a well known, successful git branching model called git-flow to help manage this.
Once a development sprint is finished, our DevOps (development and operations) team branches off the code from the develop branch into the release branch. Changes are deployed to internal staging and once testing approves the changes, they are redeployed to the testnet. Once we decide to release mainnet, the release branch is merged to the master branch and tagged. The master branch is therefore always in stable state as the mainnet. We’re working to align our processes to enable moving towards more frequent, scheduled releases. Once a suitable release schedule is decided it will be shared with the community. The process described here is not an end state. We strive to continuously improve, so updates to our release process can be expected as we fine tune our software development processes.
Artwork, Mike BeepleEthereum Classic community comes together at successful Hong Kong summit
Principles and plans for growth unite participants at event organised by Grayscale
22 November 2017 10 mins read
Since day one, Ethereum Classic has been driven by principles. What began as a disparate group of people who came together after the DAO hack and the consequent hard fork on Ethereum has now become a strong and growing community, united by a belief in immutability and that code is law. So it was tremendously exciting to see everyone gather at the first Ethereum Classic Summit held last week and watch the community move beyond the realm of instant messaging to interacting in person and making new friends. There were many highlights of the two day summit in Hong Kong. It was a moment to reflect on how much Ethereum Classic has achieved in a little over a year, developing its identity, building its technology and securing its future. Grayscale, which provides Bitcoin and Ethereum Classic investment trusts, has provided much momentum by organising this event.
Barry Silbert, founder, explained that Grayscale will put a third of its ETC fees for three years into supporting the growth of ETC though development, marketing and the community, and the ETC Cooperative has been established to do this. Barry told IOHK: “We wanted to do our part and help support increased communication, collaboration and bring together all the various stakeholders at least once a year. I’ve been super excited to see the participation, the level of engagement, the quality of the speakers and participants and think this year has been a huge success.”
What might come a surprise to some is that Ethereum Classic is not a direct competitor to Ethereum. In his keynote speech on the first day of the summit, Charles Hoskinson, founder and CEO of IOHK, and former CEO of Ethereum, said: “Frankly, we compete far more with Bitcoin. To me, Ethereum Classic is basically a digital commodity, it’s digital gold. It’s really exciting because it can give us a very interesting design space that we can explore that’s totally independent of the design space that Ethereum is pursuing. We don’t have to embrace proof of stake, Plasma or the fork of the week. Instead we can start with principles and we can grow from principles.”
Charles argued that two separate philosophies had always sat uncomfortably together in Ethereum, even before the community split. “We actually had two Ethereums from the beginning but we didn’t know that. We had two philosophies at the same time. One was this notion that code is law, that code can’t care, it doesn’t matter what you’re running and it doesn’t matter about the consequences. But there was this other philosophy, the notion of a world computer, the idea that you could have universal infrastructure that would be a beautiful computation layer that could be added to the internet. But the fundamental problem is that these things are philosophically so different they can’t coexist. When you’re the world computer you have to fork, you have to embrace mutability and have a lot of flexibility in your platform.”
Now more than a year after the split, Ethereum Classic has developed its own identity. It had a great asset in its principled and warm spirited community, Charles pointed out and championed the development work being done. “We see a lot of great things coming down the pipeline. It’s got to a point where we can prove beyond reasonable doubt that we can keep up with Ethereum Foundation and in some cases we’re writing better software. As a comparison, Bitcoin Core has about 110,000 to 120,000 lines of C++ code. The Mantis client does more and it’s 10,000 lines of code. It’s twelve times more concise. “The Sputnik VM, pulling the VM out and having it as a standalone, is a phenomenal idea, it’s a great piece of engineering. It’s far better than anything coming out of the Ethereum Foundation.”
A highlight of the conference was hearing about the steady pace of development of the ETC protocol. The eight-strong team of engineers that make up ETCDEV team are working on implementing the monetary policy, a flexible software developer kit that can be used to build applications on Ethereum Classic, and making performance and reliability improvements to the Ethereum Virtual Machine. Isaac Ardis, Go developer from ETC Dev, outlined the direction and purpose of the Emerald Project, which is not just a wallet but a library and suite of tools for third party developers. Licensing projects under Apache 2.0 makes them commercial-friendly, and easy to implement.
Igor Artamanov, CTO of ETCDEV, told IOHK that he believed Ethereum Classic would rise to be in the top three cryptocurrencies. “We’ll continue our work on open tools for developers who will build apps on top of ETC blockchain. Our goal is to set up development processes and help with building various tools for engineers and open source libraries around ETC.” A theme of the conference was formal verification, and on that topic Igor said: “It will be cool to have. We are not experts in formal verification, but if someone will decide to introduce it to SputnikVM, we’ll appreciate that.” On monetary policy, Matt Mazur, advisor to the ETC Dev team, gave a clearly explained presentation on ECIP 1017, the proposal that contains changes such as capping the amount of ETC that can be created, in contrast to Ethereum’s uncapped supply.
There was news from the Grothendieck team, the IOHK developers that are dedicated to ETC and who have over the past year built the Mantis client for Ethereum Classic. Alan McSherry, team lead, gave details about integration with the Daedalus wallet. It will be fantastic for the community to have another option for a wallet, where they can securely store their ETC. “The Daedalus release is going to happen hopefully in December. That’s taking the Daedalus wallet, the UI that was built for Cardano, and putting it on top of our Mantis client, our ETC node, so you’re able to use a familiar UI with Ethereum Classic.”
Alan McSherry also offered a look at what the team are thinking about in terms of Mantis development in 2018. With regard to K Framework, the important research to create formal semantics of the EVM done by Professor Grigore Rosu at the University of Illinois, and funded by IOHK, he said: “With any luck, that will result in the possibility of writing Scala contracts that can be compiled to run on the EVM, or a slightly different version of the EVM, and we’ll have the formal verification and it’ll be a question of trying to create formal proofs around those contracts.” IOHK research into sidechains could allow the possibility of creating contracts to allow interactions between ETC and Ethereum and ETC and Bitcoin, while other features on the horizon for ETC include zero knowledge proofs and anonymous addresses, he said.
And Alan Verbner, developer on the team, gave a reminder of why the team use Scala. “Functional programming allows you to create less and more secure code and that’s why we are trying to build in Mantis compact code that is safe and you can check its correctness. Our client has 10,000 lines of code, it’s easier to read and easier to access. We are building open source software so we like people to look into it and analyse it to look for bugs and security issues. You have fewer lines of code to test.”
The theme of principles arose again when it came to hearing from Ethereum Classic miners. We heard how mining pools chose ETC because of the core principles of the community, proving the attractiveness of the code is law philosophy. There was good news from the 91Pool, the first group that ever mined on ETC, when Meicy Mei announced the group has purchased a building in Shanghai to serve as a dedicated hub for the ETC community in China, with a cafe and bar that will host meet ups.
The importance of coming together as a community to debate these issues cannot be underestimated. The results of these discussions will drive the development of Ethereum Classic. Sharing this view, Barry told IOHK: “I think in the digital currency space there’s a little too much interaction done through social media and through anonymous forums and it’s important to get people together to meet each other, challenge each other, and do it in a way that is respectful and that fosters communication and collaboration. So I do think it’s important to have summits like this. I absolutely expect that we’ll do it again next year and hopefully the year after that will become an annual event that everybody in the community looks forward to.”
"Ethereum Classic is the most decentralized coin, and the most distributed without a significant holder of the token supply." - via Meicy Mei from 91 Pool #ETCSummit #etc
— Ethereum Classic (@eth_classic) 14 November 2017
Over 29K views from our global community who tuned in to our live stream.#ETCSummit was trending on Twitter in Hong Kong during the summit with the amount of engagement from our community.
— Ethereum Classic (@eth_classic) 15 November 2017
Thank you all for the support and we are excited about the future. #ETC #ETCisComing pic.twitter.com/ncGL1CHWJ2
Writing a High-Assurance Blockchain Implementation
Guest blog from Edsko de Vries, who is working on the high assurance implementation of the Ouroboros blockchain protocol
3 November 2017 9 mins read
Writing a High-Assurance Blockchain Implementation - Input Output
In our previous blog post Cryptocurrencies need a safeguard to prevent another DAO disaster we discussed the need for high assurance development of cryptocurrencies and their underlying blockchain protocols. We also sketched one way in which one might go about this, but we did not give much detail. In this follow-up blog post we delve into computer science theory a bit more, and in particular we will take a closer look at process algebras. We will have no choice but omit a lot of detail still, but hopefully we can develop some intuition and provide a starting point for reading more.
Compositionality
When we want to study computer programs as mathematical objects that we can manipulate and analyze, we need to write them in a programming language for which we have a well understood mathematical theory. If this mathematical theory inherits much of what we know from elementary mathematics, that's an additional benefit. For example, we learn in school that:
for every number n, n + n = 2 * n
Shockingly, in many programming languages even this most basic of equations does not hold. For example, in the language C, when we define
int f() {
return 5;
}
int g() {
int x;
scanf("%d", &x);
return x;
}
we have that while f() + f() is the same as 2 f() (both equate to 10), g() + g() is certainly not the same as 2 g(): the former asks the user for two numbers and adds them together, the latter asks the user for a single number and multiplies it by two. In the language Haskell this distinction between an honest-to-goodness number, and a program that returns a number is explicit (this is what we mean when we say that Haskell is a pure language), and for numbers basic mathematical equations such as the above hold true. Knowing that for every number n, n + n = 2 n in Haskell would still not be particularly useful without another property of Haskell: when we see the expression n + n anywhere in a program, we can (almost) always replace it with 2 n and know that our program still behaves in the same way: we say that Haskell is referentially transparent. This is key: it means that we can analyze bits of our program at a time and know that our conclusions are also correct in the wider context of the whole program. In other words, it means that we can do compositional reasoning: we can understand the larger program by understanding its components individually. This is essential when dealing with anything but toy examples.
Process Algebra
Of course, even in Haskell we can read from files, write to network sockets, etc. and while the type system makes sure that such programs are distinguished from pure values, reasoning about such programs is nonetheless more difficult because we do not have a good mathematical theory that covers all of these "effects". It is therefore useful to identify a subset of effects for which we do have a mathematical theory. Different kinds of applications need different subsets of effects. For some applications a process algebra is a good match. A process algebra is a programming language in which we can express systems of concurrent programs that communicate with each other, along with a mathematical theory. There are many different kinds of process algebras, and they differ in the details. For example, some provide synchronous point to point communication, others asynchronous; a few provide broadcast communication. So what would be an analogue to a law like n + n = 2 * n in such a language? Well, there are multiple answers, but one answer is bisimilarity, which intuitively allows us to say that "two processes behave the same". For example, consider the two following processes
p1 m = forever $ do
bcast m
bcast m
p2 m = forever $ do
bcast m
Both p1 and p2 consist of an infinite loop; p1 broadcasts message m twice within that loop, and p2 only once. While these two processes may look different, they really aren't: both broadcast message m indefinitely. To make "they aren't really different" more precise, we can try to prove that p1 and p2 are bisimilar. Intuitively, this means that we have to show that any action that p1 can take, p2 can also take, and they are still bisimilar after taking those actions. The proof for p1 and p2 would look something like this, where a red line indicates "can do the same action here":
So why is this useful? When two processes are bisimilar, you know three things: they can do the same actions, they will continue to be able to do the same actions, and most importantly, this is compositional: if p1 is bisimilar to p2 then we can replace p1 with p2 in (most) contexts. As before, this is essential: it means we can understand and analyze a larger program by understanding and analyzing its components.
Psi-calculus
Process algebra is a large field in computer science, and many different kinds of algebras have been studied and are being studied. Some of the earliest and perhaps most well-known are the Calculus of Communicating Systems (CCS) and Communicating Sequential Processes (CSP). These days one of the most important process algebras is the pi calculus; argueably, the pi calculus is to concurrent programming what the lambda calculus is to functional programming. While the lambda calculus forms the theoretical basis for functional programming, nobody actually writes any code in the lambda calculus; it would be akin to writing in assembly language. Instead we write programs in a much more sophisticated functional language like Haskell. Similarly, the pi calculus is a "core" calculus, good for foundational research but not so good for actually writing programs. Instead, we will use the psi calculus which is a generalization of the pi calculus with more advanced features, but still with a well understood theory. Let's consider an example of bisimilarity in the psi calculus, similar to the previous example but slightly more interesting. Do you think we should regard processes p3 and p4 as "behaving the same"?
p3 m = forever $ do
k <- newKey ; bcast (encrypt k m)
k' <- newKey ; bcast (encrypt k' m)
p4 m = forever $ do
k <- newKey ; bcast (encrypt k m)
What about process p5 and p3?
p5 m = forever $ do
k <- newKey
bcast (encrypt k m)
bcast (encrypt k m)
It turns out process p3 and p4 are bisimilar, and the proof is almost the same as before
They both send out an infinite number of messages, each message encrypted with a new key. Although p3 and p4 will generate different keys, from the outside we cannot tell them apart because we don't know anything about those keys. Process p5 and p4 are however not bisimilar:
After p5 sends out a message encrypted with some key k, it then sends that message again encrypted with the same key; p4 cannot follow.
Ouroboros Praos
One of the nice things about the psi calculus is that we can embed it as a domain-specific language (DSL) in Haskell. We use this to define a model of the Ouroboros Praos blockchain protocol, which we can both execute and run (it's just Haskell after all), and even test using QuickCheck (Haskell's randomized testing tool), but moreover is also amenable to formal verification using the psi calculus metatheory.
We start by expressing the algorithm as it is defined in the Ouroboros Praos paper in our Haskell psi calculus embedding. Although we cannot formally verify that this algorithm matches the one in the paper, the correspondence is close enough that it can easily be verified by hand. We can run this implementation and experiment with it. This first implementation will do various things that, while we can implement them, we would not want to do in a real implementation. So we then make a number of small, incremental adjustments, that get us closer and closer to an actual high performance implementation. Each of the individual adjustments should be small enough that a human reader can understand how we go from one implementation to the next and convince themselves that we haven't changed anything fundamental.
Each of these algorithms can be run, so that we can test that all of the adjustments we make to the original algorithm don't change any fundamental properties. Since we express the algorithm in terms of a well-understood process calculus, we can also take some of these adjustments and formally prove that we haven't changed anything fundamental. Bisimulation is but one tool in the theoretical toolbox here; there are many others. The end result is that we have a bridge from the high level specification of the algorithm in the academic literature to the actual low level implementation that we implement, with high assurance that what we actually run indeed corresponds to the algorithm defined and proved correct by the cryptographers.
Download and Contributing
If you want to follow the work in progress, you can find the psi calculus development of the Praos blockchain protocol at Praos Formalization. One way in which you could contribute would be to read the document, compare it to the Praos algorithm in the literature, and convince yourself that indeed they do the same thing; after all, this is a step we cannot formally prove. Although the subsequent adjustments are in principle formally provable, in practice we may only have resources to do that for a few steps; so here too the more people read the document and try to poke holes in the arguments, the better. If you want to do more than verify what we have done, we would be happy to discuss the possibilities; contact us here. We look forward to hearing from you!
Introducing the Cardano roadmap
Next steps of development are charted ahead of further information releases
31 October 2017 9 mins read
Cardano has been an incredibly challenging and fun project to work on involving many teams throughout the world with different skills and opinions on design, process and quality. Our core technology team consists of Well Typed, Serokell, Runtime Verification, Predictable Network Solutions and ATIX, with IOHK leading them. Then we have external auditors such as Grimm, RPI Sec and FP Complete ensuring quality and holding us accountable to delivering what we have promised. When dealing with a consortium, it’s important to be aligned not only on day to day affairs, but on the broader engineering philosophy of why and how things should be constructed, as well as the pace of development. I’d like to spend a few paragraphs exploring the principles that guide our roadmap.
First, we can only grow as fast as our community. At the moment, more than half of our community has never used a cryptocurrency before. This reality means that while it would be awesome to introduce complex features like stake pools, blockchain based voting and subscribable checkpoints, they would be underutilized and therefore ineffective until the community catches up. To this end, we’ve provisioned resources towards a help desk tour, a dedicated support channel and trying to interact with our community as much as possible as they grow. This is tremendously time consuming, but it really does force our software to become simpler and more useable.
This effort challenges design assumptions, such as what the capabilities of the average node in our system should be. Throughout the next few months, a large part of our focus will be on answering questions, debugging software, improving the user experience and community education. The output will be that more people use Daedalus comfortably and back up their Ada, are able to use exchanges according to best practices, and understand why Cardano is a special project. It also means our software will slowly become more bug free and compatible with different configurations.
Second, we believe firmly in the vision of Satoshi that these networks must be resilient and distributed. Resilience means we cannot optimize around a collection of trusted or usually reliable nodes to maintain the network. Distributed means that wherever possible, every node contributes to propagating the network and its DNA. This creates tremendous design challenges. One is at the mercy of the weakest links when replication is the tool used for resilience because it dramatically slows the network as more users come online. While we feel that a protocol like Ouroboros is perfectly suited to properly balance these conflicting demands, we admit that general network and database capacities aren’t quite ready for this task. The sole – and temporary – advantage that Cardano enjoys here is that we are new and won’t feel the pain of scale for the next year.
Yet we must remain vigilant in our efforts to address these concerns systematically as we have done with Ouroboros. Therefore, many of the most significant network improvements will be scheduled for deployment in late 2018 and throughout 2019. Third, there is a difficult balance between research and deployment. We have great protocols and protocol developers. We also have a unique advantage in being able to quickly submit papers for peer review in order to promptly validate them or fail fast, instead of enduring difficult lessons of hindsight that can result in the loss of funds. However, we cannot allow the urge to have something ready soon for commercial advantage to drown out proper process. With Ouroboros for example, we have a serious and ongoing formal specification effort using Psi Calculus to model and remove all ambiguity from our consensus protocol. This effort has been fruitful in dramatically increasing our understanding of all the benefits and sins of Ouroboros, but will not yield working code until the second half of 2018.
We deployed a rigorously built protocol for Byron, but one that does not enjoy a formal specification. There is always the chance we did something wrong, as with all software projects. There is always the chance we haven’t followed the intent of the scientists, despite our best engineering efforts. The broader point is that as our protocols increase in complexity, interdependence and with the use of more exotic cryptographic primitives, finding a balance will become more challenging. The saving grace we have is that many of our security assumptions are somewhat composable, we use great methods like Red-Black teams, peer review and formal specs, and using Haskell forces deeper conversations about intent.
The question of when to introduce smart contracts has become the most difficult issue of our roadmap. IOHK has brought some of the best minds from the programming language theory world, such as Professor Rosu and Professor Wadler, into the area of smart contracts. They have watched the world evolve from simple computers measuring performance in the megahertz and kilobytes of RAM to the internet age of big data, AI on demand and nearly unlimited processing power.
It’s an extraordinary honor and privilege to work with people with so much wisdom, hindsight and proven contributions to the fabric of computer science. Yet we also have the demands of Ethereum and the other systems that while lacking rigorous foundations have proven to attach a large following and much mindshare. We won’t accept the model of smart contracts is a closed matter and that these issues are now just a matter of optimization. We won’t accept that society is anywhere close to mass adoption of this new paradigm. It’s the beginning and it would be foolish to say Ethereum’s model is the one to back. Yet we acknowledge the need for something like that model. Therefore, we’ve decided to increase our allocation of resources towards two parallel tracks of research.
One is focused on fixing the issues that Ethereum’s hastily driven design process has yielded to the despair of those who have lost funds as a result. The other is focused on the ontology of smart contracts in general as well as different computational models that achieve similar ends without necessarily involving the new complexity or cost that Ethereum introduces. Professor Rosu’s team at the University of Illinois and the partnership with Runtime Verification is focused on the former effort. The work of Professor Wadler’s team is focused on the future and broader theory. We hope to establish more foundational research in the field of smart contracts, as Professor Aggelos Kiayias did with the GKL model and Ouroboros.
Next, there is the interplay between good partners like Bittrex, Ledger and others who are consumers of our APIs and have to deploy our software for thousands to use. The reality is that all interfaces are best guesses until they are used and then must change to conform with reality, not best intentions. We’ve already seen lots of room for improvement for our middleware layer, Daedalus’s design and also new features that would be helpful in making integration simpler and more cost effective. To this end, we’d like our first light clients to come from third parties instead of directly from IOHK, to force our documentation to improve and our software to become less arcane. Much of the effort towards Shelley (Cardano’s next major release) will be focused towards these types of improvements. They create a wonderful feedback loop that helps us make better decisions that benefit larger groups of users. Finally, the Cardano roadmap isn’t the property of IOHK, Cardano Foundation or Emurgo. It belongs to the community. When a cryptocurrency is new, it needs good shepherds to guide and steer, but we cannot allow the ecosystem to be ruled by a beneficent dictator or oligarchy.
This requirement creates some issues about scope. We have a clear idea of what needs to be done over the next 18 months to realize some of the technical requirements of Cardano from scalability to interoperability, but we cannot operate as a dictatorship pushing along and telling everyone to accept it. Cardano belongs to those who hold Ada. Thus we’ve decided to execute with three parallel paths. First, Shelley is our path to the full decentralization of Cardano. This must be done and we have decided to publish in full what Shelley entails. Next, there is the issue of Cardano Improvement Proposals (CIPs) and gathering consent for moving forward on whichever one is decided upon. We have invested a great deal of effort into building a solid voting system and will be releasing it for public review soon. This includes a standard for CIPs.
After Shelley, we will be proposing all changes to Cardano as CIPs and adding increasingly better democratic mechanisms to gather consent for these changes. The burden should be on us to inform and rally the community behind a direction. We cannot allow a technocracy to form alongside a cult of personality of the good leaders. Cardano must have the legitimacy of consent from its users. Finally, the roadmap needs to be often reviewed and updated. It cannot be a static document so we are going to run a monthly clock and update it frequently. We are also going to create a channel for people to proposal alternative roadmaps. As our users become more sophisticated and we accrue more enterprise partnerships, we fully expect divergent paths to be proposed and our hope is that they are better than our own. As a last thought, a cryptocurrency is only as good as the community behind it. We’ve been humbled by how amazing, patient and helpful our community has been. Our hope is that the roadmap is something we can build together over time and it becomes one of our strongest pillars.
To learn more about the project see the recent whiteboard video. Filmed on location at IOHK's Blockchain technology lab, Tokyo Institute of Technology - 東京工業大学
Designing a new virtual machine and universal language framework
Guest blog from Professor Rosu explains the work being done in a partnership between Runtime Verification and IOHK
26 October 2017 8 mins read
Designing a new virtual machine and universal language framework - Input Output
The IELE and K team Mathematical rigor and good design of programming languages and underlying virtual machines are critical for the success of blockchain technologies and applications. Indeed, decades of accumulated evidence show that formal techniques and their early adoption in the design of computing systems can significantly increase the safety, security and dependability of such systems. Moreover, when paired with good user interfaces that hide the mathematical complexity, such techniques can also increase the effectiveness, elegance and quality of code development. A good example is the recent success of functional programming languages and of automated theorem provers or constraint solvers.
Runtime Verification, a University of Illinois start-up founded by computer science Professor Grigore Rosu, has been recently awarded a research and development contract by IOHK to design a next generation virtual machine and a universal language framework to be used as core infrastructure for future blockchain technologies. The formal analysis and verification technology employed in this project was initiated by Prof. Rosu and his collaborators back in 2001, when he was a NASA scientist, and has been improved over more than 15 years of research and development both in his Formal System Laboratory (FSL) at the University of Illinois at Urbana-Champaign and at Runtime Verification, with generous funding of more than $10M from organisations including NSF, NASA, DARPA, NSA, Boeing, Microsoft, Toyota, and Denso. It is about time that aircraft grade, software analysis technology used for mission critical software gets deployed to smart contracts, the blockchain and cryptocurrencies. The project will be executed by a team of Runtime Verification experts led by Prof Rosu, who will work closely with students at the University of Illinois – also funded by IOHK – and with the IOHK research and development team.
IELE – A Register-Based Virtual Machine for the Blockchain
Based on learnings from defining KEVM, our semantics of EVM in K, we will design and define a new VM, which we call IELE (after the Mythological Iele). Unlike the EVM, which is a stack-based machine, IELE will be a register-based machine, like LLVM. It will have an unbounded number of registers and will also support unbounded integers. There are some tricky but manageable aspects with respect to gas calculation, a critical part of the design.
Here are the forces that will drive the design of IELE:
To serve as a uniform, lower-level platform for translating and executing smart contracts from higher-level languages, which can also interact with each other by means of an ABI (Application Binary Interface). The ABI will be a core element of IELE, and not just a convention on top of it. Also, unbounded integers and an unbounded number of registers will make compilation from higher-level languages more straightforward and elegant and, looking at the success of LLVM, more efficient in the long term. Indeed, many of the LLVM optimizations are expected to carry over. For that reason, IELE will follow the design choices and representation of LLVM as much as possible. The team includes an advanced PhD student from Vikram Adve’s lab at the University of Illinois, where LLVM was created, and who is an expert in LLVM.
To provide a uniform gas model, across all languages. The general design philosophy of gas calculation in IELE is “no limitations, but pay for what you consume”. For example, the more registers a IELE program uses, the more gas it consumes. Or the larger the numbers computed at runtime, the more gas it consumes. The more memory it uses, in terms of both locations and size of data stored at locations, the more gas it consumes. And so on.
To make it easier to write secure smart contracts. This includes writing requirements specifications that smart contracts must obey as well as making it easier to develop automated techniques that mathematically verify/prove smart contracts correct wrt to such specifications. For example, pushing a possibly computed number on the stack and then jumping to it regarded as an address makes verification hard, and thus security weaker, with current smart contract paradigms. We will have actual labels in IELE, like in LLVM, and structured jumps to those labels. Also, avoiding the use of a bounded stack and not having to worry about stack or arithmetic overflow will make specification and verification of smart contracts significantly easier.
Like KEVM, the formal semantics of EVM that we previously defined, validated and evaluated using the K framework, the design of IELE will also be done in a semantics-based style, using K. Together with a fast (LLVM-based) execution backend for K, it is expected that the interpreter obtained automatically from the semantics of IELE will be sufficiently efficient to serve as a reference implementation of IELE.
K as a Universal Language for the Blockchain
Besides EVM, several languages have been given a complete formal semantics in K, including C, Java, and JavaScript. Several others will be given K semantics as part of this project, including IELE itself, Solidity and Plutus. We want to allow all these languages, and possibly more as they are given K semantics, to be used for writing smart contracts. For that, we need to enhance the capabilities and implementation of the K platform itself, which will increase the practicality of all new and existing K semantics. We will implement several performance improvements not yet taken beyond proofs of concept, and develop production-quality implementations of analysis features such as the K symbolic execution engine, the semantics-based compiler (SBC), and the program verifier, which currently have only prototype-quality implementations in the academic K project.
Faster Execution
We plan to develop a concrete execution backend to K that will be at least one order of magnitude faster than the current one. The current one is based on translation to OCaml; we plan on translating to LLVM and specializing the pattern matcher to the specific patterns that occur in semantics. We believe it will be possible to execute programs against our KEVM semantics as efficiently as the reference C++ EVM implementation(!). If that will indeed be the case, and we strongly believe it will, then this will mark an unprecedented moment in the history of programming languages, when a language implementation automatically derived from a formal semantics of the language can serve as a realistic implementation of that language. While this was proved as a concept with toy languages, it has never been proven to work with real languages in practice. The K technology has reached a point where this is possible now. And not only to execute programs, or smart contracts, but also to reason about them, because a formal semantics, unlike an interpreter, can also be used for formal verification.
Semantics-Based Compilation
One of the most challenging components of the K framework that will be built in this project is what we call semantics-based compilation. The following picture shows how SBC works:
We have implemented a rough prototype and were able to make it work with a simple imperative language, which we call IMP. Here is an example:
The program to the left is transformed, using the semantics, into a much simpler program that looks like an abstract machine. The four states represent the “instructions” of the new language L’, and the edges are the new semantic rules of L’. As seen, the semantics of the various sequences of instructions has been symbolically summarized, so that the amount of computation that needs to be done at runtime is minimized and everything that can be done statically is hardwired in the new semantics of L’, so all done before the program is executed. Preliminary experiments are encouraging, confirming our strong belief that the resulting SBC programs will execute one order of magnitude, or more, faster:
These improvements to the K framework will not only yield a reasonably efficient prototype of executing smart contracts on IELE, but, more importantly, will give us an approach to write smart contracts in any programming languages that have a formal semantics in K.
We, the K team at RV and at UIUC, are very excited to pursue this new project with IOHK. This gives us a unique chance to demonstrate that the K technology is ready to transit to the real world, in a space where security and trust in computation are paramount. It almost feels like smart contracts are the opportunity that K was waiting for all along, like what it was designed and implemented for.
Search blog
Recent posts
Oasis Pro deal will give developing world better access to financial markets by Anthony Quinn
26 September 2021
Cardano fund injecting $6m to support Africa’s pioneers by Anthony Quinn
25 September 2021
Cardano to integrate Chainlink oracles for real-time market data by Tim Harrison
25 September 2021