In the third post in this series (part 1, part 2) on proof refinement, I'm going to show you how to properly handle bidirectionality in an elegant way. The technique we'll use is the replacement of lists and functions with a data structure called a telescope. This post will use Haskell exclusively, because of the limitations of JavaScript in presenting these things elegantly. I've put together repl.it REPLs for this blog post so you can play around with the code. You can find them here: Addition 1, Addition 2, Addition 3, Lambda Calculus.
Consider the type that represented a successful decomposition of a problem judgment into subproblems, when working in the proof system for addition: ([Judgment], [Nat] -> Nat)
. The list of judgments represents the subproblems, and the function represents how to compute the result of the main problem from the results of the subproblems. This was problematic to generalize though, because it meant that all of the subproblems had to be independent. You couldn't use the result of solving an earlier subproblem to state what a later problem was. Information flowed strictly from subproblems out to main problems, never into other subproblems.
This of course was because the subproblems were all given at the same time, and the results were all simultaneous arguments to the function that computed the main result. For instance, we might have a decomposition that looked like ([j0,j1,j2], f)
where f = \[r0,r1,r2] -> ...
, and we would solve these basically as f [solve j0, solve j1, solve j2]
. What could we do to make it possible to have dependence of later problems are earlier results, though? Well, we could produce subproblems and consume results one at a time. In fact, instead of the above pair, why not have (j0, \r0 -> (j1, \r1 -> (r2, \r2 -> ...)))
. This has the type (Judgment, Nat -> (Judgment, Nat -> (Judgment, Nat -> Nat)))
, which is specific to the case where the list of subproblems has exactly three subproblems in it. This doesn't generalize well, but we can notice the obvious recursive pattern and instead define
data Problems = Done Nat
| SubProblem Judgment (Nat -> Problems)
Here, we're either done, and we have a resulting number, or we have a subproblem to solve, and a way of getting from the result of solving it to some more problems. Now of course, decomposing actually produced a Maybe ([Judgment], [Nat] -> Nat)
, so we really ought to define this type to account for the Nothing case as well:
data Problems = Fail
| Done Nat
| SubProblem Judgment (Nat -> Problems)
Our decompositions now will look mostly the same, but slightly different:
decomposePlus12 :: Nat -> Nat -> Problems
decomposePlus12 Zero y = Done y
decomposePlus12 (Suc x) y = SubProblem (Plus12 x y) (\z -> Done (Suc z))
decomposePlus13 :: Nat -> Nat -> Problems
decomposePlus13 Zero z = Done z
decomposePlus13 (Suc x) (Suc z) = SubProblem (Plus13 x z) (\z -> Done z)
decomposePlus13 _ _ = Fail
decompose :: Judgment -> Problems
decompose (Plus12 x y) = decomposePlus12 x y
decompose (Plus13 x z) = decomposePlus13 x z
Finding a proof is pretty easy now too, because we can just define it in in terms of a second function that handles problems more generally. Dropping the reconstruction of a proof tree, we have:
findProof :: Judgment -> Maybe Nat
findProof j = solveProblems (decompose j)
solveProblems :: Problems -> Maybe Nat
solveProblems Fail = Nothing
solveProblems (Done x) = return x
solveProblems (SubProblem j f) =
do x <- findProof j
solveProblems (f x)
The interesting thing here is how we solve problems. If we fail, well, we've failed, so there's nothing to return. If we've finished, we've finished and so there's a number to return. But what if we have a subproblem? Well, we simply find a proof for it, computing the result as x, and then use the result of that to get more problems to solve, and solve those.
Generalizing
Having established the general shape of this approach, we can now move on to generalizing the pattern involved. The first move we'll make is to observe that we might want to generalize the type of judgments to index for the type of their result. After all, we might also want to include predicates in the class of possible judgments, where there are no useful return values at all, just (). So we can generalize Judgment, and in term, Problems, like so:
data Judgment r where
Plus12 :: Nat -> Nat -> Judgment Nat
Plus13 :: Nat -> Nat -> Judgment Nat
data Problems r where
Fail :: Problems r
Done :: r -> Problems r
SubProblem :: Judgment s -> (s -> Problems r) -> Problems r
As soon as we do this, we discover that Problems looks an awful lot like a monad, and indeed, it is!
instance Functor Problems where
fmap f Fail = Fail
fmap f (Done x) = Done (f x)
fmap f (SubProblem p g) = SubProblem p (fmap f . g)
instance Applicative Problems where
pure = Done
pf <*> px = pf >>= \f -> px >>= \x -> return (f x)
instance Monad Problems where
return = Done
Fail >>= g = Fail
Done x >>= g = g x
SubProblem p f >>= g = SubProblem p (\x -> f x >>= g)
This monad instance basically just codes up concatenation of problems. With lists of judgments, we can just concatenate them, but what to do with the functions that construct results? Here instead we say that if we have one sequence of problems that produces some result, and from that result, we can compute another sequence of problems, well we can just dig around in the first sequence and replace its Done node (which ends the sequence of problems with the result) by the problems we would get. We thus get a single big sequence of problems.
This monadic interfaces also gives us a really elegant way of writing these telescopes:
subProblem :: Judgment r -> Problems r
subProblem j = SubProblem j (\x -> Done x)
decomposePlus12 :: Nat -> Nat -> Problems Nat
decomposePlus12 Zero y = return y
decomposePlus12 (Suc x) y =
do z <- subProblem (Plus12 x y)
return (Suc z)
decomposePlus13 :: Nat -> Nat -> Problems Nat
decomposePlus13 Zero z = return z
decomposePlus13 (Suc x) (Suc z) =
subProblem (Plus13 x z)
decomposePlus13 _ _ = Fail
Let's add in a full ternary predicate version of our Plus to see how this works with other kinds of returned values:
data Judgment r where
Plus12 :: Nat -> Nat -> Judgment Nat
Plus13 :: Nat -> Nat -> Judgment Nat
Plus123 :: Nat -> Nat -> Nat -> Judgment ()
decomposePlus123 :: Nat -> Nat -> Nat -> Problems ()
decomposePlus123 Zero y z =
if y == z
then return ()
else Fail
decomposePlus123 (Suc x) y (Suc z) =
subProblem (Plus123 x y z)
Readers who are especially familiar with functional programming idioms will observe that this is a variety of free monad construct, namely, the call-response tree variety.
And now, what parts of this really depend on the problem domain of addition? Well, clearly Judgment, because that defines what the problems are in the first place. And of course, as a result of that, the various decomposition functions. But not much else, provided we have some means of abstracting over those. Namely: the Problems type can be generalized away from Judgment, and findProof can be generalized away from the implementation of decompose, by way of a type class.
data Problems (j :: * -> *) (r :: *) where
Fail :: Problems j r
Done :: r -> Problems j r
SubProblem :: j s -> (s -> Problems j r) -> Problems j r
subProblem :: j r -> Problems j r
subProblem j = SubProblem j (\x -> Done x)
class Decomposable j where
decompose :: j r -> Problems j r
findProof :: Decomposable j => j r -> Maybe r
findProof j = solveProblems (decompose j)
solveProblems :: Decomposable j => Problems j r -> Maybe r
solveProblems Fail = Nothing
solveProblems (Done x) = Just x
solveProblems (SubProblem j f) =
do x <- findProof j
solveProblems (f x)
Having abstracted this far, we now can extract this into a little library and use this for bidirectional proof systems in general. Let's now tackle the simply typed lambda calculus.
Simply Typed LC
Because we've extracted out the proof refinement toolkit, we need to only give definitions for the judgments and decomposition of our lambda calculus. This is a great simplification from the setting before. We can now express that type checking is a judgment that produces no interesting information, but that type synthesis will give us some type information:
data Judgment r where
Check :: [(String,Type)] -> Program -> Type -> Judgment ()
Synth :: [(String,Type)] -> Program -> Judgment Type
Our decompositions are now more interesting as well, and hopefully a bit more insightful:
decomposeCheck :: [(String,Type)] -> Program -> Type -> Problems Judgment ()
decomposeCheck g (Pair m n) (Prod a b) =
do subProblem (Check g m a)
subProblem (Check g n b)
decomposeCheck g (Lam x m) (Arr a b) =
subProblem (Check ((x,a):g) m b)
decomposeCheck g m a =
do a2 <- subProblem (Synth g m)
if a == a2
then return ()
else Fail
decomposeSynth :: [(String,Type)] -> Program -> Problems Judgment Type
decomposeSynth g (Var x) =
case lookup x g of
Nothing -> Fail
Just a -> return a
decomposeSynth g (Ann m a) =
do subProblem (Check g m a)
return a
decomposeSynth g (Fst p) =
do t <- subProblem (Synth g p)
case t of
Prod a b -> return a
_ -> Fail
decomposeSynth g (Snd p) =
do t <- subProblem (Synth g p)
case t of
Prod a b -> return b
_ -> Fail
decomposeSynth g (App f x) =
do t <- subProblem (Synth g f)
case t of
Arr a b ->
do subProblem (Check g x a)
return b
_ -> Fail
decomposeSynth g m = Fail
instance Decomposable Judgment where
decompose (Check g m a) = decomposeCheck g m a
decompose (Synth g m) = decomposeSynth g m
And we're done! That is the full definition of the type checker for the simply typed lambda calculus with pairs and functions! It has the benefit of being fairly straightforward to read.
Conclusion
This wraps up the series of blog posts on proof refinement. One limitation to this approach is that errors are uninformative, but we can actually modify this toolkit to provide not just informative errors (Either instead of Maybe), but highly informative context-aware errors that know what subproblems are being worked on. Another limitation is that the above toolkit only works for when there is at most one result from the bottom-up direction. That is to say, either a judgment has no proofs, and so there's no bottom-up result, or it has exactly one proof and thus one bottom-up result. But we might have multiple such results, for instance, we might have instead built a system for addition that has the first two arguments of Plus as the bottom-up results (i.e. solutions for Plus3 c), and we'd like to be able to get out all pairs (x,y
) such that x + y = c
for fixed c. There are plenty of those pairs, so we had better be able to get some kind of list-like results. We also might imagine some other kind of system where in the course of constructing a proof we need to invent something out of thin air, such as a new name for a variable. In that kind of setting we'd like to have a proof system that could make use of some state for the collection of generated names. I'll look at both of these limitations in future blog posts.
If you have comments or questions, get it touch. I'm @psygnisfive on Twitter, augur on freenode (in #cardano and #haskell).
This post is the third part of a three part series, the first post is Proof Refinement Basics, and the second post is Bidirectional Proof Refinement.
A Crypto on the Edge of Forever
28 December 2017 7 mins read
Now that the dust has settled after travelling to more than 20 countries, dozens of conferences, major events and community meet and greets this year, I’ve finally had the time to reflect on the progress of the Cardano project as well as some of the lessons I’ve learned. It’s honestly been the most challenging year of my life, filled with drama, stress, death and some unbelievably cruel people. It’s also been one of the most rewarding and joyful years, having had the chance to meet thousands of passionate and kind fans, technologists and scientists – I can see the inspiration that Charles Dickens had when he said it was the best of times and the worst of times. The reality is that the internet and in particular the cryptocurrency space can be a really toxic place if you allow it to get to you. There were times after reading some blog post or comment on Reddit that I seriously questioned if this effort was worth it. I can understand why Mike Hearn left Bitcoin.
But I’ve never been here for the short term, it’s always been the dream of finding a way to get financial services to the three billion people who don’t have them using technology that was only a dream a generation ago. And I think we are making great progress there.
In January of 2017, Cardano was still mostly in a very early alpha stage. We had tremendous engineering difficulty getting Haskell, our DevOps and the new protocols such as Ouroboros and Scrape to play nicely together. Rather it was a constant learning curve of how to tame the three headed dragon of research, decentralized teams and exotic programming languages while managing the expectations of a huge community.
As an aside, Cardano has one of the fastest growing and most intelligent fanbases. We actively invited people who care about formal methods, peer review and functional programming to come and see what we are working on. These people aren’t swayed by jargon or flashy marketing. They were born with bullshit detectors in their cribs.
I’ve gained significant strength and a much needed boost in morale from interacting with our community. For example, one member asked about how we were verifying the proofs in the Ouroboros paper and I posted a link to Kawin’s Isabelle repo. Most would simply say ‘that’s nice’ and move on. This member took the time to read the code and mentioned we had a long way to go with specific examples.
For most people, Isabelle is a name followed by a lake in Minnesota. For our community, some can actually read the code and comment on it. That’s a rare gift and it’s the privilege of a lifetime to be in this kind of environment (we ended up hiring the person who commented on the code).
Moving through the months, Cardano moved from the lab to a series of testnets to eventually being released in September. Dealing with these transitions gave us a newfound appreciation for just how many different computer and network configurations exist. I can almost feel a Windows force ghost whispering “I told you so” in a smug voice.
We designed Byron (the September release of Cardano) to be the minimum viable product necessary to test the concepts Cardano is built upon. We wanted to run Ouroboros in a production setting to see epochs function properly. We wanted extensive logging of both the edge nodes and relays to see how our network is being used. We wanted to have third parties play with our APIs and tell us where we screwed up (boy did they ever!). We wanted to test the update system a few times.
Overall, the experiment has been a tremendous success. There are several thousand edge nodes concurrently connected to the network. There are several exchanges and other third parties using our software in the harshest possible way. There is a wealth of data flowing in that is giving us a much better sense of what we need to do to make Cardano better.
Since launch, we’ve already pushed three updates to the network without incident. We’ve started a very rapid redesign of our middleware and its associated APIs to make it easier for third parties to integrate. We’ve started a series of systematic improvements to our network stack that will be finished with the Shelley release that should dramatically improve things.
However, what excites me most about 2018 is that Cardano is starting to open up to the world. Delegation and staking will be rolled out all throughout Q1 and Q2 in coordination with the community. Soon we’ll have a testnet running IELE allowing developers to play around with our smart contract model for the first time. And we’ll be deploying our first verified protocol with Praos thereby engaging the formal methods community.
Constantly living in the moment, one tends to eschew Cardano’s vast scope in exchange for the problem of the week. But looking at our ever growing Cardano whiteboard series demonstrates how many brilliant people wake up every morning thinking about how to solve the problems of scalability, interoperability and sustainability. These aren’t just hypothetical lectures. They are backed by papers, funding and developers working full time.
Then there are the new things. The work by Professor Rosu and Runtime Verification on K and semantics based compilation isn’t just really smart competitive differentiation, it’s literally moving the chains of the entire field of programming language theory. The Cardano project is creating a financial incentive to have correct by construction infrastructure from virtual machines to compilers. Our success means you don’t have to handwrite this code ever again – not just in a cryptocurrency context but in a general context.
Our research efforts at Tokyo Tech under Professors Mario Larangeira and Bernardo David with multiparty computation is rapidly bringing these protocols into practical use. Kaleidoscope and Royale are case studies on how to achieve everything that Ethereum does off chain, in a low latency setting, privately and at a scale of millions of concurrent users each in their own domain. Further abstractions will push this work into more useful domains like decentralized exchange. And eventually DApp developers will be able to integrate these protocols into their code via libraries.
Professor Bingsheng Zhang’s research on treasuries and voting is groundbreaking. It gives our project the ability to open a discussion about how changes to cryptocurrencies should be proposed, debated, approved and funded. What’s most special here is the interdisciplinary nature of the effort that can draw from political science, game theory, sociology, open source software governance and computer science. There is something for everybody.
Moving into 2018, we are going to open this discussion up by both engaging the community directly and by holding a conference in Switzerland. More details will be published later, but the basic idea is that this area isn’t a Cardano problem. It’s a cryptocurrency problem. And there are many great projects from Dash to Pivx who are trying to solve it in a novel way. We ought to talk to each other.
I could continue to enumerate our research efforts (there’s a lot more to write), but I think the point has been made. Cardano isn’t a cryptocurrency as much as it is a movement of minds who are frustrated with the way technology works in practice.
The functional programming community has for decades had great solutions to many of the problems plaguing modern developers, but they have been historically ignored. Our RINA guys, if given a chance, could build a much better and more fair internet. Layering protocol development with formal methods extracts a much cleaner and more meaningful design process where ambiguity and hand waving is slain.
What Cardano has given us is a chance to answer if only the world worked this way with why not? We have the freedom to dream again and the freedom to try new things without asking permission. I even have a chance to work with my heroes like Phil Wadler. 2018 is going to be one hell of a year.
Thanks for reading.
Daedalus Wallet launches for Ethereum Classic
The Grothendieck team’s Daedalus integration for the Mantis client is now live
22 December 2017 5 mins read
Daedalus Wallet launches for Ethereum Classic - Input Output
I am pleased to be able to write about the latest release from Team Grothendieck in conjunction with Team Daedalus. It's called the Daedalus release and it combines the functionality of the highly thought out Daedalus Wallet and the Mantis Ethereum Classic Client.
The previous release of the Mantis client was the beta release in August and since then we've been busy both supporting the Daedalus integration and improving the code base, making it ready for production.
We are now delighted to make our first release candidate available. Download the Daedalus Mantis integration for Windows 10 and MacOS.
Make sure to check the downloaded binary file against the fingerprint listed on the download page!
The installer installs both Mantis backend and Daedalus wallet and sets up a secure connection between them using an SSL certificate.
The Mantis part of the integration has been packaged with a Java Virtual Machine included to allow fast and easy setup of the client. This initial release has focused on the basic wallet to wallet transfer function and enjoys the safety features Daedalus is known for. This is still very new software, it is not recommended to use this wallet for any high value transactions because we consider this release to be a live test. For users who already use Daedalus Wallet for Cardano, these are currently separate products: one targeting ETC users and one for Cardano users and should not be installed side by side. Near future releases will provide a more integrated experience.
Ethereum Classic, for all it strengths, suffers from the same synchronization issue as other blockchains – it takes an impractically long time. Reducing this time will be a priority for future releases but at the moment it's quite impractical to wait for days to sync up the chain, and a synchronized chain is required to allow use of the wallet.
The solution for this is to use our prepackaged bootstrap databases. Anyone familiar with the beta release will remember the bootstrap databases we provided to users to circumvent the download wait.
The easiest way to get a bootstrap database is through the installer. The installer will download a bootstrap database, check that there's enough free space (~33GB) and then check the downloaded file’s fingerprint. All that being done, it will unzip the file to the appropriate database folder and then clean up after itself. This process can take as little as fifteen minutes or in the worst case several hours depending on the network and disk resources available. The compressed bootstrap file itself is of the order of 10GB.
The log file, located by default in the $HOME/.mantis/logs folder gives a step by step account of the download process.
Upon successful completion, starting the Daedalus wallet will show an almost synchronized database which should take less than an hour to complete its synchronization, depending on how many days pass between when the database was created and installation time.
The second way to install the bootstrap database is to make sure Mantis is stopped and then download the file and unzip it into the $HOME/.mantis/leveldb folder. Remember to delete or move the previous contents of the folder. When Daedalus is restarted, it will pick up the database and the wallet should be up to date in less than an hour. Again, the time will depend on how old the bootstrap database is at the time of download.
To protect your self against MITM attacks always compare the hash of the bootstrap database against the hash published on the Mantis download website.
macOS users must use the manual bootstrap method for now!
Users should be aware that stopping the wallet also stops the Mantis backend and so ends the synchronization process.
As well as the Daedalus Mantis integration we also announce release candidate 1 of the Mantis command line client. This is a continuation of the command line client we released in August. It is almost identical but where the default configuration is set up for Daedalus, the command line default is set up for "normal" use. Normal use just means that any client can connect to Mantis over HTTP rather than over HTTPS. It also is available for use on Linux and contains no prepackaged JVM.
Apart from the wallet integration, CPU mining has been added to the feature list. In the last release we provided integration with an external miner, this version allows Mantis to perform CPU mining by simply turning on a flag in the configuration.
At about 17.30 GMT on Monday 11 December the first block reward reduction took place, the visible effect of ECIP 1017 and Mantis handled it as expected. After block 5,000,000 the reward for mining a block was reduced by 1 ETC. It was vitally important that all clients implemented this change to prevent a network split.
Peer discovery, maintaining a gradually changing list of good peers as discovered by asking our peers for their peers and so on is another feature added since the beta release.
I witnessed a lot of hard work going into this code base and I'm very pleased for the whole Mantis team, Team Daedalus and our DevOps troops who wrestled tirelessly with our various continuous integration pipelines. Their hard work has culminated in a software release they can be very proud of.
Christmas is just around the corner here in Ireland and will be a welcome break. But January 2018 will be here soon and we will start our security audit of the Mantis code base – and more technology related adventure beckons!
IELE: A New Virtual Machine for the Blockchain
Specialized smart contract execution on the blockchain
15 December 2017 8 mins read
IELE- A New Virtual Machine for the Blockchain - Input Output
Runtime Verification (RV) is proud to release their first version of IELE, a new virtual machine for the blockchain.
What is IELE?
IELE is a variant of LLVM specialized to execute smart contracts on the blockchain. Its design, definition and implementation have been done at the highest mathematical standards, following a semantics-first approach with verification of smart contracts as a major objective. Specifically, we have defined the formal syntax and semantics of IELE using the K framework, which in return gives us an executable reference model in addition to a series of program analysis tools, including a program verifier. K was created by our team during the last 15 years and incorporates the state of the art in language design, semantics and formal methods. The design of IELE was based on our experience with formally defining dozens of languages in K, but especially on recent experience and lessons learned while formally defining two other virtual machines in K, namely:
- KEVM, our semantics of the Ethereum Virtual Machine (EVM); and
- KLLVM, our semantics of LLVM; the latest version of the LLVM semantics will be made public when complete and published, but an earlier version is available.
Unlike the EVM, which is a stack-based machine, IELE is a register-based machine, like LLVM. It has an unbounded number of registers and also supports unbounded integers. To get a feel for how IELE programs look like, here are two of them (these have not been verified yet and may change):
- erc20.iele - a IELE implementation of an ERC20 token
- forwardingWallet.iele - a wallet implementation that creates and calls into another contract
Design Rationale
Here are the forces that drove the design of IELE:
To serve as a uniform, lower-level platform for translating and executing smart contracts from higher-level languages. The contracts can interact with each other by means of an ABI (Application Binary Interface). The ABI is a core element of IELE, and not just a convention on top of it. The unbounded integers and unbounded number of registers should make compilation from higher-level languages more straightforward and elegant and, looking at the success of LLVM, more efficient in the long term. Indeed, many of the LLVM optimizations are expected to carry over. For that reason, IELE followed the design choices and representation of LLVM as much as possible. The team also includes LLVM experts from the University of Illinois (where LLVM was created).
To provide a uniform gas model, across all languages. The general design philosophy of gas calculation in IELE is "no limitations, but pay for what you consume". For example, the more registers a IELE program uses, the more gas it consumes. Or the larger the numbers computed at runtime, the more gas it consumes. The more memory it uses, in terms of both locations and size of data stored at locations, the more gas it consumes. And so on.
To make it easier to write secure smart contracts. This includes writing requirements specifications that smart contracts must obey as well as making it easier to develop automated techniques that mathematically verify / prove smart contracts correct with respect to such specifications. For example, pushing a possibly computed number on the stack and then jumping to it regarded as an address makes verification hard, and thus security weaker, with current smart contract paradigms. IELE has named labels, like LLVM, and jump statements can only jump to those labels. Also, it avoids the use of a bounded stack and not having to worry about stack or arithmetic overflow makes specification and verification of smart contracts significantly easier.
Like KEVM, the formal semantics of EVM that we previously defined, validated and evaluated using the K framework, the design of IELE was also done in a semantics-based style, using K. Together with a fast (LLVM-based) execution backend for K that is still under development, it is expected that the interpreter obtained automatically from the semantics of IELE will be sufficiently efficient to serve as a reference implementation of IELE.
What's next?
To achieve the full potential of IELE, we plan to next work on the following:
Efficient backend for K. Then K semantics, including IELE, can be executed at acceptable performance. As discussed in our KEVM paper, the current version of K can execute the EVM semantics at performance that stays within an order of magnitude from the performance of the reference C++ implementation of the EVM. We believe that we can improve the execution performance of K by one order of magnitude. If this is achieved, then there is no incentive to implement IELE in an ad-hoc way: the K executable semantics of IELE will also be its implementation, so it will be correct by construction and thus implementation defects of the VM itself cannot be exploited anymore. Also, IELE itself would be easier to maintain and future versions will be easier to deploy.
Compilers/Translators from Solidity and Plutus to IELE. Writing smart contracts directly in IELE is a bit more feasible than in the EVM, because IELE follows the LLVM IR which was designed to be human-readable, but the IELE code is still low-level and thus error-prone. To properly test IELE and gain confidence in its overall design and capabilities, we will implement a compiler/translator from Solidity to IELE, also in K. Since Plutus rises as a star among the functional programming languages for smart contracts and since we are defining a formal semantics of Plutus as well, a compiler from Plutus to IELE will be developed immediately after Solidity's.
Semantics-based compilation. In addition to improving K's performance, we plan to implement a tool that we call semantics-based compiler on top of K. See our previous blog post for details. The idea is to take a programming language semantics L and a program P in L, and generate (using symbolic execution heavily) a new language semantics L' which is a specialization of L for P. We expect at least one order of magnitude increase in performance. More importantly, this will give us a uniform mechanism to translate any programs in any programming languages that have a K semantics to IELE, thus making IELE and K into a universal platform for executing smart contracts in any language.
Deploy IELE on the Cardano blockchain.
Technical Details and Download
IELE is thoroughly commented and freely available under the UIUC license (as permissive as the MIT license) on Github:
- IELE repository on Github
In addition to the two IELE programs mentioned above, erc20.iele and forwardingWallet.iele showing that the IELE code is human readable, the following links into the github repo will give a good idea what IELE is and how it differs from EVM and LLVM:
- iele-syntax.md - the complete formal syntax of the IELE language.
- iele.md - the complete formal executable semantics of the IELE language
- Design.md - the design rationale of IELE, as well as detailed comparisons with LLVM and EVM
- iele-gas.md - the current gas model of IELE (which is still to be tuned as we develop compilers to IELE)
Get involved
In the spirit of open source, community-driven development, we will be holding all IELE discussions on our channels
- #IELE:matrix.org on Riot
- runtimeverification/iele-semantics on Gitter
We encourage any interested parties to engage us, ask questions, contribute code, or build experience with our tools. We are also always looking for contributors able to work on documentation, efficient install/quickstart process for new developers, and more examples and tests. We are hiring, and will be sure to keep an eye open for helpful contributors!
We will also be posting updates on our brand new Twitter page @rv_inc, which we hope any interested developers will follow and interact with.
Let's build more secure smart contracts for everybody, together!
Acknowledgements
We warmly thank IOHK for their generous funding support of both IELE and KEVM. IELE, in particular, would have not been possible without IOHK's support, its continuous research meetings, and the stimulating technical discussions we had with their research team.
We also thank the K team who defined the KEVM semantics (see technical report, too) and verified smart contracts for ERC20 compliance. Their effort and non-trivial proofs at the EVM level led to the quest for a new VM with better support for verification of smart contracts.
Artwork, Mike BeepleSimplicity
Only once in my life have I encountered a programming language that was too simple to use. That was Lispkit Lisp, developed by Peter Henderson, Geraint Jones, and Simon Jones, which I saw while serving as a postdoc at Oxford, 1983–87, and which despite its simplicity was used to implement an entire operating system. It is an indightment of the field of programming languages that I have not since encountered another system that I consider too simple. Until today. I can now add a second system to the list of those that are too simple, the appropriately-titled Simplicity, developed by Russell O'Connor of Blockstream. It is described by a paper here and a website here.
The core of Simplicity consists of just nine combinators: three for products (pair
, take
, and drop
), three for sums (injl
, injr
, and case
), one for unit (unit
), and two for plumbing (iden
and comp
). It is throughly grounded in ideas from the functional programming, programming language, and formal methods communities.
When I call Simplicity too simple it is intended as a compliment. It is delightful to see full adders and cryptographic hash functions cobbled together using just products, sums, and units. It is eye-opening to see how far one can get without recursion or iteration, and how this enables simple analyses of the time and space required to execute a program. It is a confirmation to see a system with foundations in category theory and sequent calculus. Now I know what to say when developers respond to my talk "Categories for the Working Hacker" by asking "But how can we use this in practice?"
The system is accompanied by a proof of its correctness in Coq, which sets a high bar for competing systems. O'Connor even claims to have a proof in Coq that the Simplicity implementation of SHA-256 matches the reference specification provided by Andrew Appel's Verified Software Toolchain project (VST), which VST proved corresponds to the OpenSSL implementation of SHA-256 in C.
At IOHK, I have been involved in the design of Plutus Core, our own smart contract scripting language, working with Darryl McAdams, Duncan Coutts, Simon Thompson, Pablo Lamela Seijas, and Grigore Rosu and his semantics team. We have a formal specification which we are preparing for release. O'Connor's work on Simplicity has caused us to rethink our own work: what can we do to make it simpler? Thank you, Russell!
That said, Simplicity is still too simple, and despite its emphasis on rigour there are some gaps in its description.
Jets
A 256-bit full adder is expressed with 27,348 combinators, meaning addition in Simplicity requires several orders of magnitude more work than the four 64-bit addition instructions one would normally use. Simplicity proposes a solution: any commonly used sequence of instructions may be abbreviated as a "jet", and implemented in any equivalent matter. Hence, the 27,348 combinators for the 256-bit full adder can be ignored, and replaced by the equivalent four 64-bit additions.
All well and good, but this is where it gets too simple. No one can afford to be inefficient by several orders of magnitude. Hence, any programmer will need to know what jets exist and to exploit them whenever possible. In this sense, Simplicity is misleadingly simple. It would be clearer and cleaner to define each jet as an opcode. Each opcode could still be specified by its equivalent in the other combinators of Simplicity, but programs would be more compact, faster to execute, and—most important—easier to read, understand, and analyse accurately. If one ignores jets, the analyses of time and space required to execute a program, given toward the end of the paper, will be useless—off by orders of magnitude. The list of defined jets is given nowhere in the paper. Nor could I spot additional information on Simplicity linked to from its web page or findable by a web search. More needs to be done before Simplicity can be used in practice.
Gaps
It's not just the definition of jets which is absent from the paper, and cannot be found elsewhere on the web. Lots more remains to be supplied.
- Sections 2.4, 2.5, 3.2 claim proofs in Coq, but apart from defining the semantics of the nine combinators in Appendix A, no Coq code is available for scrutiny.
- Section 2.5 claims a representation of Simplicity terms as a dag, but it is not specified. Lacking this, there is no standard way to exchange code written in Simplicity.
- Section 4.4 defines an extended semantics for Simplicity that can read the signature of the current transaction, support Merklised abstract syntax trees, and fail when a transaction does not validate. It also lifts meanings of core (unextended) Simplicity programs to the extended semantics. However, it says nothing about how the seven combinators that combine smaller Simplicity programs into bigger ones act in the extended semantics! It's not hard to guess the intended definitions, but worrying that they were omitted from a paper that aims for rigour.
- Section 3 provides a Bit Machine to model the space and time required to execute Simplicity. The model is of limited use, since it ignores the several orders of magnitude improvement offered by jets. Further, the Bit Machine has ten instructions, enumerated on pages 10–12, but the list omits the vital "case" instruction which appears in Figure 2. Again, it's not hard to guess, but worrying it was omitted.
Michelson
A second language for scripting blockchains is Michelson. It is described by a paper here and a website here. (Oddly, the website fails to link to the paper.)
I will offer just one word on Michelson. The word is: "Why?" Michelson takes many ideas from the functional programming community, including higher-order functions, data structures such as lists and maps, and static type safety.
Currently, it is also much more thoroughly described and documented than Simplicity. All of this is to be commended.
But Michelson is an inexplicably low-level language, requiring the programmer to explicitly manipulate a stack. Perhaps this was done so that there is an obvious machine model, but Simplicity offers a far superior solution: a high-level model for programming, which compiles to a low-level model (the Bit Machine) to explicate time and space costs.
Or perhaps Michelson is low-level to improve efficiency. Most of the cost of evaluating a smart contract is in cryptographic primitives. The rest is cheap, whether compiled or interpreted. Saving a few pennies of electricity by adopting an error prone language—where there is a risk of losing millions of dollars in an exploit—is a false economy indeed. Premature optimisation is the root of all evil.
The language looks a bit like all the bad parts of Forth and Lisp, without the unity that makes each of those languages a classic. Lisp idioms such as CAAR
and CDADAR
are retained, with new ones like DUUP
, DIIIIP
, and PAAIAIAAIR
thrown in.
There is a fair set of built-in datatypes, including strings, signed and unsigned integers, unit, product, sum, options, lists, sets, maps, and higher-order functions. But there is no way for users to define their own data types. There is no way to name a variable or a routine; everything must be accessed by navigating a data structure on the stack.
Some operations are specified formally, but others are left informal. For lists, we are given formal rewriting rules for the first three operators (CONS, NIL, IF_CONS) but not the last two (MAP, REDUCE). Type rules are given in detail, but the process of type inference is not described, leaving me with some questions about which programs are well typed and which are not. It reminds me of a standard problem one sees in early work by students—the easy parts are thoroughly described, but the hard parts are glossed over.
If I have understood correctly, the inference rules assign types that are monomorphic, meaning each term has exactly one type. This omits one of the most successful ideas in functional programming, polymorphic routines that act on many types. It means back to the bad old days of Pascal, where one has to write one routine to sort a list of integers and a different routine to sort a list of strings.
Several of these shortcomings are also shared by Simplicity. But whereas Simplicity is intended as a compilation target, not to be read by humans, the Michelson documentation includes a large collection of examples suggesting it is intended for humans to write and read.
Here is one of the simpler examples from the paper.
{ DUP ; CDAAR ; # T
NOW ;
COMPARE ; LE ;
IF { DUP ; CDADR ; # N
BALANCE ;
COMPARE ; LE ;
IF { CDR ; UNIT ; PAIR }
{ DUP ; CDDDR ; # B
BALANCE ; UNIT ;
DIIIP { CDR } ;
TRANSFER_TOKENS ;
PAIR } }
{ DUP ; CDDAR ; # A
BALANCE ;
UNIT ;
DIIIP { CDR } ;
TRANSFER_TOKENS ;
PAIR } }
The comment # T
is inserted as a reminder that CDAAR
extracts variable T
, and similarly for the other variables N
, B
, and A
. This isn't the 1950s. Why don't we write T
when we mean T
, instead of CDAAR
? WHY ARE WE WRITING IN ALL CAPS?
In short, Michelson is a bizarre mix of some of the best and worst of computing.
Conclusion
It is exciting to see ideas from the functional programming, programming languages, and formal methods communities gaining traction among cryptocurrencies and blockchains. While there are shortcomings, it is fantastic to see an appreciation of how these techniques can be applied to increase reliability—something which the multi-million dollar exploits against Ethereum show is badly needed. I look forward to participating in the conversations that ensue!
Postscript
The conversation has begun! Tezos have put up a page to explain Why Michelson. I've also learned there is a higher-level language intended to compile into Michelson, called Liquidity.
Search blog
Recent posts
Oasis Pro deal will give developing world better access to financial markets by Anthony Quinn
26 September 2021
Cardano fund injecting $6m to support Africa’s pioneers by Anthony Quinn
25 September 2021
Cardano to integrate Chainlink oracles for real-time market data by Tim Harrison
25 September 2021