As I expect to cut back on blogging soon, I’m delighted to report that Eric Drexler now blogs, at Metamodern.com. Of his first dozen posts, my favorite is on the promise of formal proof in math and digital systems:
Did Robin post earlier that he was going to cut back on blogging so that it's not a surprise to anybody?
Re: You lose some of the value of strong static typing when you make it optional.
Static typing sucks for most things - languages at that level should be prototype-based - but there are some cases where it is useful. Therefore, it should be optional - provided the syntactic cost is not too great. IMO, the syntactic cost is negligible - if you treat type safety as just another sort of contract.
A lot of people are saying that incremental progress gets made (especially in Haskell), but others on this thread seem to deny that incremental progress is possible. I really like Barry Kelly's beginning about crypto, but his last two paragraphs seemed basically wrong. Yes, there aren't mathematically-oriented people writing business code, so we need tools for other people, but there is incremental progress for more usable tools.
Phil Goetz, just because a lot of people went down the same wrong road doesn't mean there aren't other routes. It's a sign that it's premature to design languages for provable correctness, but progress is being made on fragments, like type safety.
Unit testing has improved the quality of software immensely. Test-driven development (or behavior-driven development, which is really only slightly different) consists of writing tests (or specifications that are automatically tested) before the actual tested code is written.
IMO, the big improvements are coming with more microlanguages on VMs. Programming languages (like Java) become popular for some good reasons, then get locked in due to network effects. Its hard to displace them, and its hard to improve upon them significantly without breaking backwards compatibility. But nowadays with the JVM and CLR, you have multiple languages running in the same VM, and many can talk to each other. Think Java's type system sucks? You can code in Scala and still call your old Java code, use Java deployment systems (Tomcat, Glassfish), and all that. Like dynamic languages? Then you can code in JRuby, Groovy or Jython and talk to your old Java code too. Lock-in is no longer as big of a hurdle as it was.
As some other people have mentioned, functional languages seem to be becoming more popular. I'd really like to see Scala and F# take off, but I wouldn't bet that they will.
On the subject of formal proofs, I pretty much agree with Doug S.
Tim Tyler: That's how a lot of programmers feel, but you lose some of the value of strong static typing when you make it optional. Note that for the purpose of this post any language where you can force invalid conversions (such as C or Java) doesn't really count as "strong static typing".
Dynamic languages have their place, and I've written plenty of Python. Java is just horrible, because the type system is strong enough to get in your way but doesn't protect you from doing fishy stuff like casting things to and from Object.
In a language that strictly requires strong, static typing across the board you have more guarantees about the behavior of code you aren't familiar with, compilers can produce faster code, and the code is made more tractable to static analysis. Ideally, errors such as possible null references and invalid type casts should be caught at compile time, not runtime. The main issue is to make the type system simultaneously as strong as possible and as unobtrusive as possible.
Peanut Gallery: I agree completely. In my previous post feel free to translate mention of "high-level abstractions, strong type safety, and absolute referential transparency" as "I think Haskell is pretty nifty". It may be worth noting that Microsoft has hired a lot of Haskell guys and C# has moved in that direction with lambdas, some limited type inference, lazy sequences, &c. So that's at least one mainstream, "respectable" language taking direction from Haskell...
Programming is the art of figuring out what you want so precisely that even a machine could do it.
As others have pointed out, once you know enough that you can tell the difference between a correct program and an incorrect program, you've already constrained your design enough that actually writing the code becomes trivial by comparison. If you have a specification that's precise enough to use formal methods on, the hard work has already been done.
The irony, of course, is that current trends in programming languages are often moving away from strong/static/safe type systems, towards weaker type systems in languages such as Python.
Off topic - but: languages should not force programmers to use one or the other paradigm: static typing is just dynamic typing with a contract imposed on the types.
In the research community, there is still lots of work and progress going on with regard to type systems, and much of that work will definitely make its way into production languages eventually. The advantages are too great to ignore, and a dynamically typed language can never be as safe or as fast as a statically typed language. I think the future of programming languages lies in making languages with improved type systems derived from things like Hindley-Milner (Haskell, ML) as easy to use as languages with type systems like Python.
For example, Philip Wadler and Robby Findler are working on being able to rigorously and safely combine typed and untyped sections of code in a single program, which would give you the advantages of a good type system (type safety, security, performance, etc.) and still allow you to use untyped code when that is needed. Additionally, languages like Haskell that provide a static type system with type inference (which infers types for you rather than requiring you to specify them yourself as in Java...) provide many of the advantages of a dynamically typed language like Python while still being type-safe, more secure, and faster. (See 38:17 and on in Faith, Evolution, and Programming Languages for information about the work on "evolutionary types".)
In the long term, I think we'll see stronger and richer type systems than anything that exists today, but with better inference and much better tool support. The main reason that dynamic languages are so popular at present is because of their ease of use and convenience, which translates into productivity and happy programmers. Better inference on the part of the type system combined with good IDE support could make a statically typed language almost as convenient while still providing huge advantages in terms of reliability, security, and performance.
anon raises a good point. The irony, of course, is that current trends in programming languages are often moving away from strong/static/safe type systems, towards weaker type systems in languages such as Python.
Another factor that can make programs dramatically easier to formally reason about is maintaining referential transparency, but most current languages pay little attention to this issue.
A language and environment supporting high-level abstractions, strong type safety, and absolute referential transparency would allow for a lot more formal verification than languages like Java, which do mediocre to poorly on all three...
loqi is completely correct. Formal software verification is in widespread use already, under the name of "type safety": most type systems in common use can only prevent certain limited classes of implementation errors, but others are sophisticated enough to encode a formal specification and verify that its implementation is correct.
There is even a well-known correspondence between strongly typed algorithms and constructive formal proofs: correctly implementing a formal spec is equivalent to providing a constructive proof of the corresponding theorem.
Last time I paid any attention to formal verification of computer programs was in the 1980s. This was basically what I saw:
1. Person decides to implement a way of proving the correctness of computer programs.
2. Person develops grammar for re-expressing code written in Pascal (since these were academics in the 1980s) in their formalism.
3. Person expands their formalism until it covers nearly all of Pascal.
4. Person has then re-developed Prolog without extralogicals.
5. People trying to use Person's formalism rediscover why Prolog is unusable without extralogicals.
If you want formally verifiable programs, the only solution is to program using some variation on Prolog without extralogicals. Have fun.
It is worth noting that the most recent method to overcome the software correctness problem is UNIT TESTING.
In that, you basically create specification as testing procedure, then write the final code and use testing procedure to verify it.
There are many advantages to this process, but the best part IMO is that the test is bidirectional. You can have bugs in testing procedure or in the final code, but very likely it is not the SAME bug. So both tend to be debugged at once.
Also, you can have more than one testing procedure and, most importantly, testing procedure can be written by somebody else.
This BTW might have interesting implications to AI as well. One thing comes to mind is to create testing procedure only and develop some sort of algorithm capable of building an algorithm to match the testing procedure :) Or use Cyc to create testing procedure and create some sort of network of values and algorithm to process them in a feedback loop to match Cyc's database.
I agree with Barry Kelly that this is likely fantasy.
The mud that we're building on isn't the mud of bad software. It's of incomplete and evolving understanding, both of what is and what we want to do.
Drexler mentions compilers and microprocessors as something that he hopes we can formally prove. That would make no effective difference to average users, as compilers and microprocessors are already sufficiently reliable. My computer already doesn't crash and doesn't get infected, because I'm not using a for-profit operating system. Windows is shaky because Microsoft is not there to make software that they're happy with; they're just making the software that they can get you to buy.
As far as I've seen, formal proof of software is hoping to polish a few very shiny pebbles that they've picked off the continent-spanning mountain range of the software that people actually need and use. They're welcome to dream their dreams, but I don't expect them to make much practical difference this century.
Formal methods focus on proving implementations correct with respect to specifications, but those specifications are necessarily more abstract than the only real specification, the source code, and thus they are focusing on the wrong problem.
Except, a "more abstract" specification is preferable, as long as it can be reliably translated into a correct implementation. The reverse direction (implementation -> specification) isn't the point. I doubt many people find it productive to translate assembly language into Python.
If, given a formal specification of a problem, you can produce a correct implementation, then you are a successful compiler. Do you believe there exist specifications for which humans can produce a correct encoding, but computers cannot? If not, that appears inconsistent with your view that formal methods are focusing on the wrong problem.
Kelly Barry, your example is very poorly chosen. No programmer worthy of the name believes that a 32-bit unsigned int in C (for example) is supposed to be a natural number obeying the Peano axioms, so this is hardly a case of failing to implement a specification correctly. If you don't want integers modulo 2^32 or whatever, most languages have some form of memory-bounded integral type, but nobody thinks those are supposed to represent the natural numbers either, since memory is exhaustible and the natural numbers are not.
Barry Kelly doesn't go far enough. I'd sidestep the difficulties in providing formal specifications for a wide variety of methods for a moment. Formal methods rely on assumptions that aren't valid.
Consider Peano's Axiom numbers 6 and 7:6: For every natural number n, S(n) is a natural number.7: For every natural number n, S(n) ≠ 0. That is, there is no natural number whose successor is 0.
At least one of these is provably false in every programming language. When N is a max value, S(N) either throws an error, returns a non-natural number or value, or returns a negative or 0 result. There are no computers that can represent every natural number. They'll break and violate either #6 or #7 in the process.
Formal methods would go a long way to ensuring stability and security, but they're not a magic bullet.