I'd say that the development of specialized programming languages are responsible for much of what is observed. The languages are designed to support, or work in a manner that addresses many of the problems of previous languages and the constraints they placed on how you had to think about a problem. When a language is designed to get out of your way and allow you to focus on higher-level constructs, you'd be surprised how much more you can do with it.
I'm also leaving out the incredible amount of work that modern compilers must do to translate purely imaginary constructs invented for a particular application into something the hardware can run.
The stuff you can do with a few lines of something as widely-known such as Java or Python is trivial compared to what one had to write to accomplish the same thing in say, FORTRAN - and that assumes that FORTRAN can, somehow - through sheer volume of code - accomplish the same task at all.
It's like asking your kid to mow the yard with tweezers. I'll take my ExMark 36" commercial mower any day.
> Finding the asymptote is high school math, not a field of study.
More playing with words, more ignoring my historical point. Why aren't you addressing my original point? (And calculus is not 'a field of study'? OK then...)
In what sense is Brutus correct that "Programming theory used to not consider asymptotic time to be an important field of study."? Can you provide any evidence for this bizarre claim like absence of asymptotics from Knuth, no Turing Awards before 1980 for asymptotic improvements, etc etc?
If you can't, then I'm going to stop replying because debating minutia of how 'Big O isn't *really* correct' is irrelevant to the OP, to my original comment, and to Brutus's claim.
It's hardly a "field of study". Algorithms are a field of study, proofs of the minimum number of operations required for something are a field of study, and so on. Finding the asymptote is high school math, not a field of study.
Also, plenty of theorists are most notable for decreasing a constant factor in some algorithm which is at it's theoretical best asymptotically (e.g. sorting). And plenty of important problems (e.g. breaking cryptographic functions) are O(1) by definition , just the constant is incredibly huge.
> There's another meta-level point: Programming theory used to not consider asymptotic time to be an important field of study.
How is this even remotely accurate? Stop dodging the question and quibbling about whether big O is exactly right, and answer the question: how did 'programming theory used to not consider asymptotic time to be an important field of study'? Because this seems exactly opposite of the historical reality, and you admit as much:
> at least, for the number of key operations (e.g. comparisons).
Which of course are exactly what you would want to be doing asymptotics on, the 'key operations'...
Maybe ordinary programmers spent most of their time worrying about constant-factors and bit-banging, but the theorists didn't.
Dave,This is a pretty interesting thing to think about. Lately I feel that there has been an inherent Moore's Law Bias pervasive in all software start-ups since Thiel invested in Facebook.
Perhaps it's a rational part of valuation. However, I do know that most of the initial and major investors of the early and biggest tech startups since 2006 all drink from the same futurist punch bowl.
I think this issue is a lot simpler than what you're making it out to be. These "artisans" are otherwise known as "grad students"--although I'm sure they'd appreciate the former moniker just as well.
The university system for Computer Science programs is very analogous to your description of large corporations achieving large scale economies through coordination and investment. Most open source projects begin as thesis research on how to build a better mouse trap. This is desirable in academia because of the considerably less overhead it takes to fund narrowly-scoped research in algorithm design and complexity versus hardware based projects.
This research also achieves coordination across disciplines. For example, the generalizations of the von mises distributions and asymmetrical exponential power distributions were collaborative efforts that ended up in lengthy proofs for grad students in statistics. Then the computer science students who were interested in turning these into data packages for projects were able to put those theses into data sets, identify areas where computational efficiency could be increased, and then back out more publishable material from that.
A consistent supply of students and funding for STEM fields only makes sense that the algorithm efficiency would increase just like hardware. Efficiency begets efficiency. And this field also is consistent with the "publish or perish" doctrine.
More examples of research in algorithm design turning into open source projects: most popular math packages (e.g. weka)agent based modeling packagespseudo-random number generatorsCUDA wrappersdatabase packages (e.g. Berkley DB)
None of the above magically materialized from lonely artists sitting with their slab of marble and chisel. Someone in the highly competitive and interdependent academic universe needed to publish.
Terminology is important when you try to argue with Brutus...
If you look at e.g. "the art of computer programming", or other old books, you frequently see a full formula for the run time of an algorithm, or at least, for the number of key operations (e.g. comparisons).
As has been pointed out since the start, observing that Moore's law is probably the first part of a sigmoid is about as useful as knowing that 'in the long run', the market is always rational. That may well be, but in the long run we are all dead.
Um... so what? You're quibbling over terminology; who cares whether it's 'fairly idiosyncratic', it's how CS operates and has operated for half a century, which is the point. Characterizing the asymptotics, proving bounds, and improving bounds are big parts of studying algorithms, and have been since early on, as I said.
The software you speak of here isn't really groundbreaking. Existing programs could easily design 5nm chips, it's always a question of waiting for the physicists and chemists to develop new lithography techniques.
Looks like the graph is on page 127. It's surely not the only thing in those 300 pages of relevance to this discussion, it just happens to be the one bit that stood out enough to get reused in a couple talks I attended.
"Maybe progress in hardware is really driven behind the scenes by progress in algorithms; new algorithms are what really enables each new generation of computer hardware. But that sure isn’t the story I’ve heard."
I don't have much personal experience with semiconductor manufacturing, but one of my former coworkers worked in semiconductor CAD and would talk about these issues frequently. My understanding is that many of the advances in lithography and other fabrication techniques go hand-in-hand with software to generate masks and automate design of ever larger circuits with ever smaller feature sizes. Reducing the feature size (ie what Moore's law describes) introduces new opportunities for subtle quantum interference effects that needs to be compensated for. As the chips get more complex, the chip needs to be laid out by a computer as well. Quality control is another issue. Blown up to the size that a human can see individual features, a modern CPU is enormous. It's far too big for humans to check, so the process needs to be automated, algorithmically identifying errors in fabrication by analyzing enormous image files.
I think these are for the most part extremely special purpose algorithms, but they do depend on advances in general scientific computation, as well as image processing, databases and combinatorial optimization techhniques.
Also note that once you get it down to something like O(N*log(N)) , if the problem requires at least looking at the every item, there's simply no place for massive improvement left, none what so ever.
As for multicore, for most software it is not really worthwhile to optimize it, and the general kinds of software which is worth optimizing for multicore (physics, graphics, equation solving of various kinds, other simulation, etc) have been running in parallel since 1980s or earlier. And in majority of the software I think the hardware speed was utilized to cut down programmer effort.
>Rule 2. Measure. Don't tune for speed until you've measured, and even then don't unless one part of the code overwhelms the rest.
>Rule 3. Fancy algorithms are slow when n is small, and n is usually small. Fancy algorithms have big constants. Until you know that n is frequently going to be big, don't get fancy. (Even if n does get big, use Rule 2 first.)
That is: we have better algorithms now entirely because we finally need them.
Would our current algorithms be of *practical* use for the same sort of problems 20 years ago? Or would the constants have overwhelmed them? I'm sure there will be a few cases where this is true, but I would strongly suspect that in the vast majority of cases, the new algorithms wouldn't be an improvement for the old *use cases*.
I'd say that the development of specialized programming languages are responsible for much of what is observed. The languages are designed to support, or work in a manner that addresses many of the problems of previous languages and the constraints they placed on how you had to think about a problem. When a language is designed to get out of your way and allow you to focus on higher-level constructs, you'd be surprised how much more you can do with it.
I'm also leaving out the incredible amount of work that modern compilers must do to translate purely imaginary constructs invented for a particular application into something the hardware can run.
The stuff you can do with a few lines of something as widely-known such as Java or Python is trivial compared to what one had to write to accomplish the same thing in say, FORTRAN - and that assumes that FORTRAN can, somehow - through sheer volume of code - accomplish the same task at all.
It's like asking your kid to mow the yard with tweezers. I'll take my ExMark 36" commercial mower any day.
> Finding the asymptote is high school math, not a field of study.
More playing with words, more ignoring my historical point. Why aren't you addressing my original point? (And calculus is not 'a field of study'? OK then...)
In what sense is Brutus correct that "Programming theory used to not consider asymptotic time to be an important field of study."? Can you provide any evidence for this bizarre claim like absence of asymptotics from Knuth, no Turing Awards before 1980 for asymptotic improvements, etc etc?
If you can't, then I'm going to stop replying because debating minutia of how 'Big O isn't *really* correct' is irrelevant to the OP, to my original comment, and to Brutus's claim.
It's hardly a "field of study". Algorithms are a field of study, proofs of the minimum number of operations required for something are a field of study, and so on. Finding the asymptote is high school math, not a field of study.
Also, plenty of theorists are most notable for decreasing a constant factor in some algorithm which is at it's theoretical best asymptotically (e.g. sorting). And plenty of important problems (e.g. breaking cryptographic functions) are O(1) by definition , just the constant is incredibly huge.
Yes, it is important, but look at what he said:
> There's another meta-level point: Programming theory used to not consider asymptotic time to be an important field of study.
How is this even remotely accurate? Stop dodging the question and quibbling about whether big O is exactly right, and answer the question: how did 'programming theory used to not consider asymptotic time to be an important field of study'? Because this seems exactly opposite of the historical reality, and you admit as much:
> at least, for the number of key operations (e.g. comparisons).
Which of course are exactly what you would want to be doing asymptotics on, the 'key operations'...
Maybe ordinary programmers spent most of their time worrying about constant-factors and bit-banging, but the theorists didn't.
Dave,This is a pretty interesting thing to think about. Lately I feel that there has been an inherent Moore's Law Bias pervasive in all software start-ups since Thiel invested in Facebook.
Perhaps it's a rational part of valuation. However, I do know that most of the initial and major investors of the early and biggest tech startups since 2006 all drink from the same futurist punch bowl.
I think Moore's Law will become less true if competition becomes unnecessary and the world population declines.
I think this issue is a lot simpler than what you're making it out to be. These "artisans" are otherwise known as "grad students"--although I'm sure they'd appreciate the former moniker just as well.
The university system for Computer Science programs is very analogous to your description of large corporations achieving large scale economies through coordination and investment. Most open source projects begin as thesis research on how to build a better mouse trap. This is desirable in academia because of the considerably less overhead it takes to fund narrowly-scoped research in algorithm design and complexity versus hardware based projects.
This research also achieves coordination across disciplines. For example, the generalizations of the von mises distributions and asymmetrical exponential power distributions were collaborative efforts that ended up in lengthy proofs for grad students in statistics. Then the computer science students who were interested in turning these into data packages for projects were able to put those theses into data sets, identify areas where computational efficiency could be increased, and then back out more publishable material from that.
A consistent supply of students and funding for STEM fields only makes sense that the algorithm efficiency would increase just like hardware. Efficiency begets efficiency. And this field also is consistent with the "publish or perish" doctrine.
More examples of research in algorithm design turning into open source projects: most popular math packages (e.g. weka)agent based modeling packagespseudo-random number generatorsCUDA wrappersdatabase packages (e.g. Berkley DB)
None of the above magically materialized from lonely artists sitting with their slab of marble and chisel. Someone in the highly competitive and interdependent academic universe needed to publish.
Terminology is important when you try to argue with Brutus...
If you look at e.g. "the art of computer programming", or other old books, you frequently see a full formula for the run time of an algorithm, or at least, for the number of key operations (e.g. comparisons).
As has been pointed out since the start, observing that Moore's law is probably the first part of a sigmoid is about as useful as knowing that 'in the long run', the market is always rational. That may well be, but in the long run we are all dead.
Um... so what? You're quibbling over terminology; who cares whether it's 'fairly idiosyncratic', it's how CS operates and has operated for half a century, which is the point. Characterizing the asymptotics, proving bounds, and improving bounds are big parts of studying algorithms, and have been since early on, as I said.
The software you speak of here isn't really groundbreaking. Existing programs could easily design 5nm chips, it's always a question of waiting for the physicists and chemists to develop new lithography techniques.
http://science.energy.gov/~...
Looks like the graph is on page 127. It's surely not the only thing in those 300 pages of relevance to this discussion, it just happens to be the one bit that stood out enough to get reused in a couple talks I attended.
Do you know where it can be found?
"Maybe progress in hardware is really driven behind the scenes by progress in algorithms; new algorithms are what really enables each new generation of computer hardware. But that sure isn’t the story I’ve heard."
I don't have much personal experience with semiconductor manufacturing, but one of my former coworkers worked in semiconductor CAD and would talk about these issues frequently. My understanding is that many of the advances in lithography and other fabrication techniques go hand-in-hand with software to generate masks and automate design of ever larger circuits with ever smaller feature sizes. Reducing the feature size (ie what Moore's law describes) introduces new opportunities for subtle quantum interference effects that needs to be compensated for. As the chips get more complex, the chip needs to be laid out by a computer as well. Quality control is another issue. Blown up to the size that a human can see individual features, a modern CPU is enormous. It's far too big for humans to check, so the process needs to be automated, algorithmically identifying errors in fabrication by analyzing enormous image files.
I think these are for the most part extremely special purpose algorithms, but they do depend on advances in general scientific computation, as well as image processing, databases and combinatorial optimization techhniques.
Also note that once you get it down to something like O(N*log(N)) , if the problem requires at least looking at the every item, there's simply no place for massive improvement left, none what so ever.
As for multicore, for most software it is not really worthwhile to optimize it, and the general kinds of software which is worth optimizing for multicore (physics, graphics, equation solving of various kinds, other simulation, etc) have been running in parallel since 1980s or earlier. And in majority of the software I think the hardware speed was utilized to cut down programmer effort.
This is basic Unix philosophy: http://www.faqs.org/docs/ar... Rob Pike's rules 2 and 3 are:
>Rule 2. Measure. Don't tune for speed until you've measured, and even then don't unless one part of the code overwhelms the rest.
>Rule 3. Fancy algorithms are slow when n is small, and n is usually small. Fancy algorithms have big constants. Until you know that n is frequently going to be big, don't get fancy. (Even if n does get big, use Rule 2 first.)
That is: we have better algorithms now entirely because we finally need them.
Would our current algorithms be of *practical* use for the same sort of problems 20 years ago? Or would the constants have overwhelmed them? I'm sure there will be a few cases where this is true, but I would strongly suspect that in the vast majority of cases, the new algorithms wouldn't be an improvement for the old *use cases*.