A Free Lunch in a Mousetrap

Yes, I know that the title of this paper sounds like gobbledygook. This is because the new book by William A. Dembski [1] titled No Free Lunch – Why Specified Complexity Cannot Be Purchased Without Intelligence (D-NFL), some parts of which I am going to discuss in this paper, makes no more sense than the title of this article. I tried to find in that book something which would enable me to say a few words favorable to at least a fraction of Dembski's new publication. I could not find it. In my review [2] of Dembski's earlier publications including his book The Design Inference (TDI) [3], I criticized the latter in quite unambiguous terms. In my view, Dembski's new book (which he claims is a sequel to TDI) is even worse than TDI. D-NFL has been highly acclaimed by Dembski's colleagues and supporters. In fact, confusing statements, contradictory definitions, and even elementary errors as well as unnecessary mathematical exercises, abound in this book. A substantial portion of D-NFL simply reiterates, often verbatim, Dembski's earlier publications, although many critics have demonstrated the multitude of weaknesses in Dembski's position. On the other hand, there are some new elements in this new book compared with Dembski's earlier books and papers. Unfortunately, these new elements are mostly characterized by the same penchant for using self-coined terms, pretentious claims of important insights or discoveries without a proper substantiation, and too-obvious a subordination of the discourse to preconceived beliefs. What these beliefs are can be seen from the first two quotations from Dembski, placed right under the title of this article. Note that the third quotation plainly negates the first two. Whereas in the third quotation Dembski claims that his intelligent design theory is purely scientific and not tied to any religious doctrine, so it can legitimately be discussed within the framework of a scientific dispute, the first two quotations clearly show the real religious motivation and therefore the goal of his allegedly scientific discourse.

I have no intention of providing a comprehensive review of Dembski's new book. I don't believe it deserves such, although it surely will be highly praised and used by intelligent design (ID) adherents as an allegedly very sophisticated and mathematically rigorous substantiation of the ID theory in their ongoing war against genuine science. I will discuss only a few selected points in Dembski's new book. However, the absence of discussion of some other parts of that book in no way signifies that I agree with them or find any merits in them. Though I feel that Dembski's new book should not be completely ignored lest the ID adherents could claim that their opponents have nothing to say in response, a comprehensive review would be a waste of time and effort.

Among the new elements found in D-NFL we see Dembski arguing with some of his critics. Some of that material is a repetition of arguments in his earlier papers (for example his dispute with Robert Pennock). Some other, however, seem to appear for the first time (like his replies to Wesley Elsberry, Gert Korthof, Howard Van Till, and John McDonald). On the other hand, Dembski seems to still ignore some other critics of his theories. For example, among the names of people with whom Dembski has had "direct contacts," listed on page xxiv, we find the names of Eli Chiprout and Richard Wein. However, Dembski does not say a single word about the strong critique of his work by Chiprout or Wein. Then, Dembski refers to a book by Del Ratzsch and thanks the latter for providing an example of Oklo uranium mine (page 26). However, Dembski fails to even mention that the same quoted book by Ratzsch contains serious critical remarks that address a number of items in Dembski's book The Design Inference. Other critics whom Dembski seems to ignore are Professor Massimo Pigliucci and myself. Whereas Dembski is certainly not under obligation to reply to his critics, the absence of a response is usually viewed as an admission of the lack of good counter-arguments. Hence, Pigliucci, Wein, Chiprout, and some other critics seem to be entitled to interpret the absence of replies from Dembski as his tacit acknowledgment of having nothing to say in response. Of course, there also is another possible explanation, that Dembski is so enchanted by his own achievements that he disdainfully ignores our critical remarks viewing them as well below the level of sophistication befitting his brilliant work. I leave to the readers the choice between these two explanations.

2. Dembski defends Behe's mousetrap example

A substantial fraction of Dembski's new book is devoted to a fierce defense of Behe's concept of irreducible complexity [4]. This is easy to understand. Dembski and Co. insist that their activity is not motivated by religious predisposition but is genuinely scientific and based on legitimate evidence of design. Of course, among the opuses of design proponents there are also theological or nearly-theological works wherein they explicitly reveal the religious (usually Christian) foundation of their approach, but in his latest book Dembski's denies that his theory is anything but an unbiased analysis, supported by mathematical arguments which proves the concept of intelligent design regardless of any religious connotations. However, perusing the literary production of "design theorists" reveals the paucity of evidence in favor of their position but abundance of casuistry. Behe's concept of irreducible complexity seems to be almost the sole case which may be capable of being presented as a scientific argument based on biochemical evidence. Therefore intelligent design creationists like Dembski desperately need to prove Behe's thesis. If Behe's thesis is refuted (as, in my view, it should be) then the design creationists are left with nothing which would constitute even a semblance of a scientific discourse.

In my article [5] I reviewed Behe's argument in detail and came to the conclusion that his concept is contrary to logic and to facts. I will not repeat this argument here but will address Dembski's defense of Behe.

In my article [5] I did not devote any substantial space to the critique of Behe's favorite example – that of a mousetrap which he presented as an example of irreducible complexity and therefore as a model of a biological cell from the viewpoint of that concept. I did not do it because it seemed to be a secondary point and proving the inadequacy of a particular example did not seem as important as unearthing the principal flaws in Behe's concept. I wrote a little more about Behe's mousetrap model in another paper [6] where I discussed the question of models in science in general and pointed to the faults of Behe's model.

Behe's model – his mousetrap – was debunked by John H. McDonald in a posting [7]. McDonald demonstrated how Behe's five-part mousetrap can be gradually reduced to a four-part, three-part, two-part and finally one-part contraption, each preserving the ability to catch mice, albeit not as good as the five-part construction. Therefore, the mousetrap does not seem to be irreducibly complex, contrary to Behe's assertion.

I readily concede that demonstrating the weakness of Behe's particular example does not in itself disprove his thesis. It only shows that he happened to suggest a bad model. It may, though, presumably cast a shadow on Behe's overall image and hence potentially undermine Behe's overall idea in the eyes of those readers who have not yet made up their minds in regard to the ID theory. That is something ID creationists are not willing to accept. Therefore Dembski devoted many pages of his new book not only to defending Behe's overall concept, but also to an attempt to overturn McDonald's particular debunking of Behe's mousetrap model.

When McDonald demonstrated the inadequacy of Behe's mousetrap example, the proper behavior for Behe the scientist would have been either to offer a reasonable rejoinder or to concede that his example was not very well thought of. In the latter case he might still insist that the inadequacy of his example does not translate into the inadequacy of his entire theory and such a position, which is legitimate, could be discussed on its own terms. Behe and Dembski chose a different approach – to defend Behe's example by using casuistry. (For a partial discussion of Behe's response to McDonald, see the appendix to the article at Irreducible Contradiction ). This is one more illustration of the typical behavior of ID proponents. They seem to be not that much interested in establishing the truth as in winning the dispute at any cost, by whatever means available.

In order to see the lack of substantiation in Dembski's defense of the mousetrap model, let us quote from Behe. In his book Darwin's Black Box Behe writes: "If any of the components of the mousetrap (the base, hammer, spring, catch, or holding bar) is removed, then the trap does not function. In other words, the simple little mousetrap has no ability to trap a mouse until several separate parts are all assembled. Because the mousetrap is necessarily composed of several parts, it is irreducibly complex."

Note that Behe did not suggest any additional conditions the trap must meet to be characterized as irreducibly complex. The only feature determining, according to Behe, the irreducible complexity of the trap was that the removal of any of its parts must make the remaining set of parts incapable of catching mice, and that the trap cannot be functional until all five parts are in place.

Confronted with McDonald's spectacular counter-example, Dembski resorts to a transparently contrived argument purported to show that McDonald's example does not prove the inadequacy of Behe's model. Dembski's main argument concentrates on the particular shapes of the mousetrap parts which are slightly different in each of McDonald's simplified versions. When McDonald removes this or that part of the mousetrap, he slightly modifies the shape of the remaining parts thus preserving the trap's functionality albeit diminishing the trap's quality. It is obvious that the process can be reversed. One can start with the simplest one-part trap, then gradually add more parts, improving its ability to catch mice at every step of the procedure. In Dembski's view, the modifications of the parts' shape at each stage of McDonald's procedure makes his example irrelevant to Behe's thesis. Look again at the above quotation from Behe. He did not mention the additional condition introduced by Dembski – that the removal of the trap's parts must not be accompanied by any modification of the shape of the remaining parts. Perhaps Behe indeed had in mind the notion that in an irreducibly complex system the parts remaining after the removal of some other part must retain their initial shape. However, he did not spell out such a condition so McDonald was not constrained by it at the time he suggested his counterexample. Furthermore, and it is more important, why shouldn't the shape of parts in the "reduced" versions of the mousetrap change? McDonald's example satisfied all Behe's originally formulated conditions – it showed that, contrary to Behe's position, parts of the trap can be removed and the remaining contraption can be made to preserve functionality (albeit at a lower level of fitness). This fully debunked Behe's assertion as it has been originally formulated by Behe.

Is the requirement added by Dembski - that the parts remaining after the removal of a certain other part retain their shape - indeed relevant to the discrimination between irreducible and reducible complexities? Remember that Behe's example of the alleged irreducible complexity of a mousetrap was suggested as an illustration of his concept of irreducible complexity as applied to biological systems. The Darwinian theory of evolution includes as an inseparable part the concept of gradual changes resulting in the slow accumulation of features advantageous for organisms. If we stick to Behe's example as a model of a biological system, the evolution of a mousetrap in McDonald's scheme from a one-part to a five-part contraption comprises steps which are not really small. Actually each step of McDonald's scheme may be viewed as the sum of many smaller steps wherein the parts gradually change their shape and at a certain stage of the evolution another part is added. At a certain step of that evolutionary process, some of the already existing parts of the system may change again their shape, in particular make it simpler, if such a simplification of shape is not detrimental to the organism's fitness. The mousetrap's parts, if it is at all viewed as a model of a biological system, can modify their shape at every step of the evolution preserving the trap's ability to work at a higher level of fitness. Though McDonald's scheme shows only four steps in the path from a one-part to the five-part contraption, actually that path may have comprised many more intermediate steps not shown. Therefore the modification of the parts' shape at every step from a one-part to the five part contraption in no way diminishes the power of McDonald's example, which decisively debunks Behe's statement quoted above.

Moreover, if the requirement of unchangeable shape of all parts of the mousetrap were included in Behe's original formulation, McDonald, as well as anybody else with some engineering experience, could have shown a single part, a two-part, then the three-part, four-part and finally five-part mousetraps wherein the shapes of the original parts would not change at all in any of the steps building up to the final five-part contraption. This would result in a five-part trap whose parts would have shapes different from those shown in Behe's picture, but such a five-part trap would be as good as that by Behe, even if its parts would have not the simplest shape possible. In living organisms such organs which have unnecessarily complex shapes are common testifying to their evolutionary history.

The above considerations have recently found a confirmation as McDonald updated his scheme of gradually developing mousetrap. This new version of McDonald's example is not discussed in Dembski's book, apparently because it appeared after the book was submitted for publishing. In his new (animated) scheme, McDonald shows a series of mousetraps starting with a simple bent wire serving as a primitive mousetrap. Step by small step, by adding one more part at each step, and gradually modifying the functions of the trap's parts, as the parts, originally optional, become necessary, McDonald illustrates how a process of the trap's development may proceed. While in McDonald's example, each modification of the device was done by the designer (McDonald), in the process of evolution a similar progress from a primitive to a more complex device could have been governed by a combination of mutations and natural selection. McDonald updated scheme shows the lack of substantiation in Dembski's attempt to defend Behe's model.

McDonald states that his example does not represent the actual process of biological evolution. Dembski grasps at that statement using it to insist that McDonald's example does not prove that Behe's position is wrong or that Behe's mousetrap model is bad. He says: "the problem is that his progression of mousetraps has little connection to biological reality." You can say that again, Dr. Dembski. What you pretend not to notice is that the reason the progression of mousetraps does not represent biological reality is simply that Behe's example has very little to do with biological reality. Obviously, when debunking Behe's model which has little to do with biological reality, McDonald had no need and no way to provide an example which would be more relevant to biological reality than Behe's original model was. Neither Behe's model nor McDonald's counterexample have much to do with biological reality. Dembski is prepared to happily forgive this from Behe but not from McDonald. This is typical of Dembski's selective logic.

The difference between McDonald and Behe is that the former realizes and freely admits that his scheme does not adequately represent biological evolution, whereas Behe tries to unduly use his model as an illustration of biological reality. What McDonald's scheme does very well, is show the lack of substantiation in Behe's statement about the irreducible complexity of a mousetrap. Given Dembski's education and his "formidable intelligence" (in the words of one of Dembski's admirers found in a blurb in the D-NFL book) it is hard to believe that he himself does not realize the fallacy of his attack on McDonald and of his defense of Behe. A much more plausible assumption seems to be that he is not interested in an unbiased evaluation of arguments but only in winning the dispute at any cost, by whatever means he can contrive.

3. Dembski salvages the irreducible complexity

Dembski's rendition of the dispute between Behe and his retractors is replete with inaccuracies. I don't think, though, that a detailed analysis of these inaccuracies is important or interesting. It would be proper if Dembski's work were a scientific monograph summarizing research reported in peer-reviewed scientific journals and discussed at scientific conferences. Since, however, Dembski chose to address his book to a general audience and thus eschewed the peer-reviewing process, it seems better to analyze just the most salient points of Dembski's defense of Behe's concept of irreducible complexity.

On page xvii Dembski says: "I am not a fan of notation-heavy prose and avoid it whenever possible." However, just leafing through his new book reveals that the quoted statement is contrary to facts. Like his preceding book, the new one is chock-full of mathematical symbols, more often than not adding nothing of substance to his discourse.

A good example is found on pages 271- 279, in a section titled "The Logic of Invariants." Here Dembski resorts to his favorite method of discussion – presenting a convoluted chain of arguments in a heavily symbolic form, incomprehensible to a general reader, who must be impressed by the mathematical sophistication of Dembski's discourse, and thus conclude that an expert of such intellectual caliber, with his PhD degree in mathematics, surely must know what he is talking about. Actually, all this mathematical discourse is largely irrelevant to the concept of irreducible complexity.

Dembski explains the concept of invariant in a heavily symbolic form wherein his style is unnecessarily complicated so that it requires a considerable effort even for a reader trained in mathematics to comprehend what Dembski actually means to say. I suggest that an average reader try to decipher the following passage on page 274: "...define a function Invar on Ω (for definiteness assume Invar is real-valued, i.e., Invar takes values in the real numbers R). Let A = { r Є R | there exists some natural number n and some X in Init such that Invar (φⁿ)(X) = r} and B = { r Є R | there exists some Y in Term such that Invar (Y) = r }..." etc. The quoted passage is just a fraction of a much longer exercise in heavily symbolic discussion. If it is an example of Dembski's attempts to avoid "notation-heavy prose," I wonder what he considers to be a really notation-heavy prose? The sole purpose of that convoluted discussion heavily loaded with mathematical symbols seems to be providing a definition of the concept of an invariant.

However, those readers who have enough experience with mathematics to have comprehended Dembski's notations-heavy paragraphs, certainly know what is an invariant and need no explanation. Those readers who don't know what an invariant is presumably are also not prepared to digest the mathematical exercise on page 274 which therefore does not seem to serve any useful purpose.

In any case, the definition of an invariant could be done in one simple sentence. For example, for the purpose of Dembski's discourse it would be sufficient to say that an invariant of a certain procedure is a quantity whose value does not change in that procedure. (For example, entropy is an invariant of a reversible adiabatic process.)

After having devoted considerable space and effort to introduce his mathematically loaded definition of an invariant, Dembski makes no use of that definition anywhere afterward. Instead, he tries to apply the concept of an invariant as a tool for what he calls proscriptive generalization, one more of his self-coined terms. Proscriptive generalization simply means that certain processes or events are claimed to be impossible based on some general consideration rather than on a detailed analysis of the factors preventing the occurrence of these processes or events. What Dembski asserts essentially can be spelled out as the following simple statement: if it is found that in a certain process a quantity which is an invariant of that process would actually change, then such a process must be considered as impossible.

Whereas the gist of the above statement itself meets no objection, it is actually of very little informative value, and the lengthy and convoluted discourse by Dembski aimed at arriving at that assertion is plainly redundant and seems only to serve as a scientific-like embellishment.

Indeed, when Dembski turns to his defense of Behe's concept of irreducible complexity, all he says in regard to his preceding lengthy mathematical exercise is that irreducible complexity is "an invariant for the Darwinian process of random variation and natural selection." To make such a statement there was no need for all of the preceding mathematical-looking exercise. More important, though, is that the above statement, which purports to reflect Behe's position, cannot be taken for granted. It requires proof and none has been provided by either Behe or Dembski.

Apparently aware of the strong objections to Behe's concept from many professional biologists, Dembski uses a clever device – he admits that Behe's concept is not faultless. He writes: "Behe's idea of irreducible complexity is neither exactly correct nor wrong.... Instead it is salvageable." (page 280). Contrary to his statement, Dembski actually tries to prove that Behe's idea is indeed correct, and asserts that this must become clear if only Behe's original definition of irreducible complexity is slightly fixed. Hence, in order to salvage the concept of irreducible complexity, which is needed by the ID advocates to substantiate their otherwise arbitrary conceptual system, Dembski, who has never admitted a single error in his own output, is even prepared to sacrifice to a certain extent the sterling reputation of his cohort Behe.

Having quoted Behe's definition, Dembski then proceeds to repair it in five consecutive steps. Here is Behe's original definition quoted by Dembski:

DefinitionIC_init" A system is irreducibly complex if it is 'composed of several well-matched, interacting parts that contribute to the basic function, wherein the removal of any one of the parts causes the system to effectively cease functioning'."

And here is the final (salvaged) definition of irreducible complexity suggested by Dembski as the result of his five-step salvaging effort:

Definition IC_final--"A system performing a given basic function is irreducibly complex if it includes a set of well-matched, mutually interacting parts such that each part in the set is indispensable to maintaining the system's basic, and therefore original, function. The set of these indispensable parts is known as the irreducible core of the system."

Looking at the "salvaged definition" of Irreducible Complexity according to Behe-Dembski immediately reveals that is not a proper definition of anything. It can be argued that the fallacies of the above quasi-definition can be forgiven to Behe who is a qualified biochemist but not a professional logician (although, as a scientist, he is expected to offer a definition at least reasonably logical). However Dembski, with his collection of advanced degrees including a PhD in philosophy, should have known better than to suggest a definition which is contrary to the elementary requirements of logic. Of course there is not much surprising in that, since from his previous book The Design Inference we already know that formulating definitions is by no means Demsbki's forte.

The above definition of Irreducible Complexity purports to be a deductive definition, whose standard form necessarily is that of a triad containing the following elements: 1) A general concept, which is supposed to be known to all participants of the discourse and interpreted by all of them in the same way, 2) Qualifiers, which point to those features of the concept to be defined that distinguish it from all other concepts also encompassed by the general concept spelled out in point 1, and 3) The target of the definition, i.e., the concept to be defined. One of the requirements for a proper deductive definition is that whatever belongs in item 2 cannot simultaneously belong in either items 1 or 3. Otherwise the definition would turn out to be circular and therefore void of informative value.

Another requirement for a proper definition is that the general concept in item 1 and the qualifiers in item 2 must be precisely predefined according to a consensus among all the participants of the discourse.

It is easy to see that the Behe-Dembski's "definition" of IC above fails on both counts.

In that definition the target of definition, i.e., the concept to be defined, is IC, the famous Irreducible Complexity. Instead of directly defining IC, the above definition instead defines a system which is irreducibly complex. Such a substitution is legitimate because it is exactly what we are interested in – a definition of an irreducibly complex system, from which, if desired, the definition of Irreducible Complexity per se can be inferred.

In the Behe-Dembski definition above, the general concept (item 1 of the triad) is "a system performing a given basic function." Already this item does not meet the requirements for being the general concept known to all and interpreted in exactly the same way. Unfortunately, this concept has no definite meaning and allows for various interpretations. What is a "basic" function"? How can it be distinguished from a "non-basic" one? Moreover, what is a "system?" How should the boundaries of a system be defined, separating it from whatever is beyond the system? Behe and Dembski offer no definitions of these constituents of item 1. There is no general consensus regarding the precise meaning of these terms. In many situations the boundaries of a system can be chosen in many different ways.

Item 1 in Behe-Dembski's definition would be legitimate and meaningful if the concepts of a "system" and of a "basic function" were defined beforehand. It is a common logical flop when the general concept (item 1 of the triad) has no commonly accepted definition, thus rendering the definition meaningless.

The target of the above definition is an "Irreducibly Complex System." The qualifiers are defined there as the requirement that a system "includes a set of well-matched, mutually interacting parts such that each part in the set is indispensable to maintaining the system's basic, and therefore original, function."

Again, the sub-definition of the qualifiers does not meet the requirements of being known and identically interpreted by all. When are the parts "well-matched" and when are they not "well-matched?" The answer is uncertain. What is well-matched for John may be poorly matched for Mary. However, even if the concept of "well-matched parts" were unambiguously defined and known to all, this would not save the B-D definition. The reason for that is the egregious mix-up of items 1 and 2. The sub-definition of the qualifiers refers to the same "basic function" which is a part of item 1. This makes the definition circular and therefore void of informative value. Indeed, to verify that the parts are indispensable, we are told to test whether or not they serve the "basic function." On the other hand, the "basic function" is a part of the general concept, and therefore cannot serve also as a qualifier. To be functional, the qualifiers must be independent of the general concept.

The conclusion: both Behe's original and Dembski's final (salvaged) formal definitions of Irreducible Complexity are logically deficient and have little, if any, meaning. Of course, Behe and Dembski may limit themselves to more informal, partially descriptive definitions, which is a legitimate manner of discussion, but their attempt to offer a strict formal definition can hardly be viewed as successful.

Now let us discuss the above definitions of IC from an informal viewpoint.

The comparison of Behe's original ("initial") definition with the final, "salvaged" definition by Dembski boils down to, first, the introduction of the concept of "irreducible core" of the system and, second, to the assertion that a system of reduced complexity must retain the "basic (and therefore original)" function in order for the original system to be considered not irreducibly complex.

Behe's definition did not contain an indication that in an irreducibly complex system not all of its parts may be necessary for its proper functioning. Dembski's "salvaged" definition allows for the existence of some parts of a system which can be removed without eliminating its functionality. However, if a system includes a set of parts all of which are necessary for maintaining its basic (i.e. original) functionality, which set Dembski calls its "irreducible core," then the system is irreducibly complex.

The "salvaged" definition does not seem to add anything of substance to Behe's original concept, which, according to Dembski, is "neither exactly correct nor wrong." The question of whether all parts of the system are necessary for its original functionality or only those included in an irreducible core, is of no significance. The boundaries of a system are often set arbitrarily. The irreducible core may be as well viewed itself as the system under consideration. The only purpose of Dembski's modification is to enable him to reject an argument pointing to a system which retains its functionality after the removal of some of its parts by simply asserting that either the removed part did not belong to the "irreducible core," or that the functionality of the reduced system is not equivalent to the "basic (i.e. original)" one.

Dembski's salvaging argument does not really save Behe's concept because it has nothing to do with the actual critique of that concept.

First, Behe has never proved that even a single biochemical system he described was indeed irreducibly complex according to his definition. Dembski's argument that the reduced system must perform exactly the same "basic" function is an arbitrary requirement. Biologists tell us that in the course of evolution many systems changed their functionality. A biochemical system which in modern organisms clots blood, in some of its preceding, simpler form could very well have performed a different function or even no function at all, acquiring its ability to do this or that job only at a certain stage of evolution.

Dembski also offers a number of refutations to the critique of Behe's concept by Kenneth Miller, Niall Shanks, Karl Joplin and Russell Doolittle. Except for Shanks, these critics, unlike Dembski, are professional biologists who criticize Behe's ideas from the biological standpoint. Since I am not a biologist, I leave the discussion of Dembski's attack on the above listed experts to those better versed in biology than myself. However, there is one point in Dembski's attack on Doolittle which has little to do with biology and everything to do with the ethics of a scientific discussion. On page 281 Dembski discusses the dispute between Behe and Doolittle wherein he essentially reiterates the argument used by Behe himself in a paper published in the collection [8]. In that paper Behe claimed that after he replied to Doolittle's critique [9] of Behe's work, Doolittle conceded being wrong in his interpretation of the experiment by Bugge at al [10]. I have contacted Professor Doolittle and asked him to confirm that he indeed acknowledged his error in the interpretation of the experiment in question. Professor Doolittle unequivocally and energetically denied having provided a reason for such an assertion by Behe. In his book, Dembski essentially repeats Behe's claim saying that "Doolittle's counterexample failed." This assertion is not based on evidence. When I see the methods of discussion employed by Behe and Dembski in this case, I feel that every statement by these writers must be carefully examined since there seems to be a good chance they did not verify their sources diligently enough.

4. Dembski's explanatory filter revisited

In his new book [1] Dembski devotes a considerable part of his discussion to his "Explanatory Filter" (to be referred as EF). To my knowledge, this is at least the sixth time Dembski published a description of the device which allegedly enables one to reliably distinguish among the three causal antecedent of an event – necessity (also referred to as regularity or law), chance and design. In Dembski's scheme, these three causal antecedents of an event are supposed to cover all possibilities and are mutually exclusive.

In my earlier discussion [2] of Dembski's "Design Theory" I reviewed many parts of Dembski's previous discourse, including a rather detailed analysis of his EF, and offered a number of arguments against the validity and reliability of that allegedly powerful analytical tool. In this review of D-NFL I will not repeat the arguments detailed in [2]. I will, though, provide some additional examples showing the inadequacy of EF, and will also discuss some rather telltale alterations of Dembski's presentation of his theory in [1] as compared to his five earlier renditions of EF.

EF comprises three so-called "nodes," which are three steps of analysis aimed at determining whether an event is due to law (regularity, necessity), chance, or design.

At the first node, according to Dembski, one estimates the probability of the event in question and if this probability turns out to be "large," the event is attributed to law (regularity, necessity). The term "large" was not quantitatively defined by Dembski. If the probability is "not large," whatever this means quantitatively, it passes to the second node. Here the decision is to be made whether the probability of the event is "intermediate" or "small." Again, these terms have not been defined quantitatively. If the probability of the event is determined to be "intermediate," the event is attributed to chance. If the probability is "small" the event passes to the third node, where the final judgment is to be made whether the event must be attributed to chance or to design.

I will now discuss separately the first and the second nodes of EF on the one hand, and the third node on the other hand. The reason for such a separation is that Dembski's analytical procedure is essentially the same for the first and the second nodes but is rather different for the third node. At the first and the second nodes the only criterion according to Dembski's scheme is the event's probability, whereas at the third node the criterion is two-fold, comprising low probability and specification.

In my previous analysis of EF I argued that the procedure suggested by Dembski for the first and the second nodes of EF is unrealistic thus making his entire triad-like scheme meaningless. As I argued there, the actual procedure is necessarily opposite to Dembski's scheme. For example, at the first node of EF, according to Dembski, we conclude that an event resulted from regularity (law, necessity) if we find that its probability is large. To see that Dembski's prescription is meaningless, consider the relationship between two concepts – the large probability of an event and a law (regularity, necessity) which determines the occurrence of that event. It does not take a "formidable intelligence" (which, according to a blurb in D-NFL, characterizes Dembski) to see the simple fact – of these two concepts, law (regularity, necessity) is the cause and high probability is the consequence of the law (regularity, necessity). Therefore it is contrary to elementary logic to suggest that the attribution of an event to law (regularity, necessity) can result from a prior estimate of the event's probability. If one does not know about the existence of a law (regularity, necessity) one has no way to conclude that the event's probability is high.

Dembski's scenario, according to which we in some mysterious way conclude that the event's probability is high, is unrealistic because we cannot estimate the probability of an event unless we know its causal history. This history is a necessary part of the background knowledge possessing which is necessary for estimating the event's probability. In particular, to conclude that the event probability is "high" we have to first ascertain that the event was caused by law (regularity, necessity), not the other way around, as Dembski suggested. Likewise, at the second node of EF, we cannot assert that the event's probability is either "intermediate" or "small," unless we know the event's causal history, which is again contrary to Dembski's unrealistic scheme.

Let me suggest an example illustrating the above argument. Imagine that a small detachment of American soldiers in Afghanistan moves in a rugged mountainous terrain. They enter a narrow gorge between two steep rocky slopes. When they are in the middle of the gorge, a large stone rolls down from the slope and hits the path close to the soldiers.

Now let us try to apply EF to this event. According to Dembski, there are three and only three distinctive possibilities – that the rock fall was due either to regularity, chance, or design. If such rock falls happen in this gorge regularly, say once every few minutes, the event in question has to be attributed to regularity (law, necessity) and therefore its probability is estimated as high. If such rock falls occur not very often, say once in a couple of days, the knowledge of the frequency of such events is what enables one to conclude that the probability of that event was "intermediate." Finally, if such falls of stones occur in this gorge extremely rarely, say once in ten to fifteen years, the event, according to Dembski's EF, is to be analyzed within the framework of the third node, wherein the discrimination between chance and design has to be made by looking for a possible pattern, i.e. specification. Design in this case may mean that some Taliban fighters deliberately pushed the rock from the crest of the rim of the ridge in order to hit the soldiers.

Note, though, that in each case the probability is estimated based on the knowledge of the causal history of the event. Dembski's scheme prescribes the opposite procedure – first the probability is somehow "read off the event," and based on the value estimated, the event is attributed to one of the three causes. This procedure is unrealistic.

At the first node of EF, according to Dembski, the assumption of regularity at work is to be tested.

Obviously the occurrence of the event itself does not provide any clue as to whether its probability is high, intermediate or small. If the soldiers continue their trip, they will never find out whether the probability of the described event was large, intermediate, or small. Using Dembski's own terminology, the probability of an event cannot be "read off the event."

Assume, though, that the soldiers were warned that in that gorge rocks fall regularly and hit the path. Now they possess the necessary background knowledge – they know in advance that the rocks fall regularly. They have a prior knowledge of a regularity (law, necessity) and therefore estimate the probability of the actual event (the falling stone) as high. Its probability was high because it was due to regularity. In accordance with Dembski's unrealistic scheme, though, the soldiers first should have somehow estimate the probability of rock's falling from that slope and having found it to be large (how?) decide that a regularity was at work. In fact, to ascribe large probability to an event, the knowledge about its being due to law (regularity, necessity) must precede the estimation of probability.

Now imagine the soldiers walk through another gorge and again a large stone rolls down at the moment they are in the gorge's middle section. Again, there is no way they can follow Dembski's scheme and estimate the event's probability without knowing the history of the rocks falls in that gorge. Assume that this time they were warned that in this gorge rocks fall from time to time, but not too often, say once in so many hours. Now the probability of the event can be estimated as being neither very high nor very small. This "intermediate" probability was estimated based on the known history of similar events, whereas in Dembski's scheme the probability estimate is supposed to be made without any such knowledge. In Dembski's scheme the event is attributed to chance because its probability was found to be "intermediate." How this probability could have been "read off the event," i.e., estimated on its own, without utilizing the "background knowledge" which in this case must necessarily include the history of rocks falls, remains Dembski's secret. The actual procedure is opposite to that suggested by Dembski: the probability is estimated as not large enough to attribute the event to regularity based on the background knowledge available, not the other way around as Dembski's scheme requires. If no such background knowledge is available, no useful estimate of probability is possible, thus rendering the first and the second nodes of Dembski's EF meaningless.

Since the first and the second nodes of EF in Dembski's scheme are contrary to elementary logic, the triad-like structure of his EF collapses. The remaining part of EF, its third node, is, however, a different story requiring a little more detailed discussion.

5. The third node of the explanatory filter

Let us look at Dembski’s treatment of the third node in D-NFL.

In all five previous renditions of his EF, Dembski suggested that the discrimination between chance and design is made at the third node of EF by two criteria. One is the event’s probability (this time, unlike at the two preceding nodes, estimated upon the definite assumption that the event was due to chance) and the other is "specification."

As I argued in my previous review of Dembski’s theory, the concept of specification, which was presented by Dembski in a heavily symbolic form, actually could be rather simply defined as "a subjectively recognizable pattern." This simple definition follows from all those examples Dembski provided in his previous publications. Dembski, however, chose to cloak this concept in a convoluted mathematical mantle.

To infer design according to Dembski, an event, besides being improbable on a chance hypothesis (the latter not being specified) must also display "specified complexity," or, for short, specification. (The concept of "specified complexity," which is assigned a great importance in Dembski’s theory, is discussed in detail in another section of this review.)

Specification, according to Dembski’s previous renditions of his theory, in turn comprises two necessary components, one called detachability and the other delimitation. Detachability, in turn, according to Dembski’s previous renditions, necessarily comprises two sub-components, one named conditional independence of the background knowledge and the other tractability.

This multi-component scheme has been criticized from various viewpoints by various reviewers of Dembski’s publications. In particular, I argued against the excessively convoluted way these concept were presented by Dembski, pointing out that the concept of specification in itself does not require all that mathematical symbolism and actually all these components of specification in Dembski’s rendition serve no useful role. Apparently the critique has had some effect after all, since in [1] Dembski suggests a discussion of specification different in certain respects from his previous opuses. Of course Dembski never admits that anything was wrong in his previous publications or that he was influenced by criticism. However, when discussing specification in [1], Dembski introduces two alterations of his earlier discourse.

As mentioned above, in Dembski’s earlier rendition of specification, the concept encompassed three components, conditional independence of background knowledge (denoted CINDE), tractability (denoted TRACT), and delimitation (denoted DELIM).

In the new rendition found in D-NFL, TRACT is no longer a constituent of detachability. On page 66 in [1], Dembski says: "…I have retained the conditional independence but removed the tractability condition." In The Design Inference Dembski spent considerable effort to justify the inclusion of TRACT into his concept of detachability, using both plain words and mathematical symbolism plus examples illustrating the importance of that alleged insight into the design inference. In my previous discussion [2] of Dembski’s The Design Inference, I argued that all this convoluted structure of the design inference, and of the concept of specification in particular, was excessive, while the concepts of tractability, delimitation, etc, played no useful role and served only to embellish the discourse with a complex mathematical fa?ade. Although in [1] Dembski does not explicitly admit any faults of his earlier argument, he actually does it implicitly, by removing from the concept of detachability its TRACT component which he previously viewed as necessary and important. Instead of frankly admitting that he goofed, Dembski says now that tractability is not really necessary within detachability but rather should be moved to his Generic Chance Elimination Argument (GCEA).

The question of whether tractability is indeed a useful part of the GCEA or is as useless there as it is in detachability is a separate issue. The fact is that Dembski implicitly admits his previous error but is reluctant to say this directly.

Furthermore, there is one more alteration of Dembski’s earlier discussion of specification. A fate worse than that of TRACT befell another component of Dembski’s earlier scheme, the one he denoted DELIM. Whereas in [1] Dembski at least points out the deletion of TRACT from detachability, he does not mention at all the elimination of DELIM, which in his earlier treatment was deemed a necessary and important component of specification.

In his new book, DELIM has disappeared. Dembski does not explain what happened to that feature which he previously claimed to be a necessary part of specification.

Obviously, if that term is not mentioned any longer when specification is being discussed, it was not really a necessary component of specification, which negates Dembski’s convoluted discussion of DELIM in his previous book. Normally in a scientific publication such alterations of the author’s earlier position are explained and when appropriate, the errors or inadequacy of the earlier argument are admitted. Of course D-NFL is not really a scientific book, although Dembski wants it to be accepted as such.

If we remove from EF the first and the second nodes, will the remaining third node provide a reliable tool to attribute an event either to design or to chance? As I argued in my previous discussion [2] of Dembski’s work, the third node is not a reliable tool. Dembski himself admits that EF can yield false negatives, i.e., attribute an event to chance when it was actually designed. He insisted, though, that when EF attributes an event to design this is reliable, i.e., that EF does not yield false positives. Since Dembski made this claim, many examples of false positives produced by EF have been demonstrated. Dembski does not address these examples in D-NFL.

As I argued previously, one of the main faults of Dembski’s scheme is his attributing to specification the status of a kind of magic. There is, though, nothing magical in that concept. In the examples of false negatives specification was not discerned but the event was obviously designed. In the examples of false positives the specification seemed to be present but the event was due to chance.

Specification, as follows from Dembski’s own examples, is nothing more than a subjectively recognized pattern. It can be illusory or real, but it has no exclusive status among many factors pointing either to design or to chance. Recognition of a pattern (which necessarily is subjective) affects the estimate of the event’s probability, but so do many other factors.

I discussed this thesis in detail in my previous discussion [2] of Dembski’s work.

In conclusion of this section, let us look again at the example discussed at the beginning, that of a stone falling into a gorge, and review it within the scope of the third node of EF.

At the third node, the event of low probability (estimated on a chance hypothesis) has to be attributed to design if specification is discovered. In one of Dembski’s own examples, if in an archery competition the arrow hits a small target painted on a wall, the event is specified. In the case of the stone hitting the path in a gorge, if the stone falls exactly at the spot where the soldiers are at that moment, the event is likewise specified. In the case of archery competition, specification is in that the target has a specific shape and location, unique among all other locations on the wall. In our example, the soldiers happen to be at a specific spot, unique among all other parts of the gorge, so this example is exactly like Dembski’s. Hence, if the likelihood of a chance occurrence of the event is estimated as very small (based on the known history of the gorge) and the stone falls exactly on the spot where the soldiers happen to be at that moment, we have to attribute the event to design, i.e. to assume that some enemy fighters deliberately pushed the stone down the slope to hit the soldiers. Yes, this is a reasonable assumption. However, this assumption still may happen to be wrong.

Indeed, once in a while stones do fall in that gorge. It happens very rarely, but it is not an impossible event. Could such a coincidence happen that a stone falls exactly at the moment the soldiers are where it lands? Of course, it could. Why, then, do we attribute the event to design? Because the probability of such a coincidence is very small and only because of that. Why is the probability of the event in question estimated as being very small? Since stones fall in that gorge from time to time, albeit very rarely, the probability that at some moment a stone would fall upon some spot within the gorge is 100%. What makes the probability of the actual event so small? Specification -- the choice of a specific spot and time.

My point is that all what specification does is decrease the probability of the event in question. Yes, specification (any specification, i.e., any choice of a specific event out of the multitude of all possible events, and not necessarily the particular kind of specification meeting Dembski’s criteria) always decreases the estimate of probability of an event. But so do many other factors which may have nothing to do with specification. For example, if the soldiers know that a reconnaissance detachment has investigated the slopes flanking the gorge and found that all the stones above the path are rather strongly embedded in the ground, this would greatly decrease the likelihood of a chance fall of rocks, and this decrease of likelihood would have nothing to do with specification which in this case is in the unique location and timing of the event. On the other hand, if the soldiers knew that right behind that steep rocky slope which flanks the gorge, there is a village whose inhabitants are Taliban sympathizers who boasted that they would harm American soldiers at every opportunity, and there is an easy access path from that village to the crest of the ridge hovering over the gorge, this knowledge would not affect the probability of a chance fall of rocks, but will greatly increase the estimated likelihood of design. Hence, the ratio of likelihoods of chance and design would drastically change in favor of design. This increase of the estimated likelihood of design would also have nothing to do with specification which was in the unique location and timing of the event. Likewise the knowledge that the winner of an archery contest happened to be a novice who has never before succeeded in winning the competition would greatly decrease the likelihood of design and increase the ratio of likelihoods of chance and design in favor of chance. On the other hand it the winner was a world champion with a record of hitting the target 99% of the time, this would greatly increase the likelihood that his success was due to design as compared with the likeihood of chance and this conclusion would have nothing to do with specification (which in this case, according to Dembski, is in the unique location and small size of the target).

At this point it seems reasonable to go back to the first node of Dembski’s filter and see how the knowledge of the village behind the ridge or of the archer’s record affects the procedure.

In the case of the village inhabited by Taliban sympathizers, the answer is obvious. The knowledge of the existence of the village in question does not imply a regularity is at work. Indeed, the villagers do not regularly climb up the slope and drop rocks into the gorge. The event therefore reaches the third node where the likelihood of design exceeds that of chance because of the knowledge about the Taliban village.

In the case of an archer the situation is different. The knowledge of the champion archer’s record points to regularity, since the champion archer regularly succeeds in hitting the target. Actually, though, this has little to do with Dembski’s scheme. In his scheme, we first estimate the probability of an event and if it is found to be large we attribute it to regularity. In fact, the procedure is opposite to Dembski’s scheme. We know that the archer in question regularly hits the target and therefore we estimate the probability of his success as large. In other words, in that case the event is attributed to regularity even before it enters the filter, so that design is inferred without using the filter at all. In this case specification (i.e., the unique location of the recognizable target) plays no role whatsoever in the inference to design.

The role of specification is not that of an independent factor besides the small probability as Dembski’s alleged "crucial insight" asserts. Specification, regardless of whether or not it meets Dembski’s criteria, always diminishes the probability estimated on a chance hypothesis, but in that it is not any different from many other factors affecting the estimate of probability.

Design inference may be very plausible, but still remains probabilistic and its plausibility is due only to the low estimated likelihood of chance being the causal antecedent of an event. As I argued in my previous review of Dembski’s work, he correctly states that a small probability in itself is not a sufficient reason to infer design. However, adding specification does not remedy the situation because the latter adds nothing qualitatively different from other factors affecting the estimate of probability. Therefore design inference is necessarily probabilistic, with or without specification, although it may be in certain cases extremely plausible.

6. Is complexity equivalent to low probability?

I would like to review one more example which not only illustrates the deficiencies of Dembski's EF scheme, but, moreover, shows the fallacy of one of his most fundamental assumptions – that complexity of an event translates into its low probability.

I am indebted to Jeffrey Shallit for pointing (in a private communication) to a website where references are given to books [11, 12 , 13]. In these books a rarely observed phenomenon is described: sometimes freezing water forms unusual flat triangular crystals of snow. This phenomenon is observed quite rarely and the mechanism for the formation of such crystals is unknown.

Of course, even though the detailed mechanism is unknown, it is obvious that the formation of the mentioned crystals of an unusually simple shape is predetermined by certain weather conditions under which the thermodynamic potential of water/snow that is appropriate for these conditions has minimum, i.e., it can be said that the formation of such crystals is due to a law of physics.

Dembski maintains that since the formation of crystals is predetermined by law, his EF will not err in attributing a formation of any crystal to design. For example, on page 12 in [1] Dembski wrote: "Another concern is that filter will assign to design regular geometric objects like the star-shaped ice crystals that form on a cold window. This criticism fails because such shapes form as a matter of physical necessity simply in virtue of the properties of water (the filter therefore assigns crystals to necessity and not to design)."

Let us notice, first, that the above statement actually contradicts Dembski's scheme, according to which the attribution of an event to law (necessity) occurs in the first node of his filter if the probability of that event is found to be high. Indeed, when stating that his filter reliably attributes the formation of crystals to law, he actually bases his conclusion on his prior knowledge of the existence of a law rather than on the estimation of the event's probability which his scheme prescribes. He does not seem to notice that the procedure he himself employs is opposite to that implied in his EF.

Turning again to the flat triangular crystals, note that although their formation is indeed predetermined by the laws of physics, it depends on the occurrence of certain weather conditions. Such conditions occur very rarely. We do not possess the knowledge which would enable us to predict when and why such conditions may occur. The occurrence of the weather conditions necessary for the creation of flat triangular crystals is a typical chance event. Therefore, contrary to Dembski's scheme, the occurrence of flat triangular crystals is also a chance event.

Look now at another aspect of the described situation, pointed out by Shallit. Since the occurrence of flat triangular crystals is a very rare event, its probability is very small. On the other hand, since these crystals have a precisely defined shape, the event is specified. The combination of low probability with specification, according to Dembski's uncompromising theory is a reliable marker of design. In fact, though, the occurrence of these crystals is due not to design but to a combination of chance and law (such combination is not at all envisioned in Dembski's theory where only individual actions of either law, chance, or design are recognized). As Shallit has correctly pointed out in his private communication, this is an example of a false positive which, as Dembski vigorously asserts, his filter never produces.

Now let me point out the most serious rejection of Dembski's thesis illustrated by the above example. According to Dembski's theory, complexity is just another face of low probability. Statements to this effect are scattered all over his books and papers, including D-NFL. However, in the above example the least probable form of a crystal – the flat triangular one – is also the simplest of all observed shapes of such crystals. This example illustrates the fallacy of Dembski's position, according to which complexity necessarily translates into low probability. A simpler event may very well turn out to be less probable than a more complex one, which makes Dembski's theory unsubstantiated.

7. Dembski suggests a fourth law of thermodynamics

In his very popular book [14] first published some fifty years ago, the well known writer Martin Gardner offered five features typical of the literary production of what he called "cranks." One such feature is "a tendency to write in a complex jargon, in many cases making use of terms and phrases he himself has coined." Gardner wrote further that a crank does not have to be a dunderhead. In some cases an obvious crank may nevertheless be quite "capable of developing incredibly complex theories. He will be able to defend them in books of vast erudition, with profound observations, and often liberal portions of sound science. His rhetoric may be enormously persuasive. All the parts of his world usually fit together beautifully, like a jig-saw puzzle. It is impossible to get the best of him in any type of argument. He has anticipated all your objections. He counters them with unexpected answers of great ingenuity."

Complex jargon with many self-coined terms is found in abundance in Dembski's publications, including his newest book [1]. As to his ability to develop complex theories which are sprinkled here and there with portions of sound science, and which display erudition, Dembski seems to possess such abilities as well.

A crank's use of complex jargon replete with self-coined terms often finds its most salient expression in suggesting allegedly fundamental laws hitherto unknown in science, and Dembski has an obvious propensity to do so. Perhaps the most vivid example of Dembski's extraordinary claims is his announcement of a discovery of an additional law of thermodynamics. On page 169 Dembski writes: "The traditional three laws of thermodynamics are each proscriptive generalizations, that is they each make an assertion about what cannot happen to a physical system." Leaving aside the gist of that statement (which certainly can be disputed) we cannot fail to notice that Dembski seems to have forgotten certain simple facts from the introductory course of thermodynamics: there are not three but four "traditional" laws of thermodynamics. By a peculiar historical twist they were named the zeroth, the first, the second, and the third laws, so, although there are four of them, none is named the Fourth Law. Of course, this flop on Dembski's part does not seem to be very important, but it makes a reader to pause and to consider whether or not Dembski's statements, at least in that part where the new law of thermodynamics is suggested, should be taken with caution.

It is a very rare situation when a scientist is lucky enough to discover a new law which is then accepted by the scientific community and becomes a part of the arsenal of science. It seems to be a more common situation when a new law is suggested but dies out after the scientific community reviews it and finds it unsubstantiated. It is a much more common situation when some allegedly important law is claimed within the framework of pseudo-science. That is why the claim of a discovery of a new law of science more often than not invokes skepticism and suspicion that this is just a case of pseudo-science. I believe the alleged Fourth Law of thermodynamics claimed by Dembski, as well as his underlying Law of Conservation of Information (LCI), are examples of the latter situation. In this article I intend to substantiate that conclusion.

The laws of science differ in importance and in the extent of the generalization of the observed phenomena. For example, scientist S claims that in the course of her research she found that at a pressure of x Pascal, certain metal A melts at the temperature of T Kelvin. Other researchers try to reproduce her results and obtain data, which, within the margin of a reasonably small error confirm the claim of S. Then the law establishing that the melting point of metal A at x Pascal is about T Kelvin is postulated and is accepted by the scientific community as a reasonable approximation of reality. This is a legitimate law of science, which, however, will hardly earn S a Nobel prize. There are, though, other types of laws, those which constitute very far-reaching generalizations of a wide variety of phenomena. The four laws of thermodynamics are of the latter type. The four laws of thermodynamics are among the most general statements about nature known in science. If a scientist managed to indeed introduce a Fourth Law of thermodynamics, thus making a total of five laws in that science, this would constitute a great achievement.

Why are there four laws of thermodynamics but not three or, say, two? The reason that the four laws of thermodynamics cannot be reduced to three, or two, or one is that these laws are not derivable from each other. The second law is not a consequence of the zeroth law or of the first law, and the first law does not entail the second or the third law, etc. If a new law of thermodynamics is to be discovered, it necessarily must be independent of the four accepted laws. If a newly suggested law simply reformulates a concept which has already been covered by one of the four existing laws, then it is not a new law of thermodynamics.

Of course, another requirement for a supposedly new law of thermodynamics is that it must not contradict any of the four laws of thermodynamics already accepted in that science.

I intend to show that the Fourth Law of thermodynamics suggested by Dembski fails on both accounts. First, it covers phenomena which have already been covered by the second law of thermodynamics and therefore, even if it were correct in itself, it would not constitute a new law but would at best be just another way to state essentially the same postulate already adopted in science. However, the situation with Dembski's allegedly possible new law of thermodynamics is worse because it actually contradicts the Second Law of thermodynamics.

Let me start with the first point. The Fourth Law of thermodynamics suggested by Dembski is a generalization of what he calls the Law of Conservation of Information (LCI for short).

In my previous detailed discussion [2] of Dembski's earlier publications I pointed to the flaws which, in my view, are present in Dembski's discourse related to information. I will repeat here briefly some of the points of that critique.

Dembski's treatment of information was also subjected to critique by several other authors, including Victor Stenger [15], Matt Young [16] and others.

On page 140 of [1], Dembski offers the following definition of information I associated with an individual event A:

I(A)= - log₂ P(A)...................................(1)

where P(A) is the probability of event A.

Formula (1) as such was not given in Shannon's classical work [17] which was the real foundation of information theory. However, this formula can be formally derived from Shannon's formula which defines information as a change of entropy if we apply the latter to an individual event. Formula (1) is not peculiar to Dembski's discourse and can be found in textbooks and even in encyclopedias. For example, on page 55 in the textbook [18] we find exactly the same formula (1) for information. Likewise, an article on Information Theory by Professor George R. Cooper in Van Nostrand Scientific Encyclopedia (1976 edition) also contains the same expression as a definition of information of an individual event. Information theory has undergone substantial development after Shannon's classical contribution. While formula (1) does not plainly contradict Shannon's fundamental concepts, in the modern information theory it is viewed as simplistic, while information is defined in various more sophisticated ways. The quantity expressed by formula (1) is sometimes referred to as "self-information" (see, for example Entropy and Information Theory). Since Dembski has been acclaimed as not just an information theorist but an "Isaac Newton of information theory," his treatment of information should be expected to be on a level well above simplistic amateurish approach.

For the sake of discussion, let us accept Dembski's formula (1) as a working definition of information which can be adequate if we do not require that discussion to be at the frontier of the modern development of information theory.

A completely different question is, though, whether or not Dembski uses formula (1) properly, and I agree with critics (see, for example [15, 16] that his treatment of information, including his use of formula (1) has a number of faults. An example will be discussed a few lines down (the example with the word METHINKS).

In his previous book The Design Inference [3] Dembski did not discuss his theory in detail from the viewpoint of information. In his other book Intelligent Design [19], which was of a more popular type, there is a chapter on information (wherein he first suggested his Law of Conservation of Information). In his latest book [1], Dembski offers a rather detailed discussion of information and, unlike in the earlier presentation of his views, discusses the concept of Shannon's entropy. The mathematical expression used by Dembski for entropy (page 131) is

H(a_1.....a_n) = _defΣ_I- p_i log₂ p_{i...............}(2)

Various versions of essentially the same expression, all stemming from Shannon's original work [17] are commonly used and meet no objection. Dembski defines entropy as "the average information per character in a string." Again, I have no objection to that definition which is in agreement with Shannon's definition. I believe, though, this definition dooms to failure Dembski's attempt to introduce a Fourth Law of Thermodynamics based on his LCI. Like information I defined by formula (1), entropy in information theory is also measured in bits (or, more often, in bits per symbol, when specific entropy is used. If instead of logarithms with the base of 2, natural logarithms are used, the units for entropy and information are called nats).

In many books on information theory one can find statements according to which entropy is one of the measures of information and Dembski's quotation of equation (2) is in agreement with that approach. Sometimes these two terms – entropy and information - are used interchangeably (Shannon himself was not very stringent in unequivocal usage of these terms). Some writers prefer to use for H the term uncertainty instead of entropy (or Shannon's uncertainty) and the term surprisal instead of information for the quantity I. Regardless of the preferred usage of terms and whatever the nuances in the interpretation of entropy are, the essence of that concept is the same.

Before discussing entropy, let us look at some of Dembski's examples wherein he estimates the information carried by a certain string of characters. On page 166, Dembski estimates the complexity of the word METHINKS. The formula used by Dembski for what he in this case calls complexity is exactly the same as he previously introduced for information, namely –log₂ P (formula 1).

Note that the term complexity is used by Dembski in this example in a different sense than his own definition of complexity found in his previous book [1]. In that book, Dembski defined complexity as "the best available estimate" of how difficult it is to solve a problem at hand. Now, discussing the complexity of the word METHINKS he uses for complexity formula (1) which he introduced, a few pages earlier, for information I and whose connection to a difficulty of solving a problem seems to be rather far-fetched.

Here is how Dembski calculates the complexity of the word METHINKS. Since this word has 8 characters drawn from the English alphabet which comprises 26 letters plus a character for space, a total of 27 characters, Dembski estimates the probability P of that word's occurrence as (1/27)⁸. Logarithm on the base of 2 for this number is 38, so Dembski concludes that complexity of that word is 38 bits. (Actually the expression used by Dembski is that "the complexity of the word METHINKS is bounded by -log₂1/27⁸"; italics is mine).

Hence in his calculation Dembski treats what he calls complexity by applying a formula introduced a few pages earlier for information. His estimate assumes the uniform distribution of 27 characters in what he calls a "reference class of possibilities" so that each of the 8 characters in that word is assumed to have the same probability of appearing in the string. Such an assumption is justified only for a string of 8 characters randomly drawn from a stock which has an unlimited supply of all 27 possible characters. (The randomized texts obtained in such a way are sometimes referred to as "monkey" texts, because of the famous example of monkeys randomly hitting the keys on a typewriter). In the case of an urn technique, formula (1) is applicable if each letter, after having been drawn from the urn, is returned to the urn.

However, if the word METHINKS is a part of a message received through a communication channel, then it has to be expected to be a part of a natural language's vocabulary. On page 164, i.e., just two pages before calculating the complexity of the word METHINKS, Dembski wrote about transmission of information "from one link to another," about "the textual transmission of ancient manuscripts," "transmission of texts" (page 165) and the like. He never indicates that on page 166 he discusses the occurrence of the word METHINKS in a different way, as a result of a random selection of letters from an unlimited stock of all 27 symbols, except of using the term "bounded.".

The actual distribution of characters in English texts is not uniform. (This is always true for meaningful texts, and often also for gibberish. For example, if the letters in a meaningful text are randomly permuted, the permuted text most often will become meaningless, but the probability distribution of letter frequencies in the permuted text will remain the same as it was in the original meaningful text. The behavior of such randomized texts as compared with meaningful texts, was analyzed in detail in a posting [28].)

If the letters of the word METHINKS occur within a message in a natural language, the letter E has the maximum probability (about 12%) of appearing in any location of the word, the letter T has a slightly smaller probability, etc. In this case the use of formula (1), as was done in this case by Dembski while claiming that he calculated complexity, is wrong. Formula (1), which is legitimate for "self-information" associated with an individual event, can be formally used for a series of events only if their probability distribution is uniform. In the latter case, however, information associated with each individual event (defined by formula 1) would coincide with entropy defined by formula (2), which equals the average information. Then what Dembski actually calculated under the label of complexity, turns out to formally be the entropy of the word in question assuming a uniform distribution of letters. This mess of terms is rather typical of Dembski's discourse. Since Dembski makes no statement to the contrary, the actual distribution of characters has to be assumed to be non-uniform, so that a proper calculation of entropy should be done in this case using formula (2).

However, even if Dembski used formula (2), thus accounting for the non-uniform distribution of symbols, it still would not be sufficient for a correct estimate of the word's entropy (which he calls in this case complexity). Not only is the probability distribution non-uniform, the word in question is also a part of a meaningful English vocabulary, so the probability distributions of digrams, trigrams, etc., are also non-uniform. Furthermore, the natural languages possess redundancy which substantially decreases entropy of meaningful texts. As this was shown already by Shannon, the first order entropy of a meaningful English text is about one bit/character, hence the first order entropy of the word METHINKS is about eight bits, rather than the thirty-eight bits of Dembski's estimate of the complexity's bound (not to mention that Dembki's estimate ignores the entropies of higher orders than 1, the existence of multiple symbols other than just the 26 letters of the alphabet, etc.).

Now consider a different situation in which the word METHINKS can occur either as a part of a message arriving through a communication channel or through the urn technique wherein, though, the elements of the phase space are not individual letters but whole words. For example, imagine that the urn holds every one of the entries from the unabridged Webster's dictionary. It means the urn holds about 315,000 words, each as an indivisible unit. Since each word happens only once, every word has the same probability to be randomly pulled out of the urn. The probability that the word METHINKS happens to be the randomly chosen is then P=1/(3.15x10⁵). Then, using formula (1) suggested by Dembski for information, we find that information (which he also calls complexity) obtained when the word in question has been pulled out, is –log₂P = 18.3 bits instead of 38 bits of Dembski's estimate. This number, though, obviously had nothing to do with complexity associated with the word in question.

If the urn holds unequal numbers of words, say, the distribution of words in the urn is determined by the frequencies of their occurrence in English texts, the probability of the word METHINKS will be different from that estimated for the uniform distribution, and the information calculated by formula (1) will not only be different from either 18.3 bits, 38 bits, or 8 bits, but will also have nothing to do with complexity.

It is possible to define a procedure wherein a random occurrence of the word METHINKS results in information much exceeding 38 bits of Dembski's estimate and also has nothing to do with complexity. For example, assume a word is randomly chosen out of all the words found in all the books in the Library of Congress. What is in this case the probability that the randomly chosen word turns out to be METHINKS? If the total number of words in all the books in the library is N, and the word METHINKS happens among them X times, the probability in question is P=X/N. Obviously, N is a very large number while X is a relatively small one since the word METHINKS is a rare one. In such a procedure the information, if defined by formula (1), will be much larger than 38 bits of Dembski's suggested "bound."

This shows that the estimate of information associated with a certain word may be very different depending on the probability distribution so defining the information bound requires first defining which probability distribution is considered. Dembski's calculation of the "information bound" is valid only for a specific (uniform) probability distribution in a specific procedure (random choice of individual letters). Moreover, information associated with a certain word, if defined by formula (1), cannot be simply translated into complexity. In his discourse, Dembski does not clearly define the discussed situation, uses different terms in a haphazard way and thus creates a mess of concepts and definitions.

Now back to the discussion of the meaning of entropy.

The main founder of information theory Claude Shannon introduced the term entropy for the quantity H (formula 2) because this quantity seemed to behave like its namesake in thermodynamics. After Shannon's classical paper [17] appeared in 1948, the legitimacy of his term and its relation to the thermodynamic entropy were discussed at length and ultimately a consensus was reached accepting the term as appropriate, and informational entropy was accepted as not just a namesake of the thermodynamic entropy but as essentially the same quantity albeit traditionally measured in different units. (Thermodynamic entropy is measured in Joule/Kelvin, whereas informational entropy in bits or bits/symbol, or also in nats or nats/symbol. In theoretical physics entropy is often considered a dimensionless quantity [20].)

In order to clarify that thermodynamic and informational entropies are essentially the same, let us review the concept of entropy as it is interpreted in thermodynamics.

Every professor of physics who has ever taught thermodynamics knows that students usually rather easily accept the concept of energy but often have a hard time comprehending the concept of entropy. In fact, however, the concept of energy is one of the most mysterious concepts in science whereas entropy is a quite simple one. Perhaps the different attitude to these two concepts on students' parts is due to the fact that the term energy is commonly used in the non-scientific speech and students are simply accustomed to it while entropy is a purely scientific term with no use in the everyday vernacular. I believe that nobody really understands what energy is. This concept has no definition. The most fundamental law of physics, the law of energy conservation in its general form, is a rather mysterious postulate which says that there is some quantity we call energy, whose total amount in the universe is constant, although we don't know what this amount is and it can very well equal zero. There are many aspects of that law which are obscure (whose discussion is beyond the scope of this paper). Nevertheless, it seems to be accepted by almost everybody without much pain.

The First Law of thermodynamics is the particular form of the law of energy conservation for macroscopic systems.

The Second Law of thermodynamics deals with entropy. Although entropy has a rather simple and transparent definition, this concept is often absorbed by students only with considerable difficulty.

I see no need to delve here into the detailed history of the development of the entropy concept. The real meaning of that quantity was not immediately properly interpreted when it was first introduced in thermodynamics by Clausius. It took a considerable effort by a number of outstanding scientists in the last quarter of 19^th century to clarify that concept, which was successfully done within the framework of statistical physics. When this was done (and one of the crucial insights was provided by Ludwig Boltzmann) it transpired that entropy can in fact be interpreted as a simple concept of an extremely versatile character.

Without discussing many nuances of that concept, its most concise and universal definition is as follows: entropy is a measure of the degree of randomness (disorder) in a system which comprises many constituent elements. The more disordered is the conglomerate of whatever elements the system encompasses, the larger is its entropy.

It is easy to see the extreme versatility of that concept. It does not matter what the physical nature of the system's constituent elements is. Initially introduced in thermodynamics, entropy was meant to characterize thermodynamic systems. For example, it can characterize the degree of disorder in a gas occupying a certain volume. The gas consists of a large number of identical molecules and its entropy is a quantity which is a measure of disorder (randomness) in the distribution of those molecules over the volume. However, entropy is not a property of those molecules. Their physical nature has nothing to do with the degree of randomness characterizing their distribution in space. The choice of units for entropy is not restricted by the properties of that quantity but can be different depending on convenience. Since entropy is not a physical property of a body or of any other conglomerate of constituent elements, it has no "natural" units which therefore can be assigned at will. Historically, the choice of Joule/Kelvin (or, say cal/⁰C) as units for thermodynamic entropy was made not because these units are somehow intrinsic for entropy but because the initial definition of entropy S by Clausius was in the form of dS=dQ/T where Q stands for heat and T for temperature. As it was realized later, Clausius's entropy is just a particular choice for that function out of an infinite number of possible functions all of which could serve as entropy as well. The sole requirement for a function to be capable of serving as thermodynamic entropy is that it stays constant in a reversible adiabatic process. When Boltzmann discovered the statistical meaning of entropy, thus tying it to probabilities, he wished to preserve the quantitative values of entropy matching Clausius's entropy, so he introduced a coefficient (named the Boltzmann coefficient) which has a certain value expressed in Joule/Kelvin (or cal/⁰C) thus converting the dimensionless logarithm of (thermodynamic) probability into Clausius's entropy. However, the multiplication of the logarithm of probability by Boltzmann coefficient, while being a clever and very useful device, was in fact an arbitrary choice as far as the meaning of entropy itself is considered. Entropy can just as well be used as a dimensionless measure of randomness (and indeed is often used in such way in theoretical physics [2]) or measured in any arbitrary units if this facilitates its use. The meaning of entropy does not depend on the units chosen for it, as the essence of that quantity is in no way related to the physical properties of a system. It can be equally applied to estimate the degree of disorder in a gas occupying a certain volume, to a long string of characters, to the DNA strand, to a large gathering of people, and to an infinite number of other systems each comprising many constituent elements. Regardless of what those constituent elements are, the behavior of entropy is determined by the same laws, of which the 2^nd law of thermodynamics is the most widely known.

As quoted above, Dembski acknowledges that entropy is just the average information. Therefore whatever new law of thermodynamics he may suggest, as long as it deals with information, it necessarily deals with entropy. If that is the case, the new law not only must not contradict the 2^nd Law but may become a legitimate new law only if it adds something not already covered by the 2^nd Law. Given the universal character of the 2^nd Law of thermodynamics, whose validity extends well beyond its original home – thermodynamics – it is very hard to discover a new law relating to entropy which would add anything not covered by the 2^nd law of thermodynamics. Regardless of whether or not Dembski's Fourth Law is correct or not, it deals with information and hence it deals with entropy whose fundamental behavior is covered by the 2^nd law. Even if the new law about information (i.e., about entropy) turns out to be correct, the chance that it will shed light on some hitherto unknown features of entropy's behavior is very slim. Most probably, the supposed Fourth Law, even if true, can only be a consequence of the 2^nd law or some corollary to it.

As I will argue now, however, the possible Fourth Law of thermodynamics suggested by Dembski is in fact wrong because it is based on what he calls the Law of Conservation of Information. As I will argue, LCI is neither a law of conservation nor a law about information. Dembski's LCI is an unsubstantiated and contradictory statement which belongs in pseudo-science.

Let us first discuss whether LCI can indeed be named a conservation law. This seems to be a secondary point, but I will discuss it because Dembski obviously attaches a considerable significance to the name of his supposed new law, devoting many words to the justification of the name he gave to it.

It seems a platitude to say that a conservation law must necessarily be about something that is conserved. Conservation laws are important in physics. The most fundamental law of physics is the law of energy conservation. This law is an example of that rare type of conservation laws which are unconditional. The total amount of energy in the universe, according to that law, is constant, so it is conserved regardless of any changes of anything else in any processes. It encompasses energy in all of its multiple forms, including the energy of the rest mass. A much more common type of a conservation law is that which is conditional. For example, the law of momentum conservation in Newtonian mechanics asserts that the momentum of a system is conserved if no external forces act upon that system. The quantity which is conserved, according to that law, is momentum, which is quite rigorously defined within the framework of Newtonian mechanics. This conservation law is conditional because it asserts that the momentum of a system is conserved only under the condition that no external forces act upon the system. However, this law, first, quite clearly states that something is indeed conserved, second, clearly defines what is conserved and, third, clearly defines under which conditions it is conserved.

It can also be said that each conservation law is about the conditions ensuring that a certain quantity is an invariant of a certain process. Momentum of a macroscopic mechanical system is an invariant of such process in which no external forces act upon the system in question.

There also are many laws in science which are not conservation laws. In particular, out of the four laws of thermodynamics only the First Law is a conservation law. It is a particular form of the energy conservation law applicable to macroscopic systems. The three other laws of thermodynamics are not conservation laws. This includes the 2^nd Law. The 2^nd Law of thermodynamics has a number of definitions, but essentially it deals with entropy. This law does not state that entropy is conserved (although it is conserved in the particular case of the so-called reversible adiabatic processes; the concept of a reversible process is an abstraction since all real processes are not reversible). The 2^nd Law of thermodynamics states that the entropy of a closed ("isolated") system cannot decrease. In any spontaneous process the entropy of a closed system can either increase or remain constant. Therefore, the 2^nd law of thermodynamics is not considered and is not named a conservation law.

Dembski's alleged LCI – Law of Conservation of Information, is not about something that is conserved. It states that a quantity he calls Complex Specified Information (CSI) can either decrease or remain constant in what he calls "a closed system of natural causes." Dembski provides no definition of the "closed system of natural causes." I discussed this point in my previous review [2] of Dembski's work. However, at this time I am only discussing whether or not his LCI can indeed be interpreted as a conservation law, regardless of whether LCI is correct or not. In [1] Dembski spends many words to justify the inclusion of the word conservation into his LCI. All these words are pure casuistry because they cannot change the simple fact – his LCI is not about conservation of anything. CSI, according to LCI, can decrease, hence there is no reason to name LCI a law of conservation which it obviously is not.

Now let us discuss whether or not LCI is about information.

In several of his earlier publications Dembski discussed an example (based on the famous example originally suggested by Richard Dawkins) wherein he pointed out the difference between two strings of letters of equal length. One of these strings spells a phrase from Hamlet, "METHINKS IT IS LIKE A WEASEL," and the other is a string of gibberish of the same length (28 characters if space is counted among the characters).

Note that the 1^st order entropy of the above meaningful quotation from Hamlet, according to classical information theory, is about 28 bits, whereas the 1^st order entropy of a random string of 28 characters taken from the English alphabet which comprises 26 characters plus one more for space, is almost five times larger, i.e., about 135 bits. If, though, we include in the set of available symbols also numerals, commas, colons, periods, semi-colons, exclamation and question marks, mathematical symbols etc., the entropy of string will be larger. Moreover, if we wish to account for the total entropy rather than for only the 1^st order entropy (which will be discussed later in this article) the numbers will be larger by about 30 to 40%. Note also that the entropy of a random string and, hence, also the amount of information brought by that string to a receiver, is larger than it is for a meaningful string of the same length.

According to Dembski, the quotation from Hamlet must be attributed to design (of course, nobody would argue against this obvious conclusion) whereas the string of gibberish of the same length is attributed rather to chance. What is the reason, according to Dembski's criteria of design, for the above conclusions? In order to be attributed to design, asserts Dembski, the event (in this case the occurrence of the string of characters) must have low probability (which in this example is one in so many billions) and be specified (in this case to be a recognizable meaningful English phrase). Low probability, according to Dembski, is equivalent to large complexity (statements to this effect are found in many places in Dembski's books and papers). Hence, in the example in question, Dembski obviously finds CSI – his Complex Specified Information, in the quotation from Hamlet but not in the string of gibberish of the same length. What is actually the difference between the two strings? It is in that one is a meaningful English phrase, i.e. is displays a recognizable pattern to those who possess at least some minimal knowledge of English whereas the other string is meaningless, i.e., displays no recognizable pattern (although, unknowingly for us, it may happen to be a meaningful text in some other language we are not familiar with). If we rely on the above example, it seems that what Dembski means by CSI is equivalent to the recognizable meaningfulness of the string. If we accept this interpretation of CSI, obviously this concept is not what is called information in information theory. The concept of information in that theory has nothing to do with the semantic contents of a message. The definition of information adopted by Dembski himself (formula 1 above) also has nothing to do with the meaningfulness of a string. Hence, the interpretation of CSI, as it follows from Dembski's own example, shows that CSI, despite its name, is not information even in Dembski's own interpretation of the latter term. If we were to stick to Dembski's concept of CSI as rendered in his previous publications, we would have to conclude that CSI, despite the inclusion in it of the term information, is not information at all and therefore the LCI in its earlier form was not about information as well as it was not about conservation.

In his new book [1], however, Dembski discusses the CSI and LCI in terms sometimes rather different from those seemingly following from the example of two strings of characters.

Actually there seem to be several differing interpretations of CSI within the same new book by Dembski.

One of these interpretations is based on Dembski's discrimination between what he calls conceptual and physical information. On page 137 we find the following statement: "In practice, there are two sources of information – intelligent agency and physical processes." Immediately after that sentence Dembski tries to protect his flanks by diluting the strict meaning of the above statement. He writes: "This is not to say that these sources of information are mutually exclusive – human beings, for instance, are both intelligent agents and physical systems. Nor is this to say that these sources of information exhaust all logically possible sources of information -- it is conceivable that there could be nonphysical random processes that generate information."

The last quoted sentence leaves a reader confused about what actually is Dembski's position. The possibility of the existence of what he calls nonphysical random sources of information seems to negate his initial assertion about the two sources of information. Dembski proceeds, though, to distinguish between what he calls conceptual and physical information. On page 139 he offers definitions of these two kinds of information:

Conceptual Information: Intelligent agent S identifies a pattern and thereby conceptually reduces the reference class of possibilities.

Physical Information: Event E occurs and thereby reduces the reference class of possibilities.

These definitions allow for various interpretations, but it seems that conceptual information is used by Dembski as a more general concept than just the semantic contents of a string of characters. The meaning of the first definition seems to be that any recognizable pattern, if identifiable by an intelligent agent, would point to design. Obviously, though, if a string is a meaningful text in a language familiar to the intelligent agent, it will, according to Dembski's definition, carry conceptual information, although conceptual information is not limited to strings of characters. Therefore, if we deal with texts, the term conceptual information seems to coincide with the meaningfulness of that text. The term text can have a very wide interpretation. A multitude of systems can be encoded by a string of zeros and ones and hence be represented by a text. If that is so, then conceptual information is not what is named information in information theory.

Therefore Dembski's term conceptual information, despite the inclusion of the word information, is not information in the sense of information theory.

Apparently being aware that his term conceptual information is actually often synonymous with the semantically meaningful contents of a message, Dembski tries to salvage that term as allegedly denoting a different concept by devoting a separate section (pages 145-146) to the discussion of what he calls semantic information. He asserts in that section that semantic information is not a part of CSI. However, the concepts of semantic information and conceptual information, although not fully synonymous, partly overlap. [Note that Dembski's discussion of semantic information seems to indicate that he is not familiar with the recent developments in the algorithmic theory of probability which trespassed the boundaries of information theory and have some promising achievements toward distinguishing between noise and useful information (see for example [21]). Also, Dembski's assertion that what he calls semantic information does not submit to "mathematical and logical analysis" (page 147) is incorrect. It were true if it related only to information theory. But information theory is just a part of science. Contrary to Dembski's assertion, semantic contents of texts predetermine statistically discernable patterns. In particular, a method of statistical analysis of texts named Letter Serial Correlation which enables one to distinguish between semantically meaningful texts and gibberish was developed by Brendan McKay and the author of this review [22]].

Dembski uses the term conceptual information in his definition of Complex Specified Information (CSI), also referred to as Specified Complexity (SC). This definition is given on page 141:

Complex Specified Information: The coincidence of conceptual and physical information where the conceptual information is both identifiable independently of the physical information and also complex.

This definition of CSI comprises several points, each of which requires some deciphering. There is no need, though, for such deciphering at this time, since now we are only interested in an evaluation of whether CSI is defined in a way fitting its use in Dembski's Law of Conservation of Information.

The point of interest in this respect is that Dembski's concept of CSI as defined above incorporates, as an inseparable part, conceptual information, which is not information in the sense of information theory. Therefore CSI is not information either.

Hence, the acclaimed Law of Conservation of Information suggested by Dembski is neither about conservation nor about information.

With his typical inconsistency, in other sections of his book Dembski offers different interpretation of LCI, which introduce a new element absent from his earlier discussions, although this new interpretation of LCI has its roots in his previous publications where he suggested the so-called Universal Probability Bound of 10^-150.

According to the new interpretation of CSI given in [1] CSI is indeed information after all, but to qualify for being CSI the amount of information must be not less than 500 bits.

The new formulation of LCI given in [1] seems to become a statement asserting that both stochastic processes and algorithms are capable of generating specified information up to 500 bits but no more than that. The threshold value of 500 bits is based on Dembski's "universal probability bound" of 10^-150.

To provide a feeling for what the seemingly minuscule probability bound of 10^-150 means, let us note that a random string of characters drawn from the English alphabet (not including numerals, punctuation marks and spaces) carries over 500 bits of information when its length exceeds only 105 letters. A semantically meaningful English text carries over 500 bits if its length exceeds about 500 characters, which is about 1/6 of an average single-spaced typewritten page. (Actually these estimates are good only for the 1^st order entropy, which, however, constitutes the main portion of the total entropy of a text. The total entropy includes L-1 terms where L is the text's length expressed in the number of characters. For example the 2^nd order entropy can be expressed by the same formula as formula (2) where though p_i denotes the probability of a digram, that is of a combination of any two characters, rather than of an individual character, and where the sum also has to be divided by 2. If the total entropy is considered, the length of a randomized string of characters drawn from the English alphabet and carrying 500 bits is about only 60 letters. For a meaningful English text that number is close to 300 characters.)

Dembski offers his new definition of the Law of Conservation of Information on page 160. This definition provides a good example of Dembski's style which is characterized by convoluted renditions of rather simple concepts. Here is the quotation from page 160:

"Law of Conservation of Information. Given an item of CSI, call it B=(T₂, E₂), for which E₂ arose by natural causes, any event E₁ causally upstream from E₂ that under the operation of natural causes is sufficient to produce E₂ belongs to an item of CSI, call it A(T₁,E₁), such that

(LCI _csi) I(A&B)=I(A) mod UCB.....................(3)

where by definition the quantity of information in an item of specified information is the quantity of information in the conceptual component (i.e., I(A) =_def I(T₁) and

I(A&B) =_def I (T₁ & T₂))."

If an average reader is puzzled by the above definition, with its collection of constituent concepts piled upon each other, such a reader can be consoled that he is not alone. First note that the abbreviation UCB stands for "Universal Complexity Bound," which, Dembski explains, "throughout this book we take to be 500 bits of information." He also explains that the abbreviation "mod" stands for "modulo," which "refers to the wiggle room within which I(A) can differ from I(A&B)." Dembski elaborates by saying: "To say that these two quantities are equal modulo UCB is to say that they are essentially the same except for a difference no greater than UCB." (Note that this use of the term modulo differs from its standard use in mathematics). Regarding notation T, Dembski explains it on 141-142 as follows: "The event E ... is an outcome that occurred via some physical process. The target T... is a pattern identified by the intelligent agent S without recourse to the event . ... both T and E denote events. The ordered pair (T,E) now constitutes specified information provided that the event E is included in the event T and provided that T can be identified independently of E (i.e., is detachable from E)."

Whereas the quoted passages illustrate Dembski's propensity for unnecessary quasi-scientific esoteric language, if an average reader is still in the dark regarding what exactly Dembski's definition of LCI means, for such a reader Dembski also provides a few more definitions in plain words. On pages 159-160 we read: "If a natural cause produces some event E₂ that exhibits specified complexity, then for every antecedent event E₁ that is causally upstream from E₂and that under the operation of natural causes is sufficient to produce E₂, E₁ likewise exhibits specified complexity." Now, this is a bit simpler than the above quoted notation-laden definition. It lacks, though, that part of the full definition which is about the wiggle room. Actually Dembski permits the natural causes which produce a consequent event E₂to add information to that already contained in an antecedent event E₁, but only if the additional information does not exceed 500 bits. On page 161 he says: "Because small amounts of specified information can be produced by chance, this 500-bit tolerance factor needs to be included in the Law of Conservation of Information." This seems to be a small step in right direction on Dembski's part, because his earlier formulation did not allow for any wiggle room as he asserted that "natural cause cannot generate CSI" without exception.

8. Can functions add information?

It is instructional to look at some passages in D-NFL which precede the definition of the LCI. On pages 151-154 we find a lengthy discussion of whether or not functions can add information. Perhaps it would be better to use the term "algorithm" instead of function, but function is the term chosen by Dembski. On page 152 we read: "Functional relationships at best preserve what information is already there, or else degrade it – they never add to it." Now jump over two pages. On page 154 we read: "I have just argued that when a function acts to yield information, what the function acts upon has at least as much information as what the function yields. This argument, however, treats functions as mere conduits of information, and does not take seriously the possibility that function might add information." Dembski proceeds with an example of a function which adds information and concludes the passage as follows: "Here we have a function that is adding information. Moreover, it is adding information because the information is embedded in the function itself."

Here is a quintessential Dembski who makes two important-sounding statements within two pages of which the second statement completely negates the first one. So, which statement is correct – the one asserting that functions "never add" information or the one asserting that there are functions adding information embedded in functions themselves?

Since Dembski has a goal -- to prove something he takes as true before even considering arguments in favor or against his belief, namely that CSI can only be created by intelligent agent -- whereas his own example with functions seems to contradict his thesis, he resorts to a mathematically-looking acrobatics wherein the simple facts are obscured by esoteric notations. At the end of page 154 and the beginning of page 155 he offers what can only be viewed as a quasi-mathematical trick aimed at allegedly reconciling his two irreconcilable statements.

To perform his trick Dembski introduces a new operator U which comprises both the initial information i (source information) and the initial function f (which can add information embedded in it to the initial information i). He insists that unlike f, U does not add information. In what way inclusion of f into a composite function U makes f lose its ability to add information, is not explained. If f can add information embedded in it, no mathematical trick like making it a part of a composite function can eliminate its ability to add information, regardless of whether it does so as a stand-alone function or as a constituent of composite function U.

One of his statement on page 155 is: "...distinction between functions and information is not hard and fast." The intrinsic meaning of that statement seems to be Dembski's secret. For anybody who is not Dembski's admirer, the concepts of information and of function are quite distinctive. Having concluded that functions do not after all add information (a conclusion which is necessary to support his preconceived thesis) Dembski says: "Formula (*) confirms this as well." It is rather odd to hear from a mathematician that a formula confirms something. Formulas in themselves neither confirm nor negate anything. Any formula is just a statement made in mathematically symbolic form. Formulas, as any statements, are either postulated or derived. If a formula is postulated it obviously does not confirm anything. If a formula has been derived, it means it was obtained via a certain logical procedure, in a mathematically compressed form, starting from a certain premise. Therefore a formula is only as good as is the premise. A formula in itself cannot confirm anything beyond whatever was already assumed in the premise, although it may shed additional light on the premise. In particular, formula (*) (page 152) is as follows:

I(A&B) = I(A) + I(B|A)

This formula is a consequence of, first, the formula for the probability of two events A and B both actually occurring (when A and B are not independent events) and, second, of the definition of information as a negative logarithm of probability. It does not in any way confirm or negate Dembski's thesis, according to which CSI can only be created by intelligent agents, or even his narrower thesis that functions do not add information (the latter is actually rejected elsewhere by Dembski himself).

There is one more rather convincing indication of the fallacy of Dembski's LCI. I intended to provide this additional consideration but I was too late. Richard Wein has already given it and shared it with me in a private communication. Wein's critique of D-NFL has now been made public [25]. Whereas Wein and I have very different backgrounds, education and experience, so that we approach Dembski's work from different vantage points, his argument against LCI in this specific case turned out quite close to what I had in mind but was too slow in developing it. Upon my request, Richard kindly permitted me to use his argument in this paper.

It goes as follows. Wein first quotes the already quoted equation (*):

I(A&B) = I(A) + I(B|A)......(*)

In this equation, A and B denote events of which A is the cause of B and B the consequence of A. According to Dembski, since A entails B, therefore the combination of events A and B carries no more information than was already carried by A alone, i.e.

I(B|A) = 0, or I(A&B)=I(A)

As mentioned before, equation (*) is a consequence of going from probabilities to information via a logarithmic transformation.

(I'd like to add to Wein's argument that to reconcile equation (*) with Dembski's definition of LCI, it would be necessary to assume not that

I(B|A) =0,

but rather that (using Dembski's own notation mod)

I(B|A)=0 mod UCB.

This is just one more example of Dembski's inconsistency).

As Wein points out, I(B|A) in equation (*) "is not Dembski's specified information (SI)! The problem is that I(B|A) is just P(B|A) transformed, and P(B|A) is the true conditional probability of the event, which in this case is 1. SI, on the other hand, is based on the assumption of a uniform probability distribution, regardless of the true probability of the event. "

I believe the combination of Wein's argument with my preceding argument effectively lays to rest any claims of legitimacy of Dembski's LCI.

Pages 152-154 are full of "notation-heavy prose" (using Dembski's own expression) which he allegedly tries to avoid (page xvii). This segment of his book is saturated with such terms as "homomorphism of Boolean algebras" and the like. All these piles of mathematical notations are irrelevant to his thesis. They serve no useful role except for impressing readers with the alleged sophistication of Dembski's discourse.

Note that in all of the reviewed discussion Dembski always refers to information rather than to Complex Specified Information. As discussed before, CSI is actually not information in the sense of information theory. Since Dembski's Law of Conservation of Information, despite its name, actually asserts something about CSI rather than about information, the whole discussion on pages 151–154 does not seem to be related to his subsequent discussion of LCI.

Dembski constantly switches between CSI and information, without making any comments in regard to this switching. This makes all of his discourse in regard to information and LCI inconsistent and rather confusing. To understand what exactly he means by this or that statement the reader must constantly be on alert and it seems impossible to discuss Dembski's thesis in a consistent way. Overall, his conclusion, repeated many times all over his book, that CSI cannot be created other than by intelligent agent, remains utterly arbitrary.

In view of the above it can be asserted that, whichever of several mutually contradictory interpretations of Dembski's LCI is chosen, this alleged law makes no sense. Dembski's suggestion that his LCI can be generalized as the Fourth Law of Thermodynamics has no basis in facts. His supposed Fourth Law would actually contradict the 2nd Law of thermodynamics. The 2nd law asserts that entropy in a closed system cannot decrease, while Dembski's LCI states that information in a closed system cannot increase. Since average information and entropy are tied together according to Dembski's own definition, his alleged new law cannot be taken seriously, although we can expect that from now on his cohorts will trumpet the alleged fundamental discovery of a new law of nature by the great mathematician and philosopher Dembski.

There are many more parts in Dembski's new book dealing with a variety of topics. For example, on pages 49-51 Dembski purports to correct alleged weaknesses in Fisher's statistical theory of hypothesis's testing by generalizing it (without any reasonable substantiation for such claim). Reviewing all of them would require much more time and effort than it deserves.

In his new book Dembski continues to adhere to an obviously incorrect idea that complexity is inextricably tied to low probability. This idea contradicts a variety of facts as the much more reasonable definitions of Kolmogorov complexity [23] and computational complexity. I have previously offered examples illustrating that it is simplicity rather than complexity which points to low probability [as in the case of irregularly shaped (i.e., complex) pebbles vs a perfectly spherical (i.e., quite simple in shape) piece of stone (see [2] and [4])]. Another example showing that high complexity not necessarily means low probability was discussed above in the section on EF, where the case of flat triangular ice crystals was reviewed.

9. Are NFL theorems relevant for Dembski's thesis?

It is time to say a few words about the title of Dembski's new book. It has been borrowed from the name of a set of mathematical theorems proven a few years ago by David Wolpert and William Macready (NFL theorems).

As far as the purely mathematical essence of the NFL theorems is in question, these theorems are proven beyond doubt. However, every mathematical theorem, however logically impeccable, is always true only to the extent the premise accepted for its derivation is fulfilled. The NFL theorems are applicable only if and when certain conditions are fulfilled which constitute a part of the premise on which the proof of these theorem was based. For example, the NFL theorems are only applicable to the so-called "black box" algorithms. There are certain other conditions which limit the area of applicability of these fine mathematical results. There are certain situations wherein the NFL theorems are either inapplicable or at least require an investigation of their applicability. One such case is the biological evolutionary algorithms. Before trying to apply the NFL theorems to his theory of the solely intelligent origin of CSI, Dembski should have performed a detailed analysis to find out whether or not the NFL theorems can be legitimately applied to his case. He did not do that, simply assuming that the NFL theorems work for biological evolutionary algorithms. Dembski applied these theorems to the case where their usefulness was plainly wrong.

A critique of Dembski's use of the NFL theorems has been suggested by several authors (for example in [24, 25]). Recently David Wolpert, one of the co-authors of the NFL theorems, wrote a brief review [14] of Dembski's NFL book where he dismissed Dembski's discourse as mathematically vague (in Wolpert's terms, "written in jello").

I offer a detailed analysis of Dembski's misuse of the NFL theorems in [32]. Here I present a brief exposition of the main point of my critique.

The NFL theorems assert that any two search algorithms "perform" equally well if their "performance" is averaged over all possible "fitness functions." From that Dembski concludes that no algorithm can outperform a random sampling (or "blind search'). Since a random sampling, in order to produce the complex organisms from a much simpler progenitor, need an enormous number of trials and therefore an enormously long period of time, then, if we accept Dembski's conclusion from the NFL theorems, no evolutionary algorithm can succeed in producing complex organisms within a period of time available for evolution. Hence, concludes Dembski triumphantly, the NFL theorems prove the impossibility of Darwinian evolution.

Without arguing about the reliability of Dembski's final triumphant conclusion about Darwinian evolution, I can categorically assert that such a conclusion does not at all follow from the NFL theorems.

As mentioned above, the NFL theorems only relate to algorithms' performance averaged over all possible fitness functions. These theorems say nothing about algorithms' relative performance on specific classes of fitness functions. In fact, various algorithms perform very differently on specific fitness landscapes and the NFL theorems in no way prohibit this. In Dembski's book there are many examples of such situations which he strangely seems not to perceive as contrary to his thesis. For example, Dawkins's algorithm generating a phrase from Shakespeare reaches its target in only about 40 iterations. A random sampling would, as Dembski himself points out, take about 10^40 iterations. This is an outperformance! The same is observed in other examples Dembski refers to - the search for an optimal shape of antenna, in a checker-playing algorithm, etc.

Therefore Dembski's attempt to utilize the fine mathematical result of Wolpert and Macready - their NFL theorems is unsubstantiated.

Overall, Dembski’s new book is a hodge-podge of unsubstantiated but quite pretentious claims and unnecessary quasi-mathematical exercises serving no useful purpose, and displays many features of pseudo-science so eloquently described by Gardner.

Finally, I would like to say again, that the discussion of Dembski's work in this article, as well as in[2], addresses a reader who has no special training in information theory, mathematical statistics and related disciplines. A more detailed discussion, which can be comprehended only by readers with some mathematical background, is given in several publications, among which I recommend articles by Richard Wein,[25] [28], Wesley Elsberry and Wilkins,[30] and Wesley Elsberry and Jeffrey Shallit.[31] Except for the difference in the targeted audience and thus in the level of mathematical sophistication of the discourse, I share the views of these authors in their critique of Dembski's literary production.

10. Acknowledgments

I am indebted to Brendan McKay, Matt Young, and Richard Wein for useful comments and to Wein and Jeffrey Shallit for the permission to use some of their material before it was made public.

11. Appendix

(In May 2002, Dembski posted a lengthy response[26] to Wein's paper.[25] It was replete with irrelevant ad hominem remarks. Dembski often used the word rubbish to characterize Wein's arguments, but otherwise it was mostly a repetition of the stuff Dembski offered in his earlier publications. It failed to refute Wein's critique. Wein, however, decided to response[28] to Dembski's failed refutation of Wein's critique. Wein's excellent response speaks for itself, showing the complete absence of substance in Dembski's piece[26]. In June 2002, Dembski published one more response[29] to Wein's article [28]. Dembski's new rebuttal [29] is a remarkable document. It displays Dembski's enormous ego and arrogance. Again, it is replete with insulting personal remarks, references to Nobel laureates who all love Dembski, and scoffing advices to Wein as to what the latter's behavior should be. Leo Tolstoy wrote that the actual value of a human being is a fraction wherein the numerator is that person's talents and the denominator is what that person's opinion of himself is. If the denominator is very large, the fraction approaches zero. In Dembski's case, the numerator may be reasonably large, but the denominator is enormous.) What a waste!

12. References

1. William A. Dembski. No Free Lunch. Why Specified Complexity Cannot Be Purchased without Intelligence. Rowman & Littlefield Publishers, 2002.

2. M. Perakh, A Consistent Inconsistency.

3. W. Dembski, The Design Inference, Cambridge University Press, 1998.

4. Michael J. Behe, Darwin's Black Box, The Biochemical Challenge to Evolution, Simon and Schuster, 1996.

5. M. Perakh, Irreducible Contradiction.

6. M. Perakh, Science In the Eyes Of a Scientist.

7. John H. McDonald, A reducibly complex mousetrap.

8. W. Dembski, in coll. Science and Evidence for Design in Universe, Ignatius Press, 2000.

9. Russell Doolittle, in Boston Review, February-March 1997.

10. T.H. Bugge et al, Cell, 87, 709-719, 1996.

11. W. Tape, Atmospheric Halos. Antarctic Research Series, v. 69. American Geophysical Union, 1994.

12. W.A. Bentley and W. J. Humphreys, Snow Crystals (Dover, 1962).

13. U. Nakaya, Snow Crystals: Natural and Artificial, Harvard University Press, 1954.

14. Martin Gardner, Fads and Fallacies in the Name of Science, Dover Publications, 1957 (originally the book was published by G.P. Putnam's Sons in 1952 under the title In the Name of Science).

15. Victor J. Stenger, Messages from Heaven.

16. Matt Young, How to Evolve Specified Complexity by Natural Means.

17. Claude E. Shannon, A Mathematical Theory of Communication, Bell System Tech. J., July 1948 and October 1948.

18. Richard E. Blahut, Principles and Practice of Information Theory, Addison-Wesley Publishing Co., 1990.

19. W. Dembski, Intelligent Design, The Bridge Between Science & Theology, InterVarsity Press, 1999.

20. L. Landau and E. Lifshitz, Statistical Physics (Moscow: Gosfizmatizdat, 1971). In Russian (an English translation is available).

21. Paul Vitanyi, Meaningful information - in Front for the Mathematic ArXiv.

22. Mark Perakh and Brendan McKay, Study of Certain Statistical Properties of Meaningful Texts as Compared to Randomized Conglomerates of Letters.

23. A. N. Kolmogorov, Three Approaches to the Quantitative Definition of Information, in Problemy Peredachi Informatsii (in Russian). Under the same title translation in 1(1) 1965.

24. Jeffrey Shallit (University of Ontario, Canada). Private communication.

25. Richard Wein, Not a Free Lunch But a Box of Chocolates, see also on this site.

26. William A. Dembski, Obsessively Criticized but Scarcely Refuted: A Response to Richard Wein.

27. David Wolpert (Santa Fe Institute), William Dembski's treatment of the No Free Lunch theorems is written in jello.

28. Richard Wein, Response? What Response?.

29. William A. Dembski, ARN Discussion Forum: Dembski Responds to Wein's Response.

30. John S. Wilkins and Wesley R. Elsberry, The Advantages of Theft over Toil: The Design Inference and Arguing from Ignorance, Biology and Philosophy, 16 (2001): 711

31. Wesley Elsberry and Jeffrey Shallit, Information Theory, Evolutionary Computation, and Dembski's "Complex Specified Information." A preprint made available through a private communication.

32. Mark Perakh, There Is a Free Lunch After All. (A chapter in the collection to be published, editors Matt Young and Taner Edis).

* * *

Location of this article: http://talkreason.org/articles/dem_nfl.cfm