Software Mythbuster

I just read the book The Leprechauns of Software Engineering from Laurent Bossavit. It talks about a a few common “facts” of software development that are not facts at all. They are just merely anecdotal. Unfortunately they are heavily used to support other claims.

These facts originate in some older papers that are referenced from other papers, which are referenced from other papers, which are…. I guess you get the idea.

By all this referencing the original information changed and also gained some kind of authority that is not justified.

Laurent tracked down the primary sources (listing the references) and their hypotheses. Showing that our “facts” are based on very limited data that magically gets generalized and that some papers used obscure metrics which compare apples and oranges.

The “facts” he dissects in details are the Cone of Uncertainty, 10x variation in software developers & cost of defects.

While I never thought of these “facts” and their pictures as “real facts” to the numbers (but more or less as some kind of trend) it is still surprising that their supporting data is so weak. We do not really know if the trends they describe are real or not. There is too less data to “verify” their claim even if they feel right.

Definitely an interesting read!

Advertisements

Escaping Fun: replaceAll (“\\\\”, “\\\\\\\\”)!

There is a small escaping bug in cucumber-jvm. The java generated groovy step snippets do not properly escape the escape character \ in the steps regular expression.

Currently it generates:

Given(~'^I have (\d+) cukes in my "([^"]*)" belly') { int arg1 ->
    // Express the Regexp above with the code you wish you had
    throw new PendingException()
}

which should be:

Given(~'^I have (\\d+) cukes in my "([^"]*)" belly$') { int arg1 ->
    // Express the Regexp above with the code you wish you had
    throw new PendingException()
}

Cucumber generates code snippets we have to escape the escape character in the snippet output too, i.e. the (\\d+). I have modified the groovy snippet generation before, so it should be an easy fix. Or so I thought. ;-)

It was not a big issue but it took me longer than expected to understand because escaping the escape characters is a bit confusing at first using replaceAll().

Escaping the escape character (\) gets interesting if it is a regular expression: it needs to be escaped again. All this endless escaping turns into this stupid piece of code:

public String escapePattern(String pattern) {
    return pattern.replaceAll ("\\\\", "\\\\\\\\");
}

The method above gets a regular expression string for a step as input (as I see it in the debugger):

"^I have (\\d+) cukes in my \"([^\"]*)\" belly$"

Actually the real string is just:

^I have (\d+) cukes in my "([^"]*)" belly$

And what we like to see as the final regular expression is:

^I have (\\d+) cukes in my "([^"]*)" belly$

We just want to replace \ with \\.

replaceAll takes a regular expression pattern (String) as the first parameter so we have to escape the \ twice to match it:

  • \ => \\ because \ is the escape character for regular expressions
  • \\ => \\\\ because \ is the escape character for Strings

Because \ is also a special character in the replacement (second) parameter of replaceAll. So we have to escape \ twice again:

  • \\ => \\\\ because \ is the escape character in the replacement parameter`
  • \\\\=> \\\\\\\\ because \ is the escape character for Strings

Which finally leads to this stupid line: pattern.replaceAll ("\\\\", "\\\\\\\\"); !

This can be simplified by using String.replace (CharSequence target, CharSequence replacement) (since 1.5). It does not use regular expressions which allows us to drop one level of escaping:

pattern.replace ("\\", "\\\\");

Which is a lot easier to understand. Which is also the final solution for the pull request :-)