And the winners are…


Yanick Champoux, whose code completed the task in a mere 234 seconds (3 minutes, 54 seconds), wins the speed race. Tommy’s unofficial entry takes 2nd place trailing by 40 seconds, followed closely by the reference design, and Tim King.


Joel Berger wins for taking only 130 MB of memory, but he traded that off for an almost 9 minute run time. Notably Yanick’s high-speed solution only took about 300 MB and came in second for entries that produced the correct output. Reini Urban’s entry didn’t produce the correct output, but scored an impossibly low 10 KB, which we believe can only be accounted for by a measurement error (even though it is repeatable). A trivial “hello world” Perl program will consume ~20 MB, and Reini had some self-reported figured around 30 MB, which we are inclined to believe are accurate, and consider that a accomplishment.


Joel Berger once again wins for accomplishing the correct solution in only 130 lines of original code. Hats off to Bruce Gray for an impressively short 6 line Perl 6 entry. It didn’t produce the correct output, and did not run in the allotted time, but it captured the spirit of a basic duplicate finder. Yanick also takes second place here, of the correctly working entries, with 266 lines of code.


Joel Berger also had the lowest Perl::Critic score of the working entries, with 104 violations reported. Joakim Lagerqvist, with a little more work to fix his output, could have easily won with a Perl::Critic score of only 2. Yanick once more takes second place with a Perl::Critic score of 266. Reini Urban, with a not-quite-working solution, also did quite well in this category, while Tommy and Tim both aimed for more modern frameworks (Moose and MOP) that don’t play nicely with Perl::Critic.

Best documentation

Tim King takes the win for best documentation. He had at least boilerplate POD for all of his 7 modules, and full POD with descriptions, synopsis, and method definitions for 3 of the modules, plus full POD for the main script, and a separate file documenting the theory of operation (architecture). Additionally his code was well commented. Yanick came in second with good POD coverage and moderate code comments. Of the entries with incorrect output, Joakim Lagerqvist did a good job with POD in his script and main module, architecture description, and decent code comments.

Most features

Tim King’s entry offered the most features, supporting 12 command line options of which 8 were deemed to be useful for an end-user. This code supported an optional progress indicator, ability to use a config file, selectable deduping algorithms, and the ability to dedupe multiple paths. Joel and Yanick both offered fairly few command line options, while Joakim Lagerqvist and Reini Urban both had a decent count of 6 useful options, some of which allowed you to tune the deduping to be optimal for a given file system. (Arguably good for development, but less useful for end users.)  Bruce Gray’s solution was more of a prototype with no options, and even hard coded the path to dedup.

Best effort

For this category we used a count of non-trivial comments submitted to gitgub as a (far from perfect) proxy for the amount of effort contestants put into their entries. Tim King lead the pack with 41 commits, followed by Yanick Champoux with 26. It’s almost it’s own accomplishment that Joel Berger’s managed to create a working solution that won in several categories in only 6 commits. Reini Urban had a good showing with 20 commits, and Joakim Lagerqvist with 9. Bruce Gray eschewed using a VCS, but with an entry of merely 6 lines, we don’t blame him for that choice.

Packaged Application

Rather than declaring a winner in this category, our rules stated that anyone who took the time to create a packaged application would get recognition. Joakim Lagerqvist, Reini Urban, and Tim King all provided packaged, easy to install applications, with the latter two also including unit tests.