Perl Traps for Python Programmers

Introduction

This page will no longer be updated, since I managed to convince my boss, that we would be relieved of a lot debugging problems by using Python instead. The notes below are still relevant though.
This will be filled with examples of simple traps (and simple complexity) in Perl when coming from Python and other programming languages.
I have been programming in Perl for about half a year. I started working on this page when I was an absolute beginner in Perl, but not in programming. This page will reflect my experiences with Perl as I fell into the traps. A bunch of minor Perl annoyances are mentioned too. Perl will mostly be compared to Python which I have programmed in since 2000 and which hardly gave any surprises when I learned it.
Please note: I do not mind that complex programs can be difficult to understand from a quick glance, but I do hate simple programs with no easy explanations. Most of the code snippets below will fall into that category.
I have talked to other programmers with a lot more experience (both Perl and other languages) about these traps and plausible explanations are hard to find. What I find really worrying is that these simple pieces of Perl code can baffle programmers using Perl daily. It seems that the main problem with Perl is the extreme flexibility, "There is More Than One Way To Do it", Perl's greatest asset. I do very much prefer the simple and predictable behaviour in Python. Interestingly, several Perl style guides urges the programmer to leave out a lot of the Perl flexibility in the interest of readability and maintainability.

The code is tested on the following versions under Windows and RedHat Linux: (ActiveState) Perl 5.8 and Python 2.3

Simple traps

Implicit variables

The are a couple of special variables that are often used and which can be used as shortcuts to allow for very short and clean code. The variable $_ is used everywhere. Here are two ways of simple array printing. Be careful not to confuse $_ with @_ when using elements in arrays - it won't give you a syntax error, just no output.

@my_array = ('a', 'b', 'c');
foreach (@my_array) { print; }         # clean syntax
foreach (@my_array) {                  # to show $_
	print $_; 
}
foreach (@my_array) {                  # will not print anything
	print @_; 
}

Simple functions: Implicit parameters

Functions in Perl are deceptively simple because they rely on implicit information in special variables - both as parameters and as return values. The problem with "implicit" is that the information is hidden from plain sight in the source code. The following examples show this.

sub print_implicit_param {
	($_) = @_;               # @_ contains parameters, 
	print;                   # $_ is often the current value
}

sub print_leftover {
	print , "\n";            # This is not a syntax error, 
                             # but is non-effective. It does print though
}

print_implicit_param('XXX');       # will print XXX
print_leftover( ('AAA', 'BBB') );  # will print XXX again

Simple functions: Implicit return values

Functions return the value of the last evaluated expression even with no explicit return. The following code will print: Sum: 8.

sub sum {
	($add1, $add2) = @_;
	$add1 + $add2;
}

print 'Sum: ' . sum(4, 4) . "\n";

The return value automagically returns to the caller. Of course this code will break when another programmer is trying to figure what the value is before the function returns.

sub sum {
	($add1, $add2) = @_;
	$sum = $add1 + $add2;
	print $sum;              # prints 8
}

print 'Sum: ' . sum(4, 4) . "\n"; # prints: Sum: 1

This will naturally print the value 1 - the value returned by print. A surprising thing for a Python programmer is that the parentheses in the parameter unpacking line are not optional. The following function will print the value 2 (the length of the parameter list).

sub sum {
	$add1, $add2 = @_;     # lacking ()
	$sum = $add1 + $add2;
	print $sum;            # prints 2
}

print 'Sum: ' . sum(4, 4) . "\n"; # prints: Sum: 1

The most annoying part is the silent handling of this bug:

sub sum {
	$add1, $add2 = @_;
	$sum = add1 + add2; # lacking $
	print $sum;         # prints 0
}

print 'Sum: ' . sum(4, 4) . "\n";   # prints: Sum: 1

This typical Pythonesque typo does not result in a syntax error, but will print 0 since the unquoted barewords add1 and add2 can not be added (string concatenation used the . operator, not + as in many other languages).

Optional or non-optional parentheses in function calls and elsewhere

The lacking parentheses in the sum example above are confusing due to two things: 1) in Python, the parentheses in the same context is optional; 2) in Perl, lots of other parentheses are optional. The built-in function time() does not take any parameters and can therefore be called without parentheses:

$time = time;
print $time;    #print current time
print($time);

Sometimes you still need to print using parentheses - see example below.
This feature combined with optional quoting of barewords can give some interesting interpretations - for example in hashes (same as Python dictionaries).

%hash = {
    info, 'Connection established',    # unquoted word
    time, '19:48',                     # function call
    date, '2004-05-12'                 # unquoted word
}

Good luck retrieving your values (without iterating over the keys). The problem gets bigger and bigger the more modules are used and the more functions that are imported into the modules namespace. Perl does offer another alternative to the comma in this context: the => operator that automatically quotes unquoted key strings. To call a function you then need to type the parentheses, as in:

%hash = {
    info   => 'Connection established',    # unquoted word
    time() => '19:48',                     # function call
    date   => '2004-05-12'                 # unquoted word
}

The easiest thing to is actually to just type everything out explicitly, as one is forced to in a lot of other languages. The syntactic flexibility actually hinders software development in this case.

Difficult to distinguish operators

Perl has a lot of operators, some of them are obvious and easy to remember and easy to spot, some are not.

@my_array = ('a', 'b', 'c');
print $my_array . "\n";       # prints 3
print $my_array , "\n";       # print abc

Non-polymorphic operators

One of the reasons Perl has so many operators is that similar operations on different datatypes require different operators. Numbers are added with +, strings are concatenated with . (dot). Numbers are compared with <, >, and ==; Strings with 'lt', 'gt', and 'eq'. But in some cases strings will be converted to integers which allows more flexibility and confusion. See below.
We also have || and 'or', && and 'and', ' - the differences between theses are not cosmetic, they do have significance and even different precedence; the problem is: which operator is used for which datatypes and what are the precedence rules in more complex cases? Combine that with the optional parentheses and...

Sometimes even using the wrong operator for your datatypes wont give you an error message - just a wrong value.

Even very simple cases can be surprising:

print 2 == 2;        # prints 1
print "\n"; 
print ('2' eq '2');  # prints 1
print "\n"; 
print ('3' gt 2);    # prints 1
print "\n"; 
print ('3' lt 2);    # prints nothing, not even zero

Locale/Unicode traps

When using a utf-8 locale (such as da_DK.utf-8 under Linux), the ord function gives you the value of the first byte, not the value(s) of the first characters:

ord('ø'); # all these return 0xc3 (195) under utf-8
ord('æ');
ord('å'); 

These all returns the same byte: 195 (0xc3) when running under utf-8. In contrast, the Python ord function will raise an exception complaining about the length of the string; the string must be one byte long. I guess the problem in Perl is that the ord() take any length of string and only returns the first byte.

String handling

The print function is very flexible in Perl especially when combined with easy-to-use string interpolation. I find that this is also a huge weakness. The following pieces should show problems with varying kinds of string interpolation; here I would expect printing of the text: "first::last". What exactly is it that interpolation of :: does?
Take a look at perldoc for more scary examples.

$first = 'first';
$last = 'last';

print $first . '::' . $last . "\n";      # OK: first::last
print "$first" . '::' . "$last" . "\n";  # OK: first::last
print "$first" . "::" . "$last" . "\n";  # OK: first::last
print "$first::" . "$last" . "\n";       # Uh-oh: last, -w gives: use of uninitialised value
print "$first::$last\n";                 # Uh-oh: last, -w gives: use of uninitialised value
print "$first" . '::' . "$last\n";       # OK: first::last
print "$first:\:$last\n";                # OK: first::last
print "$first\:\:$last\n";               # OK: first::last

By the way, I prefer Python's way of handling string printing with automatic newlines - especially since Perl only interpolates in double quotes ("), not in single quotes ('), so a: print '$first:\n' . "$first"; does not give the newline. It is only a minor problem, but one does get to type "\n" a lot of times during a day.

Print/Calculations

I don't know whether the problem here is the print function or the calculation or a combination. In this case I would like get the value 54 printed.

print (4 + 5) * 6 . "\n";      # prints: 9 and does not print a newline
print +(4 + 5) * 6 . "\n";     # prints: 54
print int(4 + 5) * 6 . "\n";   # prints: 54
print((4 + 5) * 6 . "\n");     # prints: 54

and the following, which I wrote while trying to provoke a syntax error:

print (4, 5) . "\n";           # prints: 45 but no newline
print (4, 5) x 6 . "\n";       # prints: 45 but no newline

and the following, which I wrote while trying to provoke a syntax error:

Printing array contents

Here I would like a short way of printing array contents as in Python: print array (similarly in Java and JavaScript).

my @array = (
	'pos0', 'pos1', 'pos2', 'pos3',
);
print "Contents: \n", @array, "\n";        # prints: pos0pos1pos2pos3
print "Contents: \n", "@array", "\n";      # prints: pos0 pos1 pos2 pos3

The interpolation quotes inserts spaces between array elements which is nice in this case. But when the array contains lines read in from a file, you still get the extra spaces which has baffled more than one more experienced Perl programmer. One programmer started debugging his code to find where the extra spaces came from.

Simple complexity

What is the use of "local"? And why is it not possible to use it together with "use strict". The following code was intended to print "global" "local" "global" separated by newlines. The first version gives an interesting error message; the other two works as intended, but sacrifices the use of "use strict" or the use of local.

After reading the relevant chapters in Programming Perl, some course notes and talking to Perl users it is not clear how "local" can be used. It may just be me, but I find that a keyword which is difficult to understand, may be a bad idea.

Version 1: Error: global symbol "$loc" requires explicit package name 
use strict;
my $global = 'global';
print $global . "\n";
sub localtest {
	no strict 'vars';
	local $loc;
	$loc = 'local';
	print $loc . "\n";
}
localtest();
print $global . "\n";
Version 2: No error, no local
use strict;
my $global = 'global';
print $global . "\n";
sub localtest {
	my $loc;
	$loc = 'local';
	print $loc . "\n";
}
localtest();
print $global . "\n";
Version 3: No error, no strict 'vars' 
use strict;
my $global = 'global';
print $global . "\n";
sub localtest {
	no strict 'vars';
	local $loc;
	$loc = 'local';
	print $loc . "\n";
}
localtest();
print $global . "\n";


Index