To try and make the perl tools good and consistent, here is a list of best practises used within the modules. (Whether they are best can of course be debated, but what's listed is what is currently used) 1. Always use strict and warnings first thing. There is a lot of stuff legal in perl for backward compatability and ease of writing one liners. However, these statements are frequent source of bugs in real code. All modules and binaries should use strict and warnings to ensure that these checks are enabled. (There is a unit test in the module grepping source to ensure this). Thus, pretty much the first thing in all perl files should be: use strict; use warnings; 2. Use perl modules. We want to group functionality into multiple files in perl too. A perl module is just another perl file with a .pm extension, which minimally can look something like this: Yahoo/Vespa/VespaModel.pm: package Yahoo::Vespa::VespaModel; use strict; use warnings; my %CACHED_MODEL; # Prevent multiple fetches by caching results return 1; sub get { ... } Yahoo/Vespa/Bin/MyBinary.pl: use strict; use warnings; use Yahoo::Vespa::VespaModel; my $model = Yahoo::Vespa::VespaModel::get(); 2a. Module install locations. Perl utilities are installed under $VESPA_HOME/lib/perl5/site_perl 2b. Aliasing namespace. Perl doesn't have that great namespace handling. It's not like in C++, where we can be in the storage::api namespace and thus address something in the storage::lib namespace as lib::foo or even refer to another instance in the same namespace. Thus, if the user of the VespaModel module above were Yahoo::Vespa::MyLib, it still has to address VespaModel with full path by default. It is possible to create aliases in Perl to help this. Using an alias the MyBinary.pl code above could look like: ... use Yahoo::Vespa::VespaModel; BEGIN { *VespaModel:: = *Yahoo::Vespa::VespaModel:: ; } my $model = VespaModel::get(); The alias declaration doesn't look very pretty, but it can be helpful to get code looking simple. 2b. Exporting members into users namespace. Another option to using long prefixed names or aliasing, is to export names into the callers namespace. This can be done in a module doing something like this: Yahoo/Vespa/VespaModel.pm: package Yahoo::Vespa::VespaModel; use strict; use warnings; BEGIN { use base 'Exporter'; our @EXPORT = qw( getVespaModel ); our @EXPORT_OK = qw( otherFunction ); } my %CACHED_MODEL; return 1; sub getVespaModel { ... } sub otherFunction { ... } Yahoo/Vespa/Bin/MyBinary.pl: use strict; use warnings; use Yahoo::Vespa::VespaModel; my $model = getVespaModel(); In this example, the getVespaModel function is imported by default, while otherFunction is not, but can be included optionally. You can specify what to include by adding arguments to the use statements: use Yahoo::Vespa::VespaMode; # Import defaults use Yahoo::Vespa::VespaModel (); # Import nothing # Import other function but not getVespaModel use Yahoo::Vespa::VespaModel qw( otherFunction ); (The qw(...) function is just a function to generate an array from a whitespace separated string. Writing qw( foo bar ) is equivalent to writing ('foo', 'bar')) You can also export/import variables, but then you need to prefix the names with the type, as in "our @EXPORT = qw( $number, @list, %hash );". Note that you should prefer to export as little functions as possible as they can clash with names used in caller. Also, the tokens you do export should have fairly descriptive names to reduce the chance of this happening. An exported name does not have a module name tagged to it to include context. Thus, if you don't export you can for instance use Json::encode, but if you do export you likely need to call the function encodeJson or similar instead. 2c. Prefer private variables (my instead of our) When declaring variables with 'my' they become private to the module, and you know outsiders can't alter it. This makes it easier when debugging as there are less possibilities for what can happen. 2d. Prefer calling functions or exported variables rather than referencing global variables in a module from the outside. Referencing non-declared variables in another module does not seem to create compiler warnings, nor does using private (my) declared variables. Thus it's better to refer to imported variables or call a function, such that the compiler will tell you when this doesn't work anymore. 2e. Put all function declarations at the bottom. When a perl module is loaded, the code within it run. If that doesn't return true, that means the module fails to load. Thus, traditionally, perl modules often end with 1; (equivalent to return 1;) to ensure this. However, this mean you have to read through the entire module to look for module code run. By doing exit(...) call in main prog before function declaration and return; in modules before function declarations, it is easier for reader to see that you haven't hidden other code between the function declarations. (Unless you've hacked it into a BEGIN{} block to enforce it to run before everything else) 2f. Make it easy to reinitialize in unit tests. By putting initialization steps in a separate init function, rather than doing it on load, unit tests can easily call it to reinitialize the module between tests. Also this separates declarations of what exist from the initialization so it is easier to see what variables are there. 3. Confess instead of die. The typical perl assert is use of the 'die' function, as in: defined $foo or die "We expected 'foo' to be defined here"; The Utils package contains a confess function to be used instead (Wrapping an external dependency), which will do the same as 'die', but will add a stacktrace too, such that when encountered, it is much easier to find the culprit. 4. Do not call exit() in libraries. We want to be able to unit test all types of functions in unit tests, also functionality that makes application abort and exit. The Utils defines an exitApplication that is mocked for unit tests. Assertion types of exits with die/confess can also be catched in unit tests. 5. Code conventions. - Upper case, underscore divided, module level variables. - Camel case function names. - Four space indent. 6. Naming function arguments. For perl, a function is just a call to a subroutine with a list containing whatever arguments, called @_. Using this directly makes the code hard to read. Naming variables makes this a bit easier.. sub getVespaModel { # (ConfigServerHost, ConfigServerPort) return Json::parse(Http::get("http://$_[0]:$_[1]/foo"#)); } sub getVespaModel { # (ConfigServerHost, ConfigServerPort) -> ObjTree my ($host, $port) = @_; return Json::parse(Http::get("http://$host:$port/foo"#)); } In the latter example it is easier to read the code. The argument comment is something I usually add for function declarations to look better with vim folding.. When I fold functions in vim, the below line will look like +-- 4 lines: sub getVespaModel (ConfigServerHost, ConfigServerPort) -> ObjTree Using such a convention it is thus easier to read the code, as you may be able to see all your other function declarations while working on the function you have expanded. 6b. Functions with many arguments. If you create functions with loads of parameters you can end up with a messy function, and a hard time to adjust all the uses of it when you want to extend it. At these times you may use hashes to name variables, such that the order is no longer important.. sub getVespaModel { # (ConfigServerHost, ConfigServerPort) -> ObjTree my $args = $_[0]; return Json::parse(Http::get("http://$$args{'host':$$args{'port'}/foo"#)); } getVespaModel({ 'host' => 'myhost', 'port' => 80 }); Using this trick, you can have defaults for various arguments that can be ignored by users not caring, rather than having to pass undef at many positions to ensure order of parameters is correct. Note however, that this looks a bit more messy in the function itself, and it makes it more important to make comments of what arguments are actually handled and which ones are not optional.. I prefer to try and have short argument lists instead. 7. Constants Sometimes you want to declare constants. Valid flag values for instance. You can of course just declare global variables, but you have no way of ensuring that they never change, which can be confusing. To define constants you can do the following: use constant MY_FLAG => 8; This constant is referred to without the usual $ prefix too, so it is easy to distinguish it from variables. These constants can also be exported, enabling you to create function calls like: MyModule::foo("bar", OPTION_ARGH | OPTION_BAZ); Though this of course pollutes callers namespace again, so he has to specifically not include them if he otherwise would have a name clash. 8. Libraries not in search path Sometimes people install perl libraries in non-default locations. If temporary you can fix this by add directory to PERLLIB on command line, but if permanent, the recommended way to find the libraries is to add the directory to the search path where you include it, like the Yahoo installation for the JSON library: use lib '$VESPA_HOME/lib64/perl5/site_perl/5.14/'; use JSON; 9. Perl references In perl you can create references to variables by prefixing a backslash '\'. my @foo ; my $listref = \@foo; my $var ; my $scalarref = \$var; my %bar ; my $hashref = \%bar; You can also create references to lists and hashes directly: my $listref = [ 1, 2, 4 ]; # [] instead of () to get ref instead of list. my $hashref = { 'foo' => 3, 'bar' => 'hmm' }; # {} instead of () To check what a variable is you can use the ref() function: ref($scalarref) eq 'SCALAR' ref($listref) eq 'ARRAY' ref($hashref) eq 'HASH' ref($var) == undef To dereference a reference you can add a deref clause around it: my @foo = @{ $listref }; my %bar = %{ $hashref }; my $scalar = ${ $scalarref }; If the insides of the clause is easy, you also omit it. my $scalar = $$scalarref; my %bar = %$hashref; my $value = $$hashref{'foo'} You can also dereference using the -> operator. my $value = $hashref->{'foo'}; my $value2 = $listref->[3]; # Element 3 in the list The -> operator is typically used when traversing object structures. 10. Perl structs Perl object programming requires some blessing and doesn't look that awesome, so I typically mostly program functionally. However, at the bare minimum one needs to be able to create some structs to contain data that isn't bare primitives. Perl's Class::Struct module implements a way to define structs in a simple fashion without needing to know how bless works, module inheritation and so forth. An example use case here is Yahoo::Vespa::ClusterState use Class::Struct; struct( ClusterState => { globalState => '$', distributor => '%', storage => '%' }); struct( Node => { group => '$', unit => 'State', generated => 'State', user => 'State' }); struct( State => { state => '$', reason => '$', timestamp => '$', source => '$' }); # Some file using it. use Yahoo::Vespa::ClusterState; my $clusterState = new ClusterState; $clusterState->globalState('UP'); my $node = new Node; $node->group('Foo'); $clusterState->distributor('0', $node); ... my $group = $clusterState->distributor->{'0'}->group; my $nodetype = 'storage'; my $group = $clusterState->$nodetype->{'0'}->group; Some notes: - The names of the structs are automatically imported. Thus you don't need to worry about prefixing or aliasing, but be aware names can collide for user. - $, % or @ indicates if content is scalar, hash or list. A name indicates the name of another struct that should have the content.