Published 2006-08-21 16:17:08

[** not proof read yet ;) - so send corrections not complaints ;) **]
As my freelancing work has scaled up into a full on company (of one still). I have had to evolve from being a one-man coding band into a team leader/manager. This introduced, over the last few years, a slightly different way of working.

Phase I

Specification writing, normally involving doing Inkscape interface designs, along with a huge gnumeric code specification, detailing the database design, along with explaining all classes and templates need for a project.

Phase II

Outsourcing, either to third parties, or to team members within the office I work with. Including a roughly 50% complete code review.

Phase III

The oh sh*t, it's getting urgent, we better fix the delivered code and make it work well enough for the client to start testing.

Phase IV

The what seems like never ending bug finding and fixing stage.

Where I am at present.

Two of the larger projects I have at present are in Phase II & III. One is being forced to go live, the other is halfway through development. Both being coded by different teams, however, what is interesting is that the issues that come up in the code reviews both, have been pretty similar.

The specification stage, perhaps the most labourious, and sometimes mind numbing if it gets large, is always interesting to look back at. The more I do it, in hindsight, It tends to make you realize, there is no such thing as a perfect specification. It always ends up as a best guess, based on what you think is needed and what you think the end user probably wants.

Dealing with end users is always fun, thankfully, this process tends to deal with the end user in two seperate phases - The Specification and the Bug fixing. (Although what normally happens is that after the bug fixing there is another process called changes - which gets dealt with almost as a new project)

What has interested me alot in the last few weeks, has been, to see how far some parts of the specification I wrote and designed, had to be dramatically changed During Phase III. It seems that however much thought and consideration went into the concepts, there is no way to totally predict how the look and feel translates to trying to actually get things done using the application. I've spent alot of time doing mass culling of parts that I spend considerable time trying to specify down to some detail, as when I sat down to test it, It was completely confusting.

Anyway, what I really want to go into, is some of the interesting code that comes out of the code review. Both half way through the project, and the complete overhall that has to be done as you push a project live. So you can demand payment (or at least beg for it..)

Coding Standards

Thankfully, I have managed to cobble together a quite large project coding standard to try and keep projects on track. Without them, Stages III and IV would become unmanagable, as tracing exactly how code was implemented so that both changes and refactoring would become completly impossible.

Unfortunatly, however clear the standards are (or you thought they are), developers have a nack of not following them, either out of habit or just the pressure to deliver so they can get paid.. - So a certian part of the halfway Phase II code review ends up just reiterating where they have missed those elements.

Common mistakes include

Not following the 'no if without braces rule'
not cloning DataObjects, as we develop in both PHP4 and test in PHP5
creating javascript libraries with functions in them.
creating PHP includes with functions in them
Not making the most of the automated url rewriting that FlexyFramework and HTML_Template_Flexy do for CSS, javascript and images.
Using mysql_real_escape, rather than the more portable DataObjects' escape() method. (which really wraps PEAR DB/MDB2)
Incorrect Indentation in javascript files.
passing strings around as error messages notices, rather than using error and notice flags, so the text of the message is on the templates, rather than in the PHP code.
Implementing magic constructors on the base controller...
using $do->whereAdd('somevar=1') when $do->somevar = 1 will do..
adding ?> at the end of the file ;) - It's not needed, and will come back to bite you one day.
Hard coding email subjects etc. in the PHP code, rather than using email templates.
Adding too many config options - especially file locations.

Reviewing the code also is a great opportunity to update and rewrite some of the standards so that things like this should not happen any more. (Although I wonder if making it bigger means there is even less chance they will follow it)

using 'print' rather than 'echo'
for xmlHttpRequest called code, using 'exit' in the get()/post() methods, rather than trying to get clever with content-type, and modifying the output method.
don't do the page processing in the output method()
stopping the use of javascript shortcuts - ie:

function ID(n) { return document.getElementById(n); }

using libraries that where not specified.. or using non-PEAR libraries when PEAR ones are available... - or creating libraries and plonking them in the Application Tree, with the wrong file/class mapping.
using commandset/command in XUL rathar than simply oncommand and the Javascript call.
using cwd(), rather than using dirname(__FILE__), for determinig the path for reading data files.
creating new object/array's when cloning dataobjects will work.
working with session variables from classes outside the one you are in (in The Framework, classes own a session variable $_SESSION[__CLASS__]..., and other classes should not access it directly).
not implementing a 404 return for unknown get's on the main page.
passing get/post requests back up to the main page using parent::get()...
Hard coding options (eg. enums) into controller classes (eg. currencies or statuses etc.) - put them in the DataObjects, so other pages can get the same list.
Overlaying config var's into the controller as raw properties..
Doing PHP calls in the template that involve alot of Database calls (that are critical and could fail) - Which is why we design code to do the processing before any output occurs, so we can redirect and handle issues gracefully.
Trying to get clever with sessions to pass errors about, when internal redirectors are intended to send notices back to the next page. (eg. HTML_FlexyFramework::run('',array('notice'=>array('user_updated_ok'=>true)));
excessive nesting of if/while/if/...else etc. - kill that nesting, return, continue or exit from conditions early, eg. negative test, and return, rather than positive test, and if/else.. nested many levels.
Consistantancy withdataobject methods, in general, dont do updates to a dataobject on a call, unless the method hints' that it might do it... -> eg. $do->approve() or $do->approveAndUpdate(); ...
Avoid silly code... $obj->is_verified = Project::isAuth() ? 0 : 1 .. either just cast it (int) , or let DataObjects worry about it...
Making lines so long you can't read them... Creating an instance of an object to call a method that 'could' be static, is not a bad thing, MyProject_DataObjects_Sometable::callaMethod() or $x= DB_DataObject::factory('Sometable'); $x->callaMethod();
Expecting me to chmod directories hidden away in the application. If you need to create temporary or semi temporary files, use the Template engines's temp directory, rather than trying to create them in the Code source somewhere (so that changing file permissions on svn up/checkout is not needed) - We can always change our default temp directory with set ini_set(session.save_path) in the bootstrapper..
ConfigData/default.ini are generally depreciated in favour of using the bootstrap to define the default options - and we have a slightly different bootstrap defined for our servers.
Assuming it's ok to delete stuff! instead, it should be a basic assumtion, that nothing get's deleted from databases (eg. it always get's flagged), so dont write code that actually deletes it, and always check the deleting flag when searching...
Forgetting to extend the Project base class - Everything extends the base class, even code run via cron jobs.

And sometimes, you realize that the rules you set down may not be totally the best idea...
In one case I advocated using $do->setFrom() rather than seeing a big list of map's between $POST[] / or something else and a dataobject. However, in some cases either building the object slowly and clearly or listing what you send to setFrom() makes the code considerably clearer.

In more detail

In the extended entry I'll look a bit closer at some of these deviations from the ideal..

Print rather than Echo

This one's more for consistancy, (we've used echo everywhere before, it's it's one character shorter ;)
Grepping for echo in a project, is also quite a handy test for any kind of data injection, since most output is our projects is done via template engines. Grepping for :h and echo will find most potential holes. (althoug grepping for print_r / var_dump is also a good idea..)

Exiting on XMLHttpRequest calls.

Most of the XUL based code, and alot of the new HTML/XUL hybrid code I specifiy uses XMLHttpRequest calls to send update/delete/add requests,

Rather than using standard HTML POST's on the form, we build a POST request manually(well with a simple library), and send it via XMLHttpRequest, this we we can test the return, and if there was a problem, we dont have to re-render the form, or re-open a popup.

Unfortunatly since alot of the more recent project are confidential, there have not been many examples I've been able to show the developers of how to do this.

In standard FlexyFramework operation, you would run get/post, and expect it either to set up all the variable for the template to be rendered, or redirect you to another page. So for a XMLHttpRequest that was expecting 'OK' or an error message, the last stage of the FlexyFramework does not fit in with the desired behaviour.

One of the changes that had to be made, was removing some logic that the developers had put in. - They came up with the idea of sending ?def=html, def=txt, def=xml, def=xul etc. to determine the return type, and then overloaded the output method to either die there, and then added header def's to every page.

The correct way to handle this was to use the extension of the $page->template variable (eg. fred.html=>html, fred.xul => xul) to determine type for html & xul,

Then for text and xml output use headers() + exit; at the end of the get/post requests for text/xml output. (although txt does not really need a header)

function get() 
{
        do stuff();
        if (some_error_occurs()) {        
             echo "The error message";
             exit;
        }
        echo "OK";
	exit;
}

// or with XML:
function get() 
{
        do stuff();
        if (some_error_occurs()) {        
             echo "The error message";
             exit;
        }
	header('Content-type: text/xml');
        echo $xml;
	exit;
}

Dont do the page processing in the output method

Since every controller class extends the projects base class, and the base class usually implements get($str), and tests if $str is not empty -> hence 404 - not found. A page implementing the input/output in the output() method will accidentaly call the get() method and return 404 - causing XMLHttpRequest caller to think there is a problem. - Basically they should just have used the get()/post() methods.

Avoiding shortcut's in Javascript

Pretty much applies to PHP as well, by creating a load of shortcuts, you make the code very difficult to follow, being a little verbose may take a few extra characters, but will save the debugging effort later no end.
This is one of those things, that once you get started, you usually dont know when to stop, resulting in build a huge javascript library that is sneakly included on every page. - making you umm and ahh, every time you see a strange looking function call.

Using Libraries that where not specified or non-PEAR ones.

While PEAR is not perfect, fixing things in PEAR is feasible, alot of the code is very well tested andusually kept update with latest PHP versions. Hence and warnings and errors have been fixed.

In both projects external libraries appeared as dependancies, sajax, and HTML2PDF. For the first, I consider that HTML_Javascript should provide all the necessary features that where being utilized from sajax (we where only outputing json data!), and since I know HTML_Javascript inside out, fixing or working around issues would be easier in the long run than introducing a rarely used library into a projects.

For the second we ran into huge compatibility problems with newer PHP4 and PHP5 versions with HTML2PDF, and although I had agreed to use it when we disucussed it during development. It proved to be more trouble than it was worth, especially as the perl html2ps worked perfectly well.

Using oncommand, rather than commandset/command

In XUL, you can group commands into command sets, and use command="xxx" rather than oncommand="callsomething()". however, this just makes mapping an action to some code involve an unneccessary step of deduction, that rarely adds much value. I suspect oncommand is more suited to XUL based xulrunner app's rather than XUL apps that work in a similar way to HTML applications.

dirname(FILE) or cwd()

In general I regard cwd as a bit hit or miss - sometimes it worked, and sometimes it doesnt - and I dont have time to read the manual to work out why, when a perfecty good alternative exists - dirname(__FILE__) gives predictable results, so that's the way the rule get's laid down. Existing proven methods over doing things differently...

Cloneing dataobject arrays over creating simple structs.

One of the pieces of code that was used to store data prior to rendering looked a bit like

while($do->fetch()) { 
   $ret[] = (object) array('user'=>$do->user, 'id' => $do->id));
}

While it's a little less efficient memory wise, the reality was that the whole data from dataobject would probably be needed later (as things like emails that use the data where only drafts, and would probably require more details)

so the code should have said

while($do->fetch()) { 
   $ret[] = clone($do);
}

dont hard code options into the controller class

unfortunaly, one of the things that was added to the base controller class of the project early on, and has been difficult to remove is this little gem

function __construct() {
   $opts= PEAR::getStaticProperty('MyProject','options');
   foreach($opts as $k=>$v) {
         $this->$k = $v;
   }
}

This has the horrible consequence of hiding a potentail variable mis-used (eg. size is an option and we use $this->size later for something else). The correct way to do something like this would be.

function loadProjectOptions() {
   $opts= PEAR::getStaticProperty('MyProject','options');
   $this->projectOptions = $opts;
}

And calling that from any get() method on pages that needed it, in a similar ilk, not calling that from the template is also frowned apon, (eg. {loadProjectOptions()} ..... ) as if the method did database calls, you could end up with some strange results.

Well, that's all for now.. Thankfully the code in general on both projects is pretty good, so maintaining, and growing the code to meet future needs, has like older projects should save me considerable time, when that infamous request comes in 12 months later... 'can you change this?'...

Mentioned By:
www.planet-php.net : Planet PHP (93 referals)
google.com : april (88 referals)
www.phpdeveloper.org : PHPDeveloper.org: Alan Knowles' Blog: Code Reviewing. (78 referals)
google.com : december (76 referals)
www.phpdeveloper.org : PHPDeveloper.org: PHP News, Views, and Community (38 referals)
google.com : php code review (7 referals)
google.com : php freaks code (6 referals)
google.com : mdb2 mysql_real_escape (5 referals)
www.phpfreaks.com : PHP Help: Alan Knowles' Blog: Code Reviewing. (4 referals)
planet.debian.org.hk : Debian HK : Debian @ Hong Kong (4 referals)
google.com : Get Laid Code review (4 referals)
www.phpfreaks.com : PHP Help: PHP Freaks Articles (3 referals)
syntux.net : Advanced Caching Technique - Block Randomization - Don’t say geek say syntux! - Your freedom is worth more than you thin (3 referals)
google.com : "get laid code" (3 referals)
google.com : code reviewing (3 referals)
google.com : GENERAL CODE REVIEW IN PHP (3 referals)
google.com : html2pdf php (3 referals)
google.com : javascript send a post rather than get (3 referals)
google.com : php map post variables to classes (3 referals)
google.com : "get laid code" review (2 referals)

Comments

Iterative development

i usually only write up 30% of what is needed into the specs .. actually for the most part the specs are what i put in the offer. then as we work through each of the todo items i add some more infos to the specs. the first half way working implementation gets shown to the customer, specs are updated and the code is updated. this way we work through all the todo items. we always have a test version with the latest code (usually not older than 7-14 days) online so the customer cannot claim at project end, that this is all wrong. obviously we tell the customer that the more he participates, the better the final product will be .. its his choice. this way the bug fix and change phase is usually pretty short.

#0 - Lukas ( Link) on 2006-08-21 17:58:18 Delete Comment

SantosJ

I have also written out long documents detailing what I wanted in a project and either the specs completed changed halfway through or I couldn't exactly do it the way I wanted. They are really awesome to have in case you forget something or need to draw some inspiration for what the 'big picture is'.

#1 - SantosJ ( Link) on 2006-08-21 21:32:48 Delete Comment

wiki

One thing that really paid off was using a wiki with version history for the specs and even documentation writing. It makes collaboration with the customer very easy and fits well with an iterative approach.

#2 - Lukas ( Link) on 2006-08-22 04:05:12 Delete Comment

Related

Code Reviewing.

Phase I

Phase II

Phase III

Phase IV

Where I am at present.

Coding Standards

In more detail

Print rather than Echo

Exiting on XMLHttpRequest calls.

Dont do the page processing in the output method

Avoiding shortcut's in Javascript

Using Libraries that where not specified or non-PEAR ones.

Using oncommand, rather than commandset/command

dirname(FILE) or cwd()

Cloneing dataobject arrays over creating simple structs.

dont hard code options into the controller class

Comments

Add Your Comment

Related

Phase I

Phase II

Phase III

Phase IV

Where I am at present.

Coding Standards

In more detail

Print rather than Echo

Exiting on XMLHttpRequest calls.

Dont do the page processing in the output method

Avoiding shortcut's in Javascript

Using Libraries that where not specified or non-PEAR ones.

Using oncommand, rather than commandset/command

dirname(__FILE__) or cwd()

Cloneing dataobject arrays over creating simple structs.

dont hard code options into the controller class

Comments

Add Your Comment

Follow us on

OUR BLOG

dirname(FILE) or cwd()