Category Archives: Software Development

All posts on and about software development

Why is software such a monster?

One of the questions I ask first when I give talks to prospective students or public groups is “What is Software Engineering?”, and then get their responses. Usually the answers are similar; “programming”, “building an app”, “writing code” – all along the technical development line.

My next step (not that it does much for student numbers) is to tell them they’re wrong, that people focus on the programming and development aspects but in fact software engineering is not just programming in the same way that bridge building is not just bricklaying. If you were to say someone was a “Bridge Engineer” it could be they did everything from designing, testing, model making, laying foundations, sinking piles, suspending cables, or, yes indeed, bricklaying. In fact I say, no doubt smugly and patronisingly, “Software Engineering isn’t a job or a profession, it’s a whole industry”. The next 20 minutes or so are then filled as I labour the point and talk about the roles from requirements engineering through programmer to systems engineer, before they troop dejectedly out for a free lunch.

But this got me thinking; what if we did build bridges in the same way we build software?

Certainly there’s no doubt software engineering has advanced leaps and bounds since the Software Crisis of the 60’s, and a great deal has changed since Fred Brooks’ famously categorised software projects as “werewolves” (monsters that can turn around at any time and bite you!). But, advances aside, software projects can and do still fail, and the world is full of semi-functional legacy systems nobody dares to touch for fear of the house of cards collapsing.

Sure, in the formalised world of big business and large projects, generally we now expect projects to succeed. Good requirements engineering (knowing what you should actually be building!), software design (a sensible design to meet the goals and evolve), iterative development and agile approaches (building it a piece at a time, checking in with the customer on a regular basis, responding to change), testing (making sure it actually works!), and use of fancy modern cloud infrastructure (making it someone else’s responsibility to keep the lights on) has made this happen. But is this how most software is developed? As we’ve democratised software development, made a million open access resources and put a computer in every pocket, we are now in a position where most coding is being done in bedrooms or in small offices by individuals or tiny teams. Is this all being done in adherence with best practice on requirements traceability and agile principles? Probably not.

Are even we as professional software engineers eating our own dog food? How many utility scripts or programs in our home directories could be referred to as “quick dirty hacks”?

So, what if we built bridges the same way we (a lot of the time) build software?

We’d start with a seemingly simple requirement – I need to cross a river.

Bridge-1

A simple solution then – drop some big rocks in which can act as stepping stones.

Bridge-2

Brilliant! This solves the immediate problem and our gallant hero can cross to market and buy some bronze, or a chicken, or something.

However in the cauldron of innovation that is humanity someone goes and invents the wheel, and creates a cart. After trying to attach it to the aforementioned chicken, an unimpressed pig, and some crows, someone tries a horse and the winning formula is found. Except horses and wheels can’t use stepping stones (the requirements have evolved).

Simple answer – lay some planks over the stones.

Bridge-3

Time passes, the wheel is voted “best thing prior to sliced bread” four years in a row, and horses and carts cross our bridge back and forth. But there’s no stopping progress, and some lunatic builds a steam-powered cart on rails known as a train, much to the anger of the horses’ union. The train is much heavier than the cart, can’t go up and down banks, and needs rails so we modify the bridge again. We raise the deck, lay the rails, building on top of the planks resting on the stepping stones.

Bridge-4

Trains, it turns out, are way more popular than horse-drawn carts, and make much better settings for murder mysteries. As their number and use increases they get heavier and heavier, faster and faster. No longer will the rickety beams hold up so the bridge is reinforced over and over, as each new train comes onto the line.

Bridge-5

In time even steam becomes passé, some joker goes and puts the new-fangled internal combustion engine in it’s own little chariot and the age of the motor-vehicle is born.

Quick as a flash the bridge solution is found, and a new suspended roadway is bolted on the top!

Bridge-6

Evolved from stepping stones, still standing on those very foundations, and being used heavily day in, day out. It works, mostly, taking more volume and type of traffic than ever before, outliving by far the stepping stone engineer and many afterwards who cared for and modified it through the centuries.

Here we are, bang up to date, with a legacy bridge that has evolved over the years and with the majority of the engineers dead and gone, some from the black death.

Now Dave is responsible for maintaining the bridge, the hodge-podge of obsolete half-forgotten technologies, built by people long since gone using techniques long since lost to the sands of time.

One day, carrying about his usual bridge duties, he sees a cable he thinks is loose…

Bridge-7A

And being a conscientious sort he gives it a good old tighten up.

Bridge-7B

So the bridge collapses, spectacularly, probably just as a busload of orphans and accompanying nuns are travelling across. There is outcry, rage, incomprehensible gibberish of outrage on twitter, fingers pointed at the government, promises of enquiries, and a desire on all sides to find someone to blame. The burden of blame naturally falls on the obvious suspect, the man with the smoking wrench in hand, Dave.

Bridge-Result

We know who to blame, and everyone can rest assured it was definitely, and uniquely, his fault. He rots in jail, flowers are laid at the old bridge site, and we start all over again.

But is this really how software is developed?

I believe so, yes. Certainly more often than it should, and more often than we would like to admit. All too often software is quickly (and dirtily!) built to solve the minimum problem right now, and even at the time we say things like “when I get time I’ll come back and do this properly” or “well this will do for now until we replace it/the need goes away”, and how often do we do that? We could just as well say “well this will do until pigs fly to the moon and prove it’s made of cheese”.

But the initial quick dirty solution isn’t the problem. The issue comes with uncontrolled evolution. As much as night follows day (follows night, follows day, etc) requirements will change, we in the IT industry exist in a state of constant flux, nothing stands still for long. By it’s very nature successful software and successful systems will be required to face change and evolve to meet new requirements continually. When we “hack” these requirements onto our, only intended for short term use, original “quick and dirty” solution, we rapidly introduce all sorts of unintended complexity and integration issues – we make ourselves a codethulu or screaming tower of exceptions.

Yet we still rarely learn our lesson because once we’ve hacked one new set of requirements in and it works, sort of, in a “don’t press G on the third input field or it’ll crash” kinda way we stride away feeling like code-warrior bosses and not looking back. Until the next inevitable change.

So Brooks is still right?

Well obviously – the man is a genius after all. The essential difficulties he identified with software are still present in most cases.

  • Complex – Check! More complex than ever really. As computer power grows so does our demand on what we can do with it. Of course if we properly decompose a problem down we can reduce the complexity, but that’s doing things properly.
  • Subject to external conformity – yes siree! External stuff is changing all the time and it’s expected that the software can be tweaked to conform; we don’t change the printer spec to make it work with the software, we change the software.
  • Changeable – more than ever! Change is a natural part of software and entirely unavoidable, unless you’re MySpace.
  • Invisible – Unlike bridges we can’t see software, we can’t kick it, or go and marvel at it’s grandeur, prod a few bits to understand it’s function. Yes now we have UML and various design tools but only if we do things properly which, often, we don’t.

But most of the time even unplanned development projects deliver, so haven’t things got better?

Ah yes, what I like to call the “fallacy of the initial success” (catchy name, right?). If I set out to build a software system to do X and I stick at it, I normally expect the system will eventually do X. The project probably won’t fail partway through into a pit of recrimination and blame.

This is because we now have some amazing tools at our disposal. Web scripting languages combined with HTML and a browser mean anyone can build a program with a UI. Memory managed (garbage collected) languages mean who cares about freeing resources, or causing hardware resets by accessing the wrong address in RAM. Stack overflow will answer most queries for us, and the world is full of talented developers who put their libraries and source free and available online for us to use. So to sit down and build something to do X, no problem! Everything is rosy and it’s home for tea and medals!

The problem however is Y and Z, new requirements that follow on with apparent inevitability. I’ve been in the business long enough to know when I say to myself “it’ll only need to do X” that I am blatantly lying to myself.

My hypothesis then is that: rather than solving the problem with software development, we’ve mainly just transferred the risk from the front-end (initial development) to the back-end (maintenance and modification) of the development lifecycle.

It’s true we may no longer have werewolves in every project, ready to turn into beasts and sink their teeth into us – the stuff of nightmares, but we certainly now have our fair share of zombies. Many of these are the relatively benign shuffling-gait “braaaaaains…” type, easily avoided or picked off at our leisure, but there’s also a large number of buttock-clinchingly scary 28-days style vicious running zombies. These are the myriad of legacy programs we’re surrounded by, just waiting for the opportunity to smash their way out of the coffin and chase us down for breakfast brains.

So what’s the solution?

All the above is based on one premise – that the lovely and excellent principles of proper software engineering from requirements gathering, analysis, design (flexible, extensible, reusable, standardised, highly cohesive and lowly coupled), agile development, prototyping, stakeholder engagement, iterative methods aren’t being followed (which sadly I think happens most of the time – of course if you are following good practice then of course nothing can go wrong [ha!], or at least we hope not this particular mess).

So the simple solution is to follow good practice. To try and take a little more time in the design and implementation of solutions, consider the future maintainability and reuse potential; build software to last years not days or months. Listen to the little bit of us that screams with impotent rage when we “just quickly bodge this”.

We also shouldn’t ever be afraid to tear it down and start again. Iterative does not always mean incremental, if something is becoming unfit for purpose can’t we find the time to invest in sorting it properly? How much time do we really spend on preventative maintenance compared to how much we should? Is it fire-fighting or careful fireproofing to avoid the emergency?

If you want to fix the roof, better do it while the sun is shining.

There are attempts at automated approaches to help us manage these zombie programs, and in fact a large part of my research is on this kind of area. But don’t worry, if we all start coding properly – engineering not just programming – tomorrow, there’s still plenty of zombies out there already created, so you’re not doing me out of a job. My confrontational attitude and willingness to tell prospective students they’re wrong when I ask them a trick question will do that for me.

Good documentation is another good idea (assuming you’re not following a good practice that tells you documentation is the very spawn of Satan) but remember; documentation can end up evolving like code, we end up with too much of it telling us too little and unsure where to find the little nugget of information we need amongst the volumes. Dave (the bridge man) had plenty of blueprints, and if he understood them all he would have known not to tighten the fateful wire, but how was he to know?

Dave Blueprint scroll

 

David Cutting is a senior research associate at the Tyndall Centre at UEA, associate tutor, partner at Verrotech, CTO of tech startup Gangoolie, and alleged software engineer (purveyor of much shoddy half-built freeware) – he does not eat his own dog food often enough. He holds both a PhD and an MSc in computer science and by that gives even the poorest student hope. This article is a cut-down version of a presentation entitled “Software! It’s Broken” he gives when anyone will listen. Artwork by Justin Harris and Amy Hunter.

All content is (C) Copyright 2015-2017 David Cutting (dcutting@purplepixie.org), all rights reserved.

cotravel at SyncTheCity 2014: a git history

Coming up for a year ago now I took part in the inaugural Sync the City event in Norwich, and guess what – it’s happening again!

Last year we undertook the challenge to design, build/develop and deploy our business cotravel in 54 hours. This culminated in a working system (with proper back end, usable APIs, web interface etc), several actual business agreements with local taxi companies, and finally an excellent presentation by Rod.

The event garnered quite a bit of press (in which we got a mention!) and has been the subject of some other reflective blogging by team members. Sadly I can’t take part this year (some nonsense about a PhD I have to finish…), but it got me looking back.

B3

During the two days of (frantic) development we used a git repo on BitBucket to manage our source code between the development machines, test, and live environments. This got me thinking it would be nice to visualise the development process mining the repository, if only I had the tools and computing power. Then I remembered I totally do! Analysing software systems, including from repository logs, is kinda what we do. So, waiting for a long experiment to finish, I realised I could quickly bodge a couple of our tools together and see what I could get.

During the development period (evening of 20th November to evening of 22nd November) there were two developers: David (me), and Adam (not me). So I plotted commits by hour both per developer and in total, from the very first commit at 17:28 on the 20th to the last at 16:46 on the 22nd.

cotravel-all

And there we have it, from 0:00 on the 20th to 23:59 on the 22nd. As you can see the main splurge of work (technical term) was on the 21st from 06:00 to 22:00, which is when we built the majority of the functionality. On the 22nd we again got an early start, but nowhere near the commit intensity and it petered off anyway, after go live at 14:00 on the 22nd to the presentations.

Two interesting peaks on the 21st – the first was mainly Adam putting together the agreed UI elements and repeatedly pushing it up for all to see. The second peak was me trying (and generally failing) to master the Facebook API, as it needed domains working etc and so had to be pushed to a live server to test (until I found a workaround which I really should document).

Of course once we went live…

Cotravel Lives!

We then had all manner of other stats available to us, for example a log of visitors to the site and the number of API calls made from our snazzy web 2.0 front-end to my dodgy interfaces. We went live at 14:00 and kept track every hour until 20:00.

Cotravel Live Calls (k)

Here’s the graph showing (in 1000’s) the page views and (much higher) volume of API calls made to the system from 1400-2000 when we finally collapsed.

Overall Sync the City was a great experience, a lot of fun and a great way to meet interesting people and do ninja coding (plenty of opportunities to DevOps the **** out of it, or as we used to call it; make live changes to the code while users are still interacting).

Good luck to anyone taking part this year!

Working With Big Numbers

Recently a friend asked me a some questions:

“97 raised to the power of 242 has equalled infinity on every calculator I’ve used, but it’s not infinite just very big. Why do they say infinity? And what is 97^242?”

The answer to the first part is easy; precision (and hence maximum values) are limited. Since this isn’t a basic intro to binary and computing we won’t go into it but just say in the good old days this would have resulted in the number simply wrapping. Modern devices and systems detect this overflow and show a special case result, usually Infinity/Inf or sometimes Not-a-Number/NaN.

Big number calculation in R

97^242 in R

96^242 in Matlab

97^242 in Matlab

Above are two common tools (R and Matlab) both running on 64-bit Linux and overflowing for 97^242. The difference in overflow can be seen in that the OSX Calculator can handle 97^71 but overflows at 96^72, whereas both Matlab and R will handle 97^166 but not 97^156.

Ok well Google has a calculator function so maybe we can just ask it for 97^242?

Google for 97^242

Google for 97^242

Alas, no. But maybe if we trick it with 97+242 to get it’s calculator up and then use that?

Google calculator

97^242 in Google Calc

Nope. So how do we go about trying to calculate 97^242?

There are approaches we can use to estimate it (one of the most promising being looking for differences between powers of 100 and 97 then extrapolating, maybe something for another blog post) but we want an exact answer. As shown by Matlab/R and common logic built-in types are just too small and will overflow.

The solution comes, as with so much in life, from GNU in the GNU Multiple Precision Arithmetic Library aka GNU MP Bignum.

Once installed (yum install gmp-devel on Fedora) it’s just a case of hacking together some C to calculate the result (note the below isn’t supposed to be efficient, more transparent):

#include <gmp.h>
#include <stdio.h>
#include <stdlib.h>

int main(int ac, char** av)
{
	mpz_t number,start;
	unsigned long int startnumber = 97;
	int i=0;
	mpz_init(number);
	mpz_init(start);

	mpz_set_ui(start,startnumber);
	mpz_set(number, start);

	for (i=0; i<(242-1); ++i)
	{
		mpz_mul(number, number, start);
	}

	gmp_printf ("Answer: %Zd\n\n", number);

	return 0;
}

Build this with: gcc bignum.c -lgmp -o bignum

And voila:

Answer: 6291579554172660514180168586029512181759771859911909633079235697774386086528343277488812886056338013920280508647975158853848809035553070842805751211101339655910548731303652360707362342079349547320620109301210997985503312350525910702941569606402987567610468491227904389508486082138580406254059512446817845870945561908178074689723504831108854735558785367285467641732532222094514773911007516550984383178257067496267923472962067697340265687768345831925754550553895144516227499602812609

Which for those who can’t be bothered to count digits is rounded as 6.29E+480.

Though of course after all this loading of libraries and writing of C it turns out that although Google couldn’t answer it, naturally Wolfram|Alpha could:

Wolfram Alpha Calculation

Wolfram|Alpha Calculates 97^242 with Ease!

C/C++ CGI File Upload

A long time ago when I still had (some) hair and hadn’t bitten the PHP bullet I played around with C++ CGIs. Owing to a lack of then available HOW-TO docs I went on to write a (badly written and error-filled) CGI in C/C++ HOW-TO and also a CGI Variable Wrapper. The HOW-TO did what it said on the tin and the wrapper provided an easy API to read/write GET and POST variables as well as cookies.

Surprisingly both the HOW-TO and the wrapper are still in use and I get contacted form time to time with queries. The most common query regards file upload which the wrapper doesn’t support. To illustrate a simple file upload I cobbled together a quick and dirty C example which I’ve provided via email ever since.

So here, for general reference, is my demonstration C code. Please note this is very untested and unrobust, even dodgier than my usual fare. I keep meddling with the idea of finding time to do a proper job either of a standalone file upload API or integrating support into the CGI wrapper. All of this is really just for kicks though as there are better solutions available.

/** Very rough-and-ready CGI file upload in C/C++
    This is demonstration code only really and, of course,
    no liability accepted for anything!

    David Cutting
    http://www.purplepixie.org/dave/
    http://blog.purplepixie.org/
    dcutting [at] purplepixie [dot] org

    Code Copyright DMC 22/05/2011
**/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void)
{
FILE *out;
char *rawdata; // pointer for rawdata
char *data; // will be an offset of rawdata after the two newlines
unsigned long length; // length of data to be read used for malloc
unsigned long writelength; // use for the length of the write
char *pos; // used in the loop

printf("Content-type: text/html\n\n<HTML><PRE>"); // for debug output
// Various bits of debug output are included - comment out for live
// and replace with whatever output you'd like to send to the client

out=fopen("/var/www/test.jpg","wb+"); // open the output file

length=atol(getenv("CONTENT_LENGTH"));
writelength=length;

printf("Content Length: %u\n",length);

rawdata=malloc(length+1); // malloc required buffer

fread(rawdata, length, 1, stdin); // read into buffer

// now comes the loop, there are better ways but not that I can find quickly enough
for (pos=rawdata; pos<(rawdata+length-4); pos++)
{
 writelength--; // decrement the write length
 printf("%c %d\n",pos[0],pos[0]); // used for debug output (comment out for live)
 if ( (pos[0]==13) && (pos[1]==10) && (pos[2]==13) && (pos[3]==10) && ( (pos[4]<32)||(pos[4]>127) ) ) // pattern to find two double-newlines
 {
  data=pos+4; // move data pointer forward 4 to start of actual data
  pos=rawdata+length+2; // break loop
  printf("Found\n"); // another debug line - comment out for live
  writelength-=3; // decrement writelength by three (done one already above for this loop)
 }
}

printf("Writelength: %u\n",writelength); // yet another debug

// write the data to the file
fwrite(data, 1, writelength, out);

// close the file
fclose(out);

free(rawdata); // free memory

printf("Upload Complete"); // debug - comment out for live

return 0; // exit
}

 

Variable Length Arguments in C++, Java, and PHP

Normally in software development we define methods with a given number of parameters (and their type in some languages). Quite often however we want to be able to deal with different numbers of arguments and there are two widely used approaches; different methods and default parameters.

Different methods relies on the concept that the call to function is matched not just on the name of the method but also the count (and type if applicable) of the parameters. So if we wanted a method that could accept one or two integers in C++ we could define two methods:

void SomeFunction(int);
void SomeFunction(int, int);

So if we called SomeFunction(1) the first would be used, SomeFunction(1,2) would use the second.

Default parameters allows us to define some of the parameters as optional and their default values if not passed so the definition:

void SomeMethod(int a, int b=0);

Would accept one or two integer parameters. SomeMethod(1,2) would have a=1 and b=2 whereas SomeMethod(1) would use the default value and so b=0.

This is all very good and highly useful in a variety of situations but suppose you wanted to handle any number of parameters, from a very small set (or zero) to a large number. Using either of these techniques would require a lot of additional coding, creating a method for each length or the longest set of default parameters imaginable.

This is where the concept of variable length arguments for a method comes in; we want to be able to define a method and accept an arbitrary number of arguments which it can process (please note in most if not all cases the best option for this would be to pass something like a Vector in C++ or an array in PHP, but best practice is not the point of the exercise).

Let’s consider a problem.

We want to have a LineShape function. This function takes a series of Points (a simple class just containing an X and Y coordinate). In a proper system it would then start with the first point and draw a line to each consecutive one but for our example we just want it to print a list of the points it will draw to/from in order.

This could be two points (a single line) or a complex shape of an undetermined total number of points (again note the caveat above that a Vector/List/Array would be the best and safest way to do this in TRW).

So for our implementation we need:

  • A simple Point class
  • A method (LineShape) that takes an arbitrary number of Points and prints out the coordinates
  • Code to create a set of Points and pass them to LineShape

How to do this varies from language to language and, as you might assume, it’s hardest and most dangerous in C++ (because of it’s lack of type safety), slightly easier in Java and PHP (Java because of it’s high type safety and PHP because of it’s lack of any enforced typing).

Variable Length Arguments in C++

To implement in C++ we make use of the va_ functionality provided in stdarg.h. The function is defined as taking the number of parameters passed (int) and then the parameters themselves represented by “…”.

We read the parameter count and then iterate through reading each in turn with va_arg and specifying the type to be used. Note in C++ you must specify the number of parameters being passed when calling the method.

#include <iostream>
#include <stdarg.h>
using namespace std;

// Declare the variable-length argument method
void LineShape(int, ...);

// Our simple Point class
class Point
{
public:
	int x;
	int y;
	Point(int ix, int iy)
	{
		x = ix;
		y = iy;
	}
};

// Definition of LineShape
void LineShape(int n_args, ...)
{
	va_list ap; // arg list
	va_start(ap, n_args); // start on list

	for (int i=1; i<=n_args; i++) // iterate
	{
		// read next argument as Point*
		Point* a = va_arg(ap, Point*);
		cout << a->x << "," << a->y << "\n";
	}
	va_end(ap);
}

int main(int ac, char **av)
{
	Point *a = new Point(10,10);
	Point *b = new Point(15,15);
	Point *c = new Point(10,20);
	LineShape(3, a, b, c);
	return 0;
}

There you have it in C++ (well actually using C libraries); but don’t do it (see above).

Variable Length Arguments in Java

In Java it’s a lot easier as the functionality is built-in to the language. Additionally you don’t need to pass the number of parameters and also the type is determined for the entire set of parameters (in our case Point).

Note that in order for Point to be instantiated as non-static it must be in a seperate file (Point.java).

So our Point class is:

public class Point
{
	public int x;
	public int y;
	Point(int ix, int iy)
	{
		x = ix;
		y = iy;
	}
}

And the main Var.java contains:

public class Var
{
	public static void LineShape(Point... points)
	{
		for (Point p: points)
		{
			System.out.println(p.x + "," + p.y);
		}
	}

	public static void main(String args[])
	{
		Point a = new Point(10,10);
		Point b = new Point(15,15);
		Point c = new Point(10,20);
		LineShape(a,b,c);
	}
}

So in Java we just need to declare a method with Type… name and then iterate through the array in a for loop.

Variable Length Arguments in PHP

PHP isn’t quite as built-in as Java (an actual language construct) but PHP natively provides functions to support variable length parameters to methods such as func_num_args (number of arguments) and func_get_args (arguments as an array).

<?php
class Point
{
	public $x;
	public $y;
	public function Point($ix, $iy)
	{
		$this->x = $ix;
		$this->y = $iy;
	}
}

function LineShape()
{
	$num_args = func_num_args();
	$arg_list = func_get_args();
	for ($i=0; $i < $num_args; $i++)
	{
		$point = $arg_list[$i];
		echo $point->x.",".$point->y."\n";
	}
}

$a = new Point(10,10);
$b = new Point(15,15);
$c = new Point(10,20);

LineShape($a, $b, $c);
?>

 

PHP Dynamic Factories

Design patterns are common solutions to commonly-encountered problems in software development. Of these the most widely used are creational patterns – methods of creating objects, most notably the factory pattern. The factory pattern is used for the centralised instantiation of related class objects.

Let’s illustrate this with an example using the idea of shapes. We want to have an abstract base Shape class and then concrete classes derived from the base class of specific (or concrete) shapes.

abstract class Shape
{
	abstract function Draw();
}

class Square extends Shape
{
	function Draw()
	{
		echo "Square";
	}
}

class Circle extends Shape
{
	function Draw()
	{
		echo "Circle";
	}
}

We would, without any of this factory nonsense, instantiate the classes directly:

$square = new Square();
$circle = new Circle();

Using the standard factory method rather than instantiating directly we would create classes to handle the instantiation for us. First we define an abstract ShapeFactory class and then specific concrete factories to create individual shapes. So adding to our previous code:

abstract class ShapeFactory
{
	abstract function Create();
}

class SquareFactory extends ShapeFactory
{
	function Create()
	{
		return new Square();
	}
}

class CircleFactory extends ShapeFactory
{
	function Create()
	{
		return new Circle();
	}
}

So now, rather than creating instances directly we first create a concrete factory instance and then use that to create the concrete shape:

$squareFactory = new SquareFactory();
$circleFactory = new CircleFactory();

$square = $squareFactory->Create();
$circle = $circleFactory->Create();

This is all very well and good (and design patterns in general are a very powerful and useful tool) but it fails to take advantage of PHP’s powers of runtime adaptability, the ability to change and update code behaviour and functionality during execution (at runtime).

For example having these factories provides us with a standard way to create shapes but how would we add new shapes easily. With the factory pattern we would still need code modification, new shapes require a new factory and for this factory to be explicitly used to create objects.

With the wonder that is PHP however we can be more flexible, more dynamic.

Consider the possibility we want to be able to create any type of shape from a factory.

As an example you could have a Shape Maker class which took a text string for the type and returned an appropriate object (this is known as a paramerized factory):

class ShapeMaker
{
	public function Create($type)
	{
		if ($type == "circle")
			return new Circle();
		else if ($type == "square")
			return new Square();
		else
			return null;
	}
}

This would allow the creation of shapes as follows:

$maker = new ShapeMaker();
$square = $maker->Create("square");
$circle = $maker->Create("circle");

But we would really like to go further than this; we want to create a dynamic factory, one in which shapes are simply registered with a type and a associated class name, and then created by passing the type.

For this to work we would need the following:

  • A list (array) of registered types and their class names
  • A method to register new types
  • A method to create an object of the given type
  • A fall-back; what to do if a creation request is made for a type that doesn’t exist

To make this all even easier to use in our example the class will be an abstract class using static members and methods (though of course the same idea holds true for a non-abstract class using non-static members, it would just need to be instantiated first).

So, using the same shape code but replacing the factory code with the following:

abstract class ShapeFactory
{
	protected static $types = array();

	public static function Register($type, $class)
	{
		self::$types[$type]=$class;
	}

	public static function IsRegistered($type)
	{
		if (isset(self::$types[$type]))
			return true;
		else
			return false;
	}

	public static function Create($type)
	{
		if (isset(self::$types[$type]) && 
		class_exists(self::$types[$type]))
			return new self::$types[$type];
		else
			return null;
	}
}

ShapeFactory::Register("square","Square");
ShapeFactory::Register("circle","Circle");

We create a ShapeFactory class with the ability to register, check the existence of, and create shape classes. All that remains to do is to register the Square and Circle classes with type names (the lower case versions). Once this is done they can be created with:

$square = ShapeFactory::Create("square");
$circle = ShapeFactory::Create("circle");

Which creates objects of Square and Circle using the dynamic type identifiers.

So what is the advantage of this?

Well imagine now in our code we wish to add a new shape. This shape (Triangle) is contained in a file triangle.php which may or may not be included at runtime with an include_once. We may want to limit the inclusion of triangle to an add-on pack or for just specific users, so we have logic to decide if triangle should be included.

All triangle.php needs now to contain is:

class Triangle extends Shape
{
	function Draw()
	{
		echo "Triangle";
	}
}

ShapeFactory::Register("triangle","Triangle");

And then if it is included the triangle type becomes automatically available from the ShapeFactory. If ShapeFactory had been extended (very easy) to return a list of possible shapes which was used to populate a UI component then triangle would now appear, could be selected, and used as a type identifier to create a concrete shape of the correct type (triangle) at runtime.

The idea of the dynamic factory can be taken much further with list provision or description fields for example and is with certain purplepixie.org products such as FreeDESK.

Another layer of abstraction could also easily be added; all the functionality of the ShapeFactory would be generic to any dynamic factory, just the list of types and objects being unique (if this was a non-static class these could be just different instances of a general DynamicFactory class). In the static example we create a DynamicFactory and then extend from it product-specific dynamic factories to be used to house the lists of specific types:

<?php

abstract class Shape
{
	abstract function Draw();
}

class Square extends Shape
{
	function Draw()
	{
		echo "Square";
	}
}

class Circle extends Shape
{
	function Draw()
	{
		echo "Circle";
	}
}

abstract class DynamicFactory
{
	protected static $types = array();

	public static function Register($type, $class)
	{
		self::$types[$type]=$class;
	}

	public static function IsRegistered($type)
	{
		if (isset(self::$types[$type]))
			return true;
		else
			return false;
	}

	public static function Create($type)
	{
		if (isset(self::$types[$type]) && 
		class_exists(self::$types[$type]))
			return new self::$types[$type];
		else
			return null;
	}
}

class ShapeFactory extends DynamicFactory { }

ShapeFactory::Register("square","Square");
ShapeFactory::Register("circle","Circle");

$a_square = ShapeFactory::Create("square");
$a_circle = ShapeFactory::Create("circle");

$a_square->Draw();
echo "\n";
$a_circle->Draw();
echo "\n";

?>

Some links to source files for download:

classic.php : Shows the classic factory pattern

dynamic.php : Shows the first dynamic iteration along with the maker class

dynamic2.php : Shows the finalised version