SSH SOCKS Proxy Control

One of the many amazing features with SSH is various forms of traffic forwarding. You can forward ports (local or remote) and even setup a network tunnel with -w, and you can also use SSH to setup a SOCKS proxy for you. This creates an HTTP/S proxy server on your local machine which directs web traffic through a remote machine when the proxy is configured in your browser/system.

This is incredibly useful for accessing general web content for example on an otherwise firewalled network. It has the added bonus of encrypting traffic from you to the remote system if you were worried about local snoopers or outgoing firewalls.

To start a SOCKS proxy is easy: ssh -D PORT username@remotehost

This will start a proxy which can be accessed via PORT on localhost and connect through to the remotehost as username. Various other options can be used as well so the most common connection is:

ssh -D PORT -f -C -q -N username@remotehost

Here: -D as before says start the proxy on this port, -f means fork (run in the background), -C compress traffic (usually beneficial), -q quiet mode (nothing to the user), and -N means I don’t want to run a command just make the connection.

This is the command I commonly used to connect to different sites. To make things easier I had a number of scripts in my bin directory which had the values coded in. Opening became easy but closing the connection would require looking through a process list and stopping the right ssh process.

Given that I only wanted to run this in userspace (not system-wide or as part of startup etc) but I did want to be able to easily bring the link up and down without using a tty (so running in the background) I put together a quick bash script inspired by the old init.d services:

#!/bin/bash

SERVER=some.remote.server
USERNAME=username
PORT=8123

function vpn_pid(){
  echo $(ps aux | grep "ssh" | grep " -D" | grep "${PORT}" | awk '{print $2}')
}

case $1 in

up)
  echo "Starting VPN as ${USERNAME}@${SERVER}"
  ssh -D ${PORT} -f -C -q -N ${USERNAME}@${SERVER}
  echo "VPN Started"
  ;;
down)
  echo "Stopping VPN"
  PID=$(vpn_pid)
  if [ -z "$PID" ]
  then
    echo "SOCKS VPN does not appear to be running"
  else
    echo "Killing PID ${PID}"
    kill ${PID}
    echo "Kill signal sent"
  fi
  ;;
status)
  PID=$(vpn_pid)
  if [ -z "$PID" ]
  then
     echo "SOCKS VPN does not appear to be running"
  else
     echo "SOCKS VPN appears to be running, pid=${PID}"
  fi
  ;;
*)
  echo "Usage vpn up|down|status";
  ;;
esac

This script, imaginatively called vpn (even though yes I know it’s not actually a VPN) lets me start, check, and stop the proxy very easily.

Posted here just because!

RPi @ QUB (Raspberry Pi Wireless at Queen’s University Belfast)

Raspberry Pi’s are great but there can be some limitations in the cut-down GUI utilities they come with. One of these is joining of enterprise/corporate WiFi networks, the sort that require username/password authentication rather than a single pre-shared key.

Luckily there’s quite a lot of guides out there for different networks here and there (including eduroam) but it took me a few goes to get a Pi with a nano network adaptor working on the QUB WiFi (_QUB_WiFi), so here’s the configuration you need (note this works for me, it could well be I don’t need all the options – but with this it auto-connects in about 15 seconds).

You need to edit the /etc/wpa_supplicant.conf file and add a network section as follows:

network={
    ssid="_QUB_WiFi"
    key_mgmt=WPA-EAP
    eap=PEAP
    identity="*QUB-STAFFSTUDENT-NUMBER*"
    password="*QUB-NETWORK-PASSWORD*"
    phase1="peaplabel=0"
    phase2="auth=MSCHAPV2"
    disabled=0
    mode=0
}

To test this you can run: sudo wpa_supplicant -i wlan0 -c /etc/wpa_supplicant.conf

And have a look at the output – note on another terminal you’ll have to run: sudo dhclient wlan0

Once it’s connected to get an IP address (on normal login after bootup it’ll do all that for you).

 

Why is software such a monster?

One of the questions I ask first when I give talks to prospective students or public groups is “What is Software Engineering?”, and then get their responses. Usually the answers are similar; “programming”, “building an app”, “writing code” – all along the technical development line.

My next step (not that it does much for student numbers) is to tell them they’re wrong, that people focus on the programming and development aspects but in fact software engineering is not just programming in the same way that bridge building is not just bricklaying. If you were to say someone was a “Bridge Engineer” it could be they did everything from designing, testing, model making, laying foundations, sinking piles, suspending cables, or, yes indeed, bricklaying. In fact I say, no doubt smugly and patronisingly, “Software Engineering isn’t a job or a profession, it’s a whole industry”. The next 20 minutes or so are then filled as I labour the point and talk about the roles from requirements engineering through programmer to systems engineer, before they troop dejectedly out for a free lunch.

But this got me thinking; what if we did build bridges in the same way we build software?

Certainly there’s no doubt software engineering has advanced leaps and bounds since the Software Crisis of the 60’s, and a great deal has changed since Fred Brooks’ famously categorised software projects as “werewolves” (monsters that can turn around at any time and bite you!). But, advances aside, software projects can and do still fail, and the world is full of semi-functional legacy systems nobody dares to touch for fear of the house of cards collapsing.

Sure, in the formalised world of big business and large projects, generally we now expect projects to succeed. Good requirements engineering (knowing what you should actually be building!), software design (a sensible design to meet the goals and evolve), iterative development and agile approaches (building it a piece at a time, checking in with the customer on a regular basis, responding to change), testing (making sure it actually works!), and use of fancy modern cloud infrastructure (making it someone else’s responsibility to keep the lights on) has made this happen. But is this how most software is developed? As we’ve democratised software development, made a million open access resources and put a computer in every pocket, we are now in a position where most coding is being done in bedrooms or in small offices by individuals or tiny teams. Is this all being done in adherence with best practice on requirements traceability and agile principles? Probably not.

Are even we as professional software engineers eating our own dog food? How many utility scripts or programs in our home directories could be referred to as “quick dirty hacks”?

So, what if we built bridges the same way we (a lot of the time) build software?

We’d start with a seemingly simple requirement – I need to cross a river.

Bridge-1

A simple solution then – drop some big rocks in which can act as stepping stones.

Bridge-2

Brilliant! This solves the immediate problem and our gallant hero can cross to market and buy some bronze, or a chicken, or something.

However in the cauldron of innovation that is humanity someone goes and invents the wheel, and creates a cart. After trying to attach it to the aforementioned chicken, an unimpressed pig, and some crows, someone tries a horse and the winning formula is found. Except horses and wheels can’t use stepping stones (the requirements have evolved).

Simple answer – lay some planks over the stones.

Bridge-3

Time passes, the wheel is voted “best thing prior to sliced bread” four years in a row, and horses and carts cross our bridge back and forth. But there’s no stopping progress, and some lunatic builds a steam-powered cart on rails known as a train, much to the anger of the horses’ union. The train is much heavier than the cart, can’t go up and down banks, and needs rails so we modify the bridge again. We raise the deck, lay the rails, building on top of the planks resting on the stepping stones.

Bridge-4

Trains, it turns out, are way more popular than horse-drawn carts, and make much better settings for murder mysteries. As their number and use increases they get heavier and heavier, faster and faster. No longer will the rickety beams hold up so the bridge is reinforced over and over, as each new train comes onto the line.

Bridge-5

In time even steam becomes passé, some joker goes and puts the new-fangled internal combustion engine in it’s own little chariot and the age of the motor-vehicle is born.

Quick as a flash the bridge solution is found, and a new suspended roadway is bolted on the top!

Bridge-6

Evolved from stepping stones, still standing on those very foundations, and being used heavily day in, day out. It works, mostly, taking more volume and type of traffic than ever before, outliving by far the stepping stone engineer and many afterwards who cared for and modified it through the centuries.

Here we are, bang up to date, with a legacy bridge that has evolved over the years and with the majority of the engineers dead and gone, some from the black death.

Now Dave is responsible for maintaining the bridge, the hodge-podge of obsolete half-forgotten technologies, built by people long since gone using techniques long since lost to the sands of time.

One day, carrying about his usual bridge duties, he sees a cable he thinks is loose…

Bridge-7A

And being a conscientious sort he gives it a good old tighten up.

Bridge-7B

So the bridge collapses, spectacularly, probably just as a busload of orphans and accompanying nuns are travelling across. There is outcry, rage, incomprehensible gibberish of outrage on twitter, fingers pointed at the government, promises of enquiries, and a desire on all sides to find someone to blame. The burden of blame naturally falls on the obvious suspect, the man with the smoking wrench in hand, Dave.

Bridge-Result

We know who to blame, and everyone can rest assured it was definitely, and uniquely, his fault. He rots in jail, flowers are laid at the old bridge site, and we start all over again.

But is this really how software is developed?

I believe so, yes. Certainly more often than it should, and more often than we would like to admit. All too often software is quickly (and dirtily!) built to solve the minimum problem right now, and even at the time we say things like “when I get time I’ll come back and do this properly” or “well this will do for now until we replace it/the need goes away”, and how often do we do that? We could just as well say “well this will do until pigs fly to the moon and prove it’s made of cheese”.

But the initial quick dirty solution isn’t the problem. The issue comes with uncontrolled evolution. As much as night follows day (follows night, follows day, etc) requirements will change, we in the IT industry exist in a state of constant flux, nothing stands still for long. By it’s very nature successful software and successful systems will be required to face change and evolve to meet new requirements continually. When we “hack” these requirements onto our, only intended for short term use, original “quick and dirty” solution, we rapidly introduce all sorts of unintended complexity and integration issues – we make ourselves a codethulu or screaming tower of exceptions.

Yet we still rarely learn our lesson because once we’ve hacked one new set of requirements in and it works, sort of, in a “don’t press G on the third input field or it’ll crash” kinda way we stride away feeling like code-warrior bosses and not looking back. Until the next inevitable change.

So Brooks is still right?

Well obviously – the man is a genius after all. The essential difficulties he identified with software are still present in most cases.

  • Complex – Check! More complex than ever really. As computer power grows so does our demand on what we can do with it. Of course if we properly decompose a problem down we can reduce the complexity, but that’s doing things properly.
  • Subject to external conformity – yes siree! External stuff is changing all the time and it’s expected that the software can be tweaked to conform; we don’t change the printer spec to make it work with the software, we change the software.
  • Changeable – more than ever! Change is a natural part of software and entirely unavoidable, unless you’re MySpace.
  • Invisible – Unlike bridges we can’t see software, we can’t kick it, or go and marvel at it’s grandeur, prod a few bits to understand it’s function. Yes now we have UML and various design tools but only if we do things properly which, often, we don’t.

But most of the time even unplanned development projects deliver, so haven’t things got better?

Ah yes, what I like to call the “fallacy of the initial success” (catchy name, right?). If I set out to build a software system to do X and I stick at it, I normally expect the system will eventually do X. The project probably won’t fail partway through into a pit of recrimination and blame.

This is because we now have some amazing tools at our disposal. Web scripting languages combined with HTML and a browser mean anyone can build a program with a UI. Memory managed (garbage collected) languages mean who cares about freeing resources, or causing hardware resets by accessing the wrong address in RAM. Stack overflow will answer most queries for us, and the world is full of talented developers who put their libraries and source free and available online for us to use. So to sit down and build something to do X, no problem! Everything is rosy and it’s home for tea and medals!

The problem however is Y and Z, new requirements that follow on with apparent inevitability. I’ve been in the business long enough to know when I say to myself “it’ll only need to do X” that I am blatantly lying to myself.

My hypothesis then is that: rather than solving the problem with software development, we’ve mainly just transferred the risk from the front-end (initial development) to the back-end (maintenance and modification) of the development lifecycle.

It’s true we may no longer have werewolves in every project, ready to turn into beasts and sink their teeth into us – the stuff of nightmares, but we certainly now have our fair share of zombies. Many of these are the relatively benign shuffling-gait “braaaaaains…” type, easily avoided or picked off at our leisure, but there’s also a large number of buttock-clinchingly scary 28-days style vicious running zombies. These are the myriad of legacy programs we’re surrounded by, just waiting for the opportunity to smash their way out of the coffin and chase us down for breakfast brains.

So what’s the solution?

All the above is based on one premise – that the lovely and excellent principles of proper software engineering from requirements gathering, analysis, design (flexible, extensible, reusable, standardised, highly cohesive and lowly coupled), agile development, prototyping, stakeholder engagement, iterative methods aren’t being followed (which sadly I think happens most of the time – of course if you are following good practice then of course nothing can go wrong [ha!], or at least we hope not this particular mess).

So the simple solution is to follow good practice. To try and take a little more time in the design and implementation of solutions, consider the future maintainability and reuse potential; build software to last years not days or months. Listen to the little bit of us that screams with impotent rage when we “just quickly bodge this”.

We also shouldn’t ever be afraid to tear it down and start again. Iterative does not always mean incremental, if something is becoming unfit for purpose can’t we find the time to invest in sorting it properly? How much time do we really spend on preventative maintenance compared to how much we should? Is it fire-fighting or careful fireproofing to avoid the emergency?

If you want to fix the roof, better do it while the sun is shining.

There are attempts at automated approaches to help us manage these zombie programs, and in fact a large part of my research is on this kind of area. But don’t worry, if we all start coding properly – engineering not just programming – tomorrow, there’s still plenty of zombies out there already created, so you’re not doing me out of a job. My confrontational attitude and willingness to tell prospective students they’re wrong when I ask them a trick question will do that for me.

Good documentation is another good idea (assuming you’re not following a good practice that tells you documentation is the very spawn of Satan) but remember; documentation can end up evolving like code, we end up with too much of it telling us too little and unsure where to find the little nugget of information we need amongst the volumes. Dave (the bridge man) had plenty of blueprints, and if he understood them all he would have known not to tighten the fateful wire, but how was he to know?

Dave Blueprint scroll

 

David Cutting is a senior research associate at the Tyndall Centre at UEA, associate tutor, partner at Verrotech, CTO of tech startup Gangoolie, and alleged software engineer (purveyor of much shoddy half-built freeware) – he does not eat his own dog food often enough. He holds both a PhD and an MSc in computer science and by that gives even the poorest student hope. This article is a cut-down version of a presentation entitled “Software! It’s Broken” he gives when anyone will listen. Artwork by Justin Harris and Amy Hunter.

All content is (C) Copyright 2015-2017 David Cutting (dcutting@purplepixie.org), all rights reserved.

A day at Whitespace

On Tuesday I decanted myself for the day down to Whitespace Norwich a “not-for-profit co-working space”. Apparently “with a fast-growing network of entrepreneurs, investors, mentors and influencers, the Whitespace network encourages collaboration and growth within a supportive and like-minded community of tech and digital innovators”. Blimey.

To bolster our enterprise engagement and community profile the School of Computing Sciences at UEA, which is one of the places I hang out, have a desk there. So I thought I’d try it out.

Whitespace Building

Whitespace in Jarrolds Building

Whitespace takes up a floor in a classic old building by the river in Norwich, very close to the city centre and offering some nice views.

2016-09-06-10-10-55

View from the Whitespace stairs

View from the Whitespace stairs

Having had a quick tour of the facilities (well appointed kitchens and a games area) I got setup.

2016-09-06-13-05-122016-09-06-13-05-222016-09-06-13-05-24

Nice comfortable chair and plenty of power, it turned out the 5G WiFi wasn’t bad either.

2016-09-06-10-15-06

The only small issue I had was monitor poverty… used as I am to two 24 inch and my laptop screen as well… And the desk looked a bit empty.

2016-09-06-11-21-41

Other than my screen real estate issues all went well, even met and talked to a few interesting people (which for a computer scientist is quite something). Unfortunately I couldn’t capitalise on networking as I was up against it as usual. So, I’ll be back again when I’m not quite as busy.

There was a nice energetic buzz around and plenty of groovy people doing no doubt groovy tech stuff. It certainly seemed like the kind of place a new startup could grow and thrive.

Being an open-plan shared working place meant there was some noise, but this wasn’t really distracting. I did put headphones in after a while but only because I wanted to listen to some music rather than I needed to drown anything out. It was certainly quieter than a PhD lab.

All in all a really great place to work, good resource for UEA to use, and somewhere I’ll try my hardest to make use of again.

The Good

  • Friendly welcome
  • Nice kitchen and other resources
  • Fast WiFi
  • Chairs
  • Big desks
  • Centre manager on hand for any queries
  • Lovely setting (building + views)
  • Handy setting (so central!)

The Bad

  • The only one issue I had was heat. It was a very hot muggy day and there’s no air-con (it’s been suggested on the whiteboard). Everyone else seemed to have desk fans, I did not… but copious cold diet pepsi helped. Note: my UEA office would have been as stifling, but I do have a fan there.

 

FreeNATS on Github

After promises dating back many years and a couple of false starts the FreeNATS development codebase is now fully hosted on github.com (purplepixie/freenats)!

This will, unlike previous code releases, be the active base for development and includes the new updated build/release system as well.

The new build system has been used to build 1.15.0a which has been pushed out as the current development version (it shouldn’t change anything at all otherwise).

Find FreeNATS on github at: https://github.com/purplepixie/freenats

Containerised FreeNATS

For a trial to see if there’s a demand for a pre-packaged container of FreeNATS a pre-alpha release of a docker container has been released.

The image can be found on the docker hub (purplepixie/freenats) and the project on github (purplepixie/docker-freenats).

Links and details are available on the FreeNATS Docker wiki page, feedback welcome.

 

Facebook PHP SDK Tips

Hmm so I wrote this in November 2014… and didn’t publish it. Then found it again in November 2015. I don’t know why it was in draft – usually because there’s something horribly wrong with it. But… seems ok, and I haven’t played with the FB API since then so I doubt I can add anything (or even that it still applies) hence… PRESSING PUBLISH:

Over the last two days I’ve been at a startup event called SyncTheCity and been building a little startup called cotravel (website, twitter, facebook). One of the supposedly easy tasks was to integrate the Facebook SDK in PHP to allow facebook logins.

I blame lack of sleep and time pressure but actually this was a pain in the behind. However I have two tips for future sufferers:

1. Development Domain

Facebook will only accept requests from domains you have linked. In our case this means our live domain. We develop locally on our machines (127.0.0.1 or localhost) but you can’t register those as valid domains.

However what you can do is create an A record in your domain to 127.0.0.1 so you now have local.yourdomain.com pointing to 127.0.0.1. You can add local.yourdomain.com to facebooks’ domains for your app and off you go.

2. URL is required at both ends

The docs say you must pass a request_uri to redirect to when calling – but you must also pass the SAME uri on return or else death and room results.

This means that your two files (http://domain.com/sender.php and http://domain.com/recipient.php) must have code like:


session_start();
FacebookSession::setDefaultApplication('appid', 'appsecret');
$URL="http://domain.com/recipient.php";
$helper = new FacebookRedirectLoginHelper($URL);
$loginUrl = $helper->getLoginUrl($params);
header("Location: ".$loginUrl);

And recipient.php:


session_start();
FacebookSession::setDefaultApplication('appid', 'appsecret');
$URL="http://domain.com/recipient.php";
$helper = new FacebookRedirectLoginHelper($URL);
try {
$session = $helper->getSessionFromRedirect();
} catch(FacebookRequestException $ex) {
echo "Facebook Authentication Error";
} catch(\Exception $ex) {
echo "Facebook Authentication Failure";
}

Note in the above this line is the one that differs from the example:


$helper = new FacebookRedirectLoginHelper($URL);

e.g. you must pass the URL!

Long-Held HTTP Polling with AMP

Wow found this in a dark corner of a hard drive so, better late than never, here you go:

A technical working paper on the evaluation of Long-Held HTTP Polling for PHP/MySQL Architecture.

Abstract

When a web client needs to periodically refresh data held on the server there are generally two approaches. Interval polling (“short polling”), which is most commonly used, where the client repeatedly re- connects to the server for updates, and a technique in which the HTTP connection is kept open (“long polling”). Although work exists investi- gating the possibilities of long polling few if any experiments have been performed using an Apache MySQL PHP (AMP) stack. To determine the potential effectiveness of long polling with this architecture an ex- periment was designed and conducted to compare update response times using both methods with a variety of polling intervals. Results clearly show a marked improvement in timings with long polling, with the mean response time down to 0.38s compared to a mean of just under the polling interval for short polling. Issues of the complexity and load implications of wide use of long polling are discussed but were outside the remit of this experiment owing to a lack of resources.

Notes

Note that this is a just basically a subset of work performed as part of work performed for my MSc dissertation.

Links

Available from Purplepixie Labs: http://labs.purplepixie.org/content/cutting-long-held-php-2015.pdf

Purplepixie Permalink: http://go.purplepixie.org/sleepjax-long-held-http-request

 

cotravel at SyncTheCity 2014: a git history

Coming up for a year ago now I took part in the inaugural Sync the City event in Norwich, and guess what – it’s happening again!

Last year we undertook the challenge to design, build/develop and deploy our business cotravel in 54 hours. This culminated in a working system (with proper back end, usable APIs, web interface etc), several actual business agreements with local taxi companies, and finally an excellent presentation by Rod.

The event garnered quite a bit of press (in which we got a mention!) and has been the subject of some other reflective blogging by team members. Sadly I can’t take part this year (some nonsense about a PhD I have to finish…), but it got me looking back.

B3

During the two days of (frantic) development we used a git repo on BitBucket to manage our source code between the development machines, test, and live environments. This got me thinking it would be nice to visualise the development process mining the repository, if only I had the tools and computing power. Then I remembered I totally do! Analysing software systems, including from repository logs, is kinda what we do. So, waiting for a long experiment to finish, I realised I could quickly bodge a couple of our tools together and see what I could get.

During the development period (evening of 20th November to evening of 22nd November) there were two developers: David (me), and Adam (not me). So I plotted commits by hour both per developer and in total, from the very first commit at 17:28 on the 20th to the last at 16:46 on the 22nd.

cotravel-all

And there we have it, from 0:00 on the 20th to 23:59 on the 22nd. As you can see the main splurge of work (technical term) was on the 21st from 06:00 to 22:00, which is when we built the majority of the functionality. On the 22nd we again got an early start, but nowhere near the commit intensity and it petered off anyway, after go live at 14:00 on the 22nd to the presentations.

Two interesting peaks on the 21st – the first was mainly Adam putting together the agreed UI elements and repeatedly pushing it up for all to see. The second peak was me trying (and generally failing) to master the Facebook API, as it needed domains working etc and so had to be pushed to a live server to test (until I found a workaround which I really should document).

Of course once we went live…

Cotravel Lives!

We then had all manner of other stats available to us, for example a log of visitors to the site and the number of API calls made from our snazzy web 2.0 front-end to my dodgy interfaces. We went live at 14:00 and kept track every hour until 20:00.

Cotravel Live Calls (k)

Here’s the graph showing (in 1000’s) the page views and (much higher) volume of API calls made to the system from 1400-2000 when we finally collapsed.

Overall Sync the City was a great experience, a lot of fun and a great way to meet interesting people and do ninja coding (plenty of opportunities to DevOps the **** out of it, or as we used to call it; make live changes to the code while users are still interacting).

Good luck to anyone taking part this year!

Research Paper

Late notice but a journal paper’s been published around my research in the International Journal on Advances in Software.

This can be downloaded from the publisher but, as it’s open access, can also be downloaded directly from here: An Extensible Benchmark and Tooling for Comparing Reverse Engineering Approaches (http://labs.purplepixie.org/content/cutting-noppen-ijais-2015.pdf).

Title page of article