I am thinking about renaming this series to the ‘The Bonsai Mysteries’.  How frustrating it is to think that you are beginning to know how something works, only to find that it has suddenly malfunctioned (i.e. leaves dropping off) without good reason!

My mini tree started its journey with me in my newly decorated home office, then the leaves started to go yellow and drop off.  I thought it might be the central heating so, when the weather was mild, I found it was happier in the conservatory… until, as the weather got warmer, the leaves started to turn yellow and drop off.   So, I moved it into the garden.  All good until today, the warmest day of the year so far, when I found that the leaves were dropping off without even having the decency to warn me by turning yellow first!

The constant care and attention required by ‘tiny timber’ is quite stressful.  Especially as I don’t want to rename any of this series to ‘How I killed my Bonsai’!  Finding a business analogy for that doesn’t bear thinking about.

So, the analogy that sprung to mind whilst contemplating my latest Bonsai issue, was around resilience – resilience both in terms of architecture and personal resilience of your IT team.

Architectural Resilience

Just as I have tried many different environmental changes for my Bonsai, so will you have changed your IT infrastructure to ensure its resilience.  How do you know that you have got it right?  There is a balance between being realistic about the disasters your infrastructure may face and the budget that you may have available for protecting it.

If we think about a standard on-premise, private cloud or co-located (your servers stored in a data centre) infrastructure, there are some essentials, including:-

  1. Backup and server snapshots – regular full backups, daily incremental backups and server snapshots all stored at a different location to the main infrastructure
  2. UPS – Uninterruptable power supplies (very large battery packs) for the server kit so that they have time to shut down ‘gracefully’ in the event of a power outage
  3. If in a data centre, all services covered by secondary sources to which they can fail over (i.e. alternative power to the building, alternative data connectivity routes from disparate suppliers)
  4. Security – secure routing of data around your infrastructure and any devices connected to it as well as protection against viruses and data loss.

Depending on the size of the business and the criticality of various elements of your architecture, it may be necessary to have a Disaster Recovery (DR) or Business Continuity environment waiting in the wings.  This is effectively a copy of some, or all, of your environment.  The copy is updated periodically, perhaps daily, hourly or down to the minute depending on business need and resources (yes you guessed it, the more “live” it is the more money it costs) – known as the Recovery Point Objective or ‘RPO’ – and, in the event of an issue with the live environment, can be fired up so that users can log into it and carry on working within a certain amount of time – the Recovery Time Objective or ‘RTO’- for example all core systems back online within 8 hrs or less.

Several Cloud providers are very loud in pronouncing they have multiple data centres, often they are less clear on the time to enact failover and how fresh the data will be when the secondary centres kick in.

Public cloud hosting, through suppliers such as Microsoft Azure and AWS, has the benefit of being able to provide backup and DR solutions as part of the purchased architecture and this takes the onus of implementing bespoke solutions away from your IT team. That being said, these solutions will still require some administration.

In any of these scenarios, the key success criteria are research and regular testing.

Had I fully researched the watering and environmental requirements for my Bonsai (which I confess I have still not done in detail) and had I been a bit more attentive, I might have avoided the leaf dropping incidents… or just bought a cactus which is more akin to my skills and needs as a gardener!

Obviously, the consequences of not doing your research to protect your IT infrastructure are somewhat more dramatic.  Every infrastructure is different, there will be aspects of it that require more or less resilience depending on certain factors.  The expertise of specialists in this area is invaluable in ensuring that you have considered all of the issues, potential events and likely outcomes.  It’s quite easy to overlook which systems need to be protected by disaster recovery systems too.  A common example of this includes the telephony system and whilst the Microsoft Azure public cloud service includes add-ons that can be used to backup the environment, the Microsoft 365 service which is used by some firms to host their email requires separate backup solutions.

As far as testing is concerned, with the best will in the world, a supplier can provide a service and confirm what your expectations should be, but in the event of a disaster insult will be added to injury if your expectations are not met and the disaster cannot be mitigated.

Changes to your live environment will impact upon your protective measures.  For each change you should ask whether you need to update the backup, whether you should include this change in your DR environment, whether the area of change is covered by security provisions, etc.  Consideration of these critical elements should therefore form part of a formal Change Control process.   In addition, you should perform a full fail-over test to your DR environment at least annually (including a partial restore from backup) as well as testing your cyber breach plan.

Lastly, if you are hosted in a data centre, you should ensure that you understand how your hosting provider and the data centre are testing their own disaster recovery plans.

Personal Resilience

After 30 years in varying roles in IT I seem to have developed 4-inch-thick reinforced steel skin but I am often reminded that others do not have the benefit of that.

A good IT service is very much the least that is expected by a business and IT teams are rarely congratulated for achieving this.  However, when there is an issue, it can often be either critical to that individual’s ability to work or, at the extreme, critical to the business’s ability to operate.

In addition to this, IT staff tend to be highly technical people who may take any failing of a system or process to be a reflection on them personally.

From my experience resilience is most tested where an IT team does not have the full understanding and support of the business.  I come back to a recurrent theme of my articles – communication.  The IT team should have a conduit to de-mystify who they are and what they do.  Most IT projects are firm-wide projects with input from IT, and should be communicated in this way so that the whole firm become engaged in achieving the objectives – particularly where that means working closely with IT.

Lastly, it is important that the business understands the context in which IT staff work – technology changes happen daily.  Your IT team will be running to keep up at all times, they won’t and can’t know everything and will need time to look outward, time to train and learn, time to consolidate and absorb.  Confidence and resilience go hand-in-hand.

 

In summary

Returning to the subject of my Bonsai, I enacted my DR plan (soaked its roots in water) and accepted that I need to revise my recovery point objective (check the leaves more often), and if it is still too much for me then there is always outsourcing…

Written by…

Cathy Kirby

More from Cathy…

The curious case of the disappearing phone system

Over the last five to ten years, it has become normal to follow a cloud-first IT Strategy.  This is particularly the case when considering telephone systems.  The promise of a more robust platform on a ‘pay per user per month’ basis makes compelling business sense for...

read more

Baskerville Drummond at Home – Episode 1

In a new video series, we’ll be talking to Baskerville Drummond consultants on a whole range of topics over the weeks and months to come. First in Sarah Levick's hot seat is David Baskerville discussing some of his key takeaways from the recent lockdown period with...

read more

Cost-cutting – you might be getting it wrong

When business slows down or stops and the cash correspondingly also slows down, it is natural – and right – to reduce expenditure.  The manager’s job is to manage, not be overtaken by events by failing to respond.  Action should certainly be taken and some idea of the...

read more

Making a success of succession planning

Law firms have a problem.  There are too many owners that want to retire and too few people that are interested in taking over their business or share of their business and paying them out, so that they can do so. If you are a partner or owner looking to retire soon,...

read more

Why a virtual or part-time finance director makes sense

Over the last 20 years or so, a steady revolution has been taking place in the employment market, which has led to fewer firms employing full-time specialists to lead their business support roles.  IT, HR, facilities, business development, accounting, procurement and...

read more