I am thinking about renaming this series to the ‘The Bonsai Mysteries’. How frustrating it is to think that you are beginning to know how something works, only to find that it has suddenly malfunctioned (i.e. leaves dropping off) without good reason!
My mini tree started its journey with me in my newly decorated home office, then the leaves started to go yellow and drop off. I thought it might be the central heating so, when the weather was mild, I found it was happier in the conservatory… until, as the weather got warmer, the leaves started to turn yellow and drop off. So, I moved it into the garden. All good until today, the warmest day of the year so far, when I found that the leaves were dropping off without even having the decency to warn me by turning yellow first!
The constant care and attention required by ‘tiny timber’ is quite stressful. Especially as I don’t want to rename any of this series to ‘How I killed my Bonsai’! Finding a business analogy for that doesn’t bear thinking about.
So, the analogy that sprung to mind whilst contemplating my latest Bonsai issue, was around resilience – resilience both in terms of architecture and personal resilience of your IT team.
Architectural Resilience
Just as I have tried many different environmental changes for my Bonsai, so will you have changed your IT infrastructure to ensure its resilience. How do you know that you have got it right? There is a balance between being realistic about the disasters your infrastructure may face and the budget that you may have available for protecting it.
If we think about a standard on-premise, private cloud or co-located (your servers stored in a data centre) infrastructure, there are some essentials, including:-
- Backup and server snapshots – regular full backups, daily incremental backups and server snapshots all stored at a different location to the main infrastructure
- UPS – Uninterruptable power supplies (very large battery packs) for the server kit so that they have time to shut down ‘gracefully’ in the event of a power outage
- If in a data centre, all services covered by secondary sources to which they can fail over (i.e. alternative power to the building, alternative data connectivity routes from disparate suppliers)
- Security – secure routing of data around your infrastructure and any devices connected to it as well as protection against viruses and data loss.
Depending on the size of the business and the criticality of various elements of your architecture, it may be necessary to have a Disaster Recovery (DR) or Business Continuity environment waiting in the wings. This is effectively a copy of some, or all, of your environment. The copy is updated periodically, perhaps daily, hourly or down to the minute depending on business need and resources (yes you guessed it, the more “live” it is the more money it costs) – known as the Recovery Point Objective or ‘RPO’ – and, in the event of an issue with the live environment, can be fired up so that users can log into it and carry on working within a certain amount of time – the Recovery Time Objective or ‘RTO’- for example all core systems back online within 8 hrs or less.
Several Cloud providers are very loud in pronouncing they have multiple data centres, often they are less clear on the time to enact failover and how fresh the data will be when the secondary centres kick in.
Public cloud hosting, through suppliers such as Microsoft Azure and AWS, has the benefit of being able to provide backup and DR solutions as part of the purchased architecture and this takes the onus of implementing bespoke solutions away from your IT team. That being said, these solutions will still require some administration.
In any of these scenarios, the key success criteria are research and regular testing.
Had I fully researched the watering and environmental requirements for my Bonsai (which I confess I have still not done in detail) and had I been a bit more attentive, I might have avoided the leaf dropping incidents… or just bought a cactus which is more akin to my skills and needs as a gardener!
Obviously, the consequences of not doing your research to protect your IT infrastructure are somewhat more dramatic. Every infrastructure is different, there will be aspects of it that require more or less resilience depending on certain factors. The expertise of specialists in this area is invaluable in ensuring that you have considered all of the issues, potential events and likely outcomes. It’s quite easy to overlook which systems need to be protected by disaster recovery systems too. A common example of this includes the telephony system and whilst the Microsoft Azure public cloud service includes add-ons that can be used to backup the environment, the Microsoft 365 service which is used by some firms to host their email requires separate backup solutions.
As far as testing is concerned, with the best will in the world, a supplier can provide a service and confirm what your expectations should be, but in the event of a disaster insult will be added to injury if your expectations are not met and the disaster cannot be mitigated.
Changes to your live environment will impact upon your protective measures. For each change you should ask whether you need to update the backup, whether you should include this change in your DR environment, whether the area of change is covered by security provisions, etc. Consideration of these critical elements should therefore form part of a formal Change Control process. In addition, you should perform a full fail-over test to your DR environment at least annually (including a partial restore from backup) as well as testing your cyber breach plan.
Lastly, if you are hosted in a data centre, you should ensure that you understand how your hosting provider and the data centre are testing their own disaster recovery plans.
Personal Resilience
After 30 years in varying roles in IT I seem to have developed 4-inch-thick reinforced steel skin but I am often reminded that others do not have the benefit of that.
A good IT service is very much the least that is expected by a business and IT teams are rarely congratulated for achieving this. However, when there is an issue, it can often be either critical to that individual’s ability to work or, at the extreme, critical to the business’s ability to operate.
In addition to this, IT staff tend to be highly technical people who may take any failing of a system or process to be a reflection on them personally.
From my experience resilience is most tested where an IT team does not have the full understanding and support of the business. I come back to a recurrent theme of my articles – communication. The IT team should have a conduit to de-mystify who they are and what they do. Most IT projects are firm-wide projects with input from IT, and should be communicated in this way so that the whole firm become engaged in achieving the objectives – particularly where that means working closely with IT.
Lastly, it is important that the business understands the context in which IT staff work – technology changes happen daily. Your IT team will be running to keep up at all times, they won’t and can’t know everything and will need time to look outward, time to train and learn, time to consolidate and absorb. Confidence and resilience go hand-in-hand.
In summary
Returning to the subject of my Bonsai, I enacted my DR plan (soaked its roots in water) and accepted that I need to revise my recovery point objective (check the leaves more often), and if it is still too much for me then there is always outsourcing…
More from Cathy…
Cathy out of practice with Baskerville Drummond
Baskerville Drummond has announced another addition to its senior consulting team with the arrival of Cathy Kirby. David Baskerville is delighted to welcome his new colleague: “I’m thrilled. When Cathy looked to move to ‘the other side’ she could have taken her pick...
The Hugo Metaphor
During ‘Lockdown v1’ my wife Joyce and I decided it was time to have another dog. As supporters of the Dog Trust and RSPCA we wanted to help a dog in need rather than going for the easy option of a cute puppy. OK, bear with me a while - there is a serious point to...
Talking Points…Shadow IT
The term ‘Shadow IT’ conjures up images of illicit activity and unsavoury characters lurking in the alleyways of cyber city. I was amused to find that Wikipedia gives alternative names of ‘Rogue IT’ and ‘Feral IT’ – sounds like just another day at the office! In...
Hi, Finance – March 2021
“I don’t understand why my firm is profitable and yet we always seem to be up against it with the Bank. We don’t want to be putting in more capital this year so what can we do now to get a grip on this?” Ultimately all a firm’s cash is generated from its fee income in...
Talking Points…Managed Service Providers
Introduction Managed Service Providers (MSPs) are organisations that provide a range of technical and hosting services to their clients. There are thousands of MSPs through the country providing these services to businesses although the number that specialise in...