Operational Readiness
Overview
Test Events
Test Events are real events of a similar nature (sport or non-sport) that take place in the venues and provide an early opportunity to both assess and improve the readiness of the IT services and workforce as part of a real event. It is expected that the systems and services will use real users and real data. While the primary objective is to achieve the targeted service level requirements, Test Events also provide an opportunity to test services, people, procedures, tools and communications in a live environment although with lower visibility than the final major event.
Test events can validate assumptions and planning across a range of topics including people, procedures, spaces, stakeholder expectations, communications, system functionality and equipment needs as shown below.
Service Continuity & Disaster Recovery
Technology service continuity supports overall business continuity management (BCM) by ensuring that technology services can be resumed within agreed timescales following a disaster or crisis. It is triggered (invoked) when a service disruption occurs on a scale that is greater than can be managed with normal response and recovery practices such as incident and major incident management. This typically includes the unavailability of a complete facility such as a data centre or operations centre.
The essential components of Service Continuity and Disaster Recovery planning include:
A Business Impact Assessment (BIA) by business owners to identify the recovery priority for technology services, categorised by tier and restoration order. The BIA helps to align customer expectations during a disaster.
A risk assessment to identify potential external hazards that may affect technology services such as earthquake, fire, supplier failure, utility outage, network availability or cyber attack.
A Technology service continuity plan that describes in-scope facilities, service categorisation by tier, target recovery times (RTO) and data restoration points (RPO), definition of disaster, organisation structure and roles, invocation process and call tree, communications and collaboration spaces, run book procedures for each technology service, pre-disaster and post-disaster guidelines.
A number of Disaster Recovery tests for the in-scope facilities and services.
A tool to manage disaster recovery test cases preparation, execution and follow up.
Training & Tabletops
The following outlines an overall major events technology training approach:
Training material can be prepared and delivered in-person, via video or through an e-learning platform. We highly recommend the use of a good e-learning tool (such as this one or this one), as they enable the material to be presented in a dynamic way with quizzes that reinforce the learning while tracking the progress and completion of each person to be trained.
The scope of the training material should be aligned with the processes, procedures and tools in use and may include the following as an example:
General induction
Technology service landscape
TEAP Process
Incident management / Major incident management / Servicer request management
ITSM Tool
Venue Technology operations
Customer Service Excellence
Radio usage
Workforce management
Change management
A simple and effective LMS tool is highly recommended to make training content widely available, track team progress and set up quizzes to test knowledge.
Tabletops are a scenario-based exercise in which knowledge of operating processes and procedures is tested in a group setting. By reviewing these scenarios jointly the team can learn, challenge, question and finally come to a shared understanding of how to deal with technology issues during the event. Some example tabletop scenarios include:
Incident Severity Assessment
Loss of redundant network connection to a venue
Website registration form not working
Slow internet access at HQ premises
Ticketing system not operational
VIP user is unable to print
Communications & Escalations
A Severity 2 incident is about to breach its SLA
Media / broadcast complain that their equipment is not allowed onsite (not previously registered via Test & Tag)
Frequency of communications to MOC for a Severity 1 incident
Staffing
TOC Duty Manager late for shift start
Network engineer in venue gets sick 2 hours into their shift
Additional IT Technician support capacity needed at the media centre
2 IT Helpdesk staff resigned
TOC Director is requested to attend multiple executive briefing sessions throughout the day
Change management & service interruption
CER UPS needs to be shut down for 1 hour maintenance
Upgrade of network management OS
Primary node restart needed for Accreditation database
Emergency OS security patch on all Microsoft servers
Incident Severity Changes
Incident WiFi outage initially assessed at Severity 2, but only 10 users impacted
Electronic access control at venue is down at 3am in the morning
VIP client requests to “make it a P1!”
Equipment Requests
Shared printer move request to another room
Addition 75” TV requested for VIP space
Media representative requests move of shared kiosk
Out of Scope Services
Contractor calls technology service desk and asks for an extra mouse
Media representative calls technology service desk and ask to support their own laptop
Wrong Data
Incorrect information on official website / mobile app (results, news, etc)
…..and many more….
Technology Rehearsals
The primary focus of the Technology Rehearsals is to test people and processes in place to handle operational situations at the venues, TOC and other technology facilities.
A Technology Rehearsal simulates the real operations of the event with technology team members taking their assigned event time roles and dealing with both normal and abnormal situations that may occur during the event.
Objectives
An opportunity for the team to learn their event time role and their team interactions
Test peoples knowledge of processes, procedures and tools that will be used to deliver technology support for the event
Test communication channels between the TOC and the MOC, including crisis management processes and escalations.
Identify areas of improvement to ensure that the teams are able to cope with any situation that might arise during the event.
Timing
Rehearsals are most useful when the majority of services, people, processes and venues are available. In reality this can be difficult as many elements may only be fully available just prior to the start of the event. Holding a technology rehearsal without a high level of availability of all of the above components can be a waste of effort.
However, as a rule of thumb for a major event with a 4-5 year technology planning cycle, two Technology Rehearsals would be organised around 2 months and 4 months prior to the event. In some cases it can make sense to only hold a single TR, with the other covered by tabletop exercises. The Technology Rehearsal presents the single largest learning opportunity to ensure event time operational readiness.
Preparation
Preparing a Technology Rehearsal will typically include the following activities:
Confirming the scope of services to be included.
Planning the participation of service providers, organising committee, and other stakeholders, together with their roles and responsibilities.
Assembling and training an independent team who will trigger scenarios at agreed times across all locations and observe the response of the impacted technology teams.
Preparing Technology Rehearsals scenarios and reviewing with stakeholders.
Identifying required policies and procedures and ensuring their readiness.
Defining a detailed timeline for Technology Rehearsals execution. (i.e the TR execution plan)
Confirming the facilities & venues to be included
Preparation of tools readiness to support Technology Rehearsals, including all scenarios uploaded into the tool.
Preparing and facilitating briefing sessions.
Providing training to teams participating.
Defining Technology Rehearsal organisation and staffing roster.
Coordinating with all stakeholders to ensure that all pre-requisites are in place including service readiness, environment readiness and data readiness.
Execution
A technology rehearsal would typically take place over 3 days, with a day before for briefing and final preparation, and the day after for debriefing.
An independent team will execute scenarios in line with the TR execution plan, ensuring all Technology Rehearsals scenarios are updated in the Operational Readiness Tool and open issues where required.
Based on input from the team executing scenarios, the ITSM tool, the TR organising team will measure the Technology Rehearsals performance and prepare a debrief with to review performance metrics, open issues, lessons learned, and improvements required.
Mobilisation Planning
Part of operational readiness is to ensure that the technology workforce have the other practical requirements that are needed to support effective delivery of services.
These activities will ensure that the technology team are aware of when they are due to be on shift, know what they have to wear, know how to get to and from the venue, are able to access the areas required to perform their roles and have access to catering.
Scope of mobilisation activities include:
Resource planning & shift roster
Accreditation & access rights
Accomodation (where needed)
Transportation
Catering
Uniforms
Parking
These are generically be termed ‘Mobilisation Activities’ and success involves establishing effective working relationships with other areas of the organisation to ensure that the technology team requirements are understood and can be provided for the Event.
By far the most time-consuming part of mobilisation planning is resource planning and shift rostering for hundreds, if not thousands of technology staff across organiser and service providers. This follows the following sequence:
Confirming the event time organisation and roles for TOC, Venues, roaming team, remote support groups and others.
Confirming the operating hours for each role (e.g. 9 x 5, 16 x 7, 24x7, etc) based on the support required for end user groups and daily event start and finish times.
Confirming the shift patterns to be used (e.g 2 shift, 3 shift, etc) in alignment with HR policies or local law.
Confirming the resource demand based on the shift patterns and event duration.
Gathering the assigned names for each role to meet the resource demand.
Preparing, agreeing and publishing the roster
Managing roster changes on a daily basis.
An effective way of establishing collaboration for mobilisation activities across staff, service providers and the respective functional areas providing services is by establishing a Technology Readiness Working Group.
Access rights need to be agreed with the Accreditation team to ensure that technology staff and contractors can enter the areas where there are technology incidents, otherwise the resolution SLAs cannot be met.
Operations Centres
As outlined in support model the Technology Operations Centre is the facility where all key technology areas collaborate during the operational periods of the major event (including Test Events and Technology Rehearsals).
The TOC is an escalation, support and coordination point for technology-related issues that cannot be solved by venue technology teams and for event-wide technology infrastructure and services.
Prior to a TOC being stood up the support team can operate in an early stage virtual TOC (vTOC) model, where rotating duty managers and TOC Director is assigned on an on-call basis.
There will be initial iterations of the TOC that will be made operational to support Test Events, Technology Rehearsals and Event ramp-up, prior to the full TOC operations. This enables the continuous improvement of processes and tools well ahead of the Event.
Remote Operations Centres - provide a remote 3rd level support capability dedicated to a specific service provider or domain. While physically separated from the main Technology Operations Centre, they will be fully integrated into the operations support model through the use of common process, policies, procedures, tools, service levels and organisation. Examples may include a Cyber Security Operations Centre (CSOC) or Network Operations Centre (NOC)
Alternate TOC - The Alternate TOC will be used in case of a disaster at the primary TOC location. The Alternate TOC (ATOC) will have a reduced number of positions based on continuation of essential roles.
The TOC Director is empowered to invoke the use of ATOC instead of the TOC in coordination with the MOC as described in the service continuity strategy.
In addition to the TOC layout, there will be at least one meeting area nearby for collaboration / decision making in major incident or crisis situations.