Australian Gm Departmentof?omeA?'airs Major Incident Review Compromised - Cargo, Traveller and other Border Systems unavailable - Monday 29th April, 2019 Priority: P1 Outage Period 6 hours 47 minutes Description: From 0558hrs on Monday 29th April 2019 users reported being unable to access a range of systems, including? . Users were prompted with 'Cannot reach this page'. A major incident was declared and a number of technical teams engaged to investigate, this was managed via a Service Restoration Team. The cause of the incident was identified as a hardware failure, specifically a Line Card on Network Distribution Switch 1 at theimdl Data Centre. To restore services, the faulty card was removed and minor patch configurations performed. This action restored services and confirmation was received from various business areas including airports. A restart of JVM's fors. 47E(d) were required. This was likely due to current known errors for which regular restarts are needed. Services were restored from a technical perspective at 1115hrs. Final confirmation of restoration was received from business areas at 1244hrs, some local reboots of Smartgates were required to trigger connections to the network. A 24 hour period of monitoring was undertaken to ensure stability of services and there were no issues raised during this time. The incident was resolved and closed at 1500hrs on Tuesday 30 April 2019. There was also an impact on active major incident for: 47E(d) referencel??d) Actions for this incident needed to be suspended as this outage limited the ongoing investigations. Business Impact: An unscheduled outage of multiple IT systems occurred; this includeds. 47E(d) i This outage is resulted in a risk to national security, trade enforcement, migrations system, and border protection. This outage affected processing times at international airports, and was subject to open source media reporting. Processing of international cargo was halted due to redirecting ABF resources to process passengers manually for the duration of the outage. Ongoing activities: During this incident, it was identified that a configuration issue prevented failover to a second switch. A proactive major incident was raised on Wednesday 1 May because of confirmation of a lack of failover/redundancy. An emergency change was successfully completed at 0030hrs on Thursday 2 May to configure the second switch to enable failover. - A Major Incident Review was completed on Friday 3 May and this has identified a number of points for clarification and immediate next steps. - IBM have prepared a risk assessment and recommendation to action the replacement of faulty hardware and complete a test of the failover. Failover testing of the redundancy put in place under the above mentioned P2 is being under a P1 Problem record PM4001051. This will be planned and agreed with ABF. Australian Gm . Department of Home Affairs Identification: 0558hrs First call regarding outage of National Intelligence System received at the IT Service Desk 0600hrs Service Desk advised Unisys MIM and called through to IBM Service Desk for report of P3. 0638hrs 22(1X3Xii) called 3, to advise that Arrivals?Departures/t?EId) were all unavailable. 3. then contacted Unisys MIM and advised that a Major Incident should be raised based on the business impact. 0706hrs Pageout for P2 was then sent at 07:10hrs (07:06hrs raised) by Unisys MIM. 0730hrs Major Incident escalated as a P1. Prioritisation: - Major Incident was initially raised as a P2 at 07:09hrs and upgraded to a P1 at 07:30hrs. Noting the significant impact to business, a Service Restoration Team was stood-up. In total, 5?checkpoint teleconference dial-ins were conducted throughout the day while Major Incident in-flight. Investigation and diagnosis O745hrs WIO Technician identified a number of servers that were down. At this point, the issues were isolated to IBM. 0832hrs IBM carried out health checks and stage investigation. - 0841hrs Verified that a Line Card was down on the Network Distribution Switch 1 at the and the assumed failover did not occur to Switch 2. Although traffic was moving across to Switch 2, it was not moving out to the WAN. Resolution activities IBM performed the following activities in an attempt to failover to Switch 2: 0855hrs Shut down the VLANS to Switch 1 so failover to Switch 2 could occur. The VLANS were seen on Switch 2, however, there was no traffic flowing through Switch 2 to the Wide Area Network (WAN). 1009hrs Agreed to turn off Switch 1 to force a physical failover to Switch 2 just in case there was some corruption in Switch 1 preventing the failover from occurring. Once switch 1 was turned off, Switch 2 could see the VLANs same as before, however, there was still no WAN network traffic. - 1105hrs Given the continuing issues with Switch 2 not failing over, Switch 1 was restarted again and services were moved from the failed Line Card to another working line card on Switch 1, which involved physically moving cables from the failed Line Card to a working Line Card on Switch 1. When the relocation of the cables was complete, traffic started to flow on Switch 1. This resolved the issue and allowed traffic to flow over the WAN. 1115hrs Majority of airports reported restoration of services, with the remaining resolved by following SOPs and performing system restarts. Resolution confirmation All impacted systems and services were coming back on-Iine around 1115hrs and associated checks and balances were being undertaken by business to confirm services restored. - Traveller Cargo needed to perform a restart of NM fors. 4715(6) Restart of5-47E(d) fixed some issues, however, full functionality of this was dependant on an IBM fix that was to be deployed into PROD later that evening. IT and business stakeholders advised all affected systems were back up and running by 1245hrs and will continue to monitor. All systems appeared to be stable by 1457hrs. Incident monitoring period Major Incident was placed into monitoring for a further 24 hours to confirm stability. At 1500hrs the next day, Major Incident was placed into a resolved state and closed. Immediate post incident action - Requested Post Incident Reports from IBM and Unisys. Daily Major Incident Summary: 01 May 2019 provided to Senior Executive. Action items from this review IBM to draft a risk assessment as part of their implementation plan in relation to the above?mentioned P3 Incidents. Unisys MIM to confirm other reports of Incidents called through to the IT Service Desk between 0630-0710hrs; Unisys MIM to update consolidated Unisys PIR i.e. what times were the Teleconferences held, and when updates were provided to the broader group; - IBM to provide update on when they contacted the Service Desk about Citrix issue; - IBM, Cargo & Trade, ABF Operations Systems Management, s. 22(1)(a)(ii) to review Cargo Processing between 1114-1230hrs; ICT SM to have a discussion with ABF Operations Systems Management to understand the greater impact of the issues; Change Management to review the time-frame of lodged change post-incident and the change process; IBM to advise what existing monitoring system triggered an alert for the line card failure and what methods are in place for IBM to receive alerts when access is unavailable via Citrix. IBM to perform failover testing in PROD, once the routing change is deployed there. Problem Ticket to be raised to investigate failover of Smartgate Arrivals and Departures in similar scenarios; and IBM Problem Management and Network Assurance to assess changes put through for the lower environments, with an idea to move into prod - change records required to validate time frames. Released by Department of Home Affairs under the Freedom of Information Act 1982 - Released by Department of Home Affairs under the Freedom of Information Act 1982 Released by Department of Home Affairs under the Freedom of Information Act 1982 Released by Department of Home Affairs under the Freedom of Information Act 1982 Released by Department of Home Affairs under the Freedom of Information Act 1982 Released by Department of Home Affairs under the Freedom of Information Act 1982 Released by Department of Home Affairs under the Freedom of Information Act 1982 Released by Department of Home Affairs under the Freedom of Information Act 1982 Released by Department of Home Affairs under the Freedom of Information Act 1982 Released by Department of Home Affairs under the Freedom of Information Act 1982 Released by Department of Home Affairs under the Freedom of Information Act 1982 Released by Department of Home Affairs under the Freedom of Information Act 1982 Released by Department of Home Affairs under the Freedom of Information Act 1982 Released by Department of Home Affairs under the Freedom of Information Act 1982 Released by Department of Home Affairs under the Freedom of Information Act 1982 07:12 – BROKER restarted and now back up processing. At this point other s.47E(d) batch jobs which were stalled commenced running again. 07:14 – IBM contacted MIM advised that the s.47E(d) connections are showing down in Sydney but all other Airports seem to be up and running and shouldn’t be having issues. IBM technician advised this would need to be investigated by the application team (Border Mainframe) 07:25 – MIM engaged BOC to validate widespread impact given IBM’s update. BOC advised all Airports are still having issues. 07:29 – Border Mainframe technician advised they had successfully restarted BROKER. After restarting BROKER, s.47E(d) was available s. 47E(d) was working as intended. Border Mainframe advised there was a backlog of 30,400 Expected Movements to be processed. 08:13 – Border Mainframe technician advised backlog had dropped to 27,500 Expected Movements. 09:00 – s.47E(d) backlog of expected movements was cleared but airport departure gates were still affected because the records also need to be loaded into s.47E(d) 09:07 – Border Mainframe technician advised the backlog of Expected movements had been loaded in to s. 47E(d) 09:28 – MIM engaged Sydney Control Room to test Smartgates given the data was believed to have cleared. Sydney Airport staff advised the issue is still ongoing. 09:30 – All parties had attended DORM, it was revealed that change C4534173 impacted BROKER communications. 09:58 – MIM engaged Melbourne Airport Control Room to test Smartgates given the data was believed to have cleared. Sydney Airport staff advised the issue is still ongoing. 09:59 – MIM re-engaged Border Mainframe technician to advise the error message was still being received. 10:09 – Border Mainframe technician advised that the data had been loaded into s.47E(d) but at this current time there was still 30,000 jobs that needed to be transferred to s.47E(d) to resolve this issue. 10:59 – Backlog at approximately 23,000 11:39 – Backlog at approximately 15,000 12:24 – Backlog at approximately 8,000 12:46 – Backlog at approximately 5,500 13:04 – Border Mainframe confirm that backlog has cleared. 13:05 – MIM engaged ABF – Advised will contact ABOC and instruct to contact Airports to test. 13:12 – ABF confirmed all Airports operating as per intended. 13:17 – MIM resolved P1 incident IM4594683. Released by Department of Home Affairs under the Freedom of Information Act 1982 Released by Department of Home Affairs under the Freedom of Information Act 1982 Released by Department of Home Affairs under the Freedom of Information Act 1982 Released by Department of Home Affairs under the Freedom of Information Act 1982 Released by Department of Home Affairs under the Freedom of Information Act 1982 **Please update the classification of the document if information provided in the background is above FOUO** MEDIA ENQUIRY Subject: Sydney Airport Delays Deadline: ASAP s. 47F(1) Enquiry Received (Time & Date): 8:27am 15 July 2019 s. 22(1)(a)(ii) Media Officer: Media Ph: 02 6264 2244 QUESTION / ISSUE I understand there's significant delays at Sydney airport (and possibly nationwide) with security and e-gates systems down. Can I receive a statement and more information on what these delays ASAP? • A number of Australian Border Force (ABF) and Department of Home Affairs IT systems impacted by an earlier outage have now been restored. • The Department is continuing work to bring all systems back online, ensure the integrity of the systems and resolve any ongoing issues. • Additional ABF staff have been deployed to process passengers at international airports and to minimise delays in cargo processing where possible. • While the addition of staff has seen reduced delays at some airports, passengers are still encouraged to arrive at airports early to allow additional time for processing. • Cargo processing is continuing, though some delays can be expected as staff work through the backlog. • We appreciate the patience of passengers and businesses impacted by these outages. BACKGROUND (not for public release) The information below is classified and should not be publicly released without the authority of the Australian Border Force. A short, unclassified brief providing background/context to the incident/issue/event which may not be clear from the rest of the document; the background must detail actions taken by agency/departments/other stakeholders in the information environment, propaganda by adversaries/interest groups and highlight sensitive considerations. FOR OFFICIAL USE ONLY Released by Department of Home Affairs under the Freedom of Information Act 1982 RESPONSE UNCLASSIFIED FOR OFFICIAL USE ONLY The background may point to further correspondence on a higher classification system if required. CLEARANCE: Title Time/Date drafted Time DD Month 2019 Cleared by Full Name Title Position s. 22(1)(a)(ii) Director, ABF Media CoS to Commissioner Time/Date cleared Time DD Month 2019 Time DD Month 2019 Time DD Month 2019 Tony Smith Released by Department of Home Affairs under the Freedom of Information Act 1982 Drafted by FOR OFFICIAL USE ONLY From: To: Cc: Subject: Date: ABF Media s. 47F(1) ABF Media RESPONSE: Airport systems [SEC=UNCLASSIFIED] Monday, 29 April 2019 3:04:11 PM UNCLASSIFIED Good afternoon s. 47F(1)   Please see below an updated statement. Please attribute to an Australian Border Force spokesperson.   A number of Australian Border Force (ABF) and Department of Home Affairs IT systems impacted by an earlier outage have now been restored.   The Department is continuing work to bring all systems back online, ensure the integrity of the systems and resolve any ongoing issues.   Additional ABF staff have been deployed to process passengers at international airports and to minimise delays in cargo processing where possible.   While the addition of staff has seen reduced delays at some airports, passengers are still encouraged to arrive at airports early to allow additional time for processing. Cargo processing is continuing, though some delays can be expected as staff work through the backlog.   We appreciate the patience of passengers and businesses impacted by these outages.   Australian Border Force Media & Communications Media line: 02 6264 2211 E: media@abf.gov.au UNCLASSIFIED   s. 47F(1) Released by Department of Home Affairs under the Freedom of Information Act 1982 Thank you, Released by Department of Home Affairs under the Freedom of Information Act 1982 s. 47F(1) From: To: Cc: Subject: Date: Attachments: ABF Media s. 47F(1) ABF Media RESPONSE: UPDATED: Media enquiry: verifying cause and attribution of Monday ABF / Smartgate outage [SEC=UNCLASSIFIED] Tuesday, 30 April 2019 4:53:48 PM image001.png UNCLASSIFIED UNCLASSIFIED   s. 47F(1) Released by Department of Home Affairs under the Freedom of Information Act 1982 Good afternoon,   Please attribute the following to a spokesperson from the Australian Border Force (ABF).   All IT systems are now back online and the Department and ABF are continuing to test and monitor systems to prevent further issues.   Passenger processing is occurring as normal. The ABF has also deployed additional resources to ensure cargo is cleared in a timely manner.   The outage of IT systems yesterday was not directly related to Smartgates, but was linked to a back end network issue that impacted a number of systems, including those used to process passengers and cargo.   Regards   Media Operations Australian Border Force Media line: 02 6264 2211 E: media@abf.gov.au   Released by Department of Home Affairs under the Freedom of Information Act 1982 s. 47F(1) Released by Department of Home Affairs under the Freedom of Information Act 1982 s. 47F(1) From: To: Cc: Subject: Date: Attachments: ABF Media s. 22(1)(a)(ii) ; ABF Media RE: FOR INPUT/CLEARANCE: SECURITY DELAYS - AIRPORT [DLM=For-Official-Use-Only] Monday, 15 July 2019 9:28:37 AM 190715 EN Sydney Airport Delays Various.docx s. 22(1)(a)(ii) For-Official-Use-Only Hi all,   Please see the below for lines we used while the previous incident was ongoing:   ·         A number of Australian Border Force (ABF) and Department of Home Affairs IT systems impacted by an earlier outage have now been restored.   ·         The Department is continuing work to bring all systems back online, ensure the integrity of the systems and resolve any ongoing issues.   ·         Additional ABF staff have been deployed to process passengers at international airports and to minimise delays in cargo processing where possible.   ·         While the addition of staff has seen reduced delays at some airports, passengers are still encouraged to arrive at airports early to allow additional time for processing.   ·         Cargo processing is continuing, though some delays can be expected as staff work through the backlog. ·         We appreciate the patience of passengers and businesses impacted by these outages.     Grateful if you can advise if you are happy with the above and they are more relevant to this particular issue?   Kind regards,   s. 22(1)(a)(ii) Public Affairs Officer, Media Operations Media & Engagement Branch Executive Coordination Department of Home Affairs Media line: 02 6264 2244 P: s. 22(1)(a)(ii)   E: media@homeaffairs.gov.au For-Official-Use-Only Released by Department of Home Affairs under the Freedom of Information Act 1982     From: ABF Media Sent: Monday, 15 July 2019 9:20 AM To: s. 22(1)(a)(ii) @homeaffairs.gov.au> Cc: Media Operations ; s. 22(1)(a)(ii) s. 22(1)(a)(ii) @homeaffairs.gov.au>; s. 22(1)(a)(ii) @homeaffairs.gov.au>; s. 22(1)(a)(ii) ABF Media ; @homeaffairs.gov.au> Subject: FOR INPUT/CLEARANCE: SECURITY DELAYS - AIRPORT [DLM=For-Official-Use-Only]   For-Official-Use-Only 22(1) Hi s.(a)(ii)   We have received multiple enquiries about alleged delays at Sydney Airport due to smart gate outages.   Grateful if you can provide any available information on this topic and advise if we are able to reuse the same lines as last time we had a delay?     ·         Passenger processing is occurring as normal. The ABF has alsovaness deployed additional resources to ensure cargo is cleared in a timely manner. ·         The outage of IT systems yesterday was not directly related to Smartgates, but was linked to a back end network issue that impacted a number of systems, including those used to process passengers and cargo.   Kind regards,   s. 22(1)(a)(ii) Public Affairs Officer, Media Operations Media & Engagement Branch Executive Coordination Department of Home Affairs Media line: 02 6264 2244 P: s. 22(1)(a)(ii)   E: media@homeaffairs.gov.au   For-Official-Use-Only s. 47F(1) Released by Department of Home Affairs under the Freedom of Information Act 1982   ·         All IT systems are now back online and the Department and ABF are continuing to test and monitor systems to prevent further issues. Released by Department of Home Affairs under the Freedom of Information Act 1982 s. 47F(1)