Key points
- News roundup as worldwide IT outage hits airports, business, NHS and TV
- CrowdStrike update triggered Microsoft system errors - prompting apology
- More than 5,000 flights cancelled worldwide
- Expert explains how security 'arms race' led to crash
- Could take 'weeks' for systems to recover, expert warns
- Most GPs in England disrupted|'Critical incident' at hospital trust
- In-depth analysis:The firm behind the world's worst IT outage|The costly cautionary tale of how CrowdStrike came to dominate
- Data & Forensics:Search data reveals scale of global IT disruption
- Watch tonight:A Sky News special on the crisis at 8pm
Goodnight - here's your evening summary
That's it for our live coverage of the global IT outage today.
Services from airlines to healthcare, shipping and finance have been coming back online after computer systems were disrupted for hours.
Even with the glitch fixed, companies were dealing with backlogs of delayed and cancelled flights and medical appointments, missed orders and other issues that could take days to resolve.
Businesses also face questions about how to avoid future blackouts triggered by technology meant to safeguard their systems.
An earlier software update by global cybersecurity firm CrowdStrike, one of the largest operators in the industry, had triggered systems problems that grounded flights, forced broadcasters off air and left customers without access to services such as healthcare or banking.
It was not a security incident or cyber attack, according to the firm and theUK's National Cyber Security Centre.
The outage shone a spotlight on CrowdStrike, an $83bn company that is not a household name but has more than 20,000 subscribers around the world including Amazon and Microsoft.
The UK government responded with its COBRA emergency team.
Here's a look at some of the services affected...
Travel
- As of 8pm, more than 5,000 were cancelled across the globe - out 110,000 scheduled;
- There were long queues at multiple UK airports, but Heathrow and Edinburgh say operations are now returning to normal;
- Ryanair told customers whose flights have been cancelled to leave airports;
- Some airlines issued handwritten tickets, while some airports - like Belfast International - are relied on whiteboards to update passengers;
- Many US carriers grounded their planes, while airports worldwide were impacted in Spain, Singapore, Hong Kong, Australia, Germany and elsewhere;
- Train services were also affected - including operators such as Avanti West Coast, Great Western Railway, Southern and Thameslink.
Hospitals and emergency services
- Some people were experiencing difficulties booking appointments at GP surgeries, with practices across England affected, according to the NHS;
- Pharmacies warned disruption could continue over the weekend;
- The Royal Surrey NHS Foundation Trust declared a critical incident as IT issues are affecting its services;
- A few hospitals warned of delays and disruption, but others said services were running normally;
- The London Ambulance Service said it experienced "huge increases" in the number of calls to its 999 and 111 services following;
- NHS Blood and Transplant urged people to keep their blood donor appointments, as there remains an "urgent need for O negative blood".
Businesses
- Major UK supermarkets including Tesco, Sainsbury's, Asda, Morrisons and Waitrose reported issues with online services;
- One Waitrose in Hampshire was accepting cash only, in an example of what was thought to be a wider issue;
- Many businesses were left with issues with their payroll software, which could potentially pose problems for companies that pay weekly.
CrowdStrike boss warns 'bad actors' will try to exploit crash
The CEO of CrowdStrike has warned "adversaries and bad actors" will try to exploit the crash.
George Kurtz encouraged "everyone to remain vigilant" and ensure they are engaging with official CrowdStrike representatives.
He promised "full transparency" on how the crash occurred.
The company was working on a technical update and root cause analysis that will be shared with everyone.
The outage was caused by a defect found in a Falcon content update for Windows hosts, he said, meaning Mac and Linux hosts were not impacted.
"All of CrowdStrike understands the gravity and impact of the situation," he said.
"As we resolve this incident, you have my commitment to provide full transparency on how this occurred and steps we’re taking to prevent anything like this from happening again."
Three minute read: The costly cautionary tale of how CrowdStrike came to dominate, and the perils of complexity
By Ed Conway, economics and data editor
This wasn’t supposed to happen.
We were told that as the internet matured, that this kind of thing - a single error causing a domino effect taking out millions of machines - was supposed to become less and less likely. There would be more and more servers and cables distributed in more and more places, making a single point of failure increasingly unlikely.
Instead, what today’s episode - in which an update from a company called CrowdStrike to customers using its services around the world essentially broke the Windows operating system on their computers - has underlined is that often the more complex a system becomes, the more vulnerable it is to collapse.
The great irony, of course, is that CrowdStrike’s raison d’être is to prevent moments like this from happening. The company’s Falcon Sensor is a product used to prevent cyber attacks - a complex program best thought of as a kind of super anti-virus package, which, in order to do its job, gets privileged access to more parts of your machine than regular software.
But it so happens the latest update to Falcon Sensor, uploaded overnight to computers around the world, had a dodgy bit of code in it, which caused Windows machines to crash.
Right now, it looks as if the only way it can be resolved is by technicians rebooting each machine and manually deleting a particular file (C-00000291*.sys - since you asked).
In other words, spare a thought for your company’s technicians, because they’re about to have a long weekend.
The more complex we get, the more vulnerable we are
But perhaps the most striking lesson from the episode is a more ancient one, laid out by historian Joseph Tainter in his 1988 book The Collapse Of Complex Societies.
The more complex societies and systems become, the more vulnerable they are to collapse. Tainter was referring to examples like the fall of Rome or the collapse of ancient Mesopotamian civilisation, but one could just as easily apply the logic to modern examples.
Lurking beneath Tainter’s thesis was the point that often in a complex society of organisation actors might make decisions which seem sensible but, due to the complexity of the system and their inability to understand it, could actually make it more vulnerable.
Consider the subprime crisis which triggered the financial crisis of 2008. Mortgages were packaged and repackaged into assets sold, eventually, to banks - which had little understanding of their actual value and their risks. The more complex the system became, the less able people were to comprehend how exposed they were to a catastrophic failure, and the more vulnerable the entire edifice was to collapse.
How all roads led to CrowdStrike
Now let’s ponder the current IT malaise. Let’s ask ourselves: how did it come to be that so many companies around the world had the very same bit of software installed on their systems, making them vulnerable to the very same lines of duff code?
After all, the vast majority of people working at the companies affected will never have heard of CrowdStrike. Like the bankers presiding over the financial crisis, they had no idea of the potential vulnerabilities lying within their systems.
But in recent years, as businesses have become more and more concerned about the risk of cyber attacks, they have begun to implement cyber security checks and regulations.
These often took the form of a checklist some poor operative had to fill out: how many computers have you got? What operating system? Are they all online?
What forms of cyberprotection do they have? And so on.
Now, this might sound like frustrating red tape to many of you, but the reality is that these days some companies stipulate that anyone doing business with them must have fulfilled all the items on the checklist.
So all of a sudden, salespeople trying to do a deal would discover that they couldn’t do it without complying with the checklist. The company’s financial survival depended on being able to tick the boxes!
And invariably one of the boxes in those checklists was: do you have an endpoint detection and response (EDR) solution? And if you didn’t have an EDR solution (or, more likely, didn’t know what one was) then invariably you googled EDR and looked for the world’s biggest provider, which just so happened to be… CrowdStrike.
Perhaps you spoke to your IT provider and insisted that you needed an EDR. Perhaps they said: "Oh I wouldn’t do that if I were you” - but then… no EDR no sale. This is a stylised example, of course, but you see how this kind of thing can happen.
And hence, gradually and imperceptibly, a large proportion of the world’s companies came - mostly unbeknownst to their leaders - to be running the very same piece of software with direct access to the most privileged parts of their computers.
And then all it took was a few lines of code and all of those machines were instantly dead - or rather, they faced the 'Blue Screen of Death'.
A costly cautionary tale
So there’s a reminder here about the risks of complexity. It's way too early to put a figure on how much disruption this episode has caused and how much economic damage wrought. The short answer is almost certainly: a lot.
Millions of people around the world have been unable to travel, to communicate, to transact. It may well transpire that it has put lives at risk, given it has affected many doctors’ ability to do their job.
Perhaps the best thing that can be taken from today’s chaos is that it might just serve as a cautionary tale which could make our computers that bit safer and more stable in the future.
It might remind bosses that cyber security decisions are more than box-ticking exercises - and sometimes installing cyber security software can backfire.
It reminds us how dangerous it is if everyone in the world is relying on the same provider.
It reminds us about the need for redundancy - to have backup systems.
It reminds us of the dangers of complexity.
This probably won’t come as much consolation if you’re one of those people whose holiday plans have been disrupted or your business messed around by the IT outage today.
But it’s something.
Pharmacies warn disruption likely over weekend
The National Pharmacy Association has warned disruption is likely to continue through the weekend as outlets deal with a backlog of medicine deliveries.
Pharmacies reported issues with accessing prescriptions from GP surgeries and said this would affect the delivery of medicines to patients.
Patients with "minor ailments" were also being sent to pharmacies from GP surgeries earlier today, according to the Independent Pharmacies Association.
No "serious patient safety issues" have been identified during the outage, Health Secretary Wes Streeting said, urging people to "bear with" GPs after disruption to appointment bookings and other services.
The global IT outage has impacted the EMIS Web system, NHS England says, which is understood to be used by about 60% of practices in England.
The programme enables GP practices to book appointments, examine records and help with admin.
Around 3,700 GP practices may be affected, the Press Association reported.
Professor Kamila Hawthorne, chairwoman of the Royal College of GPs, said: "Our members are telling us that today's outage is causing considerable disruption to GP practice bookings and IT systems – practices using EMIS IT systems appear to be particularly affected.
"Outages like this affect our access to important clinical information about our patients, as well as our ability to book tests, make referrals, and inform the most appropriate treatment plan."
There were issues with administrative systems in some hospitals while some ambulance services reported a surge in demand.
How the IT outage caused chaos around the world - from Brazil to Poland
The disruption caused by the outage has been truly global - here are just a few examples.
Brazil
Bradesco, one of the main banks in Brazil, notified its users via its app that digital services were unstable due to a global cyber outage, but its ATMs were working normally.
Azul Airlines, a Brazilian low-cost airline, said its check-in systems were affected, causing occasional flight delays.
Japan
Universal Studios Japan in Osaka, western Japan, said the global system outage will affect ticket sales at the park over the weekend.The park said its ticket booth sales will not be available on Saturday and Sunday.
Canada
The outage grounded some flights, disrupted hospitals and backed up border crossings in Canada.
Porter Airlines said it was cancelling its flights for several hours because of the outage.
Air Canada, Canada's largest airline, said there is no major impact to its operations, adding that it's monitoring the situation closely.
University Health Network, one of Canada's largest hospital networks, said that some of its systems had been impacted by the outage.
Windsor Police reported long delays at both the Canada-US border crossings at the Ambassador Bridge and the Detroit-Windsor tunnel.
Sri Lanka
The National Centre for Cyber Security in Sri Lanka says four information technology companies there have been affected.
Switzerland
Landings at Switzerland's Zurich Airport have returned to normal after being suspended earlier in the day.
Germany
A German regional grocery chain, Tegut, temporarily shut its 340 stores in the country this morning due to the impact on cash register systems. By early afternoon, more than half of the stores were open again.
South Africa
In South Africa, at least two major banks said they experienced service disruptions as customers complained they weren't able to make payments using their bank cards at grocery stores and gas stations or use ATMs. Both said they were able to restore services hours later.
Poland
Baltic Hub, a major container hub in the Baltic port of Gdansk, Poland, says it's battling problems resulting from the global system outage. Entry gates are temporarily closed and business has been suspended, the Baltic Hub said in a statement.
More than 5,000 flights cancelled worldwide
We have some updated figures now on the number of cancelled flights.
As of 8pm, 167 flights scheduled to depart UK airports have been cancelled, aviation analytics company Cirium said.
This equates to 5.4% of scheduled departures, the firm said.
Some 171 flights due to land in the UK were cancelled.
Globally 5,078 flights, or 4.6% of those scheduled, have been cancelled.
Someone approved wrong file - and they're having an 'extraordinarily bad day', says expert
The crash occurred because the wrong file was distributed to computers, says an IT security expert.
Human error will have played a part because the faulty file must have been approved at some stage in the process, says Tim Rawlins, director of the NCC Group, a organisation which secures business data.
CrowdStrike will be "tearing their hair out" trying to find the cause of the crash. Only they will know why the wrong file was uploaded, but it will come out.
"I imagine somebody there is having an extraordinarily bad day," he said.
"It is really unfortunate, but imagine you are the person who is responsible for going: 'Right, here is the file, we have made all the changes, we've done all the testing, push it to the machine to do the distribution [and] the distributor has either grabbed the wrong file or the wrong file has been given to it.'"
Asked if human error was involved, he said: "It is probably a fully automated system but at some stage there will have been a person in the loop. Somebody would have gone 'yes I approve this one to go'.
"Who knows where that mistake is. It will come out. I'm sure CrowdStrike will be tearing their hair out trying to find that issue."
How did the mistake happen?
Mr Rawlins explained software that protects computer systems from threats, named end point detection and response (a more comprehensive package than simple anti-virus), went awry.
Update packages are constantly created and then uploaded and pushed out it out to "endpoints" - the computers - but it appears the distribution system took and pushed out "the wrong file".
"That's the file that is full of zeros, as people describe it - there is noting in there for the system to operate," said Mr Rawlins.
"And that has caused the system to glitch, which has led to this blue screen of death that everyone is talking about."
Constant security updates have to be released "because the bad guys and girls are constantly changing their attacks".
One computer will report a dodgy new file to Crowdstrike, which will then need to tell all it's other computers how to stop it.
"It's this classic arms race," said Mr Rawlins.
Analysis: IT outage an early wake-up call for PM
The IT crash has been an "early wake-up call" for the prime minister, perhaps damaging his honeymoon period, says chief political correspondent Jon Craig.
The outage overshadowed Volodymyr Zelenskyy addressing the cabinet - the first foreign leader to do so in person since Bill Clinton, he says on our special programme, Crash: The Global IT Outage.
The government will have to take a "very critical look" at precautions for this type of issue.
Craig adds there is every chance there will be a COBRA meeting with ministers over the weekend.
"The prime minister will want to be better prepared for a crisis of this sort or a different sort in the future," he says.
"I'm sure there will be a big inquest after today and the PM will bang a few heads together."
Analysis: This crash will happen again - it's time to rethink
This will happen again,science and technology editor Tom Clarkesays on our special programme, Crash: The Global IT Outage.
Continual updates and fixes are required to keep systems safe from cyberattacks, he explains.
If they are to work, those updates must be pushed out globally and immediately - or else leave networks vulnerable.
"Really the answer is: People have to rethink the way they are going to manage when they don't have access to their IT."
Much of everyday business needs to be connected, and backup systems are a question of cost, says Clarke.
Many shops and restaurants are cashless, but it may be time to think about "how do we revert to some other form of payment if we lose that".
Sky News special programme tonight
Tune in soon for a special programme on the global IT outage, including the latest developments and analysis.
Crash: The Global IT Outage- will air on Sky News at 8pm.
It's free to watch on TV and you can catch it on YouTube as well.
We'll be covering the key moments right here.