Knowing how to manage incidents is a critical element for every security environment.
The incident analysis begins with the forensic and terminates with the report given to the Incident Manager.
The task involves digital forensic investigators, malware analysts and network operators.
Only through the evaluation of the network streams and the identification of the way the attacker has infected the systems and has sprung in the network or has ex filtrated information it is possible to understand what the cyber criminals were up to.
Some organizations (Mandiant, for example) has written and developed a set of indicators that could help in deriving the basic information of every compromise in order to locate malicious artifacts throughout the organizations.
However, for my experience as Incident Response Team Leader, the process is always more complex and strongly related to the victim, her organization and her security capability, but is also strongly bounded to the type of attack and the organization behind the attack itself.
It is extremely complex today to bring the parties responsible for the incident to justice.
But let’s look at some examples.
The malware wail
It is a Thursday morning of an incoming sunny weekend, the online ticket office of a big transportation company is already full of users that are booking their weekend travels.
In just a few moments the website of the ticket office starts to slow down and increase its memory consumption.
The monitoring stations begin to alarm the internal personnel about a weird behavior from the front-end of the online ticket platforms.In about twenty minutes, despite the attempt to modify the load balancers and the responsiveness of the Front-End, the online ticket office is blocked.
The Security personnel, alerted about the situation, tries to analyse if the firewall or the Intrusion Prevention Systems have noticed something strange, but just to understand if the problem is security-related the operators should copy the log of the last hour and start to dissect every possible connection… a three hour job to say at least.
In the meanwhile, the Network personnel is looking to the configuration of the load balancers and the systems and has already confirmed that there is nothing extraordinary, except for the slowness of the Front-End.
The Company ICT managers, looking to resolve the issue as quick as possible, ask the Network Operators and the Security Operators to limit the inspection and security measures on the front-end to a minimum… but the problem persists, despite a little improvement on the responsiveness of the whole platform.
So the Company asks the Service Provider if it has done something during the latest 48 hours, but the Internet Service Provider does not answer quickly to the request attempting to get more time to evaluate the situation.
Unfortunately, by limiting the inspection and the controls over the incoming traffic, the Company has done what the attackers were looking to… and so they start their real attack against the Back-End of the Ticket office…
Three days later, with the online platforms still showing a series of problems and weird behaviours, some users’ complaint that their credit card codes have been stolen by unknown cybercriminals after being used in the online ticket office.
They have booked in the last seven days a ticket from the transportation company, but have not spent money through Credit Card in other online services.
The number of complaints increases to more than 300 in about six hours, too much for being just a coincidence.
The Security Manager asks the Computer Emergency Response Team to start a deep analysis, despite the Company Management continues to think that the problem is related to the Service Provider.
The Incident Response Team Leader divides the engagement in two different tasks:
- Front-End Systems analysis
- Back-End Systems analysis
In about three hours, looking to Front-End traffic streams, the Network specialist of the CERT Team identifies a strange and repeated series of connections from a small group of IP addresses originates from Seoul (South Korea).
The streams are apparently correct sequence of HTTP and HTTPS traffic, but instead of requesting simple access on web pages (through the load balancers); the connections are requesting a lot of data from the web forms of the ticket office.
Meanwhile, the Back-End Team has discovered that the Database of the public infrastructure, despite the correct segmentation behind a two-tier model of the communication flows with the Front-End, appears overwhelmed by several processes originated from the Front-End.
These processes are, apparently, normal requests with a high series of Database operations in their payload.
In fact, the Back-End Team, by comparing the average CPU and Memory consumption of the previous month with the CPU and Memory consumption of the latest five days identifies important discrepancies.
Basically in the latest four days the CPU and Memory resources occupied by processes on the Back-End have growth from a 20% to more than 60% on a daily basis with peak of 80% in many occasions.
By comparing the data collected from both teams the CERT is now able to identify what is really going on: a Slow Loris DDoS attack. But this does not explain the stolen Credit Cards…
It takes the following 40 hours of comparative analysis on the Firewall, IPS and Internal network streams to identify the second part of the attack.
Exploiting the lower subset of defence enforced after the first wave of the Slow Loris Attack, the Cybercriminals have used an already compromised internal computer (a laptop) with a custom backdoor (a variant of Cybergate RAT) operating through a chained Company unauthorized proxy (on TCP port 3128) to access the Back-End and, through the Internal Credential of the laptop owner, they have dumped the Database and slowly transferred its tables to a public dropzone through Http streams.
The Incident Response report, a 150 pages book, is an example of street-level tools well exploited by cybercriminals.
The Blackhats behind the attack have not been identified, but at least, the economic impact of the fraud has been limited and the users have been refunded…
Another teeny-weeny malware case…
A big oil Company has shared some strategic plans with one of its subcontractor operating in emerging markets about a new set of oil rigs licenses they are planning to collect from the local National Company after two years of intense political initiative.
The subcontractor has been informed because it has the essential knowledge to help the Company design and setup the rigs.
The subcontractor has signed a very tight Non-Disclosure Agreement for the case, but he works with the Oil Company from more than a decade; no doubt about his trustfulness.
However, two days before the agreement will be closed; the subcontractor internal network records a strange set of performance hiccups, especially in the Restricted File Servers located in the Server Farm in his Headquarter. The problems are related to unresponsiveness and poor performance in I/O operation during the night backups, enough to fire the SNMP Monitoring Station for about forty times in two hours.
The Sys Admin of the Servers checks the logs, the processes and the resources available and found nothing. Nevertheless, to ensure a proper monitoring of the entire situation he activates a subset of monitoring processes through his credentials by manually starting them on both the file servers.
He is unsure if the problem was originated by a failed update procedure for one of the latest patches distributed by the local patch management system.
Also he call the Net Admin telling about the strange behaviour of the File Servers and asking if the network guys have modified something in the latest three to five days.
The Net Admin denies any modification and assures the Sys Admin that he would investigate further. In fact the Net Admin immediately checks the local network and routers and founds nothing. The intranet is strictly regulated by a static routing without complex rules and all seems to be under normal operative conditions.
The ICT Manager, informed by the personnel about the weirdness call the Security Manager asking for his cooperation.
The Security personnel, informed of the incidents, contacts the Sys Admin and the Net Admin and does not carry out further investigations concluding that the issue is bounded to patch misconfiguration errors. No more analysis will be made for the day.
When the day of the agreement arrives the Oil Company receives a call from its representatives. Something has changed the mind of the local Government. The license will not be issued that day and probably the drilling permits will be granted only through a public auction.
All the efforts to achieve the permission early and privately were sunk and a complex negotiation is about to begin. But what has changed the mind of the Minister out of the blue?
Late, the same day, the representatives call the Company again to inform that the licenses will be given, the next day, to their biggest competitor without auction. The Prime Minister himself has awarded the competitor the license of extraction.
The Company calls immediately the subcontractor. The agreement was known only by a restricted number of individuals of both Companies and just few big local political figures. How the competitor has been so brave and capable to beat them in just few days without notice?
The Subcontractor managers swear that they have not given the information to others and that their plans have been preserved in the most secure location in their Company.
The next day, the friendly faces in the entourage of the local Prime Minister, tell the Company representatives that few days ago the Prime Minister has been contacted by their competitor and that something has happened because the Prime Minister himself has then met the Ministry of the Environment and the Ministry of Industry about the oil rig concessions.
Also they told that the competitor has made an offer slightly better than the original one made by their Company, but fair enough to convince the Prime Minister…
A week later, the Sys Admin of the Subcontractor, during a routine cleaning of local logs discovers a scary set of entries in the SQL Database Event Logs.
The first weird log is about an access with Backup Operator at about 9:00 PM ten days ago. Weird because the Backup Operator is a Bot that is always started at 2:00 AM and is linked to Backup processes carried out at night.
But the scariest logs are a set of unsuccessful login attempts made from SQL Admin account between 9 PM and 5 AM the same night.
In fact by correlating the Event with Domain pre-authentication failed message, a set of login attempts were discovered by the Admin in about five servers of the Restricted Area.
The domain controller logs record a long strip of code 675 events in the Event Log:
These records contain the username and IP address of a workstation normally used by several users to manage backup and restore of data in the Restricted Area.
Immediately the Sys Admin calls the Security Team and forwards them all the logs explaining the situation.
Three days and lots of coffees later the Security Manager arrange a meeting with the board to show what his team as discovered: basically they have been target of an attack made by some pros out there. The attackers have used several bulletproof VPS to jump in their network and stole their classified data.
The reason they have not being able to identify the attack has been due to several reasons.
Basically they have been compromised by a vulnerable laptop used by the Network Team to patch or manage theirs system via serial console connection. In fact the laptop, an old Windows 2000 Workstation, was used by the team because it was the only laptop with a native serial (COM) interface. The laptop has been directly attacked when it has been left turned on and directly connected through internet during the weekend two weeks earlier after a scheduled maintenance.
The attackers have exploited the system and then have left a keylogger (Dracula Logger) inside the machine in persistence mode. Through this action they have collected the account of a couple of Network Operator Domain account, useful to access the Restricted Area of the Data Center.
Also they have jumped to the Maintenance Workstation, a Windows XP SP3 machine used by Network and Backup Operators.
With some Domain Accounts in their hands the attackers have tried to force the access to the File Servers and the SQL Databases, but initially they have not succeeded.
So they have tried to copy some instances of the SQL Database, thus generating network issues, but basically without result, considering that the SQL Database was encrypted.
The real problem has been the capture of the Sys Admin credentials when he has started the additional monitoring tools through the Maintenance Workstation. Through this part of the puzzle they have been finally able to steal files and private data.
The Security Manager concludes that they have relied too much on looking to Network logs and Firewalls to enforce Security. In fact nothing has been recorded by their IDS during the SQL bruteforce attack because the encryption of login packets for Database login has created a network blackhole making them unable to track user credentials when applications authenticate through IDS Systems.
The meeting ends with the resignation of the Security Manager.
What to do?
How to catch the attacks that I’ve depicted?
There is a lot to do to improve our general responsiveness against what the market call Advanced Persistent Threats (APT). To be honest I don’t like the name APT is too generic and misses the real capabilities that marks attacks like the ones I’ve shown earlier.
I prefer the name: Advanced Attack Patterns, because the attackers use specific strategies and because quite often they don’t want to stay persistent, to reside in the victim network. Instead they adopt subtle strategies that rely on multiple stages. They don’t want to remain in the target network more than they need.
This means that the adoption of Exploit Kits, Trojans and Keyloggers is defined in complex canvas where is up to the attacker to choose a tool instead of another.
This does not mean that Exploit Kits or Trojans are not a weapon of choice in such attacks, but they are chosen only if the target could be reasonably exploited by these tools instead of custom version of other advanced tools.
However, what I think we should do is enforce a relatively complex Security strategy that should force the attacker to play by ear, to improvise.
Normally attacks like the ones I’ve told you are made by patient and skilled people, but even to the most skilled blackhat the worst scenario is the one where he should act without a proper plan, without a strategy that makes him comfortable. The risk is to be caught, or at least to lose money and time by alarming the victim.
But how we can force the attacker to play on our turf?
In my opinion there are several ways, the most important is the Company awareness and readiness, in a word, to have a Computer Emergency Response Team that really works.
Enforce a verification lifecycle
In both cases that I’ve described the lack of knowledge or the inappropriate adoption of Security procedures have given to the attacker the chance to fulfill his goals, to steal restricted or private data.
In my opinion, the critical events are normally generated by a dangerous mix of attacking skill and inaccurate reaction. In fact, often the reaction to some minor incidents that could be seen as a prologue of the real attack, is carried out in an incoherent way underestimating the real threat.
Sometimes instead, by not correlating the events, the Security or Network operators do not see the attack and their reactions generate more entropy, making the subsequent analysis extremely hard.
All this means that a proper lifecycle of testing should be put in place not only for evaluating the technologies and the infrastructures, normally tasks carried out by Vulnerability Assessments and Penetration tests, but also for checking the procedures, the awareness and the readiness of the Company personnel.
In my experience, by testing the Company with a simulation of an ICT incident once in a while, ensure an improved level of reactiveness not only for the Security teams, but for the entire Company.
But the management of incident tests and the readiness of the entire Company should pass in the hands of the Incident Response Team, the internal structure that should play a role during critical situations.
More space and responsibilities to the Incident Response team
In a world where the DDoS could be arranged and carried out in just few minutes or a computer could fall victim of a drive-by download in just few seconds, to be ready to face malicious threats is an imperative goal of every mid to large Company.
And today the readiness could not be ensure just with technology. It is essential to have at least a small but skilled internal team of Security experts that could be triggered when problems arise.
In my experience this is invaluable.
Rest to note, that the Team should be made responsible for the action taken during the incident situation. In the same time, space should be given to it, in terms of operational freedom and availability of proper communication channels with all the other internal and external structures.
In fact, it is extremely important to have a direct link with all Company Third-Party ICT providers and to ensure the highest operational capabilities.
A very good paper about the subject can be downloaded here:
More capabilities for Incident Response
To give the Incident Response more operational capabilities, a proper set of procedures and toolkits should be made available.
For my experience the procedures should be highly customized for the environment, because each Company has her own set of rules and policies.
Instead, speaking about technology and toolkits, they could be divided in three areas:
- Early Warning tools
- Inspection tools
- Mitigation tools
The Early Warning means all the operational awareness that every IRT should keep constantly updated.
This means to follow the Security Information flows based on online news, exploit updates, malware analysis and early warning systems. In this field my team, for example, has developed a platform, called Sybil, that collects and checks several potential attack patterns, via honeypots and sandboxes and could inform the IRT about newest threats or massive diffusion of malware campaigns.
The Inspection Tools are forensic, system and network analysis tools useful during the incident. They could be divided in two groups: Centralized tools and Field kits.
In Centralized tools category fall the Log inspectors, the correlators, the SIEM and the monitoring tools; even the Antivirus Console could be considered in this field.
However, one invaluable tool in the Centralized category is the Sandbox environment. By studying the behaviour of a malware in a sandbox environment the team could understand the strategy adopted by the attacker or, at least, identify the modification introduced to the victim system by the malware and plan a set of corrective measures.
Instead Field Kits are a set of Linux distro, such as CAINE or DEFT Linux that are invaluable in Forensic investigations. Some Field Kits are prepared on Windows Systems and include FTK or Encase applications, but such kits are really expensive and not always the money means value and capability, in my opinion.
Nevertheless, to have such tools could be very useful, especially when the victims are smartphones or tablets.
Mitigation Tools are related to reaction capabilities, for example IPS or Firewall console, but also tools that could be used to quarantine an environment.
This category, however, is the more strictly related to the specific environment and is the one that should be ruled by very tight and clear procedures in order to avoid misunderstanding and errors during the incident handling and the sanitization of compromised systems.
As you can see I’ve not specified tools and technologies because it is up to the single team to define its preferred choices.
My advice is just to push further the idea of the Incident Response as a real focus of the Security strategy of every corporate environment.
Because, today, to lack a proper management of potential threats means that the Company is sitting on a time bomb and uses the timer to synchronize its clocks…