SECTION 3
The Stages of Incident Response There are multiple stages for incident response. Each stage should be performed in sequence with the integrity of the system in mind. From ensuring the company/organization is properly prepared for the inevitable incident to the complete and successful prosecution of a malicious insider or external “hacker,” all incident response stages are necessary to be followed and completed. We are going to present two different methods for Incident Response, which are used in the industry to provide the services and response needed when providing the Incident Handling and Response activities.
METHODOLOGY #1 There are seven stages for incident response. Each stage should be performed in sequence with the integrity of the system in mind. From ensuring the company/organization is properly prepared for the inevitable incident to the complete and successful prosecution of a malicious insider or external “hacker,” all incident response stages are necessary to be followed and completed.
Preparation Preparation is critical to the successful identification, handling, and recovery for any computer incident. Preparation includes identifying the start of the incident, identifying how to recover from the incident, and how to get back to normal business operations sooner. This preparation includes the creation and approval of well-established corporate security policies, including warning banners, user privacy expectations, established incident notification process, the development of an incident containment policy, creation of incident handling checklists, insuring the corporate disaster recovery plan is up to date, and making sure the security risk assessment process is functioning and active. Another important part of preparation is training: training for the incident responders (SIRT team members) and training for the organization’s system administrators. Types of training include operating system support, specialized investigative techniques, incident response tools usage, and corporate Computer Incident Response and Forensics Team Management. DOI: http://dx.doi.org/10.1016/B978-1-59749-996-5.00003-0 © 2014 Elsevier Inc. All rights reserved.
21
22
Section 3: The Stages of Incident Response
environmental procedure requirements. An additional focus of specialized training includes decision-making efforts to determine if and when local law enforcement authorities should be contacted during investigations. Third area important to preparation is having predeployed incident handling assets. These assets include: A. Sensors, probes, and monitors on critical systems—these automated mechanisms allow monitoring of: 1. System applications with a service and network monitoring program; 2. CPU utilization, disk space, processes, and other pertinent metrics and investigating any unusual activity; 3. Access to the application (allows and denies). B. Tracking data bases on core systems based upon minimum security baselines (snapshots of the systems or networks during normal operations) and the corporate Configuration Management Data Base (CMDB) and C. Active audit logs for all server and network components—where the logs are retained, how often the logs are updated, who controls them, etc. It is best to have automated mechanisms employed to increase the availability of support features to users as well as to signal the need for a possible criminal investigation. One final thing to understand about preparation is a few questions you always need to present to your users so they know what is expected. ■ ■ ■ ■ ■
What action are they to take or not take at identification of incident? Who do they call? When do they call? What information do they provide when calling? Who should they notify or not notify about the possible incident?
Providing a basic response template for users to fill in when reporting an event or incident is a good way to ensure the right information is provided in a timely manner.
Identification The second stage of Incident Response is proper Identification of the incident. Is the event simply an unusual activity, or can you identify it as suspicious? If so, what are the surrounding activities? Are there multiple reports of issues on the network, or is it confined to one machine or location? Some of the areas to check include: 1. Suspicious Entries in System or Network Accounting The accounting logs for each location should provide indicators of unusual actions in the system or network activities.
Methodology #1
2. Excessive Unsuccessful Login Attempts (>3 per user) Usual security controls are designed to deny access once three unsuccessful logins are attempted in a row. This is an indicator of potentially malicious attackers trying to access systems or networks without having compromised credentials. 3. Unexplained New User Accounts Compromised administrative accounts often will indicate the creation of new accounts for the purposes of creating malicious user access to systems and networks. 4. Unexplained New Files New files are often added to the system directories expecting users not to be aware of the normal file set within these directories. 5. Unfamiliar File Names The modern malware event today often loads files onto a system hard drive with files which are not normally found in a modern operating system. These files can be identified and evaluated quickly once discovered. 6. Modifications to file names and/or dates, especially in system executable files “Hackers” often rename malicious executable files with names of normal files and then place them into the expected directories of the operating system. 7. Intrusion Detection System (IDS) Alerts or Alarms Network- and host-based Intrusion Detection Systems can often provide the responder indications of potential malicious or harmful activity so they should be reviewed and examined during initial response efforts to gain a possible path of attack and/or access of the malicious incident. 8. External notifications (either outside users, customers, or emergency personnel) Often the first indication of a problem comes from external sources, such as the local Computer Emergency Response Team (CERT) or outside users indicating they have received a strange or unusual indicator. 9. E-mail flood from unsuspecting helper E-mail flooding, often an indicator of a worm event, can provide a responder with indications of access methods and techniques used to gain a method of ingress into the network and should be used to instigate a response. All of these potential incident indicators can allow you to determine if this event really is an incident or just an unusual event. Who notices a suspicious or unusual event and when did they notice it? What has happened or has changed? These are couple of questions asked when an incident occurs which helps identify the event or potentially the incident. The reporting person, the end
23
24
Section 3: The Stages of Incident Response
user or the system administrator, should fill out a standardized computer incident report. This report should contain multiple fields wherein the reporting person identifies the time and date of occurrence, location of occurrence, machine/system/network involved and activity active when the event occurred. This form provides valuable initial identification data for ensuring the proper handling of the event. For example if system/computer has an apparent identifiable virus/malware then this could be reflected on the reporting form by the type of end-user interaction and response reported. The point of incident response and gathering of all the facts and data is to evaluate the extent of the unauthorized disclosure and/or distribution; determine its origin; identify the account(s) involved (if applicable); identify the potential agencies, information systems, and/or network(s) affected (if applicable); identify the type(s) of file formats involved in the incident (e.g., e-mail, Web posting, and database); document all actions taken; and gather any evidence processed.
NOTE: The primary and paramount step, in initial identification of the event or incident, is the absolute requirement that all investigative activities, all initial inspections, and all initial responses are performed AFTER a complete bit-stream image copy is created on the system under investigation, the system under question. This process can be completed quickly with the proper tools and support equipment; therefore, it is imperative to have these available at the start of the identification stage of incident response process—not later on during the containment or eradication steps. The logs to be evaluated should also be digitally copied off the suspect system(s) as well as the e-mails or other supporting data. This is the most important requirement to ensure proper and successful prosecution of any resultant civil or criminal case.
Some of the other activities that assist in identification are correlation of events through system logs, network logs, firewall logs, and application logs. Attention to detail is the primary attribute here when looking for potential malicious code, such as a “key-logger” or “rootkit,” backdoors and trapdoors placed into a system by a “hacker” or even an insider, or some other malicious mobile code. Other areas to look at: Are there other indications of problem or an issue, such as IDS alerts and alarms, third-party e-mail recipients complaining about excessive e-mail from your domain, which would indicate a worm is running on your network? ■ Excessive unsuccessful login attempts per user, unexplained new accounts or files, modification of system executable files, etc. ■
The basic requirement for incident response is the identification of the type of incident that has or is occurring at the time of response. The six levels for
Methodology #1
classifications of incidents are provided here. This list originated with the US-CERT and National Institute of Standards and Technology (NIST) organizations for use with governmental systems but has been modified to encompass both public and private organizations. This list includes the generally accepted security practices reporting requirements for each type of incident (Table 3.1). Table 3.1 Incident Categories by Type of Incident Category
Name
Incident Description
Level 1
Unauthorized access
Level 2
Denial of service (DoS)
Level 3
Malicious code
Level 4
Improper usage
Level 5
Scans/probes/ attempted access
Level 6
Investigation incident
In this category an individual gains logical or physical access without permission to a department/agency/corporate network, system, application, data, or other resource (e.g., physical documents). This category includes any breach of Personal Identifiable Information (PII) or Privacy data. This type of incident should be reported to the responsible corporate or organizational office as soon as the incident is identified. An attack that successfully prevents or impairs the normal authorized functionality of networks, systems, or applications by exhausting resources. This activity includes being the victim of or participating in the DoS. The DoS attack is focused on not allowing users to access the needed computing resources to accomplish their tasks. Most current router devices have mitigation techniques built-in to their operating systems, so this type of incident is either based against older equipment, or focused on large-scale attacks from multiple sources. Successful installation of malicious software (e.g., virus, worm, Trojan horse, or other code-based malicious entity) that infects an operating system or application. This level is, today, by far and away has the largest number of incidents reported. Most all of the antivirus and malware security vendors are reporting large increases in the number and deployment of malware across the entire Internet. Departments and Companies are NOT usually required to report malicious logic that has been successfully quarantined by antivirus software. A person violates acceptable computing use policies. This is the typical classification for an insider threat or disgruntled employee incident within the organization. This type of incident typically causes large-scale losses but is often isolated to one network or location and carries the potential to harm an information system or enterprise through destruction, disclosure, modification of data, and/or denial of service… There are potentially major activities which require incident handling with this type of incident. This category includes any activity that seeks to access or identify a corporation or department computer, open ports, protocols, service, or any combination for later exploit. This activity does not directly result in a compromise or denial of service. This type of incident can be caused by vulnerability scanning tools, network mapping tools, as well as penetration testing tools. This incident level can be related to expected or unexpected scans, tests, automated equipment evaluations, as well as outside reconnaissance of networks and machines. This category covers unconfirmed incidents that are potentially malicious or anomalous activity deemed by organization/corporation to warrant further review. Once this incident is determined to require additional investigation, this level remains in effect during the entire investigation. This is the level for all levels of investigations, criminal, civil, and administrative, as well as forensics investigations.
25
26
Section 3: The Stages of Incident Response
Containment Limiting the scope and magnitude of the incident is of paramount interest. There are two primary areas of coverage here: Protect and keep available critical computing recourses where possible, and determine the operational status of the infected computer, system or network. Protecting resources includes immediately changing administrative passwords, reevaluating and determining trust relationships between systems and networks, and, if you are a governmental agency, contacting your local computer emergency response team to check the scope of the incident. Once again, the initial record of all logs for review, examination, and investigation needs to be bit-stream “hashencrypted” copies of the originals, not the originals themselves, for the purposes of legalities and potential future criminal or civil litigations. To determine the operational status of your infected system and or network, you have three options, based upon immediate appearances when incident is discovered and on what type of processing device is the occurring event based: A. Disconnect system from the network and allow it to continue stand-alone operations—i.e., if a workstation is contaminated, responders should disconnect the workstation from the network immediately by removing the physical connection (e.g., RJ-45 cable and fiber) and the workstation will remain in operational mode. B. Shut down everything immediately—i.e., if the server or workstation shows immediate activity on monitor of active file deletions when responder arrives at scene, an immediate assessment is performed to see if files are required, important or could provide evidence in response efforts—if so, then turn off workstation or server to preserve remaining files before deletion activities are completed—this provides some remaining system integrity before total deletion of all files. C. Continue to allow the system to run on the network and monitor the activities. All three options are available at the beginning of an incident response and should be determined as quickly as possible to allow movement to the next stage, which is investigation of the incident.
Investigation (Break-off point for Forensics) This step is the initial investigative step necessary to determine the breadth and scope of the incident. Once all the bit-stream copies of the drives, external storage, real-time system memory, network devices logs, system logs, application logs, and other supporting data is assembled, then systematic
Methodology #1
review of each piece of evidence is conducted. Yes, at this time, these components and drives are considered evidence at least until it has been evaluated and determined not to be relevant or germane to the incident investigation. Some of the areas for investigation include: How much? Where it went? How widespread?—Determining extent of infection or penetration. ■ Was it a network- or host based, i.e., computer/user system access? ■ What data was accessed? ■ Where did the suspect/malicious activity go? ■ What happened to this device/component? ■ When did it occur? ■ In what log was it used or found? ■ Who did it? ■ Was it an inside threat or outside attack? ■ What do the log reviews reveal? ■
Document each step of the investigation, especially since external threats may require law enforcement involvement. This is often the point of separation between the return to normal operations actions from a business perspective and the detailed investigation action to be performed by the forensics and potentially even law enforcement personnel if the incident warrants legal actions.
Eradication Eradication is getting rid of the problem. Once the investigation shows there are no further external or internal actions necessary to complete the containment or investigations, then the Eradication step is initiated. Two important parts are cleanup and notification. Cleanup usually consists of running your antivirus software, deinstallation of infected software, rebuilding the entire operating system or replacement of the entire hard drive and reconstituting the network communications and equipment. Notification always includes relevant personnel, both above and below the SIRT manager in the reporting chain. By getting the system back into operation, the next stage, Recovery, is activated.
Recovery Recovery is returning the system/network/component to normal business operations. There are always two steps to any recovery effort. First part of recovery involves service restoration, based upon implementation of corporate contingency plans; second part of recovery involves system and/or network validation, testing, and certifying the system as operational. Each system or component which has been determined to have been compromised must
27
28
Section 3: The Stages of Incident Response
be recertified to be both operational and secure so as to not reintroduce the problem into the operating environment of the organization after initial recovery.
Follow-Up After the incident is completed, there are questions for the incident response team and the corporation to answer. These include: A. Was there sufficient preparation? B. Did detection occur promptly? C. Were communications adequate? D. What was the financial or informational cost of the incident? E. How can we prevent it from happening again? In this stage the follow-on review is done as an objective evaluation, for computer security mistakes are our best hope for improvement because progress is measured by making new mistakes, instead of the same ones over and over again.
METHODOLOGY #2 Incident Response was devised by the US Government’s NIST organization and is straightforward and simple in its design, but detailed in implementation.
Preparation
Detection and analysis
Containment, eradication, and recovery
Postincident activity
NIST incident response life cycle.
Preparation The first step in this methodology is the Preparation Stage. The purpose of this stage is twofold: to create an incident response mechanism within the organization and to install a minimum security baseline in the IT and network infrastructure of the organization so as to set criteria to help prevent incidents from occurring in the first place. This first part includes security engineering activities for the network, systems, and the various applications running on the systems and networks. Reviews of security products, services, and systems before installation are considered important which can establish
Methodology #2
the base level of security in the multiple areas of administrative, technical, and operational security controls. The incident response team level of security experience and subject matter expertise is vital to this effort.
Detection and Analysis The second stage of this Incident Response Methodology is the Detection and Analysis Stage. This is the phase where the actual detection of an incident occurs. There are many methods available today which allow for the automated detection of possible events and incidents. Software- and hardwarebased devices and components can detect changes in network traffic patterns, change in file directory structures and sizes of the files themselves, or even the behavior of files on the servers or network. However, still in today’s interconnected world, the most common way to receive notification of a problem is from the users themselves. They will know when a program is running exceptionally slow, or when a directory is unavailable which should be open, etc. There are many incident indicators which will need to be reviewed, prioritized, and evaluated each day in a large commercially active organization. So it does become important to have the right people with the right skills at the right location and time who are able to provide the needed actions in response to each incident. These skilled responders will read the “signs” of an activity or event to see if it indicates an incident needing review. Some of these “signs” can indicate a potential area of concern—something to watch for in the near future and some can indicate an active activity which needs immediate attention. The skill of the responders in their methods of identification of incidents is critical in this stage to ensure the correct incident events are responded to and the false-positive ones are identified and properly evaluated.
Containment, Eradication, and Recovery Containment of the incident is usually the next major step necessary during the response after identifying what the incident really is that is taking place. As a part of this containment, the team leader’s decision-making becomes paramount to allow the right resources be applied to the effort, to properly gain management support during the response. And, to “secure the data” which, of course, is the primary goal of any response action. Various questions need to be answered during this stage by the responders and their managers, such as: a. Should we shut the system off? b. Should we disconnect the network from the machine? c. Should we disable certain ports, protocols, or services first?
29
30
Section 3: The Stages of Incident Response
Depending upon what kind of incident it is, some other considerations are then considered for eradication during the response. The type of incident will, in itself, provide these areas. These considerations include: Potential damage to and theft of resources. Need for evidence preservation. ■ Service availability (e.g., network connectivity, services provided to external parties). ■ Time and resources needed to implement the strategy. ■ Effectiveness of the strategy (e.g., partially/fully contains the incident). ■ Duration of the solution (e.g., emergency workaround to be removed in 4 h, temporary workaround to be removed in 2 weeks, permanent solution). ■ ■
Full eradication of the cause of the incident is the next goal within this framework for response. Eradication actions could include deletion of the malicious software or code snippet, disabling certain accounts on the system, closing the applicable firewall ports, etc. Remember, based on the type of incident, full eradication may not be needed and could actually cause further damage, so the method of eradication is also an important consideration during this stage. Recovery back to full business operations is the ultimate goal of this stage. Recovery activities will include restoring the system affected from full, uninfected backups, rebuilding systems from scratch, hardening systems to prevent further occurrence of incident, adding new or expanded security parameters on boundary devices, changing administrative passwords, increasing the logging of events immediately after the incident to ensure full recovery, and increased monitoring of the network and system since the incident often has follow-up event and attacks applied in its aftermath.
POST-INCIDENT ACTIVITY In the full arena of incident response, one of the most important activities is after the event learning action. The results of the response are ideally always improving and this postincident effort is the key to ensuring the learning process is taking place. Each team and team member should review the effort, the techniques used, the timing of the response, the threat realized, and the support actions taken with the eye on improving the response then next time. Each incident response team should evolve to reflect new threats, improved technology and lessons learned. These lessons learned actions can dramatically improve the security of the organization, as well as improve the incident handling and response mechanisms within the response team.
Post-incident Activity
The National Institute of Justice has produced a document entitled Electronic Crime Scene Investigation: A Guide for First Responders,1 which discusses the techniques taught to law enforcement personnel when responding to incident scenes. Their basic steps are four as described in this document. They are:
Secure and Evaluate the Scene When securing and evaluating the scene, the first responder should: ■ Follow departmental policy for securing crime scenes. ■ Immediately secure all electronic devices, including personal or portable devices. ■ Ensure that no unauthorized person has access to any electronic devices at the crime scene. ■ Refuse offers of help or technical assistance from any unauthorized persons. ■ Remove all persons from the crime scene or the immediate area from which evidence is to be collected. ■ Ensure that the condition of any electronic device is not altered. ■ Leave a computer or electronic device off if it is already turned off.
■
Document the Scene Documentation of a crime scene creates a record for the investigation. It is important to accurately record the location of the scene; the scene itself; the state, power status, and condition of computers, storage media, wireless network devices, mobile phones, smart phones, PDAs, and other data storage devices; Internet and network access; and other electronic devices. The first responder should be aware that not all digital evidence may be in close proximity to the computer or other devices. The initial documentation of the scene should include a detailed record using video, photography, and notes and sketches to help recreate or convey the details of the scene later. All activity and processes on display screens should be fully documented. Documentation of the scene should include the entire location, including the type, location, and position of computers, their components and peripheral equipment, and other electronic devices. The scene may expand to multiple locations; first responders should document all physical connections to and from the computers and other devices. 1
Electronic crime scene investigation: a guide for first responders. 2nd ed. National Institute of Justice; April 2008 [NCJ 219941].
31
32
Section 3: The Stages of Incident Response
Record any network and wireless access points that may be present and capable of linking computers and other devices to each other and the Internet. The existence of network and wireless access points may indicate that additional evidence exists beyond the initial scene.
Perform Evidence Collection The first responder must have proper authority—such as plain view observation, consent, or a court order—to search for and collect evidence at an electronic crime scene. The first responder must be able to identify the authority under which he or she may seize evidence and should follow agency guidelines, consult a superior, or contact a prosecutor if a question of appropriate authority arises. Digital evidence must be handled carefully to preserve the integrity of the physical device as well as the data it contains. Some digital evidence requires special collection, packaging, and transportation techniques. Data can be damaged or altered by electromagnetic fields such as those generated by static electricity, magnets, radio transmitters, and other devices. Communication devices, such as mobile phones, smart phones, PDAs, and pagers should be secured and prevented from receiving or transmitting data once they are identified and collected as evidence. Business environments frequently have complicated configurations of multiple computers networked to each other, to a common server, to network devices, or a combination of these. Securing a scene and collecting digital evidence in these environments may pose challenges to the first responder. Improperly shutting down a system may result in lost data, lost evidence, and potential civil liability. The first responder may find a similar environment in residential locations, particularly when a business is operated from the home in some instances; the first responder may encounter unfamiliar operating systems or unique hardware and software configurations that require specific shutdown procedures.
Package, Transport, and Store the Collected Digital Evidence Digital evidence—and the computers and electronic devices on which it is stored—is fragile and sensitive to extreme temperatures, humidity, physical shock, static electricity, and magnetic fields. The first responder should take precautions when documenting, photographing, packaging, transporting, and storing digital evidence to avoid altering, damaging, or destroying the data.
Post-incident Activity
Packaging Procedures All actions related to the identification, collection, packaging, transportation, and storage of digital evidence should be thoroughly documented. When packing digital evidence for transportation, the first responder should: Ensure that all digital evidence collected is properly documented, labeled, marked, photographed, video recorded or sketched, and inventoried before it is packaged. All connections and connected devices should be labeled for easy reconfiguration of the system later. ■ Remember that digital evidence may also contain latent, trace, or biological evidence and take the appropriate steps to preserve it. Digital evidence imaging should be done before latent, trace, or biological evidence processes are conducted on the evidence. ■ Pack all digital evidence in antistatic packaging. Only paper bags and envelopes, cardboard boxes, and antistatic containers should be used for packaging digital evidence. Plastic materials should not be used when collecting digital evidence because plastic can produce or convey static electricity and allow humidity and condensation to develop, which may damage or destroy the evidence. ■ Ensure that all digital evidence is packaged in a manner that will prevent it from being bent, scratched, or otherwise deformed. ■ Label all containers used to package and store digital evidence clearly and properly. ■ Leave cellular, mobile, or smart phone(s) in the power state (on or off) in which they were found. ■ Package mobile or smart phone(s) in signal-blocking material such as Faraday isolation bags, radio frequency-shielding material, or aluminum foil to prevent data messages from being sent or received by the devices. (First responders should be aware that if inappropriately packaged, or removed from shielded packaging, the device may be able to send and receive data messages if in range of a communication signal.) ■ Collect all power supplies and adapters for all electronic devices seized. ■
Transportation Procedures When transporting digital evidence, the first responder should: Keep digital evidence away from magnetic fields, such as those produced by radio transmitters, speaker magnets, and magnetic mount emergency lights. Other potential hazards that the first responder should be aware of include seats heaters and any device or material that can produce static electricity. ■ Avoid keeping digital evidence in a vehicle for prolonged periods of time. Heat, cold, and humidity can damage or destroy digital evidence. ■
33
34
Section 3: The Stages of Incident Response
Secure scene and move everyone away from computers and electronic devices
NO
Is the computer powered on? YES Are law enforcement personnel with specific computer seizure training available?
YES
NO
Destructive processed can be any functions intended to obliterate data on the hard drive or data storage device. Terms like “format”, “delete”, “remove”, and “wipe” can be indicative of destructive processed. Document these indicators in reports.
Is the system a networked business environment?
YES
NO YES
Requested assistance and follow recommendations of personnel with specific digital evidence seizure training.
Are destructive processes running? NO Is information of evidence value visible onscreen?
STOP! DO NOT turn computer or device off. Contact personnel trained in network seizure.
YES
Thoroughly document and photograph all information on the screen.
NO DO NOT turn the computer or device on.
Remove power cord from back of computer and connected devices.
Label all connections on computers and devices as well as cables and power supplies.
Locate and secure all evidence within the scope of authority for the specific circumstances.
Document log, and photograph all computers, devices, connections, cables, and power supplies
Log and secure all evidence according to agency policies pending forensic examination.
FIGURE 3.1 NIST Incident Response Lifecycle.
Ensure that computers and electronic devices are packaged and secured during transportation to prevent damage from shock and vibration. ■ Document the transportation of the digital evidence and maintain the chain of custody on all evidence transported. ■
Post-incident Activity
Storage Procedures When storing digital evidence, the first responder should: Ensure that the digital evidence is inventoried in accordance with the agency’s policies. ■ Ensure that the digital evidence is stored in a secure, climate-controlled environment or a location that is not subject to extreme temperature or humidity. ■ Ensure that the digital evidence is not exposed to magnetic fields, moisture, dust, vibration, or any other elements that may damage or destroy it. ■ If more than one computer is seized as evidence, all computers, cables, and devices connected to them should be properly labeled to facilitate reassembly if necessary. ■
It is interesting to see what the Law Enforcement Officer (LEO) community is taught from a management perspective where the focus points are and how the procedures are developed and portrayed to the technicians who are on scene collecting the evidence. The basic flowchart for evidence collection is found in (Figure 3.1).
35