April
Network Security
7 997
Enhanced Firewall Infrastructure Testing Methodology Philip R. Moyer The demand for effective firewall penetration testing procedures is increasing, not decreasing, as more and more organizations connect their networks to the Internet. Organizations are also increasing their reliance on the Internet for real, bottom-line profit. Upper management, now more than ever, is asked to make risk decisions based on incomplete information. More and more frequently, upper management calls for penetration tests as a mechanism for evaluating their overall risk. Most penetration tests run today are too weak for that purpose. I would like’to present a methodology for conducting thorough penetration tests against firewall infrastructure systems. testing methodology to incorporate other threat vectors.
Introduction Early in 1996 (see NetworkSeaMy, March 1996). Moyer and Schultz (Moyer, 1996) presented a methodology for conducting penetration tests against firewalls. This methodology has brought consistent success for Moyer and Schultz. Results from numerous tests show that this is a sound testing for methodology resistance to internal threats, as reported by Schultz (Schultz, 1997). Experience shows, however, that external threats, while significant, are not the most likely avenue for compromise in an organization’s networks. The most likely cause of information loss in organizations is, and always has been, insider theft Existing or compromise. penetration testing techniques usually do not address this insider threat. Other experience and consideration has prompted me to extend the initial penetration
01997
Elsevier
Science
Ltd
Goals, constraints and validity The general goal of a penetration test is to evaluate a system’s resistance to attack. An ideal penetration test would be one conducted against a fully configured and fielded system that had been in operation for some weeks before the test. In addition, an ideal penetration test would incorporate methods of detecting testing oversight, or methods for compromising the firewall which the penetration team did not use. Conducting a perfect penetration test, however, is a time-consuming project; results may not be available for three or four months. In addition, granting the penetration team free reign over all connected networks is, in itself,
a risky proposition, It is important to recognize at the outset that the results will, for pragmatic reasons, be skewed towards the safe end of the security spectrum. Organizations conducting, or hiring someone to conduct, penetration tests should evaluate the expected threats in each specific operating environment and tailor the penetration test to those threats only. As will be obvious later in this article, an all-out penetration test will, in all cases succeed. Very few firewall systems will, for example, maintain the correct security posture if the the is firewall attacker administrator. It is certainly possible that a firewall will need to resist such attacks; intro-network firewalls within some government agencies would meet that criterion. A penetration test, like static security checking tools such OS Tiger or COPS, is a snapshot security assessment. It is only valid for the time at which the attackers attempt the penetration. The mere act of administering the firewall system changes the security. We cannot logically draw conclusions about the current security state simply because similar conditions existed at the time of the test. This methodology, incorporates however, procedures to mitigate this weakness. Any deviation from the ideal penetration testing conditions introduces uncertainty into the final result. The firewall design, implementation plan, administration procedures, and overall architecture will all influence the degree to which initial conditions will alter the validity or relevance of the final test analysis. A firewall system or infrastructure, which includes the firewall and any closely
9
Network Security
connected systems, is a complex, non-linear, tightly-coupled system. Deviation from the ideal test conditions will reduce, to an unknowable, though potentially the significant degree, confidence we can have in the final results (Perrow, 1984). Security exposure introduced through the implementation process is an excellent example of invalid results obtained by testing under imperfect conditions. A firewall may be perfectly secure during the test process because it has not yet been connected to production support systems, such as Web servers running on Unix machines, After the penetration test, when management is falsely confident in the firewall’s capabilities, the system may be connected to these Unix machines. In the course of testing the network capabilities, the firewall may be reconfigured to support additional services. Or perhaps the Unix system itself shares a trust relationship with an internal system, and runs a faulty HTTP daemon. These are examples of changes to the system that can take place after the penetration tests and will significantly alter the security posture of the system. The lesson we can draw from this example is that penetration tests should be conducted against systems that are as close to ‘finished’ as possible. Any change to the system after the penetration test will degrade the overall validity of the results. Because firewalls and support systems are complex by nature, we cannot predict with any confidence how significant that degradation may be. The scope of the test can also influence the validity of the final results. Restricting the attackers to known IP addresses, a single,
Aprif 7997
pre-identified set of external subnets, or external-only attacks will degrade the test’s validity. The facts tell us that insiders will attack firewalls. If the penetration team is prohibited from testing internal threat vectors, the final results will be skewed away from the actual state towards the safer end of the security spectrum. The facts tell us that internal systems can, and frequently are, used to attack firewall or DMZ systems. If the penetration team is prohibited from indirect attacks, in which the team first penetrates an internal system, then uses it to attack the firewall, the results will be similarly skewed to the safer end of the spectrum. Again, the entire firewall infrastructure, not just the firewall itself, must be as close to production as possible to maximize validity. Dissemination of knowledge regarding the test will influence the validity of the test. Systems support personnel should not have prior knowledge when or how the test will be conducted. Any prior knowledge will cause the systems and security administrators, perhaps subconsciously, to be more wary of any security events. They are more likely to catch the penetration team. Certainly, the security systems and administrators have a vested interest in playing ‘catch the intruder’ during a penetration test. This behaviour will skew the results towards the safer end of the security spectrum.
Existing methodology Moyer and Schultz (Moyer. 1996) describe a fielded penetration This testing methodology. methodology incorporates four attack levels. These levels are: (1) Information Gathering; (2)
Probes; (3) Attacks; Compromise.
(4)
Internal
These levels correspond to observed attacker behaviours. In Level 1. the attack team will gather publicly available information about the target networks. In Level 2, the attack team probes the target’s security perimeter. In Level 3, the attack team perpetrates actual penetration attempts against the firewall and support systems. If the team is successful at Level 3, it proceeds to Level 4, in which team members attempt to subvert the firewall from within its own support infrastructure.
Omissions The Moyer-Schultz methodology has two functional omissions. This methodology is, by design, biased towards external attackers. After all, most systems administrators and organization executives are concerned about external attackers. Arguably, external attackers pose the greatest non-tangible risk to an organization. Having the company’s name on the front page of the Wall Street Journal is terrific if the company has just fielded a wonderful new product; it is not so terrific if hackers have maliciously edited the company’s Web site. On the other hand, experience and research shows that the greatest tangible threat is from insiders. The existing test methodology inadequately tests the firewall infrastructure’s resistance to internal at-tack. The Moyer-Schultz methodology does not address those attacks that are theoretically possible, but which the penetration team does not test. Every test has a list of potential attacks that are not
01997
Elsevier
Science
Ltd
Network Security
April 7997
exercised, for a variety of reasons. For example, DNS Cache Corruption attacks may be very effective for attacking firewall infrastructure systems that support some kind of transitive trust relationship with other machines. DNS Cache Corruption attacks, however, also pose a high risk to the organization’s ongoing operations. The network penetration team will usually refrain from exercising these attacks. Other potential attacks channels simply be may unavailable to the penetration team. For example, partner or vendor networks may be connected to the target’s firewall Attacks that are system. unsuccessful from the Internet may be extremely successful from a partner’s more trusted network. The existing methodology inadequately examines the existing design and configuration for theoretical attack channels,
Revised methodology whole infrastructure analysis I have created a new methodology, based on my previous work, that corrects these omissions. This new methodology is an infrastructure penetration analysis, rather than a firewall analysis. The penetration distinction is important because while a firewall may be resistant to attacks, the entire infrastructure, which is the firewall plus all critical may be systems, support vulnerable. The new methodology tests the entire infrastructure. The methodology has four tiers of activity Tier One is the external attacker. It is very similar to Moyer-Schultz the existing methodology. Tier Two is the internal attacker. Tier Three tests
01997
Elsevier
Science
Ltd
synergistic attacks. Tier Four analyzes the design and support for procedures potential weaknesses. firewall designers, integrators, and engineers should, using the methodology as a framework, be able to accurately predict the results from any third-party penetration test. The designer should be able to identify all existing vulnerabilities in the system. Comparing the field results with the theoretical results is a good measure of the testing organization’s skill and expertise.
accept such information before they begin Tier Two activities. Personnel who reveal this information to a penetration team have greater interest in some goal other than evaluating the security in the test infrastructure. This assumption is easy to achieve; it should never be violated. l
The administrators, both security and system, in the target network have little or no prior knowledge about the penetration test timing, methodology, or techniques. Personnel who alert their internal staff when a penetration test begins skew the results; this indicates they have greater interest in some goal other than evaluating the true security in the system. This assumption is easy to achieve; it should never be violated.
l
The penetration team will be attacking production systems. Admittedly, this is somewhat difficult to achieve. It should, however, be emulated as closely as possible to maximize validity. Sometimes production systems cannot be hooked up to the test infrastructure until the test is completed; this may be a prudent restraint. In these cases, it may be possible to emulate production systems in some fashion.
l
The penetration team will be free to attack any internal system, using any technique. This is even more difficult to achieve than the previous assumption. Pragmatically, any system that may in some way be used to compromise the test infrastructure should be included in the test.
Operational assumptions The infrastructure penetration methodology is based on several operational assumptions. Running a test with different initial conditions will have lower, possibly significantly lower, validity. Bear in mind that these initial conditions are tailored for sites that have the highest possible security requirements. Sites with lower expected threat levels, or less to lose from a penetration, will be able to accept less-than-perfect initial conditions. Each site should, as mentioned earlier, tailor the penetration test for their specific risk environment Under perfect conditions, and in high-risk environments, the following initial assumptions would apply. l
The penetration team has little or no prior knowledge of the network. The target organization requesting the penetration test should not release, or offer, network diagrams, IP address lists, administrator phone lists, and other useful information to the penetration team. Likewise, a professional team will not
11
Network Security
l
l
Someone inside the incident response architecture at the target organization is fully of the testing aware and methodology procedures. This person can circumvent the normal reporting procedures; this is important to avoid having CERT/CC or law enforcement alerted when the penetration conducting is team operations and thus avoid undue embarrassment to all parties. The penetration test will succeed at some level. the Management, or sponsoring department or organization, should be aware, in advance, that only the most secure firewall systems and administrative procedures will prevent all attacks. For example, when the penetration team is testing whether the firewall system administrator can subvert the firewall, the expected result is a successful penetration. Architectures that resist penetration even by the caretakers of the machines are few and far between.
Tier One - external attacker
April
possible. The first stop is the Network Information Centre (NIC) for information from the whois database. The second stop is the target’s Web site. The third stop is an information broker, such as Nexis, and the SEC Web site. Each of these resources provides much useful information about the target. Probe At the Probe level, the penetration team undertakes the first activities that we would reasonably expect the target organization to detect. This activity may be conducted by hand, or by some automated probe tool such as SATAN. Probes include DNS zone transfers from authoritative servers, address space scans, stealth scanning, host port scanning, and service connection probes. Naturally, I cannot discuss specific procedures and tools. Attack The penetration team actually attempts intrusions into the target systems at the Attack level. The specific tools vary from team to team, but all commonly detected attack mechanisms should be used. Again, I must refrain from discussing particulars at this stage.
There are four activity levels within Tier One. They are Collection, Probe, Attack and Subvert. This is very similar to the entire Moyer-Schultz methodology. Other references describe these levels adequately; this paper will touch on them only briefly.
Popular and easily targeted services are NIS, NFS, and Sendmail, though others are useful as well. Increasingly, attackers can successfully use the HTTP daemon to compromise a system.
Collection
Finally, the penetration team attempts to subvert the services on the firewall itself. The team can only accomplish this if they have
During Collection, the penetration team builds as much information about the target systems as
12
Subvert
7 997
been successful at penetrating the systems at the Attack level. The expected result from Tier One testing is failure to penetrate. Firewall architecture and engineering is a reasonably mature technology. Any commercial firewall system, when configured correctly, should be resistant to Tier One penetration attempts. On some occasions, however, as reported by Schultz intermediate systems will be misconfigured. These design or implementation errors may open unexpected exposures in the infrastructure.
Tier Two - internal attacker There are essentially three kinds of internal attacker. The User is an unprivileged user on some internal system. The Developer is a programmer or DBA working in support of internal systems. The Administrator is a privileged internal user. In addition to the three kinds of internal attacker, there are two Internal Zones of Attack. The Common Zone is comprised of all the systems not critical to the security of the test infrastructure. For example, a Human Resources Novell server is, or should be, in the Common Zone for a firewall system. The Proximate Zone is comprised of the firewall and all security-critical support systems. For example, a Proximate Zone may consist of the firewall, a Web server, an E-mail exchange host, a firewall administration workstation, the development systems supporting the E-mail host and Web server, and any administrative hosts supporting the firewall administration workstation.
01997
Elsevier
Science
Ltd
April
Network Security
7 997
Note here that the Proximate Zone is usually significantly larger than the system owners would first think. In the above example, the system owners would probably identify the firewall and Web server as Proximate Zone hosts. Depending on the organization’s skill, they might identify the mail exchange firewall host, and the administration workstation. Few organizations have the security savvy, however, to correctly identify all the dependencies between support systems. For example, most organizations would not identify an rdist master as a Proximate Zone host. From a security perspective, however, such a machine is a Trojan horse insertion vector, and must be included in the Tier Two al-tacks. The order of the Tier Two penetrations is important. The penetration team should test the Common Zone first, followed by the Proximate Zone. The team should conduct on-site interviews between the Common Zone and the Proximate Zone attacks to develop a full understanding of Proximate Zone architecture, networks and services. Be aware that restricting the scope of Tier Two and higher attacks will alter the validity of the end results. Organizations that have CI blanket policy of trusting their entire administration staff to behave themselves are, perhaps knowingly, exposing themselves to risk. Systems significant administration staff have been known to subvert security measures. The three critical elements for any information system attack are opportunity, ability and motive. Systems administration staff frequently have all three. The tools and methods used in Tier Two are very similar to those in Tier
01997
Elsevier
Science
Ltd
One. The difference is that the penetration team starts with the one thing most external attackers will not have: access. Access is the first, and, unfortunately, usually the last significant hurdle an external attacker needs to overcome before having essentially free run of the target network. In Tier Two, the penetration team already has this access, just like a real inside attacker would. In many cases, it will not take long for the penetration team to escalate their privilege on the internal network. The critical test is whether the team can circumvent the firewall infrastructure from the inside. For the Common Zone, the expected result is failure to penetrate. Success in Tier Two, Common Zone indicates an insecure trust relationship between critical infrastructure systems and the inside network. These can be very subtle indeed. For example, a public CD-ROM production machine could allow an attacker to subvert a read-only Web site! The expected result from Tier Two, Proximate Zone penetration attempts is successful penetration. Frequently, mere users in the Proximate Zone systems will be able to subvert the system. This highlights the need for, and general lack of, strong user access controls in Proximate Zone systems. Certainly, developers can insert Trojan horses into Proximate Zone systems. This means, by transitive trust, anyone who can break into a developer’s account, or sniff a developer’s password off an internal subnet, may be able to circumvent the security in the system. Success in Tier Two does not necessarily present a problem for the target Perhaps an organization. architecture that permits Tier Two successful penetrations is adequate for the organization
and its goals. This, however, is a risk decision for the target organization. The penetration team can only say whether a given penetration vector is viable in the target architecture; it is the organization’s responsibility to determine if it is or is not acceptable.
Tier Three - collusive attacks Tier Three is comprised of collusive attacks, wherein an insider, working with an outsider, attempts to either circumvent the system, or transmit information through the target infrastructure. It is entirely conceivable that the penetration team will see Tier One failure, and Tier Two success, followed by an external attacker successfully penetrating the architecture by working synergistically with insiders. An excellent example of a Tier Three attack is protocol tunnelling or piggybacking over HTTP. The insider opens an ostensibly permissible HTTP connection from an inside server to a compatriot’s hostile external daemon running on port 80. The firewall infrastructure, seeing a permitted outbound HTTP connection, allows the connection, The inside ‘client’, however, is really a telnet daemon in disguise. It opens a connection to the external server, through which the external attacker can execute code on the internal ‘client’ machine. Infrequently monitored HTTP proxies can be abused in a similar fashion. The attacker in this case simply encases the telnet session in the data channels within HTTP so it looks correct to the intermediate proxy. This piggybacking mechanism results in a lower bandwidth channel, but it is still possible. As you can imagine, Tier Three attacks can take place
Network Security
supported through any communications or service channel permitted through the firewall infrastructure. The expected result from Tier Three is a successful penetration.
Tier Four - oversight detection Tier Four is designed to catch any possible penetration channels the penetration team misses, for whatever reason, during the first three Tiers. The first step is a design review of the entire firewall infrastructure. The second step is an implementation review. The third step is a review of the administrative procedures. The final step is a brief analysis of external legal exposures. Design Review During Design Review, the penetration team looks for possible attack channels in the original design specification for the system infrastructure. The team will interview designers, implementers and administrators to determine of a Tier One, Two, or Three attack may be possible through a previously unknown or untested channel. For example, Tier Three attacks via a socket availability covert channel may fail miserably through an Internet connection. It may be possible, however, to run the very same attack successfully from the target organization to the target organization’s advertising firm through a more trusted network connection. lmplemen tution Review Implementation the During Review, the penetration team looks for differences between what should be, according to the design documentation, and what is, as observed in the target
14
April
systems. This will highlight any possible change control failures or procedural mistakes in the implementation phase of building the infrastructure. For example, network administrators may permit all traffic between certain protected networks in order to troubleshoot a network problem, then forget to re-enable the packet filters. Unfortunately, this is not an uncommon problem; system owners are frequently surprised at the traffic that is permitted through their own firewall systemsone critical step in the design and implementation reviews is to draw the Common, Proximate, Firewall, and External Zones. Tests against production systems indicate that a common problem is that the Proximate and Common Zones overlap. This permits internal attackers to compromise security-critical systems. Another common problem is that the External and Firewall Zones overlap. Usually this means an external router is somehow exposed to attack. Naturally, this is undesirable. Some of these design or implementation problems may not be detected by penetration teams working through the first three Tiers. Administrative Procedure Review The penetration team looks for change control, backup, Web page generation, user access, or other administrative problems during the Administrative Procedure Review. Internal attackers can exploit, sometimes with shocking ease, procedural weaknesses. For example, some the organizations separate backup and recovery functions from the system administration function. It is possible. in such an architecture, for a user to walk up to an operations technician, hand them a tape, and ask that certain
7 997
files be restored on the firewall. Unless good procedures are in place, the operations staff may very well restore, or install, unauthorized software or services onto a security-critical system.
Legal considerations In some organizations, trusted outside connections all converge on a single router. Firewalls are reasonably good at detecting IP spoofing attacks against themselves. Incorrectly configured external routers, however, may permit such attacks between the target organization’s connected partners and vendors. This may create many side effects, such as downstream liability or competitive advantage issues.
Mitigating time sensitivity Firewall penetration tests tend to be snapshots of the system security posture at a given point in time. The penetration team can mitigate this weakness by re-conducting the Tier 4 analysis three to six months after the initial penetration test is complete. This step provides critical information regarding actual, as opposed to procedural, management in the firewall infrastructure. It is educational, and sometimes amusing, to compare the firewall configuration, as documented, with the firewall configuration, as actually observed in production, Without good change management procedures, it is very likely that the firewall will be configured differently than the administrators believe. The nature of the changes will provide insight into the probability of a security-significant, but undocumented, change to the system.
01997
Elsevier
Science
Ltd
Network Security
April 7997
Conclusions Properly conducted penetration testing is a viable mechanism for providing practical, “where the meets the road” rubber assessment of a given firewall implementation’s overall strength. Management should remember,
however, to clearly and thoroughly understand the limitations of a given test when considering the results.
References Moyer, P.R. and Schultz, E.E., 1996. A
Systematic Methodology for Firewall
Managing Network Security - Part 5: Risk Management or Risk Analysis Fred Cohen Over the last few years, computing has changed to an almost purely networked environment, but the technical aspects of information protection have not kept up. As a result, the success of information security programmes has increasingly become a function of our ability to make prudent management decisions about organizational activities. This series of articles takes a management view of protection and seeks to reconcile the need for security with the limitations of technology. expected loss and the cost effectiveness of various mitigation techniques.
Huh? As one of my co-workers once risk “risk analysis... said: management.. it’s all the same thing”. I tactically retreated and made strategic plans to provide additional information later. This is later. l
Risk analysis at least classical risk analysis consists of (1) gathering facts, assumptions and estimates and (2) making calculations based on that information to generate results including
01997
Elsevier
Science
Ltd
l
Risk management - at least as it is commonly practised consists of (1) gathering facts, assumptions and estimates, and (2) making decisions about which risks to take.
The risk analysis people may chime in here and tell me that risk analysis is HOW risk management makes these decisions. I can only speak of this from my experience. I have been involved in many
Penetration Testing. Network Security, March 1996. Perrow, C., 1984. Norma/ Accidents: Living With High-Risk Technologies. Basic Books, pp. 72-79, 93,94. E.E., 1997. When Firewalls Fail: Lessons Learned From Firewall Testing.
Schultz
Network Security, February 1997.
management decisions and I have never seen anyone make a management decision relating to information protection based solely on the result of a quantitative risk analysis. Risk analysis may contribute to the decision process, but ultimately, that’s not how decisions are made. So how are they made? If you were in my office, you would see me chuckle as I wave my hands about in response. But before you rush off and figure it’s all smoke and mirrors, I’d better tell you that it’s really not all just hand waving. In fact, as you will see, hand waving plays just as much a part in risk analysis for information protection as it does in risk management. The real difference is that in risk analysis, the hand waving is hidden within calculations, while in risk management we wave our hands in front of everyone and call it good (or bad) judgement.
Network risk analysis Nobody really knows an exact way to analyse risks and risk mitigation strategies in a networked environment, but by applying standard risk assessment techniques, we can create a framework for analysis.
15